10 Techniques to Speed Up Python Runtime

Original article was published on Deep Learning on Medium

Python is a scripting language. Compared with compiled languages like C/C++, Python has some disadvantages in efficiency and performance. However, we could use some techniques to speed up the efficiency of Python code. In this article, I will show you the speed-up techniques I usually used in my work.

The test environment is Python 3.7, with macOS 10.14.6, and 2.3 GHz Intel Core i5.

0. Optimization principles

Before diving into the details of code optimization, we need to understand some basic principles of code optimization.

  1. Make sure that the code can work normally first. Because making the correct program faster is much easier than making the fast program correct.
  2. Weigh the cost of optimization. Optimization comes with a cost. For example, less runtime usually needs more space usage, or less space usage usually need more runtime.
  3. Optimization cannot sacrifice code readability.

1. Proper Data Types Usage in Python

According to the TimeComplexity of Python, the average case of x in s operation of list is O(n). On the other hand, the average case of x in s operation of set is O(1).

We should use defaultdict for the initialization.

2. Replace list comprehension with generator expressions

# Bad: 267ms
nums_squared_list_comprehension = [num**2 for num in range(1000000)]
# Good: 5ms
nums_squared_generator_expression = (num**2 for num in range(1000000))

Another benefit of generator expression is that we can get the result without building and holding the entire list object in memory before iteration. In other words, generator expression saves memory usage.

import sys# Bad: 87632
# Good: 128

3. Replace global variables with local variables

We should put the global variables into the function. The local variable is fast than the global variable.

4. Avoid dot operation

Every time we use . to access the function, it will trigger specific methods, like __getattribute__() and __getattr__(). These methods will use the dictionary operation, which will cause a time cost. We can use from xx import xx to remove such costs.

According to technique 2, We also can assign the global function to a local function.

Furthermore, we could assign the list.append() method to a local function.

The speed of accessing self._value is slower than accessing a local variable. We could assign the class property to a local variable to speed up the runtime.

5. Avoid Unnecessary Abstraction

When use additional processing layers (such as decorators, property access, descriptors) to wrap the code, it will make the code slow. In most cases, it is necessary to reconsider whether it is necessary to use these layers. Some C/C++ programmers might follow the coding style that using the getter/setter function to access the property. But we could use a more simple writing style.

6. Avoid Data Duplication

The value_list is meaningless.

The temp is no need.

When using a + b to concatenate strings, Python will apply for memory space, and copy a and b to the newly applied memory space respectively. This is because the string data type in Python is an immutable object. If concatenating n string, it will generate n-1 intermediate results and every intermediate result will apply for memory space and copy the new string.

On the other hand, join() will save time. It will first calculate the total memory space that needs to be applied, and then apply for the required memory at one time, and copy each string element into the memory.

7. Utilize the Short Circuit Evaluation of if Statement

Python uses a short circuit technique to speed truth value evaluation. If the first statement is false then the whole thing must be false, so it returns that value. Otherwise, if the first value is true it checks the second and returns that value.

Therefore, to save runtime, we can follow the below rules:

  • if a and b: The variable a should have a high probability of False, so Python won’t calculate b.
  • if a or b: The variable a should have a higher probability of True, so Python won’t calculate b.

8. Loop optimization

for loop is faster than while loop.

We use the above example.

We move the sqrt(x) from inner for loop to outer for loop.

9. Use numba.jit

Numba can compile the Python function JIT into machine code for execution, which greatly improves the speed of the code. For more information about numba, see the homepage.

We use the example in technique 7.

We move the sqrt(x) from inner for loop to outer for loop.

10. Use cProfile to Locate Time Cost Function

`cProfile` will output the time usage of each function. So we can find the time cost function.

Check out my other posts on Medium with a categorized view!
Xu Liang