Python is a language that many programmers love. This language is incredibly easy to use. The thing is that the code written in Python is intuitive and well readable. However, in conversations about Python one can often hear the same complaint about this language. Especially when C experts are talking about Python. That's how it sounds: "Python is slow." And those who say so do not sin against the truth.
Compared to many other programming languages, Python is, indeed, slow.
Here are the test results that compare the performance of different programming languages in solving various problems.
There are several ways to speed up Python programs. For example, you can use
libraries designed to use multiple processor cores. Those who work with Numpy, Pandas, or Scikit-Learn can be advised to take a look at the
Rapids software package, which allows you to use the GPU in scientific calculations.
All these acceleration techniques are good in cases where the tasks that can be solved using Python can be parallelized. For example, these are tasks for preliminary data processing or operations with matrices.
But what if your code is pure Python? What if you have a large
for
loop that you absolutely must use, and the execution of which simply cannot be parallelized due to the fact that the data processed in it must be processed sequentially? Is there any way to speed up Python itself?
The answer to this question is given by Cython - a project using which you can significantly speed up code written in Python.
What is Cython?
Cython, at its core, is an intermediate layer between Python and C / C ++. Cython allows you to write regular Python code with some minor modifications, which is then directly translated into C code.
The only change to the Python code is to add information about its type to each variable. When writing regular Python code, you can declare a variable like this:
x = 0.5
When using Cython when declaring a variable, you need to specify its type:
cdef float x = 0.5
This construct tells Cython that the variable is a floating point number. By the same principle, variables are declared in C. Using ordinary Python, variable types are defined dynamically. The explicit type declaration used in Cython is what makes it possible to convert Python code to C code. The point is that in C, explicit declaration of variable types is necessary.
Cython installation is extremely simple:
pip install cython
Types in Cython
When using Cython, two sets of types can be distinguished. One is for variables, the second is for functions.
If we are talking about variables, then the following types are available to us:
cdef int a, b, c
cdef char *s
cdef float x = 0.5
(single precision number)cdef double x = 63.4
(double precision number)cdef list names
cdef dict goals_for_each_play
cdef object card_deck
Please note that here, in fact, C / C ++ types are shown!
When working with functions, the following types are available to us:
def
is a regular Python function, called only from Python.cdef
- Cython function that cannot be called from regular Python code. Such functions can only be called within the Cython code.cpdef
- A function that can be accessed from both C and Python.
Now that we’ve figured out the types of Python, we’ll take up the speed of Python code.
Speeding up code using Cython
Let's start by creating a Python benchmark. This will be a
for
loop in which the factorial of a number is calculated. The corresponding pure Python code would look like this:
def test(x): y = 1 for i in range(1, x+1): y *= i return y
The cython equivalent of this function is very similar to its original version. The corresponding code must be placed in a file with the extension
.pyx
. The only change that needs to be made to the code is to add information about the types of variables and function into it:
cpdef int test(int x): cdef int y = 1 cdef int i for i in range(1, x+1): y *= i return y
Note that the function has the
cpdef
keyword before it. This allows you to call this function from Python. In addition, the type is assigned to the variable
i
, which plays the role of a loop counter. Let's not forget that we need to type all the variables declared in the function. This will let the C compiler know which types to use.
Now create the
setup.py
, which will help us convert the Cython code to C code:
from distutils.core import setup from Cython.Build import cythonize setup(ext_modules = cythonize('run_cython.pyx'))
Let's compile:
python setup.py build_ext --inplace
The C code is now ready for use.
If you look at the folder in which the Cython code is located, there you can find all the files necessary to run the C code, including the
run_cython.c
file. If you're interested, open this file and look at what C code generated by Cython.
Now you are ready to test our ultrafast C code. Below is the code used to test and compare two versions of the program.
import run_python import run_cython import time number = 10 start = time.time() run_python.test(number) end = time.time() py_time = end - start print("Python time = {}".format(py_time)) start = time.time() run_cython.test(number) end = time.time() cy_time = end - start print("Cython time = {}".format(cy_time)) print("Speedup = {}".format(py_time / cy_time))
This code is very simple. We import the necessary files - just like regular Python files are imported, and then we call the corresponding functions, doing it the same way as if we would work with regular Python functions all the time.
Take a look at the following table. You may notice that the Cython version of the program is faster than its Python version in all cases. The larger the task, the greater the acceleration that Cython provides.
Summary
Using Cython can significantly speed up almost any code written in Python, without making any special effort. The more loops in the program and the more data it processes - the better results you can expect from Cython.
Dear readers! Do you use Cython in your projects?