Get started with async in Python

mercredi 26 février 2025, 10:00 , par InfoWorld

Asynchronous programming, or async, is a feature of many modern languages that allows a program to juggle multiple operations without waiting or getting hung up on any one of them. It’s a smart way to efficiently handle tasks like network or file I/O, where most of the program’s time is spent waiting for tasks to finish.

Consider a web scraping application that opens 100 network connections. You could open one connection, wait for the results, then open the next and wait for the results, and so on. Most of the time the program runs is spent waiting on a network response, not doing actual work.

Async gives you a more efficient method: Open all 100 connections at once, then switch between active connections as they return results. If one connection isn’t returning results, switch to the next one, and so on, until all connections have returned their data. Time typically wasted waiting for any one thing to finish is used to check if other things have finished.

Async syntax is now a standard feature in Python, but not all Python developers are familiar with it. In this article, we’ll explore how asynchronous programming works in Python, and how to put it to use in your code.

When to use asynchronous programming

In general, the best times to use async are when you’re trying to do work that has the following traits:

The work takes a long time to complete.

The delay involves waiting for I/O (disk or network) operations, not computation.

The work involves many I/O operations happening at once, or one or more I/O operations happening when you’re also trying to get other tasks done.

Async lets you set up multiple tasks in parallel and iterate through them efficiently, without blocking the rest of your application.

Some examples of tasks that work well with async:

Web scraping, as described above.

Network services (e.g., a web server or framework).

Programs that coordinate results from multiple sources that take a long time to return values (for instance, simultaneous database queries).

It’s important to note that asynchronous programming is different from multithreading or multiprocessing. Async operations all run in the same thread, but they yield to one another as needed, making async more efficient than threading or multiprocessing for many kinds of tasks. (More on this below.)

Using Python’s async and await keywords

Python has two keywords, async and await, for creating async operations. Consider this script:

def get_server_status(server_addr)
# A potentially long-running operation...
return server_status

def server_ops()
results = []
results.append(get_server_status('addr1.server')
results.append(get_server_status('addr2.server')
return results

An async version of the same script—not fully functional or well-formed, just enough to give us an idea of how the syntax works—might look like this.

async def get_server_status(server_addr)
# A potentially long-running operation...
return server_status

async def server_ops()
results = []
results.append(await get_server_status('addr1.server')
results.append(await get_server_status('addr2.server')
return results

Functions prefixed with the async keyword become asynchronous functions, also known as coroutines. Coroutines behave differently from regular functions:

Coroutines can use the keyword await, which allows a coroutine to wait for results from another coroutine without blocking. Until results come back from the await-flagged coroutine, Python switches freely among other running coroutines.

Coroutines can only be called from other async functions. If you run server_ops() or get_server_status() as-is from the body of the script, you won’t get their results; you’ll get a Python coroutine object, which can’t be used directly.

In this example, the awaited statements would allow the server_ops() function to yield to other async functions if they were also running. If these two functions were the only async operations in the program, there wouldn’t be any real benefit to using async, since the await statements would force them to complete one at a time.

So, if we can’t call async functions from non-asynchronous functions, and we can’t run async functions directly, how do we use them? Answer: By using the asyncio library, which bridges async and the rest of Python.

Using async and await with asyncio

Here is an example (again, not fully functional but illustrative) of how one might write a web scraping application using async and asyncio. This script takes a list of URLs and uses multiple instances of an async function from an external library (read_from_site_async()) to download them and aggregate the results.

import asyncio
from web_scraping_library import read_from_site_async

async def main(url_list):
return await asyncio.gather(*[read_from_site_async(_) for _ in url_list])

urls = ['
results = asyncio.run(main(urls))
print (results)

In the above example, we use two common asyncio functions:

asyncio.run() is used to launch an async function from the non-asynchronous part of our code, and thus kick off all the progam’s async activities. (This is how we run main().)

asyncio.gather() takes one or more async-decorated functions (in this case, several instances of read_from_site_async() from our hypothetical web-scraping library), runs them all, and waits for all of the results to come in.

The idea is to start the read operation for all of the sites at once, then gather the results as they arrive (hence asyncio.gather()). We don’t wait for any one operation to complete before moving to the next one.

In the previous example, we were awaiting each get_server_status()call, which was not an efficient way to use async. In this example, we’re dispatching all the async calls at once, so they can all run side by side, then we wait for them all to complete at once.

Elements of Python async apps

You’ve seen how Python async apps use coroutines as their main ingredient, drawing on the asyncio library to run them. Let’s look at a few other elements that are key to asynchronous applications in Python.

Event loops

The asyncio library creates and manages event loops, the mechanisms that run coroutines until they complete. Only one event loop should be running at a time in a Python process, if only to make it easier for the programmer to keep track of what goes into it.

Task objects

When you submit a coroutine to an event loop for processing, you can get back a Task object, which provides a way to control the behavior of the coroutine from outside the event loop. If you need to cancel the running task, for instance, you can do that by calling the task’s.cancel() method.

Here’s a slightly different version of the site-scraper script above. This version shows the event loop and tasks at work:

import asyncio
from web_scraping_library import read_from_site_async

tasks = []

async def main(url_list):
for n in url_list:
tasks.append(asyncio.create_task(read_from_site_async(n)))
print (tasks)
return await asyncio.gather(*tasks)

urls = ['
loop = asyncio.get_event_loop()
results = loop.run_until_complete(main(urls))
print (results)

This script is more explicit in how it uses the event loop and task objects:

The.get_event_loop() method provides an object we can use to control the event loop directly, by submitting async functions to it programmatically via.run_until_complete(). In the previous script, we could only run a single top-level async function, using asyncio.run(). (The.run_until_complete() does exactly what it says: It runs all of the supplied tasks until they’re done, then returns their results in a single batch.)

The.create_task() method takes a function to run, including its parameters, and gives us back a Task object to run it. Here we submit each URL as a separate Task to the event loop, and store the Task objects in a list. Note that we can only do this inside the event loop—that is, inside an async function.

How much control you need over the event loop and its tasks will depend on how complex the application is that you’re building. If you just want to submit a set of fixed jobs to run concurrently, as with our web scraper, you won’t need a whole lot of control—just enough to launch jobs and gather the results.

Here’s a slightly simpler version of this script:

import asyncio
from web_scraping_library import read_from_site_async

tasks = []

async def main(url_list):
for n in url_list:
tasks.append(asyncio.create_task(read_from_site_async(n)))
print (tasks)
return await asyncio.gather(*tasks)

urls = ['
results = asyncio.run(main(urls))
print (results)

In this version, we just use the asyncio.run() method to execute the async function main() and wait for its results. All the details about the event loop are handled automatically by asyncio.run(). Most of the time, this approach works fine.

By contrast, if you’re creating a full-blown web framework, you’ll want far more control over the behavior of the coroutines and the event loop. For instance, you may need to shut down the event loop gracefully in the event of an application crash, or run tasks in a threadsafe manner if you’re calling the event loop from another thread.

Why not just use threads or multiprocessing?

At this point, you may be wondering: Why use async instead of threads or multiprocessing, both of which have been long available in Python?

First, there is a key difference between async and threads or multiprocessing, even apart from how those things are implemented in Python. Async is about concurrency, while threads and multiprocessing are about parallelism. Concurrency involves dividing time efficiently among multiple tasks at once—e.g., checking your email while waiting in line at a register in the grocery store. Parallelism involves multiple agents processing multiple tasks side by side—e.g., having five separate registers open at the grocery store.

Most of the time, async is a good substitute for threading as threading is implemented in Python. This is because Python doesn’t use operating system threads but its own cooperative threads, where only one thread is ever running in the interpreter. In comparison to cooperative threads, async provides some key advantages:

Async functions are far more lightweight than threads. Tens of thousands of asynchronous operations running at once will have far less overhead than tens of thousands of threads.

The structure of async code makes it easier to reason about where tasks pick up and leave off. This means data races and thread safety are less of an issue. Because all tasks in the async event loop run in a single thread, it’s easier for Python (and the developer) to serialize how they access objects in memory.

Async operations can be canceled and manipulated more readily than threads. The Task object we get back from asyncio.create_task() provides us with a handy way to do this.

Multiprocessing in Python, on the other hand, is best for jobs that are heavily CPU-bound rather than I/O-bound. Async actually works hand-in-hand with multiprocessing, as you can use asyncio.run_in_executor() to delegate CPU-intensive jobs to a process pool from a central process, without blocking that central process.

Multithreading in Python 3.13

Python versions from 3.13 forward feature an experimental interpreter that allows Python threads to run with full concurrency, in the same way that multiprocessing does, but without multiprocessing’s cross-process overhead. This highly-anticipated feature has been in the works for some time now. You can experiment with it if you want to, but it’s still something to use cautiously.

Also, even with these new changes, async tasks will still scale better than threads at the high end. Remember: A thousand async tasks can be handled with far fewer resources and far less overhead than a thousand threads.

Next steps with Python async

The best first thing to do is build a few, simple async apps of your own. Good examples abound, especially now that async in Python has undergone a few versions and had a couple of years to settle down and become more widely used. The official documentation for asyncio is worth reading over to see what it offers, even if you don’t plan to make use of all of its functions. (Also see my recent introduction to using asyncio.)

You might also explore the growing number of async-powered libraries and middleware, many of which provide asynchronous, non-blocking versions of database connectors, network protocols, and the like. The aio-libs repository has some key ones, such as the aiohittp library for web access. It is also worth searching the Python Package Index for libraries with the async keyword. With something like asynchronous programming, the best way to learn is to see how others have done it.

Lire la suite sur InfoWorld

https://www.infoworld.com/article/2264616/get-started-with-async-in-python.html

56 sources (32 en français)

Date Actuelle

mer. 22 oct. - 08:48 CEST