Navigation
Recherche
|
4 tips for getting started with free-threaded Python
mercredi 16 juillet 2025, 11:00 , par InfoWorld
Until recently, Python threads ran without true parallelism, as threads yielded to each other for CPU-bound operations. The introduction of free-threaded or ‘no-GIL’ builds in Python 3.13 was one of the biggest architectural changes to the CPython interpreter since its creation. Python threads were finally free to run side by side, at full speed. The availability of a free-threaded build is also one of the most potentially disruptive changes to Python, throwing many assumptions about how Python programs handle threading out the window.
Fee-threaded Python was experimental until recently, but with the release of Python 3.14 beta 3, it is officially supported. The free-threaded build is still optional, so now is a good time to start experimenting with true parallelism in Python. Here are four tips to help you get started. Ask yourself how Python threading can help Python threading has long been unsuited for true parallelism, so until now, you might not have considered it as a possibility. Now that free-threaded Python is officially supported, you can start seriously considering it for your use cases. Threads in any language take a divide-and-conquer approach to any task. Anything that is “embarrassingly parallel” is a good fit for threads. That said, not every problem splits evenly across threads; some problems may simply lose in another way whatever they gained from threading. As an example, if you have a job that writes a lot of files, having each job in its own thread is less effective if each job also writes the file. This is because writing files is an inherently serial operation. A better approach would be to divide jobs across threads and use one thread for writing to disk. As each job finishes, it sends work to the disk-writing job. This way, jobs don’t block each other and aren’t themselves blocked by file writing. Use the highest-level abstraction available for threads There are several layers of abstraction available in Python threading: You can create threads directly and manage them using threading.Thread. However, you’re responsible for the lifetime of each thread, and you also have to manage waiting for threads to finish and getting results from them. This is okay for programs where other operations block until the threads finish running, but not good for more advanced tasks. The concurrent.futures.ThreadPoolExecutor is a higher-level abstraction. This creates a pool of threads—you can set how many—to respond to incoming requests. You can submit jobs to the pool and get results back at your leisure, so your program doesn’t block on waiting for jobs to finish. Both of these are abstractions over a much lower-level _thread module, which in turn is an abstraction for operating system-level thread handling. The higher the level of abstraction you use, the more likely free-threaded Python will behave as expected. As a general rule, any threads you use in Python (as opposed to somewhere external like an extension module) should be created in Python. You can, in theory, create a thread in a CPython extension and register it with the interpreter, but there’s not much point in duplicating the work the interpreter is meant to do and already does well. If you’re already using the ProcessPoolExecutor as an abstraction, you can swap that for a ThreadPoolExcecutor in the free-threaded build quite easily. The two have the same interfaces, so it amounts to little more than editing an import. Make sure Cython modules are thread-safe Easily the biggest stumbling block for free-threaded Python is ensuring CPython extensions, written in C (or something with a C-compatible interface), respect the new free-threaded design. A key tool for authoring C extensions in Python, Cython, closely tracks changes to the CPython runtime. Recently, Cython’s maintainers added support for CPython’s free-threaded builds, but you still have to ensure your code is thread-safe. Specifically, it needs to be thread-safe when you interact with Python objects. If you’re already confident your code is thread-safe, you can add a directive to your Cython module and test it out in free-threaded builds: # cython: freethreading_compatible = True This marks the module as being compatible with free-threaded builds. If you run the free-threaded build and import a module that isn’t marked this way, the interpreter will automatically re-enable the GIL for safety’s sake. To add more thread safety to an existing Cython module, you can use a couple of tools added to Cython to make the job easier: Critical sections: This context manager takes some Python object and creates for it a CPython critical section, or local lock, for the duration of the context block. It can also be used as a function decorator, typically for class methods (with the lock applied to the class instance). Critical sections automatically prevent deadlocks, but at the cost of not providing guarantees that the lock will be held continuously through the critical section—it might be released and then reacquired if some other object needs it more. PyMutex locks: These are more robust locks, which you acquire and release explicitly. Note that if you use them in non-free-threaded builds (such as for backward compatibility), reacquiring the GIL during a PyMutex lock entails the risk of a deadlock. Don’t share iterators or frame objects between threads Some objects should never be shared between threads because they have internal state that isn’t thread-safe. Two common examples are iterators and frame objects. An iterator object in Python yields a stream of objects based on some internal state. A generator, for instance, is a common way to create an iterator object. If you create an iterator, don’t attempt to pass it between threads. You can share the objects yielded by an iterator, as long as they’re thread-safe, but don’t share the iterator itself across thread boundaries. For instance, to create an iterator that produces the letters from a string one after the other, you could do this: data = 'abcdefg' d_iter = iter(data) item = next(d_iter) item2 = next(d_iter) #... etc. In this example, d_iter is the iterator object. You could share data and item (or item2, etc.) between threads, but you can’t share d_iter itself between threads, as that’s likely to corrupt its internal state. Python frame objects contain information about the state of a program at a particular point in its execution. Among other things, they’re used by Python’s debugging mechanisms to produce details about the program when an error condition arises. Frame objects also aren’t thread-safe. If you access frame objects from within your Python program, via sys.current_frames(), you’re likely to experience problems in the free-threaded build. However, if you use inspect.currentframe() or sys._getframe(), those are safer as long as you don’t share them between threads.
https://www.infoworld.com/article/4018856/4-tips-for-getting-started-with-free-threaded-python.html
Voir aussi |
56 sources (32 en français)
Date Actuelle
jeu. 17 juil. - 10:54 CEST
|