Talk given at the June 2008 meeting of the New Zealand Python User Group in Auckland.
Outline: An overview to approaches for parallel/concurrent programming in Python.
Code demonstrated in the presentation can be found here:
http://www.kloss-familie.de/moin/TalksPresentations
Beating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
1. Threading Theory Multiprocessing Others Conclusion Finalise
Beating the (sh** out of the) GIL
Multithreading vs. Multiprocessing
Hair dryer 1920s,
Dark Roasted Blend:
http://www.darkroastedblend.
com/2007/01/
retro-technology-update.html
Guy K. Kloss | Multithreading vs. Multiprocessing 1/36
2. Threading Theory Multiprocessing Others Conclusion Finalise
Beating the (sh** out of the) GIL
Multithreading vs. Multiprocessing
Guy K. Kloss
Computer Science
Massey University, Albany
New Zealand Python User Group Meeting
Auckland, 12 June 2008
Guy K. Kloss | Multithreading vs. Multiprocessing 2/36
3. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 3/36
5. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 5/36
6. Threading Theory Multiprocessing Others Conclusion Finalise
Source: http://blog.snaplogic.org/?cat=29
Guy K. Kloss | Multithreading vs. Multiprocessing 6/36
7. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people âunderstandâ threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
8. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people âunderstandâ threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
9. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people âunderstandâ threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
10. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people âunderstandâ threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
11. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people âunderstandâ threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
12. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Blog post by Mark Ramm, 14 May 2008
A multi threaded system is particularly important for people
who use Windows, which makes multiâprocess computing
much more memory intensive than it needs to be. As my
grandma always said Windows canât fork worth a damn. ;)
[. . . ]
So, really itâs kinda like sharedâmemory optimized
microâprocesses running inside larger OS level processes, and
that makes multiâthreaded applications a lot more
reasonable to wrap your brain around. Once you start down
the path of lock managment the non-deterministic character
of the system can quickly overwhelm your brain.
Guy K. Kloss | Multithreading vs. Multiprocessing 8/36
13. Threading Theory Multiprocessing Others Conclusion Finalise
Simple Threading Example
from threading import Thread
from stuff import expensiveFunction
class MyClass(Thread):
def __init__(self, argument):
self.argument = argument
Thread.__init__(self) # I n i t i a l i s e the thread
def run(self):
self.value = expensiveFunction(self.argument)
callObjects = []
for i in range(config.segments):
callObjects.append(MyClass(i))
for item in callObjects:
item.start()
# Do something e l s e .
time.sleep(15.0)
for item in callObjects:
item.join()
print item.value
Guy K. Kloss | Multithreading vs. Multiprocessing 9/36
14. Threading Theory Multiprocessing Others Conclusion Finalise
Our Example with Threading
Our fractal example
now with threading.
Just a humble hairâdryer from the
30s: âOne of the ïŹrst machines used
for permanent wave hairstyling back
in the 1920âs and 1930âs.â
Dark Roasted Blend:
http://www.darkroastedblend.com/2007/05/
mystery-devices-issue-2.html
Guy K. Kloss | Multithreading vs. Multiprocessing 10/36
15. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Global Interpreter Lock
What is it for?
Cooperative multitasking
Interpreter knows when itâs âgood to switchâ
Often more eïŹcient than preemptive multiâtasking
Can be released from native (C) code extensions
(done for I/O intensive operations)
Is it good?
Easy coding
Easy modules/extensions
Large base of available modules alredy
Speed improvement by factor 2
(for singleâthreaded applications)
Keeps code safe
Guy K. Kloss | Multithreading vs. Multiprocessing 11/36
16. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Global Interpreter Lock
What is it for?
Cooperative multitasking
Interpreter knows when itâs âgood to switchâ
Often more eïŹcient than preemptive multiâtasking
Can be released from native (C) code extensions
(done for I/O intensive operations)
Is it good?
Easy coding
Easy modules/extensions
Large base of available modules alredy
Speed improvement by factor 2
(for singleâthreaded applications)
Keeps code safe
Guy K. Kloss | Multithreading vs. Multiprocessing 11/36
17. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Alternatives
Other implementations
(C) Python uses it
Jython doesnât
IronPython doesnât
They use their own/internal threading mechanisms
Is it a design ïŹaw?
Maybe . . . but . . .
Fierce/intense discussions to change the code base
Solutions that pose other beneïŹts:
Processes create fewer inherent dead lock situations
Processes scale also to multiâhost scenarios
Guy K. Kloss | Multithreading vs. Multiprocessing 12/36
18. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Alternatives
Other implementations
(C) Python uses it
Jython doesnât
IronPython doesnât
They use their own/internal threading mechanisms
Is it a design ïŹaw?
Maybe . . . but . . .
Fierce/intense discussions to change the code base
Solutions that pose other beneïŹts:
Processes create fewer inherent dead lock situations
Processes scale also to multiâhost scenarios
Guy K. Kloss | Multithreading vs. Multiprocessing 12/36
19. Threading Theory Multiprocessing Others Conclusion Finalise
Doug Hellmann in Python Magazine 10/2007:
Techniques using lowâlevel, operating systemâspeciïŹc,
libraries for process management are as passe as using
compiled languages for CGI programming. I donât have time
for this lowâlevel stuïŹ any more, and neither do you. Letâs
look at some modern alternatives.
Guy K. Kloss | Multithreading vs. Multiprocessing 13/36
20. Threading Theory Multiprocessing Others Conclusion Finalise
GILâless Python
There was an attempt/patch âway back then ...â
Thereâs a new project now by Adam Olsen
Python 3000 with âfree theadingâ [1]
Using Monitors to isolate state
Design focus: usability
(for common cases, maintainable code)
Optional at compile time using --with-freethread
SacriïŹced singleâthreaded performance
(60â65 % but equivalent to threaded CPython)
Automatic deadlock detection
(detection/breaking, giving exceptions/stack trace)
Runs on Linux and OS/X
Guy K. Kloss | Multithreading vs. Multiprocessing 14/36
21. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 15/36
22. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is ineïŹcient in handling threads
Stackless Python is much more eïŹcient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
23. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is ineïŹcient in handling threads
Stackless Python is much more eïŹcient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
24. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is ineïŹcient in handling threads
Stackless Python is much more eïŹcient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
25. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is ineïŹcient in handling threads
Stackless Python is much more eïŹcient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
26. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Abstraction levels for parallel computing models [7]
Parallelism Communication Synchronisation
4 implicit
3 explicit implicit
2 explicit implicit
1 explicit
Explicit: The programmer speciïŹes it in the parallel program
Implicit: A compiler/runtime system derives it from other information
Guy K. Kloss | Multithreading vs. Multiprocessing 17/36
27. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Low level: Close to hardware
Must specify parallelism
. . . communication
. . . and synchronisation
â Best means for performance tuning
â Premature optimisation?
High level: Highest machine independence
More/all handled by computing model
Up to automatic parallelisation approaches
Both extremes have not been very successful to date
Most developments now:
Level 3 for speciïŹc purposes
Level 1 for general programming
(esp. in the scientiïŹc community)
With Python consistent level 2 possible
Guy K. Kloss | Multithreading vs. Multiprocessing 18/36
28. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Low level: Close to hardware
Must specify parallelism
. . . communication
. . . and synchronisation
â Best means for performance tuning
â Premature optimisation?
High level: Highest machine independence
More/all handled by computing model
Up to automatic parallelisation approaches
Both extremes have not been very successful to date
Most developments now:
Level 3 for speciïŹc purposes
Level 1 for general programming
(esp. in the scientiïŹc community)
With Python consistent level 2 possible
Guy K. Kloss | Multithreading vs. Multiprocessing 18/36
29. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Low level: Close to hardware
Must specify parallelism
. . . communication
. . . and synchronisation
â Best means for performance tuning
â Premature optimisation?
High level: Highest machine independence
More/all handled by computing model
Up to automatic parallelisation approaches
Both extremes have not been very successful to date
Most developments now:
Level 3 for speciïŹc purposes
Level 1 for general programming
(esp. in the scientiïŹc community)
With Python consistent level 2 possible
Guy K. Kloss | Multithreading vs. Multiprocessing 18/36
30. Threading Theory Multiprocessing Others Conclusion Finalise
Common for Parallel Computing
Message Passing Interface (MPI)
for distributed memory
OpenMP
shared memory multiâthreading
The two do not have to be categorised like this
Guy K. Kloss | Multithreading vs. Multiprocessing 19/36
31. Threading Theory Multiprocessing Others Conclusion Finalise
Art by âTeknika Molodezhi,â Russia 1966
Dark Roasted Blend: http://www.darkroastedblend.com/2008/01/retro-future-mind-boggling.html
Guy K. Kloss | Multithreading vs. Multiprocessing 20/36
32. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 21/36
33. Threading Theory Multiprocessing Others Conclusion Finalise
Processing around the GIL
Smart multiâprocessing
Smart task farming
Guy K. Kloss | Multithreading vs. Multiprocessing 22/36
34. Threading Theory Multiprocessing Others Conclusion Finalise
(py)Processing module
By R. Oudkerk [2]
Written in C (really fast!)
Allowes multiple cores and multiple hosts/clusters
Data synchronisation through managers
Easy âupgrade pathâ
Drop in replacement (mostly) for the threading module
Transparent to user
Forks processes, but uses Thread API
Supports queues, pipes, locks,
managers (for sharing state), worker pools
VERY fast, see PEP-371 [3]
Jesse Noller for pyprocessing into core Python
benchmarks available, awesome results!
PEP is oïŹcially accepted: Thanks Guido!
Guy K. Kloss | Multithreading vs. Multiprocessing 23/36
35. Threading Theory Multiprocessing Others Conclusion Finalise
(py)Processing module
(continued)
Some details
Producer/consumer style system
â workers pull jobs
Hides most details of communication
â usable default settings
Communication is tweakable
(to improve performance or meet certain requirements)
Guy K. Kloss | Multithreading vs. Multiprocessing 24/36
36. Threading Theory Multiprocessing Others Conclusion Finalise
(py)Processing module
Letâs see it!
Guy K. Kloss | Multithreading vs. Multiprocessing 25/36
37. Threading Theory Multiprocessing Others Conclusion Finalise
Parallel Python module
By Vitalii Vanovschi [4]
Pure Python
Full âBatteries includedâ paradigm model:
Spawns automatically across detected cores,
and can spawn to clusters
Uses some thread module methods under the hood
More of a âtask farmingâ approach
(requires potentially rethinking/restructuring)
Automatically deploys code and data,
no diïŹcult/multiple installs
Fault tolerance, secure interânode communication,
runs everywhere
Very active communigy,
good documentation, good support
Guy K. Kloss | Multithreading vs. Multiprocessing 26/36
38. Threading Theory Multiprocessing Others Conclusion Finalise
Parallel Python module
Letâs see it!
Guy K. Kloss | Multithreading vs. Multiprocessing 27/36
39. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 28/36
40. Threading Theory Multiprocessing Others Conclusion Finalise
Honourable Mentions
pprocess [5]
IPython for parallel computing [6]
Bulk Synchronous Parallel (BSP) Model [7]
sequence of super steps
(computation, communication, barrier synch)
Reactor based architectures, through Twisted [8]
âDonât call us, we call youâ
MPI (pyMPI, Pypar, MPI for Python, pypvm)
requires constant number of processors during
compationâs duration
Pyro (distributed object system)
Linda (PyLinda)
ScientiïŹc Python (master/slave computing model)
data distribution through call parameters/replication
Guy K. Kloss | Multithreading vs. Multiprocessing 29/36
41. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 30/36
42. Threading Theory Multiprocessing Others Conclusion Finalise
Things to Note
Which approach is best?
Canât say!
Many of the approaches are complimentary
Needs to be evaluated what to use when
All, however, save you a lot of time over the alternative
of writing everything yourself with lowâlevel libraries.
What an age to be alive!
Problems can arise when objects cannot be pickled
Guy K. Kloss | Multithreading vs. Multiprocessing 31/36
43. Threading Theory Multiprocessing Others Conclusion Finalise
Conclusion
Resolving the GIL is not necessarily the best solution
More ineïŹcient (single threaded) runtime
Problems with shared memory access
Various approaches to beat the GIL
Solutions are complimentary in many ways
many scale beyond a local machine/memory system
Guy K. Kloss | Multithreading vs. Multiprocessing 32/36
44. Threading Theory Multiprocessing Others Conclusion Finalise
Questions?
G.Kloss@massey.ac.nz
Slides and code available here:
http://www.kloss-familie.de/moin/TalksPresentations
Guy K. Kloss | Multithreading vs. Multiprocessing 33/36
45. Threading Theory Multiprocessing Others Conclusion Finalise
References I
[1] A. Olsen,
Python 3000 with Free Threading project,
[Online]
http://code.google.com/p/python-safethread/
[2] R. Oudkerk,
Processing Package,
[Online] http://pypi.python.org/pypi/processing/
[3] J. Noller,
PEP-371,
[Online] http://www.python.org/dev/peps/pep-0371/
Guy K. Kloss | Multithreading vs. Multiprocessing 34/36
46. Threading Theory Multiprocessing Others Conclusion Finalise
References II
[4] V. Vanovschi,
Parallel Python,
[Online] http://parallelpython.com/
[5] P. Boddie,
pprocess,
[Online] http://pypi.python.org/pypi/processing/
[6] Project Website,
IPython,
[Online] http://ipython.scipy.org/doc/ipython1/
html/parallel_intro.html
[7] K. Hinsen,
Parallel Scripting with Python
Computing in Science & Engineering, Nov/Dec 2007
Guy K. Kloss | Multithreading vs. Multiprocessing 35/36
47. Threading Theory Multiprocessing Others Conclusion Finalise
References III
[8] B. Eckel,
Concurrency with Python, Twisted, and Flex,
[Online] http://www.artima.com/weblogs/viewpost.
jsp?thread=230001
Guy K. Kloss | Multithreading vs. Multiprocessing 36/36