4. PyCon
http://us.pycon.org/2010/tutorials/
Introduction to Traits
Introduction to Enthought Tool Suite
Fantastic deal (normally $700 at PyCon
get the same material for $275)
Corran Webster
5. Upcoming Training Classes
March 1 – 5, 2009
Python for Scientists and Engineers
Austin, Texas, USA
March 1 – 5, 2009
Python for Quants
London, UK
http://www.enthought.com/training/
7. IPython.kernel
• IPython's interactive kernel provides a
simple (but powerful) interface for task-
based parallel programming.
• Allows fast development and tuning of
task-parallel algorithm to better utilize
resources.
7
8. Getting started --- local cluster
UNIX and OSX (and now WINDOWS) manually WINDOWS
# run ipcluster to start-up a # run ipcontroller and then
# controller and a set of engines # ipengine for each desired engine
$ ipcluster local –n 4 > start /B C:Python25Scriptsipcontroller.exe
Your cluster is up and running. > start /B C:Python25Scriptsipengine.exe
> start /B C:Python25Scriptsipengine.exe
... > start /B C:Python25Scriptsipengine.exe
...
You can then cleanly stop the cluster from IPython using: 2009-02-11 23:58:26-0600 [-] Log opened.
2009-02-11 23:58:28-0600 [-] Using furl file: C:Documents
mec.kill(controller=True) and Settingsdemo_ip
ythonsecurityipcontroller-engine.furl
You can also hit Ctrl-C to stop it, or use from the cmd 2009-02-11 23:58:28-0600 [-] registered engine with id: 3
line: 2009-02-11 23:58:28-0600 [-] distributing Tasks
2009-02-11 23:58:28-0600 [Negotiation,client] engine
kill -INT 20465 registration succeeded, got
id: 3
Creates several key-files in
Creates several key-files in %HOME%_ipythonsecurity :
~/.ipython/security :
ipcontroller-engine.furl
ipcontroller-engine.furl ipcontroller-mec.furl
ipcontroller-mec.furl ipcontroller-tc.furl
ipcontroller-tc.furl
8
9. Getting started -- distributed
• Run ipcontroller on a host and create .furl files
• Creates separate .furl files to be used by the different connections (engine,
multiengine client, task client).
• Places .furl files by default in ~/.ipython/security (UNIX or Mac OSX) or
%HOME%_ipythonsecurity (Windows).
• Takes --<connection>-furl-file=FILENAME options where <connection> is
engine, multiengine, or task to place the .furl files somewhere else.
• Ensure the ipcontroller-engine.furl file is available to each host that
will run an engine and run ipengine on these hosts.
• Either place it in the default security directory
• Use the –furl-file=FILENAME option to ipengine
• Ensure the multiengine (task) .furl file is available to each host that
will run a multiengine (task) client.
• Either place it in the default security directory
• Pass the FILENAME as the first argument to the constructor
9
10. Initialize client
>>> from IPython.kernel import client
MULTIENGINECLIENT TASKCLIENT
# * allows fine-grained control # * does not expose individual
# * each engine has an id number # engines
# * more intuitive for beginners # * presents a load-balanced,
# optional argument can be # fault-tolerant queue
# location of mec furl-file # optional argument can be
# created by the controller # location of tc furl-file
>>> mec = client.MultiEngineClient() # created by the controller
>>> mec.get_ids() >>> tc = client.TaskClient()
[0 1 2 3]
mec.map -- parallel map tc.map –- parallel map
mec.parallel –- parallel function tc.parallel –- function decorator
mec.execute -- execute in parallel tc.run -- run Tasks
mec.push -- push data tc.get_task_result – get result
mec.pull -- pull data
mec.scatter -- spread out client.MapTask –- function-like
mec.gather -- collect back client.StringTask –- code-string
mec.kill -- kill engines
and controller
10
11. MultiEngineClient
SCALAR FUNCTION PARALLEL VECTORIZED FUNCTION
# Using map
>>> def func(x):
... return x**2.5 * (3*x – 2)
# standard map
>>> result = map(func, range(32))
# mec.map
>>> parallel_result = mec.map(func, range(32))
# mec.parallel
>>> pfunc = mec.parallel()(func)
or using decorators
@mec.parallel
def pfunc(x):
return x**2.5 * (3*x – 2)
>>> parallel_result2 = pfunc(range(32))
11
12. TaskClient – Load Balancing
SCALAR FUNCTION PARALLEL VECTORIZED FUNCTION
# Using map
>>> def func(x):
... return x**2.5 * (3*x – 2)
# standard map
>>> result = map(func, range(32))
# mec.map
>>> parallel_result = tc.map(func, range(32))
# mec.parallel
>>> pfunc = tc.parallel()(func)
or using decorators
@tc.parallel
def pfunc(x):
return x**2.5 * (3*x – 2)
>>> parallel_result2 = pfunc(range(32))
12
13. MultiEngineClient
EXECUTE CODESTRING IN PARALLEL
>>> from enthought.blocks.api import func2str
# decorator that turns python-code into a string
>>> @func2str
... def code():
... import numpy as np
... a = np.random.randn(N,N)
... eigs, vals = np.linalg.eig(a)
... maxeig = max(abs(eigs))
>>> mec['N'] = 100
>>> result = mec.execute(code)
>>> print mec['maxeig']
[10.471428625885835, 10.322386155553213, 10.237638983818622, 10.614715948426941]
13
14. TaskClient – Load Balancing Queue
EXECUTE CODESTRING IN PARALLEL
>>> from enthought.blocks.api import func2str
# decorator that turns python-code into a string
>>> @func2str
... def code():
... import numpy as np
... a = np.random.randn(N,N)
... eigs, vals = np.linalg.eig(a)
... maxeig = max(abs(eigs))
>>> task = client.StringTask(str(code), push={'N':100},
pull='maxeig')
>>> ids = [tc.run(task) for i in range(4)]
>>> res = [tc.get_task_result(id) for id in ids]
>>> print [x['maxeig'] for x in res]
[10.439989436983467, 10.250842410862729, 10.040835983392991, 10.603885977189803]
14
15. Parallel FFT On Memory Mapped File
Time
Processors Speed Up
(seconds)
1 11.75 1.0
2 6.06 1.9
4 3.36 3.5
8 2.50 4.7