SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Compiling Python to
Native Code for Speed
and Scale
David Kammeyer
Continuum Analytics
kammeyer@continuum.io
Tuesday, June 4, 13
Continuum Background
• Python for Big Data and Science
• Founded by Travis Oliphant
(Creator of NumPy) and Peter
Wang in 2012
• 45 Employees
Tuesday, June 4, 13
Enterprise
Python
Scientific
Computing
Data Processing
Data Analysis
Visualisation
Scalable
Computing
• Products
• Training
• Support
• Consulting
About Continuum Analytics
Tuesday, June 4, 13
Products
Anaconda: Easy to install Python distribution, including the
most popular open-source scientific and mathematical
libraries. (Free!)
Accelerate: Opens up the full capabilities of the GPU or
multi-core processor to Python.
IOPro: fast loading of data from files, SQL, and NoSQL
stores, improving performance and reducing memory
overhead.
Wakari: Browser-based Python and Linux environment for
collaborative data analysis, exploration, and visualization.
(Small Instance is Free!)
Tuesday, June 4, 13
Open Source Projects
Blaze: High-performance Python library for modern
vector computing, distributed and streaming data
Bokeh: Interactive, grammar-based visualization
system for large datasets
Numba:Vectorizing Python compiler for multicore
and GPU, using LLVM
Tuesday, June 4, 13
Numba
• Just-in-time, dynamic compiler for Python
• Optimize data-parallel computations at call time,
to take advantage of local hardware configuration
• Compatible with NumPy, Blaze
• Leverage LLVM ecosystem:
• Optimization passes
• Inter-op with other languages
• Variety of backends (e.g. CUDA for GPU support)
Tuesday, June 4, 13
LLVM
LLVM IR
x86
C++
ARM
PTX
C
Fortran
Python
• Leverage LLVM ecosystem:
• Optimization passes
• Inter-op with other languages
• Variety of backends (e.g. CUDA for GPU support)
Tuesday, June 4, 13
Simple API
#@jit('void(double[:,:], double, double)')
@autojit
def numba_update(u, dx2, dy2):
nx, ny = u.shape
for i in xrange(1,nx-1):
for j in xrange(1, ny-1):
u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 +
(u[i,j+1] + u[i,j-1]) * dx2) /
(2*(dx2+dy2))
Comment out one of jit or autojit (don’t use together)
• jit --- provide type information (fastest to call at run-time)
• autojit --- detects input types, infers output, generates code
if needed, and dispatches (a little more run-time call
overhead)
Tuesday, June 4, 13
Example
@jit(‘f8(f8)’)
def sinc(x):
if x==0.0:
return 1.0
else:
return sin(x*pi)/(pi*x)
Numba
Tuesday, June 4, 13
Compile NumPy array expressions
from numba import autojit
@autojit
def formula(a, b, c):
a[1:,1:] = a[1:,1:] + b[1:,:-1] + c[1:,:-1]
@autojit
def express(m1, m2):
m2[1:-1:2,0,...,::2] = (m1[1:-1:2,...,::2]
* m1[-2:1:-2,...,::2])
return m2
Tuesday, June 4, 13
Fast vectorize
NumPy’s ufuncs take “kernels” and
apply the kernel element-by-element
over entire arrays Write kernels in
Python!
from numbapro import vectorize
from math import sin
@vectorize([‘f8(f8)’, ‘f4(f4)’])
def sinc(x):
if x==0.0:
return 1.0
else:
return sin(x*pi)/(pi*x)
Tuesday, June 4, 13
Create parallel-for loops
“prange” directive that spawns compiled tasks
in threads (like Open-MP parallel-for pragma)
import numbapro
from numba import autojit, prange
@autojit
def parallel_sum2d(a):
sum = 0.0
for i in prange(a.shape[0]):
for j in range(a.shape[1]):
sum += a[i,j]
Tuesday, June 4, 13
Example: MandelbrotVectorized
from numbapro import vectorize
sig = 'uint8(uint32, f4, f4, f4, f4, uint32, uint32,
uint32)'
@vectorize([sig], target='gpu')
def mandel(tid, min_x, max_x, min_y, max_y, width,
height, iters):
pixel_size_x = (max_x - min_x) / width
pixel_size_y = (max_y - min_y) / height
x = tid % width
y = tid / width
real = min_x + x * pixel_size_x
imag = min_y + y * pixel_size_y
c = complex(real, imag)
z = 0.0j
for i in range(iters):
z = z * z + c
if (z.real * z.real + z.imag * z.imag) >= 4:
return i
return 255
Kind Time Speed-up
Python 263.6 1.0x
CPU 2.639 100x
GPU 0.1676 1573x
Tesla S2050
Tuesday, June 4, 13
Many More Advanced Features!
• Extension classes (jit a class -- autojit coming soon!)
• Struct support (NumPy arrays can be structs)
• SSA -- can refer to local variables as different types
• Typed lists and typed dictionaries and sets coming
soon!
• Calling ctypes and CFFI functions natively
• pycc (create stand-alone dynamic library and
executable)
• pycc --python (create static extension module for
Python)
Tuesday, June 4, 13
Availability
•Core is Open Source
•github.com/numba/numba
•GPU Compiliation and Parallelization
available in Anaconda Accelerate, €100.
Tuesday, June 4, 13
Questions?
http://continuum.io
kammeyer@continuum.io
Tuesday, June 4, 13

Weitere ähnliche Inhalte

Was ist angesagt?

The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPykammeyer
 
Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningSeiya Tokui
 
PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019Ralf Gommers
 
Open mp library functions and environment variables
Open mp library functions and environment variablesOpen mp library functions and environment variables
Open mp library functions and environment variablesSuveeksha
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning FrameworksSeiya Tokui
 
Short introduction to Storm
Short introduction to StormShort introduction to Storm
Short introduction to StormJimmyZoger
 
Intro to OpenMP
Intro to OpenMPIntro to OpenMP
Intro to OpenMPjbp4444
 
BWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation systemBWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation systemAndrii Gakhov
 
190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pubJaewook. Kang
 
Introduction to python along with the comparitive analysis with r
Introduction to python   along with the comparitive analysis with r Introduction to python   along with the comparitive analysis with r
Introduction to python along with the comparitive analysis with r Ashwini Mathur
 

Was ist angesagt? (15)

The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPy
 
Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep Learning
 
PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019
 
Scientific Python
Scientific PythonScientific Python
Scientific Python
 
OpenMP And C++
OpenMP And C++OpenMP And C++
OpenMP And C++
 
Scipy, numpy and friends
Scipy, numpy and friendsScipy, numpy and friends
Scipy, numpy and friends
 
Open mp library functions and environment variables
Open mp library functions and environment variablesOpen mp library functions and environment variables
Open mp library functions and environment variables
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning Frameworks
 
Short introduction to Storm
Short introduction to StormShort introduction to Storm
Short introduction to Storm
 
Intro to OpenMP
Intro to OpenMPIntro to OpenMP
Intro to OpenMP
 
BWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation systemBWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation system
 
190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub
 
OpenMP
OpenMPOpenMP
OpenMP
 
Twitter Stream Processing
Twitter Stream ProcessingTwitter Stream Processing
Twitter Stream Processing
 
Introduction to python along with the comparitive analysis with r
Introduction to python   along with the comparitive analysis with r Introduction to python   along with the comparitive analysis with r
Introduction to python along with the comparitive analysis with r
 

Ähnlich wie Buzzwords Numba Presentation

Python-Libraries,Numpy,Pandas,Matplotlib.pptx
Python-Libraries,Numpy,Pandas,Matplotlib.pptxPython-Libraries,Numpy,Pandas,Matplotlib.pptx
Python-Libraries,Numpy,Pandas,Matplotlib.pptxanushya2915
 
Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsTravis Oliphant
 
Introduction to Machine Learning by MARK
Introduction to Machine Learning by MARKIntroduction to Machine Learning by MARK
Introduction to Machine Learning by MARKMRKUsafzai0607
 
Python for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptxPython for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptxDr. Amanpreet Kaur
 
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...South Tyrol Free Software Conference
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyGanesan Narayanasamy
 
What is Python? An overview of Python for science.
What is Python? An overview of Python for science.What is Python? An overview of Python for science.
What is Python? An overview of Python for science.Nicholas Pringle
 
Machine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptxMachine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptxpratikpatil862906
 
Scientific visualization with_gr
Scientific visualization with_grScientific visualization with_gr
Scientific visualization with_grJosef Heinen
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 

Ähnlich wie Buzzwords Numba Presentation (20)

PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
 
Python-Libraries,Numpy,Pandas,Matplotlib.pptx
Python-Libraries,Numpy,Pandas,Matplotlib.pptxPython-Libraries,Numpy,Pandas,Matplotlib.pptx
Python-Libraries,Numpy,Pandas,Matplotlib.pptx
 
Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUs
 
Introduction to Machine Learning by MARK
Introduction to Machine Learning by MARKIntroduction to Machine Learning by MARK
Introduction to Machine Learning by MARK
 
Python for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptxPython for Machine Learning(MatPlotLib).pptx
Python for Machine Learning(MatPlotLib).pptx
 
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
What is Python? An overview of Python for science.
What is Python? An overview of Python for science.What is Python? An overview of Python for science.
What is Python? An overview of Python for science.
 
PyPy London Demo Evening 2013
PyPy London Demo Evening 2013PyPy London Demo Evening 2013
PyPy London Demo Evening 2013
 
Machine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptxMachine_learning_internship_report_facemaskdetection.pptx
Machine_learning_internship_report_facemaskdetection.pptx
 
DS LAB MANUAL.pdf
DS LAB MANUAL.pdfDS LAB MANUAL.pdf
DS LAB MANUAL.pdf
 
Scientific visualization with_gr
Scientific visualization with_grScientific visualization with_gr
Scientific visualization with_gr
 
Session 2
Session 2Session 2
Session 2
 
Sci computing using python
Sci computing using pythonSci computing using python
Sci computing using python
 
Python ml
Python mlPython ml
Python ml
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 

Buzzwords Numba Presentation

  • 1. Compiling Python to Native Code for Speed and Scale David Kammeyer Continuum Analytics kammeyer@continuum.io Tuesday, June 4, 13
  • 2. Continuum Background • Python for Big Data and Science • Founded by Travis Oliphant (Creator of NumPy) and Peter Wang in 2012 • 45 Employees Tuesday, June 4, 13
  • 3. Enterprise Python Scientific Computing Data Processing Data Analysis Visualisation Scalable Computing • Products • Training • Support • Consulting About Continuum Analytics Tuesday, June 4, 13
  • 4. Products Anaconda: Easy to install Python distribution, including the most popular open-source scientific and mathematical libraries. (Free!) Accelerate: Opens up the full capabilities of the GPU or multi-core processor to Python. IOPro: fast loading of data from files, SQL, and NoSQL stores, improving performance and reducing memory overhead. Wakari: Browser-based Python and Linux environment for collaborative data analysis, exploration, and visualization. (Small Instance is Free!) Tuesday, June 4, 13
  • 5. Open Source Projects Blaze: High-performance Python library for modern vector computing, distributed and streaming data Bokeh: Interactive, grammar-based visualization system for large datasets Numba:Vectorizing Python compiler for multicore and GPU, using LLVM Tuesday, June 4, 13
  • 6. Numba • Just-in-time, dynamic compiler for Python • Optimize data-parallel computations at call time, to take advantage of local hardware configuration • Compatible with NumPy, Blaze • Leverage LLVM ecosystem: • Optimization passes • Inter-op with other languages • Variety of backends (e.g. CUDA for GPU support) Tuesday, June 4, 13
  • 7. LLVM LLVM IR x86 C++ ARM PTX C Fortran Python • Leverage LLVM ecosystem: • Optimization passes • Inter-op with other languages • Variety of backends (e.g. CUDA for GPU support) Tuesday, June 4, 13
  • 8. Simple API #@jit('void(double[:,:], double, double)') @autojit def numba_update(u, dx2, dy2): nx, ny = u.shape for i in xrange(1,nx-1): for j in xrange(1, ny-1): u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 + (u[i,j+1] + u[i,j-1]) * dx2) / (2*(dx2+dy2)) Comment out one of jit or autojit (don’t use together) • jit --- provide type information (fastest to call at run-time) • autojit --- detects input types, infers output, generates code if needed, and dispatches (a little more run-time call overhead) Tuesday, June 4, 13
  • 9. Example @jit(‘f8(f8)’) def sinc(x): if x==0.0: return 1.0 else: return sin(x*pi)/(pi*x) Numba Tuesday, June 4, 13
  • 10. Compile NumPy array expressions from numba import autojit @autojit def formula(a, b, c): a[1:,1:] = a[1:,1:] + b[1:,:-1] + c[1:,:-1] @autojit def express(m1, m2): m2[1:-1:2,0,...,::2] = (m1[1:-1:2,...,::2] * m1[-2:1:-2,...,::2]) return m2 Tuesday, June 4, 13
  • 11. Fast vectorize NumPy’s ufuncs take “kernels” and apply the kernel element-by-element over entire arrays Write kernels in Python! from numbapro import vectorize from math import sin @vectorize([‘f8(f8)’, ‘f4(f4)’]) def sinc(x): if x==0.0: return 1.0 else: return sin(x*pi)/(pi*x) Tuesday, June 4, 13
  • 12. Create parallel-for loops “prange” directive that spawns compiled tasks in threads (like Open-MP parallel-for pragma) import numbapro from numba import autojit, prange @autojit def parallel_sum2d(a): sum = 0.0 for i in prange(a.shape[0]): for j in range(a.shape[1]): sum += a[i,j] Tuesday, June 4, 13
  • 13. Example: MandelbrotVectorized from numbapro import vectorize sig = 'uint8(uint32, f4, f4, f4, f4, uint32, uint32, uint32)' @vectorize([sig], target='gpu') def mandel(tid, min_x, max_x, min_y, max_y, width, height, iters): pixel_size_x = (max_x - min_x) / width pixel_size_y = (max_y - min_y) / height x = tid % width y = tid / width real = min_x + x * pixel_size_x imag = min_y + y * pixel_size_y c = complex(real, imag) z = 0.0j for i in range(iters): z = z * z + c if (z.real * z.real + z.imag * z.imag) >= 4: return i return 255 Kind Time Speed-up Python 263.6 1.0x CPU 2.639 100x GPU 0.1676 1573x Tesla S2050 Tuesday, June 4, 13
  • 14. Many More Advanced Features! • Extension classes (jit a class -- autojit coming soon!) • Struct support (NumPy arrays can be structs) • SSA -- can refer to local variables as different types • Typed lists and typed dictionaries and sets coming soon! • Calling ctypes and CFFI functions natively • pycc (create stand-alone dynamic library and executable) • pycc --python (create static extension module for Python) Tuesday, June 4, 13
  • 15. Availability •Core is Open Source •github.com/numba/numba •GPU Compiliation and Parallelization available in Anaconda Accelerate, €100. Tuesday, June 4, 13