SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
The Joy of SciPy

   David Kammeyer
 PUB February 14, 2013
Brief History

             Person               Package       Year
                                Matrix Object
           Jim Fulton                           1994
                                 in Python
         Jim Hugunin              Numeric       1995
        Perry Greenfield, Rick
         White, Todd Miller      Numarray       2001

        Travis Oliphant            NumPy        2005
SciPy 2001       Travis Oliphant
                     optimize
                      sparse
                   interpolate
                    integrate
                      special
                       signal
                        stats      Founded in 2001 with Travis Vaught
                      fftpack
                        misc




                                                   Eric Jones
                                                     weave
                                                    cluster
Pearu Peterson
                                                      GA*
     linalg
  interpolate
      f2py
SciPy Ecosystem
Community effort
•   Chuck Harris
•   Pauli Virtanen
•   David Cournapeau
•   Stefan van der Walt
•   Dag Sverre Seljebotn
•   Robert Kern
•   Warren Weckesser
•   Ralf Gommers
•   Mark Wiebe
•   Nathaniel Smith
Why Python for Technical Computing
• Syntax (it gets out of your way)
• Over-loadable operators
• Complex numbers built-in early
• Just enough language support for arrays
• “Occasional” programmers can grok it
• Supports multiple programming styles
• Expert programmers can also use it effectively
• Has a simple, extensible implementation
• General-purpose language --- can build a system
• Critical mass
Putting Science back in Comp Sci
 • Much of the software stack is for systems
   programming --- C++, Java, .NET, ObjC, web
    - Complex numbers?
    - Vectorized primitives?
 • Array-oriented programming has been
   supplanted by Object-oriented programming
 • Software stack for scientists is not as helpful
   as it should be
 • Fortran is still where many scientists end up
NumPy: an Array-Oriented Extension
• Data: the array object
  – slicing and shaping
  – data-type map to Bytes

• Fast Math:
  – vectorization
  – broadcasting
  – aggregations
Zen of NumPy
•   strided is better than scattered
•   contiguous is better than strided
•   descriptive is better than imperative
•   array-oriented is better than object-oriented
•   broadcasting is a great idea
•   vectorized is better than an explicit loop
•   unless it’s too complicated --- then use Cython/Numba
•   think in higher dimensions
Memory using Object-oriented

                     Object
   Object                                  Object
                     Attr1
   Attr1                                   Attr1
                     Attr2
   Attr2                                   Attr2
                     Attr3
   Attr3                                   Attr3


                                  Object
                                  Attr1
            Object
                                  Attr2
            Attr1        Object
                                  Attr3
            Attr2         Attr1
            Attr3         Attr2
                          Attr3
Array-oriented (Table) approach
             Attr1   Attr2   Attr3
   Object1
   Object2
   Object3
   Object4
   Object5
   Object6
Benefits of Array-oriented

• Many technical problems are naturally array-
  oriented (easy to vectorize)
• Algorithms can be expressed at a high-level
• These algorithms can be parallelized more
  simply (quite often much information is lost in
  the translation to typical “compiled” languages)
• Array-oriented algorithms map to modern
  hard-ware caches and pipelines.
We need more focus on
complied array-oriented
languages with fast compilers!
What is good about NumPy?
• Array-oriented
• Extensive Dtype System (including structures)
• C-API
• Simple to understand data-structure
• Memory mapping
• Syntax support from Python
• Large community of users
• Broadcasting
• Easy to interface C/C++/Fortran code
New Project



 NumPy
               Blaze
         Next Generation NumPy
              Out-of-core
           Distributed Tables
Overview
                          Processing
            Code
                            Node       Processing
                   Code
                                         Node
   Main            Code   Processing
   Script                   Node
                   Code
                                       Processing
                          Processing     Node
                            Node
Timeline (Available on GitHubNow!)

         Date           Milestone

       July 2012     Pre-alpha release


     December 2012   Early Beta Release


       June 2013        Version 1.0
Spectrogram Demo
Introducing Numba
(lots of kernels to write)
NumPy Users

 • Want to be able to write Python to get fast
     code that works on arrays and scalars
 •   Need access to a boat-load of C-extensions
     (NumPy is just the beginning)


              PyPy doesn’t cut it for us!
Ufuncs


                Generalized
                 UFuncs
                                                      Python
                                                     Function
                 Window
                 Kernel
                  Funcs

                 Function-
                   based
                 Indexing


                 Memory
                                                                Dynamic compilation




                  Filters
                                              Dynamic
                                             Compilation




NumPy Runtime
                I/O Filters



                Reduction
                 Filters


                Computed
                Columns
                              function pointer
SciPy needs a Python compiler

     optimize                    integrate


      special                       ode



      writing more of SciPy at high-level
Numba -- a Python compiler

 • Replays byte-code on a stack with simple type-
   inference
 • Translates to LLVM (using LLVM-py)
 • Uses LLVM for code-gen
 • Resulting C-level function-pointer can be
   inserted into NumPy run-time
 • Understands NumPy arrays
 • Is NumPy / SciPy aware
NumPy + Mamba = Numba
 Python Function                            Machine Code


                          LLVM-PY

                          LLVM 3.1
       ISPC      OpenCL    OpenMP    CUDA     CLANG

         Intel       AMD        Nvidia      Apple
Examples
Software Stack Future?
         Plateaus of Code re-use + DSLs
   SQL                                R
            TDPL                                Matlab


                    Python


             OBJC                C
  FORTRAN                                 C++



                     LLVM
Wakari/Numba Demo
How to pay for all this?
Dual strategy




                Blaze
NumFOCUS
Num(Py) Foundation for Open Code for Usable Science
NumFOCUS

• Mission
  • To initiate and support educational programs
    furthering the use of open source software in
    science.
  • To promote the use of high-level languages and
    open source in science, engineering, and math
    research
  • To encourage reproducible scientific research
  • To provide infrastructure and support for open
    source projects for technical computing
NumFOCUS

Core Projects



  NumPy            SciPy         IPython      Matplotlib

Other Projects (seeking more --- need representatives)


                        Scikits Image
•   Large-scale data analysis products
•   Anaconda, SciPy in a Box
•   Wakari.io -- Cloud Hosted SciPy
•   Python training (data analysis and
    development)
•   NumPy support and consulting
•   Blaze, Numba, and More Development

Weitere ähnliche Inhalte

Was ist angesagt?

How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersAshraf Uddin
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...StampedeCon
 
C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...
C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...
C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...Simplilearn
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)NamHyuk Ahn
 
20160331_Automate the boring stuff with python
20160331_Automate the boring stuff with python20160331_Automate the boring stuff with python
20160331_Automate the boring stuff with pythonSungman Jang
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn
 
Python: Modules and Packages
Python: Modules and PackagesPython: Modules and Packages
Python: Modules and PackagesDamian T. Gordon
 
PyTorch Introduction
PyTorch IntroductionPyTorch Introduction
PyTorch IntroductionYash Kawdiya
 
C++ Standard Template Library
C++ Standard Template LibraryC++ Standard Template Library
C++ Standard Template LibraryIlio Catallo
 

Was ist angesagt? (20)

NumPy/SciPy Statistics
NumPy/SciPy StatisticsNumPy/SciPy Statistics
NumPy/SciPy Statistics
 
Python Basics.pdf
Python Basics.pdfPython Basics.pdf
Python Basics.pdf
 
Clustering
ClusteringClustering
Clustering
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Clustering
ClusteringClustering
Clustering
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
 
C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...
C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...
C++ Inheritance Tutorial | Introduction To Inheritance In C++ Programming Wit...
 
Bloom filter
Bloom filterBloom filter
Bloom filter
 
Pandas
PandasPandas
Pandas
 
Chapter 05 classes and objects
Chapter 05 classes and objectsChapter 05 classes and objects
Chapter 05 classes and objects
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
20160331_Automate the boring stuff with python
20160331_Automate the boring stuff with python20160331_Automate the boring stuff with python
20160331_Automate the boring stuff with python
 
Hadoop
HadoopHadoop
Hadoop
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
 
Python: Modules and Packages
Python: Modules and PackagesPython: Modules and Packages
Python: Modules and Packages
 
PyTorch Introduction
PyTorch IntroductionPyTorch Introduction
PyTorch Introduction
 
C++ Standard Template Library
C++ Standard Template LibraryC++ Standard Template Library
C++ Standard Template Library
 

Andere mochten auch

Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingEnthought, Inc.
 
Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]Samarth Shah
 
SciPy - Scientific Computing Tool
SciPy - Scientific Computing ToolSciPy - Scientific Computing Tool
SciPy - Scientific Computing ToolMarcelo Cure
 
Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Piotr Milanowski
 
Introduction to NumPy & SciPy
Introduction to NumPy & SciPyIntroduction to NumPy & SciPy
Introduction to NumPy & SciPyShiqiao Du
 
Getting started with pandas
Getting started with pandasGetting started with pandas
Getting started with pandasmaikroeder
 
Making your code faster cython and parallel processing in the jupyter notebook
Making your code faster   cython and parallel processing in the jupyter notebookMaking your code faster   cython and parallel processing in the jupyter notebook
Making your code faster cython and parallel processing in the jupyter notebookPyData
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data AnalysisAndrew Henshaw
 
Data Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - PythonData Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - PythonChetan Khatri
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)PyData
 
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...Ryan Rosario
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for PythonWes McKinney
 

Andere mochten auch (16)

Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
 
Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]
 
Scipy, numpy and friends
Scipy, numpy and friendsScipy, numpy and friends
Scipy, numpy and friends
 
SciPy - Scientific Computing Tool
SciPy - Scientific Computing ToolSciPy - Scientific Computing Tool
SciPy - Scientific Computing Tool
 
Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.
 
Data Visulalization
Data VisulalizationData Visulalization
Data Visulalization
 
Introduction to NumPy & SciPy
Introduction to NumPy & SciPyIntroduction to NumPy & SciPy
Introduction to NumPy & SciPy
 
Getting started with pandas
Getting started with pandasGetting started with pandas
Getting started with pandas
 
Making your code faster cython and parallel processing in the jupyter notebook
Making your code faster   cython and parallel processing in the jupyter notebookMaking your code faster   cython and parallel processing in the jupyter notebook
Making your code faster cython and parallel processing in the jupyter notebook
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
 
Data Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - PythonData Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - Python
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)
 
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
 
Mining Scipy Lectures
Mining Scipy LecturesMining Scipy Lectures
Mining Scipy Lectures
 

Ähnlich wie The Joy of SciPy

Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Fwdays
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataTravis Oliphant
 
Using SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with PythonUsing SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with PythonDavid Beazley (Dabeaz LLC)
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataTravis Oliphant
 
Python for Science and Engineering: a presentation to A*STAR and the Singapor...
Python for Science and Engineering: a presentation to A*STAR and the Singapor...Python for Science and Engineering: a presentation to A*STAR and the Singapor...
Python for Science and Engineering: a presentation to A*STAR and the Singapor...pythoncharmers
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019Travis Oliphant
 
Overview of python misec - 2-2012
Overview of python   misec - 2-2012Overview of python   misec - 2-2012
Overview of python misec - 2-2012Tazdrumm3r
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...PyData
 
Toward a gui remote-sensing environment built over OTB
Toward a gui remote-sensing environment built over OTBToward a gui remote-sensing environment built over OTB
Toward a gui remote-sensing environment built over OTBmelaneum
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Foundation
 
Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationTravis Oliphant
 
Machine learning from software developers point of view
Machine learning from software developers point of viewMachine learning from software developers point of view
Machine learning from software developers point of viewPierre Paci
 

Ähnlich wie The Joy of SciPy (20)

Numba
NumbaNumba
Numba
 
Numba lightning
Numba lightningNumba lightning
Numba lightning
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
 
Using SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with PythonUsing SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with Python
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
 
Python for Science and Engineering: a presentation to A*STAR and the Singapor...
Python for Science and Engineering: a presentation to A*STAR and the Singapor...Python for Science and Engineering: a presentation to A*STAR and the Singapor...
Python for Science and Engineering: a presentation to A*STAR and the Singapor...
 
London level39
London level39London level39
London level39
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
 
Overview of python misec - 2-2012
Overview of python   misec - 2-2012Overview of python   misec - 2-2012
Overview of python misec - 2-2012
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
 
Toward a gui remote-sensing environment built over OTB
Toward a gui remote-sensing environment built over OTBToward a gui remote-sensing environment built over OTB
Toward a gui remote-sensing environment built over OTB
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
 
PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
 
Machine learning from software developers point of view
Machine learning from software developers point of viewMachine learning from software developers point of view
Machine learning from software developers point of view
 

Kürzlich hochgeladen

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Kürzlich hochgeladen (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

The Joy of SciPy

  • 1. The Joy of SciPy David Kammeyer PUB February 14, 2013
  • 2. Brief History Person Package Year Matrix Object Jim Fulton 1994 in Python Jim Hugunin Numeric 1995 Perry Greenfield, Rick White, Todd Miller Numarray 2001 Travis Oliphant NumPy 2005
  • 3. SciPy 2001 Travis Oliphant optimize sparse interpolate integrate special signal stats Founded in 2001 with Travis Vaught fftpack misc Eric Jones weave cluster Pearu Peterson GA* linalg interpolate f2py
  • 5. Community effort • Chuck Harris • Pauli Virtanen • David Cournapeau • Stefan van der Walt • Dag Sverre Seljebotn • Robert Kern • Warren Weckesser • Ralf Gommers • Mark Wiebe • Nathaniel Smith
  • 6. Why Python for Technical Computing • Syntax (it gets out of your way) • Over-loadable operators • Complex numbers built-in early • Just enough language support for arrays • “Occasional” programmers can grok it • Supports multiple programming styles • Expert programmers can also use it effectively • Has a simple, extensible implementation • General-purpose language --- can build a system • Critical mass
  • 7. Putting Science back in Comp Sci • Much of the software stack is for systems programming --- C++, Java, .NET, ObjC, web - Complex numbers? - Vectorized primitives? • Array-oriented programming has been supplanted by Object-oriented programming • Software stack for scientists is not as helpful as it should be • Fortran is still where many scientists end up
  • 8. NumPy: an Array-Oriented Extension • Data: the array object – slicing and shaping – data-type map to Bytes • Fast Math: – vectorization – broadcasting – aggregations
  • 9. Zen of NumPy • strided is better than scattered • contiguous is better than strided • descriptive is better than imperative • array-oriented is better than object-oriented • broadcasting is a great idea • vectorized is better than an explicit loop • unless it’s too complicated --- then use Cython/Numba • think in higher dimensions
  • 10. Memory using Object-oriented Object Object Object Attr1 Attr1 Attr1 Attr2 Attr2 Attr2 Attr3 Attr3 Attr3 Object Attr1 Object Attr2 Attr1 Object Attr3 Attr2 Attr1 Attr3 Attr2 Attr3
  • 11. Array-oriented (Table) approach Attr1 Attr2 Attr3 Object1 Object2 Object3 Object4 Object5 Object6
  • 12. Benefits of Array-oriented • Many technical problems are naturally array- oriented (easy to vectorize) • Algorithms can be expressed at a high-level • These algorithms can be parallelized more simply (quite often much information is lost in the translation to typical “compiled” languages) • Array-oriented algorithms map to modern hard-ware caches and pipelines.
  • 13. We need more focus on complied array-oriented languages with fast compilers!
  • 14. What is good about NumPy? • Array-oriented • Extensive Dtype System (including structures) • C-API • Simple to understand data-structure • Memory mapping • Syntax support from Python • Large community of users • Broadcasting • Easy to interface C/C++/Fortran code
  • 15. New Project NumPy Blaze Next Generation NumPy Out-of-core Distributed Tables
  • 16. Overview Processing Code Node Processing Code Node Main Code Processing Script Node Code Processing Processing Node Node
  • 17. Timeline (Available on GitHubNow!) Date Milestone July 2012 Pre-alpha release December 2012 Early Beta Release June 2013 Version 1.0
  • 19. Introducing Numba (lots of kernels to write)
  • 20. NumPy Users • Want to be able to write Python to get fast code that works on arrays and scalars • Need access to a boat-load of C-extensions (NumPy is just the beginning) PyPy doesn’t cut it for us!
  • 21. Ufuncs Generalized UFuncs Python Function Window Kernel Funcs Function- based Indexing Memory Dynamic compilation Filters Dynamic Compilation NumPy Runtime I/O Filters Reduction Filters Computed Columns function pointer
  • 22. SciPy needs a Python compiler optimize integrate special ode writing more of SciPy at high-level
  • 23. Numba -- a Python compiler • Replays byte-code on a stack with simple type- inference • Translates to LLVM (using LLVM-py) • Uses LLVM for code-gen • Resulting C-level function-pointer can be inserted into NumPy run-time • Understands NumPy arrays • Is NumPy / SciPy aware
  • 24. NumPy + Mamba = Numba Python Function Machine Code LLVM-PY LLVM 3.1 ISPC OpenCL OpenMP CUDA CLANG Intel AMD Nvidia Apple
  • 26. Software Stack Future? Plateaus of Code re-use + DSLs SQL R TDPL Matlab Python OBJC C FORTRAN C++ LLVM
  • 28. How to pay for all this?
  • 29. Dual strategy Blaze
  • 30. NumFOCUS Num(Py) Foundation for Open Code for Usable Science
  • 31. NumFOCUS • Mission • To initiate and support educational programs furthering the use of open source software in science. • To promote the use of high-level languages and open source in science, engineering, and math research • To encourage reproducible scientific research • To provide infrastructure and support for open source projects for technical computing
  • 32. NumFOCUS Core Projects NumPy SciPy IPython Matplotlib Other Projects (seeking more --- need representatives) Scikits Image
  • 33. Large-scale data analysis products • Anaconda, SciPy in a Box • Wakari.io -- Cloud Hosted SciPy • Python training (data analysis and development) • NumPy support and consulting • Blaze, Numba, and More Development