SlideShare a Scribd company logo
1 of 12
Download to read offline
Raster processing with
scipy.ndimage
Henry Walshaw
henry@pythoncharmers.com
                                  @om_henners
                 N
                             om
   gis




     ta
    .s




                        .c




                             e
     s
     s
     s
     s
     s
     s
     s
     s




          ck
               exc h a n g
Getting set up

arcpy for ArcGIS 10 requires numpy 1.3.0
This means we’re restricted to scipy 0.7.1, matplotlib 1.0.1
and PIL 1.1.7 (which is still the latest version)
If you’re not using ArcGIS, or you’re using virtualenv get the
latest versions!
All code for this talk can be downloaded from github:
https://github.com/om-henners/ndimage_talk.git
Why process with scipy?

Open Source scientific algorithms
Easy to set up for large concurrent processing on local PCs
and in the cloud (see PiCloud)
It’s free!*
  *Well, aside from development cost
Getting data in and out


We’ll be using the arcpy.RasterToNumPyArray and
arcpy.NumPyArrayToRaster functions

Alternatives include GDAL Python bindings, scipy image read
functions, and many others
See getting_data_in.py and getting_data_out.py
Calculating median filter


  Once you’ve got an array it’s
  one line of code to perform a
  basic filter function
scipy.ndimage.filters.median_filter(a,
size=9)
Getting more complex

n-dimensional processing
We can supply a size or filter
shape, but now we have to be
even more aware of edge
effects
Still the same one line of code
Generic filters in scipy

Most Spatial Analyst operations are in ndimage
For everything else there’s generic_filter
Flattens the target region and passes through to a callback
function
Can be used to handle null data as ndimage can’t handle
masked arrays (yet)
Calculating slope

We can implement a simple slope calculation using generic
filter over a 3x3 footprint
We can use the form as described by the ESRI documentation:
slope = e is atan ( sqrt( [dz/dx]**2 + [dz/dy]**2 ) ) where
[dz/dx] = ((c + 2f + i) - (a + 2d + g) / (8 * x_cellsize)
[dz/dy] = ((g + 2h + i) - (a + 2b + c)) / (8 * y_cellsize))
Slower than a standard ndimage filter, but faster than
arcpy!
Calculating a variable focal
maximum
Calculate the focal maximum for every point in an array, with
the focal annulus defined by a radius from another raster
Don’t use a generic filter (there’s no need)
Don’t use ArcGIS Spatial analyst (it’s really slow)
          Method                   100x100 random raster

        generic filter                         4.7 secs

    arcpy Spatial Analyst                     6.7 secs
Do use inbuilt functions

It’s faster to calculate the focal
sum using
ndimage.maximum_filter
for every possible buffer value
than to roll your own function
100x100 array? 0.009
seconds
Last words on scipy.ndimage

It’s awesome, and free and threadsafe
It ties into any number of other packages (scikits-learn, scikits-
image)
You can do just about anything in ndimage that you can in
Spatial Analyst
  If you have time to code it
  And code it right
John Hunter (1968 - 2012)



http://numfocus.org/johnhunter/

More Related Content

What's hot

Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014Mark Rees
 
Garbage collection 介紹
Garbage collection 介紹Garbage collection 介紹
Garbage collection 介紹kao kuo-tung
 
GLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid ComputingGLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid ComputingLINE+
 
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgSri Ambati
 
Bulk Exporting from Cassandra - Carlo Cabanilla
Bulk Exporting from Cassandra - Carlo CabanillaBulk Exporting from Cassandra - Carlo Cabanilla
Bulk Exporting from Cassandra - Carlo CabanillaDatadog
 
lecture 6
lecture 6lecture 6
lecture 6sajinsc
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsSam Bowne
 
Whirr devdown
Whirr devdownWhirr devdown
Whirr devdownPuppet
 
resume-XinyuSui
resume-XinyuSuiresume-XinyuSui
resume-XinyuSuiXinyu Sui
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsSam Bowne
 
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-CQuick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-CYuki Tanabe
 
Ganga: an interface to the LHC computing grid
Ganga: an interface to the LHC computing gridGanga: an interface to the LHC computing grid
Ganga: an interface to the LHC computing gridMatt Williams
 
Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Till Blume
 
FME Lightning Talk: Raster
FME Lightning Talk: RasterFME Lightning Talk: Raster
FME Lightning Talk: RasterSafe Software
 

What's hot (20)

Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014
 
Garbage collection 介紹
Garbage collection 介紹Garbage collection 介紹
Garbage collection 介紹
 
GLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid ComputingGLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid Computing
 
scikit-cuda
scikit-cudascikit-cuda
scikit-cuda
 
Advanced R Graphics
Advanced R GraphicsAdvanced R Graphics
Advanced R Graphics
 
Smooth Sort
Smooth SortSmooth Sort
Smooth Sort
 
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
 
Bulk Exporting from Cassandra - Carlo Cabanilla
Bulk Exporting from Cassandra - Carlo CabanillaBulk Exporting from Cassandra - Carlo Cabanilla
Bulk Exporting from Cassandra - Carlo Cabanilla
 
Heap
HeapHeap
Heap
 
lecture 6
lecture 6lecture 6
lecture 6
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflows
 
Whirr devdown
Whirr devdownWhirr devdown
Whirr devdown
 
resume-XinyuSui
resume-XinyuSuiresume-XinyuSui
resume-XinyuSui
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflows
 
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-CQuick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
 
SAADATMAND_PYTHON
SAADATMAND_PYTHONSAADATMAND_PYTHON
SAADATMAND_PYTHON
 
Ganga: an interface to the LHC computing grid
Ganga: an interface to the LHC computing gridGanga: an interface to the LHC computing grid
Ganga: an interface to the LHC computing grid
 
Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...
 
Quantum computers
Quantum computersQuantum computers
Quantum computers
 
FME Lightning Talk: Raster
FME Lightning Talk: RasterFME Lightning Talk: Raster
FME Lightning Talk: Raster
 

Similar to Raster Processing with Scipy.ndimage (Dev Meet Up II)

Automation in ArcGIS using Arcpy
Automation in ArcGIS using ArcpyAutomation in ArcGIS using Arcpy
Automation in ArcGIS using ArcpyGeodata AS
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIYoni Davidson
 
The evolution of array computing in Python
The evolution of array computing in PythonThe evolution of array computing in Python
The evolution of array computing in PythonRalf Gommers
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...PyData
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4jStrata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4jAdam Gibson
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopAmanda Casari
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arcabsvis
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataTravis Oliphant
 
Low cost solutions for Lidar and GIS analysis
Low cost solutions for Lidar and GIS analysisLow cost solutions for Lidar and GIS analysis
Low cost solutions for Lidar and GIS analysisFungis Queensland
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSPeterAndreasEntschev
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkMila, Université de Montréal
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based programRalf Gommers
 
Building maps for apps in the cloud - a Softlayer Use Case
Building maps for  apps in the cloud - a Softlayer Use CaseBuilding maps for  apps in the cloud - a Softlayer Use Case
Building maps for apps in the cloud - a Softlayer Use CaseTiman Rebel
 
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas EricssonOSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas EricssonNETWAYS
 
GeospatialFramework_SlideShare.pptx
GeospatialFramework_SlideShare.pptxGeospatialFramework_SlideShare.pptx
GeospatialFramework_SlideShare.pptxDinesh Kumar Azad
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processinghuguk
 

Similar to Raster Processing with Scipy.ndimage (Dev Meet Up II) (20)

Automation in ArcGIS using Arcpy
Automation in ArcGIS using ArcpyAutomation in ArcGIS using Arcpy
Automation in ArcGIS using Arcpy
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
 
The evolution of array computing in Python
The evolution of array computing in PythonThe evolution of array computing in Python
The evolution of array computing in Python
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4jStrata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4j
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code Workshop
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arc
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
 
Low cost solutions for Lidar and GIS analysis
Low cost solutions for Lidar and GIS analysisLow cost solutions for Lidar and GIS analysis
Low cost solutions for Lidar and GIS analysis
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using spark
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
 
dev_int_96
dev_int_96dev_int_96
dev_int_96
 
Building maps for apps in the cloud - a Softlayer Use Case
Building maps for  apps in the cloud - a Softlayer Use CaseBuilding maps for  apps in the cloud - a Softlayer Use Case
Building maps for apps in the cloud - a Softlayer Use Case
 
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas EricssonOSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
 
GeospatialFramework_SlideShare.pptx
GeospatialFramework_SlideShare.pptxGeospatialFramework_SlideShare.pptx
GeospatialFramework_SlideShare.pptx
 
Scala+data
Scala+dataScala+data
Scala+data
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processing
 

Raster Processing with Scipy.ndimage (Dev Meet Up II)

  • 1. Raster processing with scipy.ndimage Henry Walshaw henry@pythoncharmers.com @om_henners N om gis ta .s .c e s s s s s s s s ck exc h a n g
  • 2. Getting set up arcpy for ArcGIS 10 requires numpy 1.3.0 This means we’re restricted to scipy 0.7.1, matplotlib 1.0.1 and PIL 1.1.7 (which is still the latest version) If you’re not using ArcGIS, or you’re using virtualenv get the latest versions! All code for this talk can be downloaded from github: https://github.com/om-henners/ndimage_talk.git
  • 3. Why process with scipy? Open Source scientific algorithms Easy to set up for large concurrent processing on local PCs and in the cloud (see PiCloud) It’s free!* *Well, aside from development cost
  • 4. Getting data in and out We’ll be using the arcpy.RasterToNumPyArray and arcpy.NumPyArrayToRaster functions Alternatives include GDAL Python bindings, scipy image read functions, and many others See getting_data_in.py and getting_data_out.py
  • 5. Calculating median filter Once you’ve got an array it’s one line of code to perform a basic filter function scipy.ndimage.filters.median_filter(a, size=9)
  • 6. Getting more complex n-dimensional processing We can supply a size or filter shape, but now we have to be even more aware of edge effects Still the same one line of code
  • 7. Generic filters in scipy Most Spatial Analyst operations are in ndimage For everything else there’s generic_filter Flattens the target region and passes through to a callback function Can be used to handle null data as ndimage can’t handle masked arrays (yet)
  • 8. Calculating slope We can implement a simple slope calculation using generic filter over a 3x3 footprint We can use the form as described by the ESRI documentation: slope = e is atan ( sqrt( [dz/dx]**2 + [dz/dy]**2 ) ) where [dz/dx] = ((c + 2f + i) - (a + 2d + g) / (8 * x_cellsize) [dz/dy] = ((g + 2h + i) - (a + 2b + c)) / (8 * y_cellsize)) Slower than a standard ndimage filter, but faster than arcpy!
  • 9. Calculating a variable focal maximum Calculate the focal maximum for every point in an array, with the focal annulus defined by a radius from another raster Don’t use a generic filter (there’s no need) Don’t use ArcGIS Spatial analyst (it’s really slow) Method 100x100 random raster generic filter 4.7 secs arcpy Spatial Analyst 6.7 secs
  • 10. Do use inbuilt functions It’s faster to calculate the focal sum using ndimage.maximum_filter for every possible buffer value than to roll your own function 100x100 array? 0.009 seconds
  • 11. Last words on scipy.ndimage It’s awesome, and free and threadsafe It ties into any number of other packages (scikits-learn, scikits- image) You can do just about anything in ndimage that you can in Spatial Analyst If you have time to code it And code it right
  • 12. John Hunter (1968 - 2012) http://numfocus.org/johnhunter/