SlideShare ist ein Scribd-Unternehmen logo
1 von 26
See the Earth as it could be.
Enabling Global-Scale Geospatial Machine Learning
FOSS4G NA 2018
Simeon Fitch
Co-Founder & VP of R&D
Astraea, Inc.
See the Earth as it could be. 2
Overview
• Context
• Problem Statement
• Introducing RasterFrames
• Example Problem
• Numerical and Performance Results
• Take-Aways
See the Earth as it could be. 3
With exploding population growth and finite resources,
we need to have tools to better plan for sustainable
growth.
By automating the processes around Remote Sensing,
High Performance Computing, and Machine Learning,
we empower individuals to ask complex questions of
the world.
HPC
ML
RS
See the Earth as it could be. 4
Think Locally, Compute Globally
• Model development is a creative, iterative, and
interactive process. How do we do this on a global
scale?
• Good tools minimize cognitive friction; attentive to good
ergonomics
• At a minimum, we need:
Solve for local → Scale to global
• Global-scale remote sensing data provide particular
challenges
5
Why This is Hard: Data Dimensionality
Temporal
Spatial
Spectral
Metadata
6
Why This is Hard: Data Density
500 Meter
7 Band
30 Meter
8 Band
3 Meter
4 Band
1 Meter
4 Band
0.3 Meter
8 Band
0
1
10
100
1,000
10,000
100,000
1,000,000
MODIS NBAR Landsat Planet NAIP Digital Globe
MultibandBytes
Football Field Multiband Image in Bytes
Log
Scale!
7
Why This is Hard: Data Velocity
EOSDISHoldingsand Projected Growth
4
Source: Katie Baynes, NASA Goddard. “NASA’s EOSDIS Cumulus”. 2017. https://goo.gl/eQX9om
See the Earth as it could be. 8
Why This is Hard: Compute & Mental Model
• Traditional cluster computing (e.g. MPI) scales, but requires
special expertise
• Python Pandas & R DataFrames are very accessible, but not
scalable
• Spark DataFrames provide the best of both worlds, but aren’t
imagery friendly, until now
See the Earth as it could be.
• Incubating LocationTech project
• Provides ability to work with global-
scale remote sensing imagery in a
convenient yet scalable format
• Integrates with multiple data
sources and libraries, including
Spark ML, GeoTrellis Map Algebra
and GeoMesa Spark-JTS
• Python, Scala and SQL APIs
GeoTrellis
Layers
Map Algebra
Layer
Operations
Statistical
Analysis
Ti l eLayer RDD Machine
Learning
Visualization
GeoTIFF
RasterFrame
Spark
Dat aSour ce
Spark
Dat aFr ame
spatial
join
Geospatial
Queries
9
See the Earth as it could be.
RasterFrame Anatomy
10
See the Earth as it could be.
Standard Tile Operations
• localAggStats
• localAggMax
• localAggMin
• localAggMean
• localAggDataCells
• localAggNoDataCells
• localAdd
• localSubtract
• localMultiply
• localDivide
• localAlgebra
• tileDimensions
• aggHistogram
• aggStats
• aggMean
• aggDataCells
• aggNoDataCells
• tileMean
• tileSum
• tileMin
• tileMax
• tileHistogram
• tileStats
• dataCells
• noDataCells
• box2D
• tileToArray
• arrayToTile
• assembleTile
• explodeTiles
• cellType
• convertCellType
• withNoData
• renderAscii
11
Polyglot API
12
SELECT spatial_key,
rf_localAggMin(red) as red_min,
rf_localAggMax(red) as red_max,
rf_localAggMean(red) as red_mean
FROM df
GROUP BY spatial_key
df.groupBy("spatial_key").agg(
localAggMin($"red") as "red_min",
localAggMax($"red") as "red_max",
localAggMean($"red") as "red_mean")
df.groupBy(df.spatial_key).agg( 
localAggMin(df.red).alias('red_min'), 
localAggMax(df.red).alias('red_max'), 
localAggMean(df.red).alias('red_mean'))
User Manual With Examples
13
14
Motivating Example: Global Ranking NDVI
On any given day, where in the world should we look for
high NDVI value(s)?
Real goal: Compute something on global imagery to
present RasterFrames and explore its scalability
Isn’t NDVI the
“Hello World”
of FOSS4G?
Compute Pipeline
15
Implementation: Query & Ingest
16
val catalog = spark.read
.format("modis-catalog")
.load()
val granules = catalog
.where($"acquisitionDate" === LocalDate.of(2017, 6, 7))
val b01 = granules.select(download_tiles(modis_band_url("B01")))
val b02 = granules.select(download_tiles(modis_band_url("B02")))
val joined = b01.join(b02, "spatial_key")
Implementation: Computing NDVI
17
val ndvi = udf((b2: Tile, b1: Tile) ⇒ {
val nir = b2.convert(FloatConstantNoDataCellType)
val red = b1.convert(FloatConstantNoDataCellType)
(nir - red) / (nir + red)
})
val withNDVI = joined
.withColumn("ndvi", ndvi($"B02_tile", $"B01_tile"))
Implementation: Computing Histograms
18
-50
0
50
100
150
200
250
300
350
400
-1.5 -1 -0.5 0 0.5 1 1.5Count
x100000 Red Band
Global NDVI Histogram for 2017-06-07
0
100
200
300
400
500
600
700
0 2000 4000 6000 8000 10000 12000
Count
x100000
Red Band
Global Red Band Histogram for 2017-06-07
0
50
100
150
200
250
300
350
400
450
500
0 2000 4000 6000 8000 10000 12000
Count
x100000
Red Band
Global NIR Band Histogram for 2017-06-07
val hist = withNDVI.select(
aggHistogram($"B01_tile"),
aggHistogram($"B02_tile"),
aggHistogram($"ndvi")
)
Implementation: Scoring Tiles
19
val ndviStats = hist.first()._3.stats
val zscoreRange = udf((t: Tile) ⇒ {
val mean = ndviStats.mean
val stddev = math.sqrt(ndviStats.variance)
t.mapDouble(c ⇒ (c - mean) / stddev).findMinMaxDouble
})
val scored = withNDVI
.withColumn("zscores", zscoreRange($"ndvi"))
Implementation: Results
20
val ordered = scored
.select(
$"B01_extent" as "extent",
$"zscores._2" as "zscoreMax"
)
.orderBy(desc("zscoreMax"))
val features = scored
.limit(20)
.select($"extent", $"zscoreMax")
.map { case (extent, zscoreMax) ⇒
val geom = extent.toPolygon().reproject(Sinusoidal, LatLng)
Feature(geom, Map("zscoreMax" -> zscoreMax))
}
.collect
val results = JsonFeatureCollection(features).toJson
Results: Histograms
0
100
200
300
400
500
600
700
0 2000 4000 6000 8000 10000 12000
Count
x100000
Red Band
Global Red Band Histogram for 2017-06-07
-50
0
50
100
150
200
250
300
350
400
-1.5 -1 -0.5 0 0.5 1 1.5
Count
x100000
Red Band
Global NDVI Histogram for 2017-06-07
0
50
100
150
200
250
300
350
400
450
500
0 2000 4000 6000 8000 10000 12000
Count
x100000
Red Band
Global NIR Band Histogram for 2017-06-07
21
Results: Top NDVI for 2017-06-07
22
Results: Benchmarks
23
31.47
16.99
12.23
9.64
8.31
5.53
6.41 6.33
0
5
10
15
20
25
30
35
8 16 24 32 40 80 120 160
Time(minutes)
CPU Cores
See the Earth as it could be. 24
RasterFrame Take-Aways
• DataFrames lower cognitive friction when modeling. Good
Ergonomics!
• Rich set of raster processing primitives
• Support for descriptive and predictive analysis
• Via spark-shell, Jupyter Notebook, Zeppelin, etc. can
interact with data and iterate over solution
• It scales!
• Many more examples at http://rasterframes.io
See the Earth as it could be. 25
Getting Started
• Try it out via Jupyter Notebooks:
docker pull s22s/rasterframes-notebooks
• Documentation: http://rasterframes.io
• Code: https://github.com/locationtech/rasterframes
• Chat: https://gitter.im/s22s/raster-frames
• Social: @metasim on GitHub & Twitter
• Company: http://www.astraea.earth
See the Earth as it could be. 26
Shout Outs
• Thanks to LocationTech
• For Incubating RasterFrames; mentoring by Jim Hughes & Rob Emanuele
• The teams behind GeoTrellis, GeoMesa, JTS, & SFCurve
• Thanks to NASA, USGS, & NOAA
• Supporting public access to massive curated data sets is not easy!
• Upcoming Astraea Presentations
• Machine Learning, FOSS, and open data to map deforestation trends in the Brazilian Amazon
Courtney Whalen & Jason Brown
Tuesday, May 15, 2018 - 4:30 to 5:05 (right after this presentation)
Gateway 1
• Using Deep Learning to Derive 3D Cities from Satellite Imagery
Eric Culbertson
Wednesday, May 16, 2018 - 2:00 to 2:35
Gateway 2
• Please visit Astraea
• Booth #14

Weitere ähnliche Inhalte

Was ist angesagt?

2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...GIS in the Rockies
 
Dem analaysis and catchment delineation using GIS
Dem analaysis and catchment delineation using GISDem analaysis and catchment delineation using GIS
Dem analaysis and catchment delineation using GISHans van der Kwast
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechRob Emanuele
 
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsEnabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsRob Emanuele
 
Data Challenges with 3D Computer Vision
Data Challenges with 3D Computer VisionData Challenges with 3D Computer Vision
Data Challenges with 3D Computer VisionMartin Scholl
 
Session 08 geospatial data
Session 08 geospatial dataSession 08 geospatial data
Session 08 geospatial databodaceacat
 
Using Very High Resolution Satellite Images for Planning Activities in Mining
Using Very High Resolution Satellite Images for Planning Activities in MiningUsing Very High Resolution Satellite Images for Planning Activities in Mining
Using Very High Resolution Satellite Images for Planning Activities in MiningArgongra Gis
 
Remote Sensing Data — Instant Home Delivery!
Remote Sensing Data — Instant Home Delivery!Remote Sensing Data — Instant Home Delivery!
Remote Sensing Data — Instant Home Delivery!Safe Software
 
Processing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechProcessing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechRob Emanuele
 
OptimalViewsheds_FairfaxVA
OptimalViewsheds_FairfaxVAOptimalViewsheds_FairfaxVA
OptimalViewsheds_FairfaxVAHatteras Hoops
 
Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...
Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...
Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...Christian Kehl
 
Ahmad Mauliddin Vol Of Water In Bili Bili
Ahmad Mauliddin Vol Of Water In Bili BiliAhmad Mauliddin Vol Of Water In Bili Bili
Ahmad Mauliddin Vol Of Water In Bili BiliHartanto Sanjaya
 
Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.
Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.
Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.Anita Graser
 

Was ist angesagt? (19)

Raster processing
Raster processingRaster processing
Raster processing
 
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
 
Dem analaysis and catchment delineation using GIS
Dem analaysis and catchment delineation using GISDem analaysis and catchment delineation using GIS
Dem analaysis and catchment delineation using GIS
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtech
 
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsEnabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
 
What's new in ar kit 2
What's new in ar kit 2What's new in ar kit 2
What's new in ar kit 2
 
Data Challenges with 3D Computer Vision
Data Challenges with 3D Computer VisionData Challenges with 3D Computer Vision
Data Challenges with 3D Computer Vision
 
Session 08 geospatial data
Session 08 geospatial dataSession 08 geospatial data
Session 08 geospatial data
 
Using Very High Resolution Satellite Images for Planning Activities in Mining
Using Very High Resolution Satellite Images for Planning Activities in MiningUsing Very High Resolution Satellite Images for Planning Activities in Mining
Using Very High Resolution Satellite Images for Planning Activities in Mining
 
Projections
ProjectionsProjections
Projections
 
Remote Sensing Data — Instant Home Delivery!
Remote Sensing Data — Instant Home Delivery!Remote Sensing Data — Instant Home Delivery!
Remote Sensing Data — Instant Home Delivery!
 
Processing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechProcessing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTech
 
Mapreduce
MapreduceMapreduce
Mapreduce
 
OptimalViewsheds_FairfaxVA
OptimalViewsheds_FairfaxVAOptimalViewsheds_FairfaxVA
OptimalViewsheds_FairfaxVA
 
Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...
Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...
Smooth, Interactive Rendering and On-line Modification of Large-Scale, Geospa...
 
Ahmad Mauliddin Vol Of Water In Bili Bili
Ahmad Mauliddin Vol Of Water In Bili BiliAhmad Mauliddin Vol Of Water In Bili Bili
Ahmad Mauliddin Vol Of Water In Bili Bili
 
Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.
Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.
Big Spatial(!) Data Processing mit GeoMesa. AGIT 2019, Salzburg, Austria.
 
Introduction to GIS
Introduction to GISIntroduction to GIS
Introduction to GIS
 
QGIS training class 3
QGIS training class 3QGIS training class 3
QGIS training class 3
 

Ähnlich wie RasterFrames: Enabling Global-Scale Geospatial Machine Learning

Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Ram Sriharsha
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphsStanka Dalekova
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityMongoDB
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기
03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기
03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기KwangJin So
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache CalciteJulian Hyde
 
RasterFrames + STAC
RasterFrames + STACRasterFrames + STAC
RasterFrames + STACSimeon Fitch
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengDatabricks
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkDatabricks
 
PyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and MapsPyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and MapsHannes Hapke
 
Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2yannabraham
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5Keshav Murthy
 
Gis capabilities on Big Data Systems
Gis capabilities on Big Data SystemsGis capabilities on Big Data Systems
Gis capabilities on Big Data SystemsAhmad Jawwad
 
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim HunterWeb-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim HunterDatabricks
 
Watershed Delineation in ArcGIS
Watershed Delineation in ArcGISWatershed Delineation in ArcGIS
Watershed Delineation in ArcGISArthur Green
 
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServerText Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServerVitor Hirota Makiyama
 
Scaling Spatial Analytics with Google Cloud & CARTO
Scaling Spatial Analytics with Google Cloud & CARTOScaling Spatial Analytics with Google Cloud & CARTO
Scaling Spatial Analytics with Google Cloud & CARTOCARTO
 

Ähnlich wie RasterFrames: Enabling Global-Scale Geospatial Machine Learning (20)

Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Intro to Spatial data
Intro to Spatial data Intro to Spatial data
Intro to Spatial data
 
03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기
03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기
03 인사이트를 줄 수 있는 Google Maps + CartoDB 활용사례 파헤치기
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache Calcite
 
RasterFrames + STAC
RasterFrames + STACRasterFrames + STAC
RasterFrames + STAC
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache Spark
 
PyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and MapsPyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and Maps
 
Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
 
Gis capabilities on Big Data Systems
Gis capabilities on Big Data SystemsGis capabilities on Big Data Systems
Gis capabilities on Big Data Systems
 
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim HunterWeb-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
 
Watershed Delineation in ArcGIS
Watershed Delineation in ArcGISWatershed Delineation in ArcGIS
Watershed Delineation in ArcGIS
 
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServerText Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
 
Scaling Spatial Analytics with Google Cloud & CARTO
Scaling Spatial Analytics with Google Cloud & CARTOScaling Spatial Analytics with Google Cloud & CARTO
Scaling Spatial Analytics with Google Cloud & CARTO
 

Mehr von Astraea, Inc.

Building a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly ScottBuilding a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly ScottAstraea, Inc.
 
Detecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep LearningDetecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep LearningAstraea, Inc.
 
2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly Scott2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly ScottAstraea, Inc.
 
2018 Charlottesville Open Data Challenge - Team DSB
2018 Charlottesville Open Data Challenge - Team DSB2018 Charlottesville Open Data Challenge - Team DSB
2018 Charlottesville Open Data Challenge - Team DSBAstraea, Inc.
 
2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex Miller2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex MillerAstraea, Inc.
 
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...Astraea, Inc.
 

Mehr von Astraea, Inc. (6)

Building a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly ScottBuilding a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly Scott
 
Detecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep LearningDetecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep Learning
 
2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly Scott2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly Scott
 
2018 Charlottesville Open Data Challenge - Team DSB
2018 Charlottesville Open Data Challenge - Team DSB2018 Charlottesville Open Data Challenge - Team DSB
2018 Charlottesville Open Data Challenge - Team DSB
 
2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex Miller2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex Miller
 
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
 

Kürzlich hochgeladen

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durbanmasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 

Kürzlich hochgeladen (20)

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

RasterFrames: Enabling Global-Scale Geospatial Machine Learning

  • 1. See the Earth as it could be. Enabling Global-Scale Geospatial Machine Learning FOSS4G NA 2018 Simeon Fitch Co-Founder & VP of R&D Astraea, Inc.
  • 2. See the Earth as it could be. 2 Overview • Context • Problem Statement • Introducing RasterFrames • Example Problem • Numerical and Performance Results • Take-Aways
  • 3. See the Earth as it could be. 3 With exploding population growth and finite resources, we need to have tools to better plan for sustainable growth. By automating the processes around Remote Sensing, High Performance Computing, and Machine Learning, we empower individuals to ask complex questions of the world. HPC ML RS
  • 4. See the Earth as it could be. 4 Think Locally, Compute Globally • Model development is a creative, iterative, and interactive process. How do we do this on a global scale? • Good tools minimize cognitive friction; attentive to good ergonomics • At a minimum, we need: Solve for local → Scale to global • Global-scale remote sensing data provide particular challenges
  • 5. 5 Why This is Hard: Data Dimensionality Temporal Spatial Spectral Metadata
  • 6. 6 Why This is Hard: Data Density 500 Meter 7 Band 30 Meter 8 Band 3 Meter 4 Band 1 Meter 4 Band 0.3 Meter 8 Band 0 1 10 100 1,000 10,000 100,000 1,000,000 MODIS NBAR Landsat Planet NAIP Digital Globe MultibandBytes Football Field Multiband Image in Bytes Log Scale!
  • 7. 7 Why This is Hard: Data Velocity EOSDISHoldingsand Projected Growth 4 Source: Katie Baynes, NASA Goddard. “NASA’s EOSDIS Cumulus”. 2017. https://goo.gl/eQX9om
  • 8. See the Earth as it could be. 8 Why This is Hard: Compute & Mental Model • Traditional cluster computing (e.g. MPI) scales, but requires special expertise • Python Pandas & R DataFrames are very accessible, but not scalable • Spark DataFrames provide the best of both worlds, but aren’t imagery friendly, until now
  • 9. See the Earth as it could be. • Incubating LocationTech project • Provides ability to work with global- scale remote sensing imagery in a convenient yet scalable format • Integrates with multiple data sources and libraries, including Spark ML, GeoTrellis Map Algebra and GeoMesa Spark-JTS • Python, Scala and SQL APIs GeoTrellis Layers Map Algebra Layer Operations Statistical Analysis Ti l eLayer RDD Machine Learning Visualization GeoTIFF RasterFrame Spark Dat aSour ce Spark Dat aFr ame spatial join Geospatial Queries 9
  • 10. See the Earth as it could be. RasterFrame Anatomy 10
  • 11. See the Earth as it could be. Standard Tile Operations • localAggStats • localAggMax • localAggMin • localAggMean • localAggDataCells • localAggNoDataCells • localAdd • localSubtract • localMultiply • localDivide • localAlgebra • tileDimensions • aggHistogram • aggStats • aggMean • aggDataCells • aggNoDataCells • tileMean • tileSum • tileMin • tileMax • tileHistogram • tileStats • dataCells • noDataCells • box2D • tileToArray • arrayToTile • assembleTile • explodeTiles • cellType • convertCellType • withNoData • renderAscii 11
  • 12. Polyglot API 12 SELECT spatial_key, rf_localAggMin(red) as red_min, rf_localAggMax(red) as red_max, rf_localAggMean(red) as red_mean FROM df GROUP BY spatial_key df.groupBy("spatial_key").agg( localAggMin($"red") as "red_min", localAggMax($"red") as "red_max", localAggMean($"red") as "red_mean") df.groupBy(df.spatial_key).agg( localAggMin(df.red).alias('red_min'), localAggMax(df.red).alias('red_max'), localAggMean(df.red).alias('red_mean'))
  • 13. User Manual With Examples 13
  • 14. 14 Motivating Example: Global Ranking NDVI On any given day, where in the world should we look for high NDVI value(s)? Real goal: Compute something on global imagery to present RasterFrames and explore its scalability Isn’t NDVI the “Hello World” of FOSS4G?
  • 16. Implementation: Query & Ingest 16 val catalog = spark.read .format("modis-catalog") .load() val granules = catalog .where($"acquisitionDate" === LocalDate.of(2017, 6, 7)) val b01 = granules.select(download_tiles(modis_band_url("B01"))) val b02 = granules.select(download_tiles(modis_band_url("B02"))) val joined = b01.join(b02, "spatial_key")
  • 17. Implementation: Computing NDVI 17 val ndvi = udf((b2: Tile, b1: Tile) ⇒ { val nir = b2.convert(FloatConstantNoDataCellType) val red = b1.convert(FloatConstantNoDataCellType) (nir - red) / (nir + red) }) val withNDVI = joined .withColumn("ndvi", ndvi($"B02_tile", $"B01_tile"))
  • 18. Implementation: Computing Histograms 18 -50 0 50 100 150 200 250 300 350 400 -1.5 -1 -0.5 0 0.5 1 1.5Count x100000 Red Band Global NDVI Histogram for 2017-06-07 0 100 200 300 400 500 600 700 0 2000 4000 6000 8000 10000 12000 Count x100000 Red Band Global Red Band Histogram for 2017-06-07 0 50 100 150 200 250 300 350 400 450 500 0 2000 4000 6000 8000 10000 12000 Count x100000 Red Band Global NIR Band Histogram for 2017-06-07 val hist = withNDVI.select( aggHistogram($"B01_tile"), aggHistogram($"B02_tile"), aggHistogram($"ndvi") )
  • 19. Implementation: Scoring Tiles 19 val ndviStats = hist.first()._3.stats val zscoreRange = udf((t: Tile) ⇒ { val mean = ndviStats.mean val stddev = math.sqrt(ndviStats.variance) t.mapDouble(c ⇒ (c - mean) / stddev).findMinMaxDouble }) val scored = withNDVI .withColumn("zscores", zscoreRange($"ndvi"))
  • 20. Implementation: Results 20 val ordered = scored .select( $"B01_extent" as "extent", $"zscores._2" as "zscoreMax" ) .orderBy(desc("zscoreMax")) val features = scored .limit(20) .select($"extent", $"zscoreMax") .map { case (extent, zscoreMax) ⇒ val geom = extent.toPolygon().reproject(Sinusoidal, LatLng) Feature(geom, Map("zscoreMax" -> zscoreMax)) } .collect val results = JsonFeatureCollection(features).toJson
  • 21. Results: Histograms 0 100 200 300 400 500 600 700 0 2000 4000 6000 8000 10000 12000 Count x100000 Red Band Global Red Band Histogram for 2017-06-07 -50 0 50 100 150 200 250 300 350 400 -1.5 -1 -0.5 0 0.5 1 1.5 Count x100000 Red Band Global NDVI Histogram for 2017-06-07 0 50 100 150 200 250 300 350 400 450 500 0 2000 4000 6000 8000 10000 12000 Count x100000 Red Band Global NIR Band Histogram for 2017-06-07 21
  • 22. Results: Top NDVI for 2017-06-07 22
  • 24. See the Earth as it could be. 24 RasterFrame Take-Aways • DataFrames lower cognitive friction when modeling. Good Ergonomics! • Rich set of raster processing primitives • Support for descriptive and predictive analysis • Via spark-shell, Jupyter Notebook, Zeppelin, etc. can interact with data and iterate over solution • It scales! • Many more examples at http://rasterframes.io
  • 25. See the Earth as it could be. 25 Getting Started • Try it out via Jupyter Notebooks: docker pull s22s/rasterframes-notebooks • Documentation: http://rasterframes.io • Code: https://github.com/locationtech/rasterframes • Chat: https://gitter.im/s22s/raster-frames • Social: @metasim on GitHub & Twitter • Company: http://www.astraea.earth
  • 26. See the Earth as it could be. 26 Shout Outs • Thanks to LocationTech • For Incubating RasterFrames; mentoring by Jim Hughes & Rob Emanuele • The teams behind GeoTrellis, GeoMesa, JTS, & SFCurve • Thanks to NASA, USGS, & NOAA • Supporting public access to massive curated data sets is not easy! • Upcoming Astraea Presentations • Machine Learning, FOSS, and open data to map deforestation trends in the Brazilian Amazon Courtney Whalen & Jason Brown Tuesday, May 15, 2018 - 4:30 to 5:05 (right after this presentation) Gateway 1 • Using Deep Learning to Derive 3D Cities from Satellite Imagery Eric Culbertson Wednesday, May 16, 2018 - 2:00 to 2:35 Gateway 2 • Please visit Astraea • Booth #14

Hinweis der Redaktion

  1. Unlock the wealth of information in global remote sensing data Do we all agree that geospatial raster data has a wealth of potential information that can be gleaned from it? My role at Astræa is to apply the art and discipline of software engineering to make data scientists efficient and effective in solving these problems
  2. To empower, think about how people approach problems Let’s think about the context for solving problems We want our models to make a big impact; you must aim for global impact Hard for reasons that are both obvious and not so obvious
  3. Spatial: 500m, 30m, 1m, 0.3m Temporal: Weeks, Days, Hours Spectral: 4 bands, 7 bands, 34 bands, 200+ bands Active sensors (SAR, LiDAR) Metadata: Coordinate Reference System, Temporal/Spatial Extent, QA Flags, Calibration parameters
  4. The dreaded hockey stick Thanks to Baynes and the EOSDIS team
  5. The prior challenges are kind of obvious This is what adds the friction Need better ergonomics Who likes DataFrames? Who’s familiar with Spark? Spark as a frontrunner in compute over industry data.
  6. To effectively and efficiently deliver the power of high-performance computing, advanced machine learning, and remote sensing to our users RasterFrames provides the ability to work with global EO data in a data frame format, familiar to most data scientists
  7. Just a Spark DataFrame, but with special components. “Tile” and “TileLayerMetadata” are types from the GeoTrellis library. STK is “Space Time Key” Conceptually you can also think of it as a map layer.
  8. Regularly growing API
  9. Explain NDVI?: Normalied difference vegetative index Somewhat contrived example for the purposes of highlighting some of RF features I’m not a data scientist... The results haven’t been validated, this is just a computational proxy for real analyis
  10. We are not specifying a region of interest…. We are computing this for the whole world.
  11. Code examples are in Scala (my native language). Look very similar in Python This front-end section is currently data source (MODIS on PDS) specific RasterFrames readers integrate directly with Spark DataSource API. Aim: nice ergonomics
  12. “Tile” and associated operations come from the GeoTrellis library UDF == User Defined Function Also a gateway to scoring by CNN
  13. Another example of a UDF
  14. Top 20 tiles with highest NDVI z-score Not validated, but some interesting points of note for further investigation
  15. r3.xlarge (8 cores, 30GB RAM)
  16. Please thank your civil servants