1. Three years of the ExtremeEarth project
Online workshop - December 9th 2021
Theofilos Kakantousis
Desta Haileselassie Hagos
Logical Clocks, KTH
The ExtremeEarth platform: scalable deep learning
pipelines with Earth observation data and Hopsworks
2. ExtremeEarth
From Copernicus Big Data
to Extreme Earth Analytics
This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No 825258.
3. 3
Contents
1. ExtremeEarth platform architecture
2. End-to-end scalable deep learning
pipelines with Hopsworks
3. Exploitation of results
4. Research
5. 5
Background
• The Copernicus programme produces more than three petabytes (PB) of Earth Observation (EO)
data annually from Sentinel satellites.*
• Data and Information Access Services (DIAS) provide centralised access to Copernicus data and
processing tools.
• European Space Agency (ESA)Thematic Exploitation Platforms (TEPs) make sure complex data
streams are exploited to their full potential.
○ Food Security, Polar
• Hopsworks Data-Intensive AI platform brings scalable AI support for Earth Observation data.
* https://workshop.copernicus.eu/sites/default/files/content/attachments/ajax/copernicus_overview.pdf
7. 7
ExtremeEarth architecture goals
• ExtremeEarth brings together these components
○ Under the same architecture…
○ … and infrastructure.
○ Reduce cost and increase productivity by providing a seamless end-user experience without
having to manage different services
• Combine
○ EO data access from DIASes
○ End-user facing EO data products from TEPs
○ Scalable AI capabilities of Hopsworks
9. 9
ExtremeEarth architecture deep dive 1/2
• Infrastructure provided by Creodias and
managed by the TEPs
○ OpenStack cluster with GPU support
• Data layer with multiple data sources
○ Raw Creodias data
○ Intermediate TEP data
○ Training datasets
• Processing layer provided by Hopsworks.
○ Core AI engine
○ Develop PB-scale machine learning
algorithms with deep learning
architectures.
○ Platform that provides support for
semantic data tools
10. 10
ExtremeEarth architecture deep dive 2/2
• Product layer
○ Hopsworks serves AI products to
external clients
• User interface
○ Hopsworks is integrated with the
TEPs via APIs
○ TEP users make direct use of AI
models develop in Hopsworks.
13. 13
ExtremeEarth running in production
• Hopsworks installed alongside TEP
infrastructure on CREODIAS
○ https://hopsworks.polartep.io
• Provides easy EO data access and
machine learning development tooling to
developers and data scientists.
• Deep learning architectures developed on
this Hopsworks cluster for the Food
Security and Polar use cases.
15. 15
Hopsworks
Open source platform to develop end-to-end machine learning pipelines at scale
for Enterprise AI.
Use your tools of choice and serve at the lowest latency on any cloud, at any
scale.
The Data Platform for AI
16. 16
Organizations are struggling to deploy AI
because of Data
● “87% identified data as the reason their organizations failed to successfully implement AI.”*
Venture Beat * https://venturebeat.com/2021/03/24/employees-attribute-ai-project-failure-to-poor-data-quality/
Where the data is
(storage)
Discover and
Access the data
Clean, Join and Aggregate the Data
Extract the Data
Transform the
data into features
Validate the data.
Make the process
repeatable
🔁
Serve for real-time
applications or train.
🏆
17. 17
Growing Consensus on How to Manage
Complexity of AI
Data validation
Distributed Training
Model
Serving
A/B
Testing
Monitoring
Pipeline Management
HyperParameter
Tuning
Feature Engineering
Data Collection
Hardware
Management
* Diagram from Google’s paper Hidden Technical Debt in Machine Learning Systems
Data Model Prediction
φ(x)
18. 18
Growing Consensus on How to Manage
Complexity of AI
Data validation
Distributed Training
Model
Serving
A/B
Testing
Monitoring
Pipeline Management
HyperParameter
Tuning
Feature Engineering
Data Collection
Hardware
Management
* Diagram from Google’s paper Hidden Technical Debt in Machine Learning Systems
FEATURE STORE
FEATURE ENGINEERING
Data Model Prediction
φ(x)
FEATURE STORE
FEATURE ENGINEERING
19. 19
Growing Consensus on How to Manage
Complexity of AI
Data validation
Distributed Training
Model
Serving
A/B
Testing
Monitoring
Pipeline Management
HyperParameter
Tuning
Feature Engineering
Data Collection
Hardware
Management
* Diagram from Google’s paper Hidden Technical Debt in Machine Learning Systems
FEATURE STORE
FEATURE ENGINEERING
FEATURE STORE
FEATURE ENGINEERING
ML PLATFORM
TRAIN and SERVE
Data Model Prediction
φ(x)
20. 20
Scalable end-to-end deep learning pipelines
● Horizontally scalable infrastructure that enables developers to manage the lifecycle of EO
machine learning applications
21. 21
End-to-end machine learning components
Streaming Train/Test Data
(S3, HDFS, etc)
Online
Application
Data Warehouse
Data Lake
Feature
Engineering
Offline
Feature Store
Model Training
Model
Serving
Online
Feature Store
Model
Repository
Monitor
Deploy
Feature Vectors
Result Sink (DB)
Batch
Scoring
Batch Access
Deploy
Feature Store
HopsFS
Scaleout Metadata
22. 22
Hopsworks - one open source platform with
all the tools
APPLICATIONS
API
DASHBOARDS
HOPSWORKS
DATASOURCE
ORCHESTRATION
In Airflow
BATCH
Apache Spark
STREAMING
Apache Spark
Apache Flink
HOPSWORKS
FEATURE
STORE
DISTRIBUTED
ML & DL
Pip
Conda
Tensorflow
scikit-learn
PyTorch
Jupyter
Notebooks
Tensorboard
FILESYSTEM & METADATA STORAGE
In HopsFS
MODEL
SERVING
Kubernetes
MODEL
MONITORING
Kafka
+
Spark Streaming
Data Preparation
& Ingestion
Experimentation
& Model Training
Deploy
& Productionalize
Apache
Kafka
24. 24
Distributed deep learning with Hopsworks
# RUNS ON THE WORKERS
def train():
def input_fn(): # return dataset
model = …
optimizer = …
model.compile(…)
history = model.fit(..)
metrics = {
'train_loss': history.history['loss'][-1],
'train_accuracy': history.history['accuracy'][-1],
'val_loss': history.history['val_loss'][-1],
'val_accuracy': history.history['val_accuracy'][-1],
}
tf.estimator.train_and_evaluate(
keras_estimator, input_fn)
# RUNS ON THE DRIVER
experiment.mirrored(train_fn, name='distributed,
metric_key='val_accuracy')
HopsFS
W 1
Driver
TF_CONFIG
W 5
W8
W 7
W 6
W 2
W 4
W 3
Metrics
TensorBoard Checkpoints Training Data Models Logs
25. 25
Hyperparameter tuning with Maggy
● Library for distribution transparent machine
learning experiments on Apache Spark
● Not bound to stage based algorithms, contrary
to existing frameworks.
● Directed Hyperparameter Search (ASHA,
Bayesian) on TensorFlow, PyTorch,
ScikitLearn, XGBoost
● In real-time, unified Logging in Jupyter
notebooks.
26. 26
Ablation studies with Maggy
● Parallel Ablation Studies: without
changing your inner training loop in
TensorFlow/Keras, evaluate (in
parallel) the effect of different
layers, datasets features, etc.
30. 30
Exploitation
● Hopsworks is now extended with EO data support
● Creates opportunities to onboard new use cases for AI with EO data
o Hopsworks as the AI platform for other research projects, H2020 DeepCube
● Hopsworks as a product offering
o With the Polar and Food Security TEPs ExtremeAI platform
o Can be seamlessly integrated with further DIASes
o Offered as SaaS at hopsworks.ai on public clouds such as Amazon AWS and Microsoft Azure
32. 32
Publications
o The ExtremeEarth Software Architecture for Copernicus Earth Observation Data. (Conference
paper)
▪ Published: Conference on Big Data from Space (BiDS21).
o ExtremeEarth Meets Data From Space (Journal paper).
▪ Published: IEEE Journal of Selected Topics in Applied Earth Observations and Remote
Sensing (JSTARS) (2021).
o Maggy: Scalable Asynchronous Parallel Hyperparameter Search. (Conference paper)
▪ Published: The 1st Workshop on Distributed Machine Learning (DistributedML'20).
o AutoAblation: Automated Parallel Ablation Studies for Deep Learning. (Conference paper)
▪ Published: The 1st Workshop on Machine Learning and Systems (EuroMLSys‘21)
o Scalable Artificial Intelligence for Earth Observation Data Using Hopsworks. (Journal paper)
▪ Under preparation: IEEE Journal of Selected Topics in Applied Earth Observations and
Remote Sensing (JSTARS) (2021). ⇒ Will be submitted soon.
• Published papers: http://earthanalytics.eu/publications.html
33. 33
Blog posts
o AI Software Architecture for Copernicus Data with Hopsworks.
▪ July 2021 (link)
o End-to-end Deep Learning Pipelines with Earth observation Data in Hopsworks
▪ October 2021 (link)