Publishing and Serving Machine Learning Models with DLHub

Ryan Chard, Logan Ward, Zhuozhao Li, Yadu Babuji,
Anna Woodard, Steven Tuecke, Kyle Chard, Ben Blaiszik, and Ian Foster
PEARC 2019
Publishing and Serving Machine
Learning Models with
https://www.dlhub.org
1

Overview
Scientiﬁc ML
DLHub
How it works
Use cases
Summary
2

Scientific ML
- Where are the model and trained weights?
- How do I run the model on my data?
- How do I scale my model to run on my cluster?
- Should I run the model on my data?
- How do I share my model with the community?
- How can I build on this work?
- How can I create pipelines comprised of many
models?
3

Scientific ML
Unique scientific requirements:
- Publication, citation, reuse
- Reproducibility
- Research infrastructure
- Scalability
- Low latency
- Research ecosystem
- Workflows
Need general solutions to support vanguard model types,
implementations, dependencies, runtimes, data, and infrastructures
4

Data and Learning Hub for Science
• Collect, publish, categorize models from many
disciplines (materials science, physics, chemistry,
genomics, etc.)
• Serve model inference on-demand via API to
simplify sharing, consumption, and access
• Enable new science through reuse, real-time
model-in-the-loop integration, and synthesis &
ensembling of existing models
https://github.com/DLHub-Argonne
5

• Register model metadata, weights, and ﬁles to improve
discoverability and reusability
• Containerize model to enhance interoperability
• Identify model with a permanent identiﬁer (e.g., DOI, minid, etc.)
• Version model and data pre/post processing steps
DLHub Model Repository
Collect
Data
Train
Model
Register
Model
User
DLHub SDK
local
6

• Servables are self-contained images
• Deploy servables for on-demand inference
• Scale deployments based on load
• Inference can be performed via SDK, CLI, and REST requests
DLHub Model Serving
Collect
Data
Receive Pred.
Properties
Send Compositions
Call
DLHub
User
Find
Model
7

DLHub Servables
- Self-contained images
- Embed model architecture, weights, and dependencies
- Supports almost any model type and implementation
- repo2docker builds servables with almost arbitrary dependencies (apt, pip,
R, etc.)
- Servables include DLHub SDK as shim for loading and interacting with models
- Recognizes model type from metadata and loads appropriately
- Facilitates secure data staging for servable to directly download data on
users’ behalf
- Deploy servables for scalable inference
- Kubernetes pods
- docker2singularity for HPC
8

PostprocessPreprocess Infer
DLHub Servables
Preprocess
.run()
Model predict
.run()
Postprocess
.run()
.run()
.test()
Pipelines
Singularity
or
Docker
methods
9

• Security model
○ provided from publication to inference
○ Globus auth -- login with one of hundreds of supported identity providers
(e.g., institutions, ORCID, Google)
• DLHub CLI and SDK
○ Describe, publish, share, and invoke
• DLHub model searching
○ Rich metadata of the model
○ Metadata stored in a ﬂexible search index, built on Globus Search
DLHub Features
10

• Management Service for users to publish, search, and infer
• Task Managers (TM) to support deployment on various compute resources
○ Parsl, a Python library that supports parallel execution on many sites
• Executors on execution sites to invoke servables
• Optimizations including Memoization, Data staging with Globus, Batch
submissions, Scalability through deployment of model replicas
DLHub Architecture
11

DLHub Performance
Scale Testing
Scaling performance of IPP and HTEX
Scale Testing
• Deployed the servables on
PetrelKube, a 14-node
Kubernetes cluster
• Parsl -- IPyParallel (IPP) and
HighThroughput (HTEX)
executors
• 10000 batch inferences of
“no-op” servable
12

DLHub Performance
Serving General Models
Latency performance of IPP and HTEX
Latency
• Deployed the executors on
PetrelKube, a 14-node
Kubernetes cluster
• Parsl -- IPyParallel (IPP) and
HighThroughput (HTEX)
executors
• 1000 repeated inferences of
“no-op” servable
13

Using DLHub is Easy!
Python SDK Command Line Interface
$ pip install dlhub_sdk $ pip install dlhub_cli
14

Marking up a Model – Python SDK
Existing Model
User Mark Up with SDK
Send to DLHub
(via Globus or HTTPS)
DLHub
Containerization
Populate Search
Index / Mint
Identiﬁers
SDK Extracts Metadata
for Known Model Types
15

Python SDK – Automated Metadata Generation
Citation Metadata DLHub Metadata Servable Metadata
Access Control
• Public
• Globus users
• Globus groups
16

Comparing Models
Cherukara (NST), Nashed (MCS), Harder(XSD) @ Argonne
17

Tomogan
Denoising Tomography Data with
TomoGAN
• Tomography data yields important
insights for a number of diﬀerent ﬁelds.
However:
○ data are initially noisy
• TomoGAN, denoises tomography data
using a generative adversarial network
(GAN)
• Powerful tool for quickly denoising
measurements at scale.
18

DLHub
Image tags
Analyzing Beamline Images
• Stage data into containers via Globus
HTTPS
• Pass valid token and data location
19

DLHub Summary
Model deposit and discovery
- Developed a model schema to promote discovery
- Implemented advanced search and ﬁltering
- Built ingest ﬂow: models are dynamically staged,
packaged, dockerized, published, and indexed
Model serving
- Deployed capabilities for users to run inference with
SDK and CLI
- Automated testing of containers
- Implemented caching and batching
Support for multiple execution sites
- PetrelKube: Parsl, TF serving, Sagemaker
- Other: AWS, OSG
Authentication
- Protected model metadata and inference with
GlobusAuth
- Secured data staging
Future work
- Build Web UI to create pipelines and invoke models
- Cache at the servable level within pipelines
- Couple DLHub to data sources (MDF, etc.)
- Integrate with ML frontend tools (DeepForge),
optimization tools (DeepHyper), and more
- Create interface for training and retraining of models
20

Thanks to our sponsors!
U.S. DEPARTMENT OF
ENERGY
ALCF DF
Parsl Globus IMaD
DLHub Argonne
LDRD

Learning Systems
Model Repositories
- Catalog and aggregate models
- Enable discovery and citation
- Capture provenance
- Record performance data
- Mint identifiers
Model Serving
- On-demand model inference
- Scalable deployments
- Standardized interfaces
- Low latency vs ease of use
Drawbacks:
1. Current serving platforms are not usable on most HPC platforms
2. There is not an integrated system that provides both
22

Publishing and Serving Machine Learning Models with DLHub

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Publishing and Serving Machine Learning Models with DLHub

Similar to Publishing and Serving Machine Learning Models with DLHub (20)

More from Globus

More from Globus (20)

Recently uploaded

Recently uploaded (20)

Publishing and Serving Machine Learning Models with DLHub