Alluxio Day XV
September 15, 2022
For more on Alluxio Day: https://www.alluxio.io/alluxio-day/
For more Alluxio events: https://alluxio.io/events/
Speaker: Lu Qiu (Machine Learning Engineer and PMC Maintainer, Alluxio)
This talk introduces the three game level progressions to use Alluxio to speed up your cloud training with production use cases from Microsoft, Alibaba, and BossZhipin.
- Level 1: Speed up data ingestion from cloud storage
- Level 2: Speed up data preprocessing and training workloads
- Level 3: Speed up full training workloads with a unified data orchestration layer
2. Lu Qiu ● Machine Learning Engineer @ Alluxio
● Alluxio PMC maintainer
● Master Data Science @ GWU
● Responsible for integrating Alluxio with
deep learning
● Areas: Alluxio fault tolerant system,
journal system, metrics system, and
POSIX API. Alluxio integration with Cloud
2
3. Agenda
● Alluxio and its POSIX API
● Accelerate Cloud Training with Alluxio
○ Round 1 Storage Read Accelerating
○ Round 2 Data Preprocessing & Training
○ Round 3 Data Orchestration Layer
3
7. ALLUXIO 7
DATA LOCALITY
Local performance for remote data with intelligent multi-tiering
Hot Warm Cold
RAM SSD HDD
Read & Write Buffering
Transparent to App
Policies for pinning,
promotion/demotion,TTL
On-premises
Public Cloud
Model
Training
Big Data ETL
Big Data Query
8. ALLUXIO 8
METADATA LOCALITY
Synchronization of changes across clusters
Old File at path
/file1 ->
New File at path
/file1 ->
Alluxio Master
Policies for pinning,
promotion/demotion,TTL
Metadata Synchronization
Mutation
On-premises
Public Cloud
Model
Training
Big Data ETL
Big Data Query
10. Alluxio POSIX API
10
HDFS #1
Obj Store
NFS
HDFS #2
Connecting to
● HDFS
● Amazon S3
● Azure
● Google Cloud
● Ceph
● NFS
● Many more
Accessing Remote/Distributed Data as Local Directories
13. Training Clusters
Data Data Data
SSD SSD SSD
Read Buffering
Transparent to App
Policies for pinning,
promotion/demotion,TTL
Under Storage Kubernetes Cloud Cluster
1. Accelerating under storage data access
14. One Click to Mount UFS to Alluxio
All the data locates in s3://<bucket_name>/ will be
cached by Alluxio and provide data locality for training
jobs.
$ bin/alluxio fs mount /s3 s3://<bucket_name>/ --option
aws.accessKeyId=<access_key> --option aws.secretKey=<secret_key>
$ bin/alluxio fs distributedLoad /s3
One Click to Load all Training data into Alluxio
16. Alluxio @ Microsoft Task
● More than 400 tasks need to read data from
Azure and write data to Azure
● The total data size is larger than 1T
Previously they directly copy data from cloud to training
nodes.
Challenges
● Easy to exceed request rate. Azure blob-fuse
requires downloading data from Azure to local
before starting the tasks, and uploading data to
Azure after finishing the tasks.
● Large amount of data input and output, easy to
cause I/O errors
● GPU idle when waiting for I/O operations
https://www.alluxio.io/resources/videos/speed-up-large-scale-ml-dl-offline-inference-job-with-alluxio/
17. Alluxio @ Microsoft Alluxio Speed up Training by 18%
Reduce I/O wait time, improve training
performance
● Use data pre-cache to improve
performance
● Dynamically cache data during training
● Share data across multiple tasks
Streaming read data to disperse I/O request and
avoid exceeding cloud storage request limit
Auto retry to reduce I/O error rate
https://www.alluxio.io/resources/videos/speed-up-large-scale-ml-dl-offline-inference-job-with-alluxio/
19. Big Data ETL Cluster Training Clusters
DATA DATA DATA
Read Buffering
Transparent to App
Policies for pinning,
promotion/demotion,TTL
2. Data processing to training speed up
20. Alluxio @ Boss Zinpin
Task
● Use Spark/Flink to process data
● Model training on top of the processed
data
Previous solution
● Spark/flink + Ceph + model training
Problems
● Write temporary files into Ceph cause
high Ceph pressure
● Cannot control Ceph read/write
pressure, cluster unstable
Solution with Alluxio
Spark/flink + Alluxio + Ceph + Alluxio +
model training
● Alluxio supports multiple data sources and
multiple compute/training frameworks
● Multiple independent Alluxio clusters, support
multi-tenants, customized configuration,
access control
22. 2. Data processing to training speed up
● Improve under storage stability
● Speed up whole data preprocessing to training pipeline
● Can launch more Alluxio clusters to meet burst ETL/Training
requirements
23. 2. Data processing to training speed up
23
Data Preprocessing Model Training
POSIX Interface
25. Big Data ETL Cluster
Training Clusters
DAT
A
DAT
A
DAT
A
Read Buffering
Transparent to App
Policies for pinning,
promotion/demotion,TTL
Under Storage System
Data Preprocessing
26. Big Data ETL Cluster
DAT
A
DAT
A
DAT
A
Write Buffering
Policies for pinning,
promotion/demotion,TTL
Under Storage
Data Preprocessing
Training Clusters
28. Alluxio @ Momo
Momo has multiple Alluxio clusters including thousands of Alluxio nodes.
Stores more than 100+ TB data. Alluxio serves searching and training tasks
of Momo. Momo continues to develop new use cases of Alluxio.
● Alluxio supports multiple under storage and multiple
compute/training frameworks.
● Accelerate compute/training tasks
● Reduce the metadata and data overhead of under storage
29. Alluxio @ Momo
Billions image training
- 2 billion small files
- Pytorch + Alluxio + Ceph
- Reduce the metadata and data interactions
with Ceph to improve performance
30. Alluxio @ Momo
Speed up recommendation system model loading
● Upload recommendation system model to HDFS
● Distributed load model from HDFS to Alluxio
● Recommendation system load model from Alluxio
concurrently
Speed up loading indexes for ANN system
● Creating indexes
● Upload indexes to HDFS (or object store)
● Nodes loading indexes from Alluxio
31. Alluxio may help you if
● Distributed Training
● Large amount of data (>= TB), large amount of small
files/images
● Network I/O cannot satisfy GPU requirements
● Multiple data sources and multiple training/compute frameworks
● Keep under storage stable and avoid exceeding request rate
problems
● Share data between multiple training tasks
32. Community Driven Project
● Community driven cooperation. Special thanks to excellent
engineers from Microsoft, Shopee, Tencent, AntFinance,
Alibaba, Bilibili, and Nanjing University.
● In production in Microsoft, Shopee, Bilibili, MOMO, Boss
Zhipin, and etc