SlideShare a Scribd company logo
1 of 33
Download to read offline
WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
Kaarthik Sivashanmugam, Wee Hyong Tok
Microsoft
Infrastructure for Deep Learning
in Apache Spark
#UnifiedAnalytics #SparkAISummit
Agenda
• Evolution of data infrastructure
• ML workflow: Data prep & DNN training
• Intro to deep learning and computing needs
• Distributed deep learning and challenges
• Unified platform using Spark
– Infra considerations, challenges
• ML Pipelines
3#UnifiedAnalytics #SparkAISummit
Video
Feeds
Call Logs
Data
Web logs
Products
Images
……
Organization’s Data
Database /
Data
Warehouse
Organization’s data
Machine Learning
Typical E2E Process
…
Prepare Experiment Deploy
Orchestrate
+ Machine Learning and
Deep Learning workloads
6#UnifiedAnalytics #SparkAISummit
How long does it take to train Resnet-50 on ImageNet?
7#UnifiedAnalytics #SparkAISummit
14 daysBefore
2017
NVIDIA M40 GPU
Training Resnet-50 on Imagenet
8#UnifiedAnalytics #SparkAISummit
1 hour 31 mins 15 mins
Apr Sept Nov
Tesla P100 x 256 1,600 CPUs Tesla P100 x 1,024
Facebook
Caffe2
UC Berkeley,
TACC, UC Davis
Tensorflow
Preferred Network
ChainerMN
2017
6.6 mins
Tesla P40 x 2,048
Tencent
TensorFlow
July Nov
2.0 mins
Sony
Neural Network
Library (NNL)
Tesla V100 x 3,456
2018 2019
Fujitsu
MXNet
1.2 mins
Tesla V100 x 2,048
Apr
Considerations for Deep Learning @ Scale
• CPU vs. GPU
• Single vs. multi-GPU
• MPI vs. non-MPI
• Infiniband vs. Ethernet
9#UnifiedAnalytics #SparkAISummit
Credits: Mathew Salvaris
https://azure.microsoft.com/en-us/blog/gpus-vs-cpus-for-deployment-of-deep-learning-models/
“Things” you need to deal with when training
machine learning/deep learning models
Gather results
Secure Access
Scale resources
Schedule jobs
Dependencies and Containers
Provision VM clusters
Distribute data
Handling failures
Machine Learning
Typical E2E Process
…
Prepare Experiment Deploy
Orchestrate
Machine Learning and Deep Learning
12#UnifiedAnalytics #SparkAISummit
Top figure source;
Bottom figure from NVIDIA
ML
DL
Lots of ML
Frameworks ….
13#UnifiedAnalytics #SparkAISummit
TensorFlow PyTorch
Scikit-Learn
MXNet Chainer
Keras
Design Choices for Big Data and Machine Learning/Deep Learning
14#UnifiedAnalytics #SparkAISummit
Laptop Spark +
Separate infrastructure for
ML/DL training/inference
Cloud
Spark
Execution Models for Spark and Deep Learning
15#UnifiedAnalytics #SparkAISummit
Task
1
• Independent Tasks
• Embarrassingly Parallel and Massively Scalable
Task
2
Task
3
Spark
Data Parallelism Model Parallelism
• Non-Independent Tasks
• Some parallel processing
• Optimizing communication between nodes
Distributed Learning
Credits – Reynold Xin, Project Hydrogen – State of Art Deep Learning on Apache Spark
Execution Models for Spark and Deep Learning
16#UnifiedAnalytics #SparkAISummit
Task
1
• Independent Tasks
• Embarrassingly Parallel and Massively Scalable
Task
2
Task
3
Spark
• Non-Independent Tasks
• Some parallel processing
• Optimizing communication between nodes
Distributed Learning
Task
3
Task
2
Task
1
Credits – Reynold Xin, Project Hydrogen – State of Art Deep Learning on Apache Spark
Execution Models for Spark and Deep Learning
17#UnifiedAnalytics #SparkAISummit
Task
1
• Independent Tasks
• Embarrassingly Parallel and Massively Scalable
• Re-run crashed task
Task
2
Task
3
Spark
• Non-Independent Tasks
• Some parallel processing
• Optimizing communication between nodes
• Re-run all tasks
Distributed Learning
Task
3
Task
2
Task
1
Credits – Reynold Xin, Project Hydrogen – State of Art Deep Learning on Apache Spark
Spark + ML/DL
18#UnifiedAnalytics #SparkAISummit
www.aka.ms/spark Sparkflow
TensorFlowOnSpark
Project Hydrogen
HorovodRunner
19#UnifiedAnalytics #SparkAISummit
Microsoft Machine Learning for
Apache Spark v0.16
Microsoft’s Open Source
Contributions to Apache Spark
www.aka.ms/spark Azure/mmlspark
Cognitive
Services
Spark
Serving
Model
Interpretability
LightGBM
Gradient Boosting
Deep Networks
with CNTK
HTTP on
Spark
Demo - Azure Databricks
and Deep Learning
20#UnifiedAnalytics #SparkAISummit
Demo – Distributed Deep
Learning using Tensorflow
with HorovodRunner
21#UnifiedAnalytics #SparkAISummit
What do you
need for
training /
distributed
training?
CPU
GPU
Network
Storage
Deep Learning
Framework
Memory
Physics of Machine Learning and Deep Learning
GPU Device Interconnect
• NVLink
• GPUDirect P2P
• GPUDirect RDMA
• Standard network stack
Interconnect topology sample
Credits:CUDA-MPI Blog (https://bit.ly/2KnmN58)
From CUDA to NCCL1 to NCCL2
Multi-Core
CPU
GPU Multi-GPU Multi-GPU
Multi-Node
NCCL 2NCCL 1CUDA
Multi-GPU
Communication
Library
Credits: NCCL Tutorial (https://bit.ly/2KpPP44)
NCCL 2.x (multi-node)
Credits: NCCL Tutorial (https://bit.ly/2KpPP44)
NCCL 2.x
(multi-
node)
Credits: NCCL Tutorial (https://bit.ly/2KpPP44)
Spark & GPU
• Using GPU with Spark options:
1. Native support (cluster manager, GPU tasks): SPARK-
24615
2. Use cores/memory as proxy for GPU resources and
allow GPU-enabled code execution
3. Code implementation/generation for GPU offload
• Considerations
– Flexibility
– Data management
– Multi-GPU execution
27#UnifiedAnalytics #SparkAISummit
Infrastructure Considerations
• Data format, storage and reuse
– Co-locate Data Engineering storage infrastructure (cluster-local)
– DL Framework support for HDFS (reading from HDFS does not mean data-locality-aware computation)
– Sharing data between Spark and Deep Learning (HDFS, Spark-TF connector, Parquet/Petastorm)
• Job execution
– Gang scheduling – Refer to SPARK-24374
– Support for GPU (and other accelerators) – Refer to SPARK-24615
– Cluster sharing with other types of jobs (CPU-only cluster vs. CPU+GPU cluster)
– Quota management
– Support for Docker containers
– MPI vs. non-MPI
– Difference GPU generations
• Node, GPU connectivity
– Infiniband, RDMA
– GPU Interconnect options
– Interconnect-aware scheduling, minimize distribution, repacking
ML Pipelines
• Using machine learning pipelines, data scientists, data engineers,
and IT professionals can collaborate on different steps/phases
• Enable use of best tech for different phases in ML/DL workflow
29#UnifiedAnalytics #SparkAISummit
Demo – Azure ML
Pipelines & Databricks
30#UnifiedAnalytics #SparkAISummit
What do you
need for training /
distributed
training?
CPU
GPU
Network
Storage
Deep Learning
Framework
Memory
Physics of Machine Learning and Deep Learning
Kaarthik Sivashanmugam, Wee Hyong Tok
Microsoft
Infrastructure for Deep Learning
in Apache Spark
#UnifiedAnalytics #SparkAISummit
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

More Related Content

What's hot

Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiApache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiDatabricks
 
Writing Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIWriting Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIDatabricks
 
Data Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet EncryptionData Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet EncryptionDatabricks
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark ClustersDownscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark ClustersDatabricks
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on KubernetesDatabricks
 
Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0Spark Summit
 
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Databricks
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Databricks
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to OneSerg Masyutin
 
Accelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeAccelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeDatabricks
 
Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Databricks
 
Parallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected WaysParallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected WaysDatabricks
 
Improving Apache Spark for Dynamic Allocation and Spot Instances
Improving Apache Spark for Dynamic Allocation and Spot InstancesImproving Apache Spark for Dynamic Allocation and Spot Instances
Improving Apache Spark for Dynamic Allocation and Spot InstancesDatabricks
 
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
 Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi... Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...Databricks
 
Using Production Profiles to Guide Optimizations
Using Production Profiles to Guide OptimizationsUsing Production Profiles to Guide Optimizations
Using Production Profiles to Guide OptimizationsDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit
 
End-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache SparkEnd-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache SparkDatabricks
 
Speed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorSpeed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorDatabricks
 

What's hot (20)

Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiApache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
 
Writing Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIWriting Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark API
 
Data Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet EncryptionData Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet Encryption
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark ClustersDownscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0
 
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to One
 
Accelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeAccelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks Runtime
 
Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1
 
Parallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected WaysParallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected Ways
 
Improving Apache Spark for Dynamic Allocation and Spot Instances
Improving Apache Spark for Dynamic Allocation and Spot InstancesImproving Apache Spark for Dynamic Allocation and Spot Instances
Improving Apache Spark for Dynamic Allocation and Spot Instances
 
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
 Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi... Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
 
Using Production Profiles to Guide Optimizations
Using Production Profiles to Guide OptimizationsUsing Production Profiles to Guide Optimizations
Using Production Profiles to Guide Optimizations
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos Erotocritou
 
End-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache SparkEnd-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache Spark
 
Speed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorSpeed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS Accelerator
 

Similar to Infrastructure for Deep Learning in Apache Spark

Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Wee Hyong Tok
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDatabricks
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesJen Aman
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkDatabricks
 
Deep learning and Apache Spark
Deep learning and Apache SparkDeep learning and Apache Spark
Deep learning and Apache SparkQuantUniversity
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterJen Aman
 
High Performance Deep learning with Apache Spark
High Performance Deep learning with Apache SparkHigh Performance Deep learning with Apache Spark
High Performance Deep learning with Apache SparkRui Liu
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit
 
Integrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache SparkIntegrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache SparkDatabricks
 
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014mahchiev
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache SparkDona Mary Philip
 
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...Databricks
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayNick Pentreath
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSpark Summit
 
夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架hdhappy001
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Databricks
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksDatabricks
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on DockerDataWorks Summit
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 

Similar to Infrastructure for Deep Learning in Apache Spark (20)

Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache Spark
 
Deep learning and Apache Spark
Deep learning and Apache SparkDeep learning and Apache Spark
Deep learning and Apache Spark
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark Cluster
 
High Performance Deep learning with Apache Spark
High Performance Deep learning with Apache SparkHigh Performance Deep learning with Apache Spark
High Performance Deep learning with Apache Spark
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
 
Integrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache SparkIntegrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache Spark
 
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
 
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI Day
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
 
夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on Docker
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Recently uploaded

如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 

Recently uploaded (20)

如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Infrastructure for Deep Learning in Apache Spark

  • 1. WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
  • 2. Kaarthik Sivashanmugam, Wee Hyong Tok Microsoft Infrastructure for Deep Learning in Apache Spark #UnifiedAnalytics #SparkAISummit
  • 3. Agenda • Evolution of data infrastructure • ML workflow: Data prep & DNN training • Intro to deep learning and computing needs • Distributed deep learning and challenges • Unified platform using Spark – Infra considerations, challenges • ML Pipelines 3#UnifiedAnalytics #SparkAISummit
  • 4. Video Feeds Call Logs Data Web logs Products Images …… Organization’s Data Database / Data Warehouse Organization’s data
  • 5. Machine Learning Typical E2E Process … Prepare Experiment Deploy Orchestrate
  • 6. + Machine Learning and Deep Learning workloads 6#UnifiedAnalytics #SparkAISummit
  • 7. How long does it take to train Resnet-50 on ImageNet? 7#UnifiedAnalytics #SparkAISummit 14 daysBefore 2017 NVIDIA M40 GPU
  • 8. Training Resnet-50 on Imagenet 8#UnifiedAnalytics #SparkAISummit 1 hour 31 mins 15 mins Apr Sept Nov Tesla P100 x 256 1,600 CPUs Tesla P100 x 1,024 Facebook Caffe2 UC Berkeley, TACC, UC Davis Tensorflow Preferred Network ChainerMN 2017 6.6 mins Tesla P40 x 2,048 Tencent TensorFlow July Nov 2.0 mins Sony Neural Network Library (NNL) Tesla V100 x 3,456 2018 2019 Fujitsu MXNet 1.2 mins Tesla V100 x 2,048 Apr
  • 9. Considerations for Deep Learning @ Scale • CPU vs. GPU • Single vs. multi-GPU • MPI vs. non-MPI • Infiniband vs. Ethernet 9#UnifiedAnalytics #SparkAISummit Credits: Mathew Salvaris https://azure.microsoft.com/en-us/blog/gpus-vs-cpus-for-deployment-of-deep-learning-models/
  • 10. “Things” you need to deal with when training machine learning/deep learning models Gather results Secure Access Scale resources Schedule jobs Dependencies and Containers Provision VM clusters Distribute data Handling failures
  • 11. Machine Learning Typical E2E Process … Prepare Experiment Deploy Orchestrate
  • 12. Machine Learning and Deep Learning 12#UnifiedAnalytics #SparkAISummit Top figure source; Bottom figure from NVIDIA ML DL
  • 13. Lots of ML Frameworks …. 13#UnifiedAnalytics #SparkAISummit TensorFlow PyTorch Scikit-Learn MXNet Chainer Keras
  • 14. Design Choices for Big Data and Machine Learning/Deep Learning 14#UnifiedAnalytics #SparkAISummit Laptop Spark + Separate infrastructure for ML/DL training/inference Cloud Spark
  • 15. Execution Models for Spark and Deep Learning 15#UnifiedAnalytics #SparkAISummit Task 1 • Independent Tasks • Embarrassingly Parallel and Massively Scalable Task 2 Task 3 Spark Data Parallelism Model Parallelism • Non-Independent Tasks • Some parallel processing • Optimizing communication between nodes Distributed Learning Credits – Reynold Xin, Project Hydrogen – State of Art Deep Learning on Apache Spark
  • 16. Execution Models for Spark and Deep Learning 16#UnifiedAnalytics #SparkAISummit Task 1 • Independent Tasks • Embarrassingly Parallel and Massively Scalable Task 2 Task 3 Spark • Non-Independent Tasks • Some parallel processing • Optimizing communication between nodes Distributed Learning Task 3 Task 2 Task 1 Credits – Reynold Xin, Project Hydrogen – State of Art Deep Learning on Apache Spark
  • 17. Execution Models for Spark and Deep Learning 17#UnifiedAnalytics #SparkAISummit Task 1 • Independent Tasks • Embarrassingly Parallel and Massively Scalable • Re-run crashed task Task 2 Task 3 Spark • Non-Independent Tasks • Some parallel processing • Optimizing communication between nodes • Re-run all tasks Distributed Learning Task 3 Task 2 Task 1 Credits – Reynold Xin, Project Hydrogen – State of Art Deep Learning on Apache Spark
  • 18. Spark + ML/DL 18#UnifiedAnalytics #SparkAISummit www.aka.ms/spark Sparkflow TensorFlowOnSpark Project Hydrogen HorovodRunner
  • 19. 19#UnifiedAnalytics #SparkAISummit Microsoft Machine Learning for Apache Spark v0.16 Microsoft’s Open Source Contributions to Apache Spark www.aka.ms/spark Azure/mmlspark Cognitive Services Spark Serving Model Interpretability LightGBM Gradient Boosting Deep Networks with CNTK HTTP on Spark
  • 20. Demo - Azure Databricks and Deep Learning 20#UnifiedAnalytics #SparkAISummit
  • 21. Demo – Distributed Deep Learning using Tensorflow with HorovodRunner 21#UnifiedAnalytics #SparkAISummit
  • 22. What do you need for training / distributed training? CPU GPU Network Storage Deep Learning Framework Memory Physics of Machine Learning and Deep Learning
  • 23. GPU Device Interconnect • NVLink • GPUDirect P2P • GPUDirect RDMA • Standard network stack Interconnect topology sample Credits:CUDA-MPI Blog (https://bit.ly/2KnmN58)
  • 24. From CUDA to NCCL1 to NCCL2 Multi-Core CPU GPU Multi-GPU Multi-GPU Multi-Node NCCL 2NCCL 1CUDA Multi-GPU Communication Library Credits: NCCL Tutorial (https://bit.ly/2KpPP44)
  • 25. NCCL 2.x (multi-node) Credits: NCCL Tutorial (https://bit.ly/2KpPP44)
  • 26. NCCL 2.x (multi- node) Credits: NCCL Tutorial (https://bit.ly/2KpPP44)
  • 27. Spark & GPU • Using GPU with Spark options: 1. Native support (cluster manager, GPU tasks): SPARK- 24615 2. Use cores/memory as proxy for GPU resources and allow GPU-enabled code execution 3. Code implementation/generation for GPU offload • Considerations – Flexibility – Data management – Multi-GPU execution 27#UnifiedAnalytics #SparkAISummit
  • 28. Infrastructure Considerations • Data format, storage and reuse – Co-locate Data Engineering storage infrastructure (cluster-local) – DL Framework support for HDFS (reading from HDFS does not mean data-locality-aware computation) – Sharing data between Spark and Deep Learning (HDFS, Spark-TF connector, Parquet/Petastorm) • Job execution – Gang scheduling – Refer to SPARK-24374 – Support for GPU (and other accelerators) – Refer to SPARK-24615 – Cluster sharing with other types of jobs (CPU-only cluster vs. CPU+GPU cluster) – Quota management – Support for Docker containers – MPI vs. non-MPI – Difference GPU generations • Node, GPU connectivity – Infiniband, RDMA – GPU Interconnect options – Interconnect-aware scheduling, minimize distribution, repacking
  • 29. ML Pipelines • Using machine learning pipelines, data scientists, data engineers, and IT professionals can collaborate on different steps/phases • Enable use of best tech for different phases in ML/DL workflow 29#UnifiedAnalytics #SparkAISummit
  • 30. Demo – Azure ML Pipelines & Databricks 30#UnifiedAnalytics #SparkAISummit
  • 31. What do you need for training / distributed training? CPU GPU Network Storage Deep Learning Framework Memory Physics of Machine Learning and Deep Learning
  • 32. Kaarthik Sivashanmugam, Wee Hyong Tok Microsoft Infrastructure for Deep Learning in Apache Spark #UnifiedAnalytics #SparkAISummit
  • 33. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT