Deep learning on a mixed cluster with Deeplearning4j and Spark

•

1 gefällt mir•735 views

François Garillot

Deep learning models can be distributed across a cluster to speed up training time and handle large datasets. Deeplearning4j is an open-source deep learning library for Java that runs on Spark, allowing models to be trained in a distributed fashion across a Spark cluster. Training a model involves distributing stochastic gradient descent (SGD) across nodes, with the key challenge being efficient all-reduce communication between nodes. Engineering high performance distributed training, such as with parameter servers, is important to reduce bottlenecks.

Deep learning on a mixed
cluster with Deeplearning4j
and Spark
Barcelona Spark meetup, Dec 9, 2016
(right after NIPS)
francois@garillot.net @huitseeker

Agenda
Intro
Why Deep Learning on a
Cluster
Big Data Architecture
Deeplearning4j
Spark challenges

Introduction : Deep Learning
in the trenches today

The bad thing about doing a
talk right after NIPS
you guys are scary.

The good thing about doing a
talk right after NIPS
You guys don't need to be told SkyNet is a fantasy (for now).

Paying algorithms
Anomaly detection in many forms (bad guys / predictive
maintenance / market rally)
Fraud detection
Network intrusion
Fintech secutiries churn prediction
Video object detection (security)

Models that are being
neglected in benchmarks and
implementation efforts
LSTMs
Autoencoders

How to deal with this in the
Spark world ?
experiment with trained model application: Tensorframes,
what are the deep learning frameworks that let you train
?

Why Deep Learning on a
cluster ?

Practically ... let's look at benchmarks

Practically ... let's look at benchmarks

Practically ... let's look at benchmarks

Practically ... let's look at benchmarks

Training, but how ?
New Amazon GPU instances

Training, but how ?

Training, but how ?

Cluster training in the
enterprise
it's really about multi-tenancy & economies of scale
a big bunch of machines shared among everybody shares
better
if only because you can reuse it for other workloads
Minor reasons
enterprises may not have
GPUs

Distributing training
basically distributing SGD (R)
challenge is AllReduce Communication
Sparse updates, async
communications

Distributing training : good
engineering matters

Cluster training in your
(experimentor) case ?
it's a fun problem : AllReduce
Ultimately solved for people with a large amount of images
that solution is not open-source (but at Facebook, Google,
Amazon, Microsoft¹, Baidu)
¹: 1-bit SGD is under non-commercial license in CNTK 2.0

Big Data architecture

With a parameter server

With Spark
Spark does the initial ETL
Spark ingests the nal result
In the middle : parameter
server.

Spark cluster modes
Mesos GPU support merged
devices cgroups !
YARN GPU support through
tags
Spark Standalone : ?

Deeplearning4j

Deeplearning4j
the rst commercial-grade, open-source, distributed deep-
learning library written for Java and Scala
Skymind its commercial support arm

Scienti c computing on the JVM
libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!)
JavaCPP: generates JNI bindings to your CPP libs
ND4J : numpy for the JVM, native superfast arrays
Datavec : one-stop interface to an NDArray
DeepLearning4J: orchestration, backprop, layer de nition
ScalNet: gateway drug, inspired from (and closely following)
Keras
RL4J : Reinforcement learning for the JVM

With Spark
JavaSparkContent sc = ...;
JavaRDD<DataSet> trainingData = ...;
MultiLayerConfiguration networkConfig = ...;
//Create the TrainingMaster instance
int examplesPerDataSetObject = 1;
TrainingMaster trainingMaster =
new ParameterAveragingTrainingMaster.Builder(examplesPerDataSetObjec
.(other configuration options)
.build();
//Create the SparkDl4jMultiLayer instance
SparkDl4jMultiLayer sparkNetwork =
new SparkDl4jMultiLayer(sc, networkConfig, trainingMaster);
//Fit the network using the training data:
sparkNetwork.fit(trainingData);

Spark Challenges

Even if you don't care about Deep
learning
(from Kazuaki Ishizaki @ IBM Japan)
SPARK-6442 : better linear algebra than
breeze
ND4J will have sparse representations soon

Even if you don't care about Deep
learning II
Meta-
RDDs

Killing the bottlenecks
Spark has already changed its networking backend once.
better support for parameters servers and their fault
tolerance.

A Last Word (from Andrew Y. Ng)
get involved !
don't just read papers, reproduce research
results
Also
We're happy to mentor contributions, and there's a book !

Questions ?

Empfohlen

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Romeo Kienzler

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François GarillotSteve Moore

Apache SystemML - Declarative Large-Scale Machine Learning

Apache SystemML - Declarative Large-Scale Machine Learning

Apache SystemML - Declarative Large-Scale Machine LearningRomeo Kienzler

Hands on image recognition with scala spark and deep learning4j

Hands on image recognition with scala spark and deep learning4j

Hands on image recognition with scala spark and deep learning4jGuglielmo Iozzia

Strata Beijing 2017: Jumpy, a python interface for nd4j

Strata Beijing 2017: Jumpy, a python interface for nd4j

Strata Beijing 2017: Jumpy, a python interface for nd4jAdam Gibson

Anomaly Detection and Automatic Labeling with Deep Learning

Anomaly Detection and Automatic Labeling with Deep Learning

Anomaly Detection and Automatic Labeling with Deep LearningAdam Gibson

Anomaly detection in deep learning (Updated) English

Anomaly detection in deep learning (Updated) English

Anomaly detection in deep learning (Updated) EnglishAdam Gibson

IBM Middle East Data Science Connect 2016 - Doha, Qatar

IBM Middle East Data Science Connect 2016 - Doha, Qatar

IBM Middle East Data Science Connect 2016 - Doha, QatarRomeo Kienzler

Empfohlen

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Romeo Kienzler

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François GarillotSteve Moore

Apache SystemML - Declarative Large-Scale Machine Learning

Apache SystemML - Declarative Large-Scale Machine Learning

Apache SystemML - Declarative Large-Scale Machine LearningRomeo Kienzler

Hands on image recognition with scala spark and deep learning4j

Hands on image recognition with scala spark and deep learning4j

Hands on image recognition with scala spark and deep learning4jGuglielmo Iozzia

Strata Beijing 2017: Jumpy, a python interface for nd4j

Strata Beijing 2017: Jumpy, a python interface for nd4j

Strata Beijing 2017: Jumpy, a python interface for nd4jAdam Gibson

Anomaly Detection and Automatic Labeling with Deep Learning

Anomaly Detection and Automatic Labeling with Deep Learning

Anomaly Detection and Automatic Labeling with Deep LearningAdam Gibson

Anomaly detection in deep learning (Updated) English

Anomaly detection in deep learning (Updated) English

Anomaly detection in deep learning (Updated) EnglishAdam Gibson

IBM Middle East Data Science Connect 2016 - Doha, Qatar

IBM Middle East Data Science Connect 2016 - Doha, Qatar

IBM Middle East Data Science Connect 2016 - Doha, QatarRomeo Kienzler

Boolan machine learning summit

Boolan machine learning summit

Boolan machine learning summitAdam Gibson

Hadoop summit 2016

Hadoop summit 2016

Hadoop summit 2016Adam Gibson

Big Data Analytics Tokyo

Big Data Analytics Tokyo

Big Data Analytics TokyoAdam Gibson

Advanced deeplearning4j features

Advanced deeplearning4j features

Advanced deeplearning4j featuresAdam Gibson

Self driving computers active learning workflows with human interpretable ve...

Self driving computers active learning workflows with human interpretable ve...

Self driving computers active learning workflows with human interpretable ve...Adam Gibson

Deploying signature verification with deep learning

Deploying signature verification with deep learning

Deploying signature verification with deep learningAdam Gibson

Deep Learning on Qubole Data Platform

Deep Learning on Qubole Data Platform

Deep Learning on Qubole Data PlatformShivaji Dutta

A Primer on FPGAs - Field Programmable Gate Arrays

A Primer on FPGAs - Field Programmable Gate Arrays

A Primer on FPGAs - Field Programmable Gate ArraysTaylor Riggan

DeepLearning and Advanced Machine Learning on IoT

DeepLearning and Advanced Machine Learning on IoT

DeepLearning and Advanced Machine Learning on IoTRomeo Kienzler

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Kaz Sato, Evangelist, Google at MLconf ATL 2016MLconf

Deep learning in production with the best

Deep learning in production with the best

Deep learning in production with the bestAdam Gibson

Bringing Deep Learning into production

Bringing Deep Learning into production

Bringing Deep Learning into production Paolo Platter

World Artificial Intelligence Conference Shanghai 2018

World Artificial Intelligence Conference Shanghai 2018

World Artificial Intelligence Conference Shanghai 2018Adam Gibson

Snorkel: Dark Data and Machine Learning with Christopher Ré

Snorkel: Dark Data and Machine Learning with Christopher Ré

Snorkel: Dark Data and Machine Learning with Christopher RéJen Aman

Amazon Deep Learning

Amazon Deep Learning

Amazon Deep LearningAmanda Mackay (she/her)

Best Practices for On-Demand HPC in Enterprises

Best Practices for On-Demand HPC in Enterprises

Best Practices for On-Demand HPC in Enterprisesgeetachauhan

Deep learning with Tensorflow in R

Deep learning with Tensorflow in R

Deep learning with Tensorflow in Rmikaelhuss

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...Edge AI and Vision Alliance

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopJosh Patterson

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGThamme Gowda

Deep Learning for Java (DL4J)

Deep Learning for Java (DL4J)

Deep Learning for Java (DL4J)신동 강

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud

Weitere ähnliche Inhalte

Was ist angesagt?

Boolan machine learning summit

Boolan machine learning summit

Boolan machine learning summitAdam Gibson

Hadoop summit 2016

Hadoop summit 2016

Hadoop summit 2016Adam Gibson

Big Data Analytics Tokyo

Big Data Analytics Tokyo

Big Data Analytics TokyoAdam Gibson

Advanced deeplearning4j features

Advanced deeplearning4j features

Advanced deeplearning4j featuresAdam Gibson

Self driving computers active learning workflows with human interpretable ve...

Self driving computers active learning workflows with human interpretable ve...

Self driving computers active learning workflows with human interpretable ve...Adam Gibson

Deploying signature verification with deep learning

Deploying signature verification with deep learning

Deploying signature verification with deep learningAdam Gibson

Deep Learning on Qubole Data Platform

Deep Learning on Qubole Data Platform

Deep Learning on Qubole Data PlatformShivaji Dutta

A Primer on FPGAs - Field Programmable Gate Arrays

A Primer on FPGAs - Field Programmable Gate Arrays

A Primer on FPGAs - Field Programmable Gate ArraysTaylor Riggan

DeepLearning and Advanced Machine Learning on IoT

DeepLearning and Advanced Machine Learning on IoT

DeepLearning and Advanced Machine Learning on IoTRomeo Kienzler

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Kaz Sato, Evangelist, Google at MLconf ATL 2016MLconf

Deep learning in production with the best

Deep learning in production with the best

Deep learning in production with the bestAdam Gibson

Bringing Deep Learning into production

Bringing Deep Learning into production

Bringing Deep Learning into production Paolo Platter

World Artificial Intelligence Conference Shanghai 2018

World Artificial Intelligence Conference Shanghai 2018

World Artificial Intelligence Conference Shanghai 2018Adam Gibson

Snorkel: Dark Data and Machine Learning with Christopher Ré

Snorkel: Dark Data and Machine Learning with Christopher Ré

Snorkel: Dark Data and Machine Learning with Christopher RéJen Aman

Amazon Deep Learning

Amazon Deep Learning

Amazon Deep LearningAmanda Mackay (she/her)

Best Practices for On-Demand HPC in Enterprises

Best Practices for On-Demand HPC in Enterprises

Best Practices for On-Demand HPC in Enterprisesgeetachauhan

Deep learning with Tensorflow in R

Deep learning with Tensorflow in R

Deep learning with Tensorflow in Rmikaelhuss

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...Edge AI and Vision Alliance

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopJosh Patterson

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGThamme Gowda

Was ist angesagt? (20)

Boolan machine learning summit

Boolan machine learning summit

Boolan machine learning summit

Hadoop summit 2016

Hadoop summit 2016

Hadoop summit 2016

Big Data Analytics Tokyo

Big Data Analytics Tokyo

Big Data Analytics Tokyo

Advanced deeplearning4j features

Advanced deeplearning4j features

Advanced deeplearning4j features

Self driving computers active learning workflows with human interpretable ve...

Self driving computers active learning workflows with human interpretable ve...

Self driving computers active learning workflows with human interpretable ve...

Deploying signature verification with deep learning

Deploying signature verification with deep learning

Deploying signature verification with deep learning

Deep Learning on Qubole Data Platform

Deep Learning on Qubole Data Platform

Deep Learning on Qubole Data Platform

A Primer on FPGAs - Field Programmable Gate Arrays

A Primer on FPGAs - Field Programmable Gate Arrays

A Primer on FPGAs - Field Programmable Gate Arrays

DeepLearning and Advanced Machine Learning on IoT

DeepLearning and Advanced Machine Learning on IoT

DeepLearning and Advanced Machine Learning on IoT

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Deep learning in production with the best

Deep learning in production with the best

Deep learning in production with the best

Bringing Deep Learning into production

Bringing Deep Learning into production

Bringing Deep Learning into production

World Artificial Intelligence Conference Shanghai 2018

World Artificial Intelligence Conference Shanghai 2018

World Artificial Intelligence Conference Shanghai 2018

Snorkel: Dark Data and Machine Learning with Christopher Ré

Snorkel: Dark Data and Machine Learning with Christopher Ré

Snorkel: Dark Data and Machine Learning with Christopher Ré

Amazon Deep Learning

Amazon Deep Learning

Amazon Deep Learning

Best Practices for On-Demand HPC in Enterprises

Best Practices for On-Demand HPC in Enterprises

Best Practices for On-Demand HPC in Enterprises

Deep learning with Tensorflow in R

Deep learning with Tensorflow in R

Deep learning with Tensorflow in R

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG

Andere mochten auch

Deep Learning for Java (DL4J)

Deep Learning for Java (DL4J)

Deep Learning for Java (DL4J)신동 강

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud

RBM with DL4J for Deep Learning

RBM with DL4J for Deep Learning

RBM with DL4J for Deep Learning 신동 강

Deep Learning on Hadoop

Deep Learning on Hadoop

Deep Learning on HadoopDataWorks Summit

Neural Networks and Deep Learning

Neural Networks and Deep Learning

Neural Networks and Deep LearningAsim Jalis

Distributed Deep Learning on Hadoop Clusters

Distributed Deep Learning on Hadoop Clusters

Distributed Deep Learning on Hadoop ClustersDataWorks Summit/Hadoop Summit

Deep Learning for Personalized Search and Recommender Systems

Deep Learning for Personalized Search and Recommender Systems

Deep Learning for Personalized Search and Recommender SystemsBenjamin Le

Andere mochten auch (7)

Deep Learning for Java (DL4J)

Deep Learning for Java (DL4J)

Deep Learning for Java (DL4J)

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

RBM with DL4J for Deep Learning

RBM with DL4J for Deep Learning

RBM with DL4J for Deep Learning

Deep Learning on Hadoop

Deep Learning on Hadoop

Deep Learning on Hadoop

Neural Networks and Deep Learning

Neural Networks and Deep Learning

Neural Networks and Deep Learning

Distributed Deep Learning on Hadoop Clusters

Distributed Deep Learning on Hadoop Clusters

Distributed Deep Learning on Hadoop Clusters

Deep Learning for Personalized Search and Recommender Systems

Deep Learning for Personalized Search and Recommender Systems

Deep Learning for Personalized Search and Recommender Systems

Ähnlich wie Deep learning on a mixed cluster with Deeplearning4j and Spark

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...DataWorks Summit

Machine Learning with Scala

Machine Learning with Scala

Machine Learning with ScalaSusan Eraly

BigDL webinar - Deep Learning Library for Spark

BigDL webinar - Deep Learning Library for Spark

BigDL webinar - Deep Learning Library for SparkDESMOND YUEN

Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Databricks

Image Recognition on AWS with Apache Spark and BigDL

Image Recognition on AWS with Apache Spark and BigDL

Image Recognition on AWS with Apache Spark and BigDLAmazon Web Services

Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23

Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23

Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Tomasz Sikora

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillotsparktc

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillotsparktc

APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...

APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...

APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...Deep Learning Italia

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski

Started with-apache-spark

Started with-apache-spark

Started with-apache-sparkHappiest Minds Technologies

Neural Networks, Spark MLlib, Deep Learning

Neural Networks, Spark MLlib, Deep Learning

Neural Networks, Spark MLlib, Deep LearningAsim Jalis

Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark

Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark

Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks

Integrating Deep Learning Libraries with Apache Spark

Integrating Deep Learning Libraries with Apache Spark

Integrating Deep Learning Libraries with Apache SparkDatabricks

Deep learning for FinTech

Deep learning for FinTech

Deep learning for FinTechgeetachauhan

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf

Build a deep learning pipeline on apache spark for ads optimization

Build a deep learning pipeline on apache spark for ads optimization

Build a deep learning pipeline on apache spark for ads optimizationCraig Chao

Meetup deeplearningitalia-milano-valerio-morfino

Meetup deeplearningitalia-milano-valerio-morfino

Meetup deeplearningitalia-milano-valerio-morfinoDeep Learning Italia

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...Databricks

AI and Spark - IBM Community AI Day

AI and Spark - IBM Community AI Day

AI and Spark - IBM Community AI DayNick Pentreath

Ähnlich wie Deep learning on a mixed cluster with Deeplearning4j and Spark (20)

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...

Machine Learning with Scala

Machine Learning with Scala

Machine Learning with Scala

BigDL webinar - Deep Learning Library for Spark

BigDL webinar - Deep Learning Library for Spark

BigDL webinar - Deep Learning Library for Spark

Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...

Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...

Image Recognition on AWS with Apache Spark and BigDL

Image Recognition on AWS with Apache Spark and BigDL

Image Recognition on AWS with Apache Spark and BigDL

Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23

Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23

Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning4J and Spark: Successes and Challenges - François Garillot

APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...

APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...

APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Started with-apache-spark

Started with-apache-spark

Started with-apache-spark

Neural Networks, Spark MLlib, Deep Learning

Neural Networks, Spark MLlib, Deep Learning

Neural Networks, Spark MLlib, Deep Learning

Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark

Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark

Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark

Integrating Deep Learning Libraries with Apache Spark

Integrating Deep Learning Libraries with Apache Spark

Integrating Deep Learning Libraries with Apache Spark

Deep learning for FinTech

Deep learning for FinTech

Deep learning for FinTech

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

Build a deep learning pipeline on apache spark for ads optimization

Build a deep learning pipeline on apache spark for ads optimization

Build a deep learning pipeline on apache spark for ads optimization

Meetup deeplearningitalia-milano-valerio-morfino

Meetup deeplearningitalia-milano-valerio-morfino

Meetup deeplearningitalia-milano-valerio-morfino

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...

AI and Spark - IBM Community AI Day

AI and Spark - IBM Community AI Day

AI and Spark - IBM Community AI Day

Mehr von François Garillot

Growing Your Types Without Growing Your Workload

Growing Your Types Without Growing Your Workload

Growing Your Types Without Growing Your WorkloadFrançois Garillot

Mobility insights at Swisscom - Understanding collective mobility in Switzerland

Mobility insights at Swisscom - Understanding collective mobility in Switzerland

Mobility insights at Swisscom - Understanding collective mobility in SwitzerlandFrançois Garillot

Delivering near real time mobility insights at swisscom

Delivering near real time mobility insights at swisscom

Delivering near real time mobility insights at swisscomFrançois Garillot

Spark Streaming : Dealing with State

Spark Streaming : Dealing with State

Spark Streaming : Dealing with StateFrançois Garillot

A Gentle Introduction to Locality Sensitive Hashing with Apache Spark

A Gentle Introduction to Locality Sensitive Hashing with Apache Spark

A Gentle Introduction to Locality Sensitive Hashing with Apache SparkFrançois Garillot

Ramping up your Devops Fu for Big Data developers

Ramping up your Devops Fu for Big Data developers

Ramping up your Devops Fu for Big Data developersFrançois Garillot

Diving In The Deep End Of The Big Data Pool

Diving In The Deep End Of The Big Data Pool

Diving In The Deep End Of The Big Data PoolFrançois Garillot

Scala Collections : Java 8 on Steroids

Scala Collections : Java 8 on Steroids

Scala Collections : Java 8 on SteroidsFrançois Garillot

Mehr von François Garillot (8)

Growing Your Types Without Growing Your Workload

Growing Your Types Without Growing Your Workload

Growing Your Types Without Growing Your Workload

Mobility insights at Swisscom - Understanding collective mobility in Switzerland

Mobility insights at Swisscom - Understanding collective mobility in Switzerland

Mobility insights at Swisscom - Understanding collective mobility in Switzerland

Delivering near real time mobility insights at swisscom

Delivering near real time mobility insights at swisscom

Delivering near real time mobility insights at swisscom

Spark Streaming : Dealing with State

Spark Streaming : Dealing with State

Spark Streaming : Dealing with State

A Gentle Introduction to Locality Sensitive Hashing with Apache Spark

A Gentle Introduction to Locality Sensitive Hashing with Apache Spark

A Gentle Introduction to Locality Sensitive Hashing with Apache Spark

Ramping up your Devops Fu for Big Data developers

Ramping up your Devops Fu for Big Data developers

Ramping up your Devops Fu for Big Data developers

Diving In The Deep End Of The Big Data Pool

Diving In The Deep End Of The Big Data Pool

Diving In The Deep End Of The Big Data Pool

Scala Collections : Java 8 on Steroids

Scala Collections : Java 8 on Steroids

Scala Collections : Java 8 on Steroids

Kürzlich hochgeladen

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveCall Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

How To Use Server-Side Rendering with Nuxt.js

How To Use Server-Side Rendering with Nuxt.js

How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc

why an Opensea Clone Script might be your perfect match.pdf

why an Opensea Clone Script might be your perfect match.pdf

why an Opensea Clone Script might be your perfect match.pdfjoe51371421

HR Software Buyers Guide in 2024 - HRSoftware.com

HR Software Buyers Guide in 2024 - HRSoftware.com

HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai

A Secure and Reliable Document Management System is Essential.docx

A Secure and Reliable Document Management System is Essential.docx

A Secure and Reliable Document Management System is Essential.docxComplianceQuest1

DNT_Corporate presentation know about us

DNT_Corporate presentation know about us

DNT_Corporate presentation know about usDynamic Netsoft

Project Based Learning (A.I).pptx detail explanation

Project Based Learning (A.I).pptx detail explanation

Project Based Learning (A.I).pptx detail explanationkaushalgiri8080

Cloud Management Software Platforms: OpenStack

Cloud Management Software Platforms: OpenStack

Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions

Der Spagat zwischen BIAS und FAIRNESS (2024)

Der Spagat zwischen BIAS und FAIRNESS (2024)

Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh

Salesforce Certified Field Service Consultant

Salesforce Certified Field Service Consultant

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

What is Binary Language? Computer Number Systems

What is Binary Language? Computer Number Systems

What is Binary Language? Computer Number SystemsJheuzeDellosa

Diamond Application Development Crafting Solutions with Precision

Diamond Application Development Crafting Solutions with Precision

Diamond Application Development Crafting Solutions with PrecisionSolGuruz

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

Kürzlich hochgeladen (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

How To Use Server-Side Rendering with Nuxt.js

How To Use Server-Side Rendering with Nuxt.js

How To Use Server-Side Rendering with Nuxt.js

why an Opensea Clone Script might be your perfect match.pdf

why an Opensea Clone Script might be your perfect match.pdf

why an Opensea Clone Script might be your perfect match.pdf

HR Software Buyers Guide in 2024 - HRSoftware.com

HR Software Buyers Guide in 2024 - HRSoftware.com

HR Software Buyers Guide in 2024 - HRSoftware.com

A Secure and Reliable Document Management System is Essential.docx

A Secure and Reliable Document Management System is Essential.docx

A Secure and Reliable Document Management System is Essential.docx

DNT_Corporate presentation know about us

DNT_Corporate presentation know about us

DNT_Corporate presentation know about us

Project Based Learning (A.I).pptx detail explanation

Project Based Learning (A.I).pptx detail explanation

Project Based Learning (A.I).pptx detail explanation

Cloud Management Software Platforms: OpenStack

Cloud Management Software Platforms: OpenStack

Cloud Management Software Platforms: OpenStack

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

Der Spagat zwischen BIAS und FAIRNESS (2024)

Der Spagat zwischen BIAS und FAIRNESS (2024)

Der Spagat zwischen BIAS und FAIRNESS (2024)

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

Salesforce Certified Field Service Consultant

Salesforce Certified Field Service Consultant

Salesforce Certified Field Service Consultant

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

What is Binary Language? Computer Number Systems

What is Binary Language? Computer Number Systems

What is Binary Language? Computer Number Systems

Diamond Application Development Crafting Solutions with Precision

Diamond Application Development Crafting Solutions with Precision

Diamond Application Development Crafting Solutions with Precision

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Deep learning on a mixed cluster with Deeplearning4j and Spark

1. Deep learning on a mixed cluster with Deeplearning4j and Spark Barcelona Spark meetup, Dec 9, 2016 (right after NIPS) francois@garillot.net @huitseeker

2. Agenda Intro Why Deep Learning on a Cluster Big Data Architecture Deeplearning4j Spark challenges

3. Introduction : Deep Learning in the trenches today

4. The bad thing about doing a talk right after NIPS you guys are scary.

5. The good thing about doing a talk right after NIPS You guys don't need to be told SkyNet is a fantasy (for now).

6. Paying algorithms Anomaly detection in many forms (bad guys / predictive maintenance / market rally) Fraud detection Network intrusion Fintech secutiries churn prediction Video object detection (security)

7. Models that are being neglected in benchmarks and implementation efforts LSTMs Autoencoders

8. How to deal with this in the Spark world ? experiment with trained model application: Tensorframes, what are the deep learning frameworks that let you train ?

9. Why Deep Learning on a cluster ?

10. Practically ... let's look at benchmarks

11. Practically ... let's look at benchmarks

12. Practically ... let's look at benchmarks

13. Practically ... let's look at benchmarks

14. Training, but how ? New Amazon GPU instances

15. Training, but how ?

16. Training, but how ?

17. Cluster training in the enterprise it's really about multi-tenancy & economies of scale a big bunch of machines shared among everybody shares better if only because you can reuse it for other workloads Minor reasons enterprises may not have GPUs

18. Distributing training basically distributing SGD (R) challenge is AllReduce Communication Sparse updates, async communications

19. Distributing training : good engineering matters

20. Cluster training in your (experimentor) case ? it's a fun problem : AllReduce Ultimately solved for people with a large amount of images that solution is not open-source (but at Facebook, Google, Amazon, Microsoft¹, Baidu) ¹: 1-bit SGD is under non-commercial license in CNTK 2.0

21. Big Data architecture

22. With a parameter server

23. With Spark Spark does the initial ETL Spark ingests the nal result In the middle : parameter server.

24. Spark cluster modes Mesos GPU support merged devices cgroups ! YARN GPU support through tags Spark Standalone : ?

25. Deeplearning4j

26. Deeplearning4j the rst commercial-grade, open-source, distributed deeplearning library written for Java and Scala Skymind its commercial support arm

27. Scienti c computing on the JVM libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!) JavaCPP: generates JNI bindings to your CPP libs ND4J : numpy for the JVM, native superfast arrays Datavec : one-stop interface to an NDArray DeepLearning4J: orchestration, backprop, layer de nition ScalNet: gateway drug, inspired from (and closely following) Keras RL4J : Reinforcement learning for the JVM

28. With Spark JavaSparkContent sc = ...; JavaRDD<DataSet> trainingData = ...; MultiLayerConfiguration networkConfig = ...; //Create the TrainingMaster instance int examplesPerDataSetObject = 1; TrainingMaster trainingMaster = new ParameterAveragingTrainingMaster.Builder(examplesPerDataSetObjec .(other configuration options) .build(); //Create the SparkDl4jMultiLayer instance SparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, networkConfig, trainingMaster); //Fit the network using the training data: sparkNetwork.fit(trainingData);

29. Spark Challenges

30. Even if you don't care about Deep learning (from Kazuaki Ishizaki @ IBM Japan) SPARK-6442 : better linear algebra than breeze ND4J will have sparse representations soon

31. Even if you don't care about Deep learning II Meta- RDDs

32. Killing the bottlenecks Spark has already changed its networking backend once. better support for parameters servers and their fault tolerance.

33. A Last Word (from Andrew Y. Ng) get involved ! don't just read papers, reproduce research results Also We're happy to mentor contributions, and there's a book !

34. Questions ?