SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Deep learning on a mixed
cluster with Deeplearning4j
and Spark
Barcelona Spark meetup, Dec 9, 2016
(right after NIPS)
francois@garillot.net @huitseeker
Agenda
Intro
Why Deep Learning on a
Cluster
Big Data Architecture
Deeplearning4j
Spark challenges
Introduction : Deep Learning
in the trenches today
The bad thing about doing a
talk right after NIPS
you guys are scary.
The good thing about doing a
talk right after NIPS
You guys don't need to be told SkyNet is a fantasy (for now).
Paying algorithms
Anomaly detection in many forms (bad guys / predictive
maintenance / market rally)
Fraud detection
Network intrusion
Fintech secutiries churn prediction
Video object detection (security)
Models that are being
neglected in benchmarks and
implementation efforts
LSTMs
Autoencoders
How to deal with this in the
Spark world ?
experiment with trained model application: Tensorframes,
what are the deep learning frameworks that let you train
?
Why Deep Learning on a
cluster ?
Practically ... let's look at benchmarks
Practically ... let's look at benchmarks
Practically ... let's look at benchmarks
Practically ... let's look at benchmarks
Training, but how ?
New Amazon GPU instances
Training, but how ?
Training, but how ?
Cluster training in the
enterprise
it's really about multi-tenancy & economies of scale
a big bunch of machines shared among everybody shares
better
if only because you can reuse it for other workloads
Minor reasons
enterprises may not have
GPUs
Distributing training
basically distributing SGD (R)
challenge is AllReduce Communication
Sparse updates, async
communications
Distributing training : good
engineering matters
Cluster training in your
(experimentor) case ?
it's a fun problem : AllReduce
Ultimately solved for people with a large amount of images
that solution is not open-source (but at Facebook, Google,
Amazon, Microsoft¹, Baidu)
¹: 1-bit SGD is under non-commercial license in CNTK 2.0
Big Data architecture
With a parameter server
With Spark
Spark does the initial ETL
Spark ingests the nal result
In the middle : parameter
server.
Spark cluster modes
Mesos GPU support merged
devices cgroups !
YARN GPU support through
tags
Spark Standalone : ?
Deeplearning4j
Deeplearning4j
the rst commercial-grade, open-source, distributed deep-
learning library written for Java and Scala
Skymind its commercial support arm
Scienti c computing on the JVM
libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!)
JavaCPP: generates JNI bindings to your CPP libs
ND4J : numpy for the JVM, native superfast arrays
Datavec : one-stop interface to an NDArray
DeepLearning4J: orchestration, backprop, layer de nition
ScalNet: gateway drug, inspired from (and closely following)
Keras
RL4J : Reinforcement learning for the JVM
With Spark
JavaSparkContent sc = ...;
JavaRDD<DataSet> trainingData = ...;
MultiLayerConfiguration networkConfig = ...;
//Create the TrainingMaster instance
int examplesPerDataSetObject = 1;
TrainingMaster trainingMaster =
new ParameterAveragingTrainingMaster.Builder(examplesPerDataSetObjec
.(other configuration options)
.build();
//Create the SparkDl4jMultiLayer instance
SparkDl4jMultiLayer sparkNetwork =
new SparkDl4jMultiLayer(sc, networkConfig, trainingMaster);
//Fit the network using the training data:
sparkNetwork.fit(trainingData);
Spark Challenges
Even if you don't care about Deep
learning
(from Kazuaki Ishizaki @ IBM Japan)
SPARK-6442 : better linear algebra than
breeze
ND4J will have sparse representations soon
Even if you don't care about Deep
learning II
Meta-
RDDs
Killing the bottlenecks
Spark has already changed its networking backend once.
better support for parameters servers and their fault
tolerance.
A Last Word (from Andrew Y. Ng)
get involved !
don't just read papers, reproduce research
results
Also
We're happy to mentor contributions, and there's a book !
Questions ?

Weitere ähnliche Inhalte

Was ist angesagt?

Boolan machine learning summit
Boolan machine learning summitBoolan machine learning summit
Boolan machine learning summitAdam Gibson
 
Hadoop summit 2016
Hadoop summit 2016Hadoop summit 2016
Hadoop summit 2016Adam Gibson
 
Big Data Analytics Tokyo
Big Data Analytics TokyoBig Data Analytics Tokyo
Big Data Analytics TokyoAdam Gibson
 
Advanced deeplearning4j features
Advanced deeplearning4j featuresAdvanced deeplearning4j features
Advanced deeplearning4j featuresAdam Gibson
 
Self driving computers active learning workflows with human interpretable ve...
Self driving computers  active learning workflows with human interpretable ve...Self driving computers  active learning workflows with human interpretable ve...
Self driving computers active learning workflows with human interpretable ve...Adam Gibson
 
Deploying signature verification with deep learning
Deploying signature verification with deep learningDeploying signature verification with deep learning
Deploying signature verification with deep learningAdam Gibson
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformShivaji Dutta
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysTaylor Riggan
 
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTDeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTRomeo Kienzler
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016MLconf
 
Deep learning in production with the best
Deep learning in production   with the bestDeep learning in production   with the best
Deep learning in production with the bestAdam Gibson
 
Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production Paolo Platter
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018Adam Gibson
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéJen Aman
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprisesgeetachauhan
 
Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in Rmikaelhuss
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ..."Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...Edge AI and Vision Alliance
 
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopHadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopJosh Patterson
 
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGData Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGThamme Gowda
 

Was ist angesagt? (20)

Boolan machine learning summit
Boolan machine learning summitBoolan machine learning summit
Boolan machine learning summit
 
Hadoop summit 2016
Hadoop summit 2016Hadoop summit 2016
Hadoop summit 2016
 
Big Data Analytics Tokyo
Big Data Analytics TokyoBig Data Analytics Tokyo
Big Data Analytics Tokyo
 
Advanced deeplearning4j features
Advanced deeplearning4j featuresAdvanced deeplearning4j features
Advanced deeplearning4j features
 
Self driving computers active learning workflows with human interpretable ve...
Self driving computers  active learning workflows with human interpretable ve...Self driving computers  active learning workflows with human interpretable ve...
Self driving computers active learning workflows with human interpretable ve...
 
Deploying signature verification with deep learning
Deploying signature verification with deep learningDeploying signature verification with deep learning
Deploying signature verification with deep learning
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTDeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoT
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
 
Deep learning in production with the best
Deep learning in production   with the bestDeep learning in production   with the best
Deep learning in production with the best
 
Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
 
Amazon Deep Learning
Amazon Deep LearningAmazon Deep Learning
Amazon Deep Learning
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprises
 
Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ..."Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
 
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopHadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
 
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGData Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
 

Andere mochten auch

Deep Learning for Java (DL4J)
Deep Learning for Java (DL4J)Deep Learning for Java (DL4J)
Deep Learning for Java (DL4J)신동 강
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
 
RBM with DL4J for Deep Learning
RBM with DL4J for Deep Learning RBM with DL4J for Deep Learning
RBM with DL4J for Deep Learning 신동 강
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep LearningAsim Jalis
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 

Andere mochten auch (7)

Deep Learning for Java (DL4J)
Deep Learning for Java (DL4J)Deep Learning for Java (DL4J)
Deep Learning for Java (DL4J)
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
RBM with DL4J for Deep Learning
RBM with DL4J for Deep Learning RBM with DL4J for Deep Learning
RBM with DL4J for Deep Learning
 
Deep Learning on Hadoop
Deep Learning on HadoopDeep Learning on Hadoop
Deep Learning on Hadoop
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
 
Distributed Deep Learning on Hadoop Clusters
Distributed Deep Learning on Hadoop ClustersDistributed Deep Learning on Hadoop Clusters
Distributed Deep Learning on Hadoop Clusters
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 

Ähnlich wie Deep learning on a mixed cluster with Deeplearning4j and Spark

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...DataWorks Summit
 
Machine Learning with Scala
Machine Learning with ScalaMachine Learning with Scala
Machine Learning with ScalaSusan Eraly
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Databricks
 
Image Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLImage Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLAmazon Web Services
 
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Tomasz Sikora
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillotsparktc
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillotsparktc
 
APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...
APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...
APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...Deep Learning Italia
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningAsim Jalis
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
Integrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache SparkIntegrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache SparkDatabricks
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf
 
Build a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationBuild a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationCraig Chao
 
Meetup deeplearningitalia-milano-valerio-morfino
Meetup deeplearningitalia-milano-valerio-morfinoMeetup deeplearningitalia-milano-valerio-morfino
Meetup deeplearningitalia-milano-valerio-morfinoDeep Learning Italia
 
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...Databricks
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayNick Pentreath
 

Ähnlich wie Deep learning on a mixed cluster with Deeplearning4j and Spark (20)

Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
 
Machine Learning with Scala
Machine Learning with ScalaMachine Learning with Scala
Machine Learning with Scala
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
 
Image Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLImage Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDL
 
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillot
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillot
 
APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...
APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...
APACHE SPARK PER IL MACHINE LEARNING: INTRODUZIONE ED UN CASO DI STUDIO_ Meet...
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep Learning
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
Integrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache SparkIntegrating Deep Learning Libraries with Apache Spark
Integrating Deep Learning Libraries with Apache Spark
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
Build a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationBuild a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimization
 
Meetup deeplearningitalia-milano-valerio-morfino
Meetup deeplearningitalia-milano-valerio-morfinoMeetup deeplearningitalia-milano-valerio-morfino
Meetup deeplearningitalia-milano-valerio-morfino
 
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI Day
 

Mehr von François Garillot

Growing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your WorkloadGrowing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your WorkloadFrançois Garillot
 
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in SwitzerlandMobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in SwitzerlandFrançois Garillot
 
Delivering near real time mobility insights at swisscom
Delivering near real time mobility insights at swisscomDelivering near real time mobility insights at swisscom
Delivering near real time mobility insights at swisscomFrançois Garillot
 
Spark Streaming : Dealing with State
Spark Streaming : Dealing with StateSpark Streaming : Dealing with State
Spark Streaming : Dealing with StateFrançois Garillot
 
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache SparkA Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache SparkFrançois Garillot
 
Ramping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developersRamping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developersFrançois Garillot
 
Diving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolDiving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolFrançois Garillot
 
Scala Collections : Java 8 on Steroids
Scala Collections : Java 8 on SteroidsScala Collections : Java 8 on Steroids
Scala Collections : Java 8 on SteroidsFrançois Garillot
 

Mehr von François Garillot (8)

Growing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your WorkloadGrowing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your Workload
 
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in SwitzerlandMobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
 
Delivering near real time mobility insights at swisscom
Delivering near real time mobility insights at swisscomDelivering near real time mobility insights at swisscom
Delivering near real time mobility insights at swisscom
 
Spark Streaming : Dealing with State
Spark Streaming : Dealing with StateSpark Streaming : Dealing with State
Spark Streaming : Dealing with State
 
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache SparkA Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
 
Ramping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developersRamping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developers
 
Diving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolDiving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data Pool
 
Scala Collections : Java 8 on Steroids
Scala Collections : Java 8 on SteroidsScala Collections : Java 8 on Steroids
Scala Collections : Java 8 on Steroids
 

Kürzlich hochgeladen

How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 

Kürzlich hochgeladen (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 

Deep learning on a mixed cluster with Deeplearning4j and Spark

  • 1. Deep learning on a mixed cluster with Deeplearning4j and Spark Barcelona Spark meetup, Dec 9, 2016 (right after NIPS) francois@garillot.net @huitseeker
  • 2. Agenda Intro Why Deep Learning on a Cluster Big Data Architecture Deeplearning4j Spark challenges
  • 3. Introduction : Deep Learning in the trenches today
  • 4. The bad thing about doing a talk right after NIPS you guys are scary.
  • 5. The good thing about doing a talk right after NIPS You guys don't need to be told SkyNet is a fantasy (for now).
  • 6. Paying algorithms Anomaly detection in many forms (bad guys / predictive maintenance / market rally) Fraud detection Network intrusion Fintech secutiries churn prediction Video object detection (security)
  • 7. Models that are being neglected in benchmarks and implementation efforts LSTMs Autoencoders
  • 8. How to deal with this in the Spark world ? experiment with trained model application: Tensorframes, what are the deep learning frameworks that let you train ?
  • 9. Why Deep Learning on a cluster ?
  • 10. Practically ... let's look at benchmarks
  • 11. Practically ... let's look at benchmarks
  • 12. Practically ... let's look at benchmarks
  • 13. Practically ... let's look at benchmarks
  • 14. Training, but how ? New Amazon GPU instances
  • 17. Cluster training in the enterprise it's really about multi-tenancy & economies of scale a big bunch of machines shared among everybody shares better if only because you can reuse it for other workloads Minor reasons enterprises may not have GPUs
  • 18. Distributing training basically distributing SGD (R) challenge is AllReduce Communication Sparse updates, async communications
  • 19. Distributing training : good engineering matters
  • 20. Cluster training in your (experimentor) case ? it's a fun problem : AllReduce Ultimately solved for people with a large amount of images that solution is not open-source (but at Facebook, Google, Amazon, Microsoft¹, Baidu) ¹: 1-bit SGD is under non-commercial license in CNTK 2.0
  • 23. With Spark Spark does the initial ETL Spark ingests the nal result In the middle : parameter server.
  • 24. Spark cluster modes Mesos GPU support merged devices cgroups ! YARN GPU support through tags Spark Standalone : ?
  • 26. Deeplearning4j the rst commercial-grade, open-source, distributed deep- learning library written for Java and Scala Skymind its commercial support arm
  • 27. Scienti c computing on the JVM libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!) JavaCPP: generates JNI bindings to your CPP libs ND4J : numpy for the JVM, native superfast arrays Datavec : one-stop interface to an NDArray DeepLearning4J: orchestration, backprop, layer de nition ScalNet: gateway drug, inspired from (and closely following) Keras RL4J : Reinforcement learning for the JVM
  • 28. With Spark JavaSparkContent sc = ...; JavaRDD<DataSet> trainingData = ...; MultiLayerConfiguration networkConfig = ...; //Create the TrainingMaster instance int examplesPerDataSetObject = 1; TrainingMaster trainingMaster = new ParameterAveragingTrainingMaster.Builder(examplesPerDataSetObjec .(other configuration options) .build(); //Create the SparkDl4jMultiLayer instance SparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, networkConfig, trainingMaster); //Fit the network using the training data: sparkNetwork.fit(trainingData);
  • 30. Even if you don't care about Deep learning (from Kazuaki Ishizaki @ IBM Japan) SPARK-6442 : better linear algebra than breeze ND4J will have sparse representations soon
  • 31. Even if you don't care about Deep learning II Meta- RDDs
  • 32. Killing the bottlenecks Spark has already changed its networking backend once. better support for parameters servers and their fault tolerance.
  • 33. A Last Word (from Andrew Y. Ng) get involved ! don't just read papers, reproduce research results Also We're happy to mentor contributions, and there's a book !