SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Future of AI on the JVM
Scala Days Amsterdam 2015
Adam Gibson
Creator of Deeplearning4j (and 4s :)
What is AI?
● Not Terminator (despite our name)
● Many subfields
● Our focus: Machine learning
Big Data?
Problem Space
● Spam Classification
● Summarization
● Face Detection
● Eye Tracking
● Targeted Ads
● Recommendation Engines
Current State of ML
● Simpler models
● Most of industry barely uses Logistic Reg.
● Many problems are binary
o e.g. fraud, spam
● Some unsupervised (clustering, reccos)
● Lots of ML frameworks on JVM
ML Frameworks on JVM...
● Apache Mahout
● Spark’s MLlib
● Weka (is that R?)
ML GUIs
● Prediction.io
● Encog
Problems
● Monolithic
● Makes assumptions about data
● Hard to use
● No separation of concerns
Ring a Bell?
● We call that “Monolithic”
● Separate ML concerns:
Data Pipelines/Vectorization
Scoring
Model Training
Evaluation
Micro-Services + ML?
● Kinda like micro-services
● Reduce lock in
● Take math, data cleaning, model training,
choosing algorithms ...
● … and separate them
Math
● Parametric Models (Matrices!)
● Non Parametric (Random forest)
● Focusing on Matrices (the hard part of ML
systems)
Matrices
● NDArrays ( > 2d)
● Tensors (think of pages of matrices)
● Example: 2 x 2 x 2 (2 2x 2 matrices)
● ^^THIS IS UNCLEAR. Two 2 x 2 matrices?
● Applies to graphs w/ sparse representations
Chips/Hardware/Matrices
● CPUs - We work with these
● GPUs - CUDA ditto
● FPGAs
o Intel bought Altera, an FPGA maker, for $17 billion
this month
o The edge, the cloud
Why New Chips?
Why New Chips?
● See the numbers yourself:
● http://www.slideshare.net/airbots/cuda-
29330283
● http://devblogs.nvidia.com/parallelforall/bidm
ach-machine-learning-limit-gpus/
● http://jcuda.org
Mixed clusters
● GPUs aren’t good for all workloads
● Because latency
● Need to upload data: not good for small
problems
● Mixed CPU/GPU clusters are best bet
Data Pipelines
● More data will be binary
● Frameworks today can’t process binary well
● Binary data has different semantics
● Moving windows for audio
● 3d for images ...
People Roll Their Own b/c
● Current frameworks assume clean data :(
● Pipelines are brittle, hard to maintain
● Moving towards being composable (reuse)
Dedicated Libraries
● Let’s focus on vectorization -- now!
● Because IoT
● Because more access to raw media
● Should fit into current big data frameworks
Scoring
● AUC
● F1
● Different Loss Functions
● Hyper parameter optimization
All independent
● These things work for different models
● Shouldn’t be tied to a particular system
● Should be embeddable
Training
● Split Train/Test
● Sample data (no, not all the data ;) to
validate model
● Increasingly compute intensive
Deep Learning
● Most done in Python...
● Norm training time is measured in
hours/days -- weeks!?
● Work being done in HPC (Model parallelism)
● Distbelief (Data parallelism)
Automatic Learning
● Good at unstructured data
● Images, Text, Audio and Sensors
● Quick, baseline feature engineering
● Not good at feature introspection
Or are they?
TSNE
Where Does Scala Fit In?
● Akka - Real time streaming analytics/micro services
● Spark - Dataframes/number crunching
● JVM Key/Value Stores
● Pistachio (powers Yahoo’s ad network)
o http://yahooeng.tumblr.com/post/118860853846/dist
ributed-word2vec-on-top-of-pistachio
The Way We Learn Now
● Monolithic ML frameworks
● No per-chip optimizations
● No Tensors (come on guys, it’s 2015...)
● Need isolation and less lockin
● JVM is the platform to make it happen
Other Links
● http://deeplearning4j.org/
● http://nd4j.org/
● https://github.com/deeplearning4j/Canova
Questions?
● adam@skymind.io
● @agibsonccc
● github.com/agibsonccc

Weitere ähnliche Inhalte

Was ist angesagt?

Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
Nikhil Garg
 

Was ist angesagt? (20)

Anomaly Detection and Automatic Labeling with Deep Learning
Anomaly Detection and Automatic Labeling with Deep LearningAnomaly Detection and Automatic Labeling with Deep Learning
Anomaly Detection and Automatic Labeling with Deep Learning
 
Big Data Analytics Tokyo
Big Data Analytics TokyoBig Data Analytics Tokyo
Big Data Analytics Tokyo
 
Deep learning in production with the best
Deep learning in production   with the bestDeep learning in production   with the best
Deep learning in production with the best
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
 
Dl4j in the wild
Dl4j in the wildDl4j in the wild
Dl4j in the wild
 
Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, Qatar
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel Kobran
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Scrappy
ScrappyScrappy
Scrappy
 
How to Feed a Data Hungry Organization – by Traveloka Data Team
How to Feed a Data Hungry Organization – by Traveloka Data TeamHow to Feed a Data Hungry Organization – by Traveloka Data Team
How to Feed a Data Hungry Organization – by Traveloka Data Team
 
Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
 
Anomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) EnglishAnomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) English
 
Anatomy of in memory processing in Spark
Anatomy of in memory processing in SparkAnatomy of in memory processing in Spark
Anatomy of in memory processing in Spark
 
Traveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsTraveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analytics
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
Machine Learning Using Cloud Services
Machine Learning Using Cloud ServicesMachine Learning Using Cloud Services
Machine Learning Using Cloud Services
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Deep Learning with Microsoft Cognitive Toolkit
Deep Learning with Microsoft Cognitive ToolkitDeep Learning with Microsoft Cognitive Toolkit
Deep Learning with Microsoft Cognitive Toolkit
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 

Andere mochten auch

The Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsThe Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text Analytics
Alyona Medelyan
 
Hadoop Turns a Corner and Sees the Future
Hadoop Turns a Corner and Sees the FutureHadoop Turns a Corner and Sees the Future
Hadoop Turns a Corner and Sees the Future
DataWorks Summit
 

Andere mochten auch (20)

Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jApplied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4j
 
Composing Project Archetyps with SBT AutoPlugins
Composing Project Archetyps with SBT AutoPluginsComposing Project Archetyps with SBT AutoPlugins
Composing Project Archetyps with SBT AutoPlugins
 
Transformative Git Practices
Transformative Git PracticesTransformative Git Practices
Transformative Git Practices
 
Nd4 j slides.pptx
Nd4 j slides.pptxNd4 j slides.pptx
Nd4 j slides.pptx
 
A Scala Corrections Library
A Scala Corrections LibraryA Scala Corrections Library
A Scala Corrections Library
 
Lightning Talk: Running MongoDB on Docker for High Performance Deployments
Lightning Talk: Running MongoDB on Docker for High Performance DeploymentsLightning Talk: Running MongoDB on Docker for High Performance Deployments
Lightning Talk: Running MongoDB on Docker for High Performance Deployments
 
Basic NLP with Python and NLTK
Basic NLP with Python and NLTKBasic NLP with Python and NLTK
Basic NLP with Python and NLTK
 
The Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsThe Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text Analytics
 
Effective Actors
Effective ActorsEffective Actors
Effective Actors
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
Scala Json Features and Performance
Scala Json Features and PerformanceScala Json Features and Performance
Scala Json Features and Performance
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream Processing
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
Distributed deep rl on spark strata singapore
Distributed deep rl on spark   strata singaporeDistributed deep rl on spark   strata singapore
Distributed deep rl on spark strata singapore
 
Recurrent nets and sensors
Recurrent nets and sensorsRecurrent nets and sensors
Recurrent nets and sensors
 
Wrangleconf Big Data Malaysia 2016
Wrangleconf Big Data Malaysia 2016Wrangleconf Big Data Malaysia 2016
Wrangleconf Big Data Malaysia 2016
 
What We (Don't) Know About the Beginning of the Universe
What We (Don't) Know About the Beginning of the UniverseWhat We (Don't) Know About the Beginning of the Universe
What We (Don't) Know About the Beginning of the Universe
 
Gifford Lecture One: Cosmos, Time, Memory
Gifford Lecture One: Cosmos, Time, MemoryGifford Lecture One: Cosmos, Time, Memory
Gifford Lecture One: Cosmos, Time, Memory
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Hadoop Turns a Corner and Sees the Future
Hadoop Turns a Corner and Sees the FutureHadoop Turns a Corner and Sees the Future
Hadoop Turns a Corner and Sees the Future
 

Ähnlich wie Future of ai on the jvm

AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
vitm11
 
Sf big analytics: bighead
Sf big analytics: bigheadSf big analytics: bighead
Sf big analytics: bighead
Chester Chen
 
What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...
Stefano Fago
 

Ähnlich wie Future of ai on the jvm (20)

AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Cloud accounting software uk
Cloud accounting software ukCloud accounting software uk
Cloud accounting software uk
 
Bridging the gap in enterprise AI
Bridging the gap in enterprise AIBridging the gap in enterprise AI
Bridging the gap in enterprise AI
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makers
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Webinar - Unleash AI power with MySQL and MindsDB
Webinar - Unleash AI power with MySQL and MindsDBWebinar - Unleash AI power with MySQL and MindsDB
Webinar - Unleash AI power with MySQL and MindsDB
 
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.com
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
 
Sf big analytics: bighead
Sf big analytics: bigheadSf big analytics: bighead
Sf big analytics: bighead
 
What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
 

Mehr von Adam Gibson

Mehr von Adam Gibson (14)

End to end MLworkflows
End to end MLworkflowsEnd to end MLworkflows
End to end MLworkflows
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 
Deploying signature verification with deep learning
Deploying signature verification with deep learningDeploying signature verification with deep learning
Deploying signature verification with deep learning
 
Boolan machine learning summit
Boolan machine learning summitBoolan machine learning summit
Boolan machine learning summit
 
Advanced deeplearning4j features
Advanced deeplearning4j featuresAdvanced deeplearning4j features
Advanced deeplearning4j features
 
Strata Beijing - Deep Learning in Production on Spark
Strata Beijing - Deep Learning in Production on SparkStrata Beijing - Deep Learning in Production on Spark
Strata Beijing - Deep Learning in Production on Spark
 
Skymind - Udacity China presentation
Skymind - Udacity China presentationSkymind - Udacity China presentation
Skymind - Udacity China presentation
 
Anomaly Detection in Deep Learning (Updated)
Anomaly Detection in Deep Learning (Updated)Anomaly Detection in Deep Learning (Updated)
Anomaly Detection in Deep Learning (Updated)
 
Hadoop summit 2016
Hadoop summit 2016Hadoop summit 2016
Hadoop summit 2016
 
Anomaly detection in deep learning
Anomaly detection in deep learningAnomaly detection in deep learning
Anomaly detection in deep learning
 
Advanced spark deep learning
Advanced spark deep learningAdvanced spark deep learning
Advanced spark deep learning
 
Deep learning on Hadoop/Spark -NextML
Deep learning on Hadoop/Spark -NextMLDeep learning on Hadoop/Spark -NextML
Deep learning on Hadoop/Spark -NextML
 
Skymind & Deeplearning4j: Deep Learning for the Enterprise
Skymind & Deeplearning4j: Deep Learning for the EnterpriseSkymind & Deeplearning4j: Deep Learning for the Enterprise
Skymind & Deeplearning4j: Deep Learning for the Enterprise
 
Sf data mining_meetup
Sf data mining_meetupSf data mining_meetup
Sf data mining_meetup
 

Kürzlich hochgeladen

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Kürzlich hochgeladen (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 

Future of ai on the jvm

  • 1. Future of AI on the JVM Scala Days Amsterdam 2015 Adam Gibson Creator of Deeplearning4j (and 4s :)
  • 2. What is AI? ● Not Terminator (despite our name) ● Many subfields ● Our focus: Machine learning
  • 4. Problem Space ● Spam Classification ● Summarization ● Face Detection ● Eye Tracking ● Targeted Ads ● Recommendation Engines
  • 5. Current State of ML ● Simpler models ● Most of industry barely uses Logistic Reg. ● Many problems are binary o e.g. fraud, spam ● Some unsupervised (clustering, reccos) ● Lots of ML frameworks on JVM
  • 6. ML Frameworks on JVM... ● Apache Mahout ● Spark’s MLlib ● Weka (is that R?)
  • 8. Problems ● Monolithic ● Makes assumptions about data ● Hard to use ● No separation of concerns
  • 9. Ring a Bell? ● We call that “Monolithic” ● Separate ML concerns: Data Pipelines/Vectorization Scoring Model Training Evaluation
  • 10. Micro-Services + ML? ● Kinda like micro-services ● Reduce lock in ● Take math, data cleaning, model training, choosing algorithms ... ● … and separate them
  • 11. Math ● Parametric Models (Matrices!) ● Non Parametric (Random forest) ● Focusing on Matrices (the hard part of ML systems)
  • 12. Matrices ● NDArrays ( > 2d) ● Tensors (think of pages of matrices) ● Example: 2 x 2 x 2 (2 2x 2 matrices) ● ^^THIS IS UNCLEAR. Two 2 x 2 matrices? ● Applies to graphs w/ sparse representations
  • 13. Chips/Hardware/Matrices ● CPUs - We work with these ● GPUs - CUDA ditto ● FPGAs o Intel bought Altera, an FPGA maker, for $17 billion this month o The edge, the cloud
  • 15. Why New Chips? ● See the numbers yourself: ● http://www.slideshare.net/airbots/cuda- 29330283 ● http://devblogs.nvidia.com/parallelforall/bidm ach-machine-learning-limit-gpus/ ● http://jcuda.org
  • 16. Mixed clusters ● GPUs aren’t good for all workloads ● Because latency ● Need to upload data: not good for small problems ● Mixed CPU/GPU clusters are best bet
  • 17. Data Pipelines ● More data will be binary ● Frameworks today can’t process binary well ● Binary data has different semantics ● Moving windows for audio ● 3d for images ...
  • 18. People Roll Their Own b/c ● Current frameworks assume clean data :( ● Pipelines are brittle, hard to maintain ● Moving towards being composable (reuse)
  • 19. Dedicated Libraries ● Let’s focus on vectorization -- now! ● Because IoT ● Because more access to raw media ● Should fit into current big data frameworks
  • 20. Scoring ● AUC ● F1 ● Different Loss Functions ● Hyper parameter optimization
  • 21. All independent ● These things work for different models ● Shouldn’t be tied to a particular system ● Should be embeddable
  • 22. Training ● Split Train/Test ● Sample data (no, not all the data ;) to validate model ● Increasingly compute intensive
  • 23. Deep Learning ● Most done in Python... ● Norm training time is measured in hours/days -- weeks!? ● Work being done in HPC (Model parallelism) ● Distbelief (Data parallelism)
  • 24. Automatic Learning ● Good at unstructured data ● Images, Text, Audio and Sensors ● Quick, baseline feature engineering ● Not good at feature introspection
  • 26. TSNE
  • 27. Where Does Scala Fit In? ● Akka - Real time streaming analytics/micro services ● Spark - Dataframes/number crunching ● JVM Key/Value Stores ● Pistachio (powers Yahoo’s ad network) o http://yahooeng.tumblr.com/post/118860853846/dist ributed-word2vec-on-top-of-pistachio
  • 28. The Way We Learn Now ● Monolithic ML frameworks ● No per-chip optimizations ● No Tensors (come on guys, it’s 2015...) ● Need isolation and less lockin ● JVM is the platform to make it happen
  • 29. Other Links ● http://deeplearning4j.org/ ● http://nd4j.org/ ● https://github.com/deeplearning4j/Canova