SlideShare ist ein Scribd-Unternehmen logo
1 von 95
Project “Deep Water”
H2O’s Integration with TensorFlow
Jo-fai (Joe) Chow
Data Scientist
joe@h2o.ai
@matlabulous
TensorFlow Paris Meetup
30th November, 2016
About Me
• Civil (Water) Engineer
• 2010 – 2015
• Consultant (UK)
• Utilities
• Asset Management
• Constrained Optimization
• Industrial PhD (UK)
• Infrastructure Design Optimization
• Machine Learning +
Water Engineering
• Discovered H2O in 2014
• Data Scientist
• From 2015
• Virgin Media (UK)
• Domino Data Lab (Silicon Valley)
• H2O.ai (Silicon Valley)
2
Agenda
• Introduction
• About TensorFlow
• TensorFlow Use Cases
• About H2O.ai
• Project Deep Water
• Motivation
• Benefits
• H2O + TensorFlow Live Demo
• Conclusions
3
About TensorFlow
4
About TensorFlow
• Open source machine learning
framework by Google
• Python / C++ API
• TensorBoard
• Data Flow Graph Visualization
• Multi CPU / GPU
• v0.8+ distributed machines support
• Multi devices support
• desktop, server and Android devices
• Image, audio and NLP applications
• HUGE Community
• Support for Spark, Windows …
5
TensorFlow Wrappers
• TFLearn – Simplified interface
• keras – TensorFlow + Theano
• tensorflow.rb – Ruby wrapper
• TensorFlow.jl – Julia wrapper
• TensorFlow for R – R wrapper
• … and many more!
• See: github.com/jtoy/awesome-
tensorflow
6
TensorFlow Use Cases
Very Cool TensorFlow Use Cases
Some of the cool things you can do with TensorFlow
7
8
playground.tensorflow.org
Neural Style Transfer in TensorFlow
• Neural Style
• “… a technique to train a deep
neural network to separate artistic
style from image structure, and
combine the style of one image
with the structure of another”
• Original Paper
• “A Neural Algorithm of Artistic
Style”
• TensorFlow Implementation
• [Link]
9
Sorting Cucumbers
• Problem
• Sorting cucumbers is a laborious
process.
• In a Japanese farm, the farmer’s wife
can spend up to eight hours a day
sorting cucumbers during peak
harvesting period.
• Solution
• Farmer’s son (Makoto Koike) used
TensorFlow, Arduino and Raspberry Pi
to create an automatic cucumber
sorting system.
10
Sorting Cucumbers
• Classification Problem
• Input: cucumber photos (side, top,
bottom)
• Output: one of nine classes
• Google’s Blog Post [Link]
• YouTube Video [Link]
11
12
13
14
Of course
there are more TensorFlow use cases
The key message here is …
15
TensorFlow
democratizes
the power of deep learning
16
About H2O.ai
What exactly is H2O?
17
Company Overview
Founded 2011 Venture-backed, debuted in 2012
Products • H2O Open Source In-Memory AI Prediction Engine
• Sparkling Water
• Steam
Mission Operationalize Data Science, and provide a platform for users to build beautiful data products
Team 70 employees
• Distributed Systems Engineers doing Machine Learning
• World-class visualization designers
Headquarters Mountain View, CA
18
Bring AI To Business
Empower Transformation
Financial Services, Insurance and
Healthcare as Our Vertical Focus
Community as Our Foundation
H2O is an open source platform
empowering business transformation
Users In Various Verticals Adore H2O
Financial Insurance MarketingTelecom Healthcare
20
21
www.h2o.ai/customers
Cloud Integration
Big Data Ecosystem
H2O.ai Makes A Difference as an AI Platform
Open Source Flexible Interface
Scalability and Performance GPU EnablementRapid Model Deployment
Smart and Fast Algorithms
H2O Flow
• 100% open source
• Highly portable models
deployed in Java (POJO) and
Model Object Optimized
(MOJO)
• Automated and streamlined
scoring service deployment
with Rest API
• Distributed In-Memory
Computing Platform
• Distributed Algorithms
• Fine-Grain MapReduce
0
10000
20000
30000
40000
50000
60000
70000
1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16
# H2O Users
H2O Community Growth
Tremendous Momentum Globally
65,000+ users globally
(Sept 2016)
• 65,000+ users from
~8,000 companies in 140
countries. Top 5 from:
Large User Circle
* DATA FROM GOOGLE ANALYTICS EMBEDDED IN THE END USER PRODUCT
23
0
2000
4000
6000
8000
10000
1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16
# Companies Using H2O ~8,000+ companies
(Sept 2016)
+127%
+60%
H2O Community Support
24
Google forum – h2osteam community.h2o.ai
Please try
H2O for Kaggle Competitions
25
H2O for Academic Research
26
http://www.sciencedirect.com/science/article/pii/S0377221716308657
https://arxiv.org/abs/1509.01199
H2O
democratizes
artificial intelligence & big data science
27
Our Open Source Products
100% Open Source. Big Data Science for Everyone!
28
H2O.ai Offers AI Open Source Platform
Product Suite to Operationalize Data Science with Visual Intelligence
In-Memory, Distributed
Machine Learning
Algorithms with Speed and
Accuracy
H2O Integration with Spark.
Best Machine Learning on
Spark.
100% Open Source
29
Visual Intelligence and UX Framework For Data Interpretation and
Story Telling on top of Beautiful Data Products
Operationalize and
Streamline Model Building,
Training and Deployment
Automatically and Elastically
State-of-the-art
Deep Learning on GPUs with
TensorFlow, MXNet or Caffe
with the ease of use of H2O
Deep
Water
H2O.ai Offers AI Open Source Platform
Product Suite to Operationalize Data Science with Visual Intelligence
In-Memory, Distributed
Machine Learning
Algorithms with Speed and
Accuracy
H2O Integration with Spark.
Best Machine Learning on
Spark.
100% Open Source
30
Visual Intelligence and UX Framework For Data Interpretation and
Story Telling on top of Beautiful Data Products
Operationalize and
Streamline Model Building,
Training and Deployment
Automatically and Elastically
State-of-the-art
Deep Learning on GPUs with
TensorFlow, MXNet or Caffe
with the ease of use of H2O
Deep
Water
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
31
Supervised Learning
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and Tweedie
• Naïve Bayes
Statistical
Analysis
Ensembles
• Distributed Random Forest: Classification
or regression models
• Gradient Boosting Machine: Produces an
ensemble of decision trees with increasing
refined approximations
Deep Neural
Networks
• Deep learning: Create multi-layer feed
forward neural networks starting with an
input layer followed by multiple layers of
nonlinear transformations
Algorithms Overview
Unsupervised Learning
• K-means: Partitions observations into k
clusters/groups of the same spatial size.
Automatically detect optimal k
Clustering
Dimensionality
Reduction
• Principal Component Analysis: Linearly transforms
correlated variables to independent components
• Generalized Low Rank Models: extend the idea of
PCA to handle arbitrary data consisting of numerical,
Boolean, categorical, and missing data
Anomaly
Detection
• Autoencoders: Find outliers using a
nonlinear dimensionality reduction using
deep learning
32
Distributed Algorithms
• Foundation for In-Memory Distributed Algorithm
Calculation - Distributed Data Frames and columnar
compression
• All algorithms are distributed in H2O: GBM, GLM, DRF, Deep
Learning and more. Fine-grained map-reduce iterations.
• Only enterprise-grade, open-source distributed algorithms
in the market
User Benefits
Advantageous Foundation
• “Out-of-box” functionalities for all algorithms (NO MORE
SCRIPTING) and uniform interface across all languages: R,
Python, Java
• Designed for all sizes of data sets, especially large data
• Highly optimized Java code for model exports
• In-house expertise for all algorithms
Parallel Parse into Distributed Rows
Fine Grain Map Reduce Illustration: Scalable
Distributed Histogram Calculation for GBM
FoundationforDistributedAlgorithms
33
H2O Deep Learning in Action
34
H2O in Action
Quick Demo (5 minutes)
35
Simple Demo – Iris
36
H2O + R
37
Please try
H2O Flow (Web)
38
H2O Flow
39
Key Learning Resources
• Help Documentations
• docs.h2o.ai
• Meetups
• bit.ly/h2o_meetups
• YouTube Channel
• bit.ly/h2o_youtube
40
H2O.ai Offers AI Open Source Platform
Product Suite to Operationalize Data Science with Visual Intelligence
In-Memory, Distributed
Machine Learning
Algorithms with Speed and
Accuracy
H2O Integration with Spark.
Best Machine Learning on
Spark.
100% Open Source
41
Visual Intelligence and UX Framework For Data Interpretation and
Story Telling on top of Beautiful Data Products
Operationalize and
Streamline Model Building,
Training and Deployment
Automatically and Elastically
State-of-the-art
Deep Learning on GPUs with
TensorFlow, MXNet or Caffe
with the ease of use of H2O
Deep
Water
Both TensorFlow and H2O are widely used
42
TensorFlow democratizes the power of
deep learning.
H2O democratizes artificial intelligence &
big data science.
There are other open source libraries like MXNet and Caffe too.
Let’s have a party, this will be fun!
43
Deep Water
Next-Gen Distributed Deep Learning with H2O
H2O integrates with existing GPU backends
for significant performance gains
One Interface - GPU Enabled - Significant Performance Gains
Inherits All H2O Properties in Scalability, Ease of Use and Deployment
Recurrent Neural Networks
enabling natural language processing,
sequences, time series, and more
Convolutional Neural Networks enabling
Image, video, speech recognition
Hybrid Neural Network Architectures
enabling speech to text translation, image
captioning, scene parsing and more
Deep Water
44
Deep Water Architecture
Node 1 Node N
Scala
Spark
H2O
Java
Execution Engine
TensorFlow/mxnet/Caffe
C++
GPU CPU
TensorFlow/mxnet/Caffe
C++
GPU CPU
RPC
R/Py/Flow/Scala client
REST API
Web server
H2O
Java
Execution Engine
grpc/MPI/RDMA
Scala
Spark
45
46
Using H2O Flow to train Deep
Water Model
Same H2O R/Python Interface
47
H2O + TensorFlow
Live Demo
48
Deep Water H2O + TensorFlow Demo
• H2O + TensorFlow
• Dataset – Cat/Dog/Mouse
• TensorFlow as GPU backend
• Train a LeNet (CNN) model
• Interfaces
• Python (Jupyter Notebook)
• Web (H2O Flow)
• Code and Data
• github.com/h2oai/deepwater
49
Data – Cat/Dog/Mouse Images
50
Data – CSV
51
Available Networks in Deep Water
• LeNet (This Demo)
• AlexNet
• VGGNet
• Inception (GoogLeNet)
• ResNet (Deep Residual
Learning)
• Build Your Own
52
ResNet
Deep Water in Action
Quick Demo (5 minutes)
53
54
55
56
57
58
Using GPU for training
59
60
Saving the custom network
structure as a file
61
Creating the custom network
structure with size = 28x28
and channels = 3
62
Specifying the custom
network structure for
training
H2O + TensorFlow
Now try it with H2O Flow
63
Want to try Deep Water?
64
• Build it
• github.com/h2oai/deepwater
• Ubuntu 16.04
• CUDA 8
• cuDNN 5
• …
• Pre-built Amazon Machine
Images (AMIs)
• Info to be confirmed
H2O.ai Offers AI Open Source Platform
Product Suite to Operationalize Data Science with Visual Intelligence
In-Memory, Distributed
Machine Learning
Algorithms with Speed and
Accuracy
H2O Integration with Spark.
Best Machine Learning on
Spark.
100% Open Source
65
Visual Intelligence and UX Framework For Data Interpretation and
Story Telling on top of Beautiful Data Products
Operationalize and
Streamline Model Building,
Training and Deployment
Automatically and Elastically
State-of-the-art
Deep Learning on GPUs with
TensorFlow, MXNet or Caffe
with the ease of use of H2O
Deep
Water
Want to find out more about Sparkling Water
and Steam?
• R Addicts Paris
• Tomorrow 6:30pm
• Three H2O Talks:
• Introduction to H2O
• Demos: H2O + R + Steam
• Auto Machine Learning using H2O
• Sparkling Water 2.0
66
Conclusions
67
Project “Deep Water”
• H2O + TensorFlow
• a powerful combination of two
widely used open source machine
learning libraries.
• All Goodies from H2O
• inherits all H2O properties in
scalability, ease of use and
deployment.
• Unified Interface
• allows users to build, stack and
deploy deep learning models from
different libraries efficiently.
68
• 100% Open Source
• the party will get bigger!
Deep Water Roadmap (Q4 2016)
69
H2O’s Mission
70
Making Machine Learning Accessible to Everyone
Photo credit: Virgin Media
Deep Water – Current Contributors
71
Merci beaucoup!
• Organizers & Sponsors
• Jiqiong, Natalia & Renat
• Dailymotion
• Code, Slides & Documents
• bit.ly/h2o_meetups
• bit.ly/h2o_deepwater
• docs.h2o.ai
• Contact
• joe@h2o.ai
• @matlabulous
• github.com/woobe
72
Extra Slides (Iris Demo)
73
74
75
76
77
78
79
80
81
82
83
Extra Slides (Deep Water Demo)
84
85
86
87
88
89
90
91
92
93
94
95

Weitere ähnliche Inhalte

Was ist angesagt?

Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2O
odsc
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
Sri Ambati
 

Was ist angesagt? (20)

Introduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewIntroduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain View
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2O
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
 
H2O at BelgradeR Meetup
H2O at BelgradeR MeetupH2O at BelgradeR Meetup
H2O at BelgradeR Meetup
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
 
H2O at Poznan R Meetup
H2O at Poznan R MeetupH2O at Poznan R Meetup
H2O at Poznan R Meetup
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyMaking Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
 
Automatic and Interpretable Machine Learning in R with H2O and LIME (Milan Ed...
Automatic and Interpretable Machine Learning in R with H2O and LIME (Milan Ed...Automatic and Interpretable Machine Learning in R with H2O and LIME (Milan Ed...
Automatic and Interpretable Machine Learning in R with H2O and LIME (Milan Ed...
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
 
GPU Accelerated Machine Learning
GPU Accelerated Machine LearningGPU Accelerated Machine Learning
GPU Accelerated Machine Learning
 
Introduction to data science with H2O-Chicago
Introduction to data science with H2O-ChicagoIntroduction to data science with H2O-Chicago
Introduction to data science with H2O-Chicago
 
Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine Learning
 
World's Fastest Machine Learning With GPUs
World's Fastest Machine Learning With GPUsWorld's Fastest Machine Learning With GPUs
World's Fastest Machine Learning With GPUs
 
H2O Random Grid Search - PyData Amsterdam
H2O Random Grid Search - PyData AmsterdamH2O Random Grid Search - PyData Amsterdam
H2O Random Grid Search - PyData Amsterdam
 
ISAX
ISAXISAX
ISAX
 
H2O Machine Learning and Kalman Filters for Machine Prognostics
H2O Machine Learning and Kalman Filters for Machine PrognosticsH2O Machine Learning and Kalman Filters for Machine Prognostics
H2O Machine Learning and Kalman Filters for Machine Prognostics
 

Andere mochten auch

Andere mochten auch (7)

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno CandelDeep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno Candel
 
H2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
H2O World - Benchmarking Open Source ML Platforms - Szilard PafkaH2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
H2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
 
Tensor flow
Tensor flowTensor flow
Tensor flow
 
H20: A platform for big math
H20: A platform for big math H20: A platform for big math
H20: A platform for big math
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
 
Cloud Foundry vs Docker vs Kubernetes - http://bit.ly/2rzUM2U
Cloud Foundry vs Docker vs Kubernetes - http://bit.ly/2rzUM2UCloud Foundry vs Docker vs Kubernetes - http://bit.ly/2rzUM2U
Cloud Foundry vs Docker vs Kubernetes - http://bit.ly/2rzUM2U
 

Ähnlich wie Project "Deep Water"

Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Ian Gomez
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
Sri Ambati
 
SharePoint 2013 Dev Features
SharePoint 2013 Dev FeaturesSharePoint 2013 Dev Features
SharePoint 2013 Dev Features
Ricardo Wilkins
 
How Joomla and Microsoft are a Great Open Source Success
How Joomla and Microsoft are a Great Open Source SuccessHow Joomla and Microsoft are a Great Open Source Success
How Joomla and Microsoft are a Great Open Source Success
Cory Fowler
 

Ähnlich wie Project "Deep Water" (20)

Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2O
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R Meetup
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers Program
 
SharePoint 2016 - nextgenportal
SharePoint 2016 - nextgenportalSharePoint 2016 - nextgenportal
SharePoint 2016 - nextgenportal
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
SharePoint 2013 Dev Features
SharePoint 2013 Dev FeaturesSharePoint 2013 Dev Features
SharePoint 2013 Dev Features
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2O
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Azure for Hackathons
Azure for HackathonsAzure for Hackathons
Azure for Hackathons
 
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligenceTour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
 
Intro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LAIntro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LA
 
How Joomla and Microsoft are a Great Open Source Success
How Joomla and Microsoft are a Great Open Source SuccessHow Joomla and Microsoft are a Great Open Source Success
How Joomla and Microsoft are a Great Open Source Success
 
3 Steps to Accelerate to Cloud
3 Steps to Accelerate to Cloud3 Steps to Accelerate to Cloud
3 Steps to Accelerate to Cloud
 

Mehr von Jo-fai Chow

Kaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesKaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunities
Jo-fai Chow
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
Jo-fai Chow
 
Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)
Jo-fai Chow
 
Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I,
Jo-fai Chow
 
Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)
Jo-fai Chow
 
Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)
Jo-fai Chow
 

Mehr von Jo-fai Chow (13)

Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Using H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters OptimizationUsing H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters Optimization
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing Values
 
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters TuningImproving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters Tuning
 
Kaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesKaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunities
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
 
Designing Sustainable Drainage Systems
Designing Sustainable Drainage SystemsDesigning Sustainable Drainage Systems
Designing Sustainable Drainage Systems
 
Developing a New Decision Support System for SuDS
Developing a New Decision Support System for SuDSDeveloping a New Decision Support System for SuDS
Developing a New Decision Support System for SuDS
 
Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)
 
Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I,
 
Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)
 
Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Project "Deep Water"

  • 1. Project “Deep Water” H2O’s Integration with TensorFlow Jo-fai (Joe) Chow Data Scientist joe@h2o.ai @matlabulous TensorFlow Paris Meetup 30th November, 2016
  • 2. About Me • Civil (Water) Engineer • 2010 – 2015 • Consultant (UK) • Utilities • Asset Management • Constrained Optimization • Industrial PhD (UK) • Infrastructure Design Optimization • Machine Learning + Water Engineering • Discovered H2O in 2014 • Data Scientist • From 2015 • Virgin Media (UK) • Domino Data Lab (Silicon Valley) • H2O.ai (Silicon Valley) 2
  • 3. Agenda • Introduction • About TensorFlow • TensorFlow Use Cases • About H2O.ai • Project Deep Water • Motivation • Benefits • H2O + TensorFlow Live Demo • Conclusions 3
  • 5. About TensorFlow • Open source machine learning framework by Google • Python / C++ API • TensorBoard • Data Flow Graph Visualization • Multi CPU / GPU • v0.8+ distributed machines support • Multi devices support • desktop, server and Android devices • Image, audio and NLP applications • HUGE Community • Support for Spark, Windows … 5
  • 6. TensorFlow Wrappers • TFLearn – Simplified interface • keras – TensorFlow + Theano • tensorflow.rb – Ruby wrapper • TensorFlow.jl – Julia wrapper • TensorFlow for R – R wrapper • … and many more! • See: github.com/jtoy/awesome- tensorflow 6
  • 7. TensorFlow Use Cases Very Cool TensorFlow Use Cases Some of the cool things you can do with TensorFlow 7
  • 9. Neural Style Transfer in TensorFlow • Neural Style • “… a technique to train a deep neural network to separate artistic style from image structure, and combine the style of one image with the structure of another” • Original Paper • “A Neural Algorithm of Artistic Style” • TensorFlow Implementation • [Link] 9
  • 10. Sorting Cucumbers • Problem • Sorting cucumbers is a laborious process. • In a Japanese farm, the farmer’s wife can spend up to eight hours a day sorting cucumbers during peak harvesting period. • Solution • Farmer’s son (Makoto Koike) used TensorFlow, Arduino and Raspberry Pi to create an automatic cucumber sorting system. 10
  • 11. Sorting Cucumbers • Classification Problem • Input: cucumber photos (side, top, bottom) • Output: one of nine classes • Google’s Blog Post [Link] • YouTube Video [Link] 11
  • 12. 12
  • 13. 13
  • 14. 14
  • 15. Of course there are more TensorFlow use cases The key message here is … 15
  • 18. Company Overview Founded 2011 Venture-backed, debuted in 2012 Products • H2O Open Source In-Memory AI Prediction Engine • Sparkling Water • Steam Mission Operationalize Data Science, and provide a platform for users to build beautiful data products Team 70 employees • Distributed Systems Engineers doing Machine Learning • World-class visualization designers Headquarters Mountain View, CA 18
  • 19. Bring AI To Business Empower Transformation Financial Services, Insurance and Healthcare as Our Vertical Focus Community as Our Foundation H2O is an open source platform empowering business transformation
  • 20. Users In Various Verticals Adore H2O Financial Insurance MarketingTelecom Healthcare 20
  • 22. Cloud Integration Big Data Ecosystem H2O.ai Makes A Difference as an AI Platform Open Source Flexible Interface Scalability and Performance GPU EnablementRapid Model Deployment Smart and Fast Algorithms H2O Flow • 100% open source • Highly portable models deployed in Java (POJO) and Model Object Optimized (MOJO) • Automated and streamlined scoring service deployment with Rest API • Distributed In-Memory Computing Platform • Distributed Algorithms • Fine-Grain MapReduce
  • 23. 0 10000 20000 30000 40000 50000 60000 70000 1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16 # H2O Users H2O Community Growth Tremendous Momentum Globally 65,000+ users globally (Sept 2016) • 65,000+ users from ~8,000 companies in 140 countries. Top 5 from: Large User Circle * DATA FROM GOOGLE ANALYTICS EMBEDDED IN THE END USER PRODUCT 23 0 2000 4000 6000 8000 10000 1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16 # Companies Using H2O ~8,000+ companies (Sept 2016) +127% +60%
  • 24. H2O Community Support 24 Google forum – h2osteam community.h2o.ai Please try
  • 25. H2O for Kaggle Competitions 25
  • 26. H2O for Academic Research 26 http://www.sciencedirect.com/science/article/pii/S0377221716308657 https://arxiv.org/abs/1509.01199
  • 28. Our Open Source Products 100% Open Source. Big Data Science for Everyone! 28
  • 29. H2O.ai Offers AI Open Source Platform Product Suite to Operationalize Data Science with Visual Intelligence In-Memory, Distributed Machine Learning Algorithms with Speed and Accuracy H2O Integration with Spark. Best Machine Learning on Spark. 100% Open Source 29 Visual Intelligence and UX Framework For Data Interpretation and Story Telling on top of Beautiful Data Products Operationalize and Streamline Model Building, Training and Deployment Automatically and Elastically State-of-the-art Deep Learning on GPUs with TensorFlow, MXNet or Caffe with the ease of use of H2O Deep Water
  • 30. H2O.ai Offers AI Open Source Platform Product Suite to Operationalize Data Science with Visual Intelligence In-Memory, Distributed Machine Learning Algorithms with Speed and Accuracy H2O Integration with Spark. Best Machine Learning on Spark. 100% Open Source 30 Visual Intelligence and UX Framework For Data Interpretation and Story Telling on top of Beautiful Data Products Operationalize and Streamline Model Building, Training and Deployment Automatically and Elastically State-of-the-art Deep Learning on GPUs with TensorFlow, MXNet or Caffe with the ease of use of H2O Deep Water
  • 31. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 31
  • 32. Supervised Learning • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Naïve Bayes Statistical Analysis Ensembles • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations Deep Neural Networks • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Algorithms Overview Unsupervised Learning • K-means: Partitions observations into k clusters/groups of the same spatial size. Automatically detect optimal k Clustering Dimensionality Reduction • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Models: extend the idea of PCA to handle arbitrary data consisting of numerical, Boolean, categorical, and missing data Anomaly Detection • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning 32
  • 33. Distributed Algorithms • Foundation for In-Memory Distributed Algorithm Calculation - Distributed Data Frames and columnar compression • All algorithms are distributed in H2O: GBM, GLM, DRF, Deep Learning and more. Fine-grained map-reduce iterations. • Only enterprise-grade, open-source distributed algorithms in the market User Benefits Advantageous Foundation • “Out-of-box” functionalities for all algorithms (NO MORE SCRIPTING) and uniform interface across all languages: R, Python, Java • Designed for all sizes of data sets, especially large data • Highly optimized Java code for model exports • In-house expertise for all algorithms Parallel Parse into Distributed Rows Fine Grain Map Reduce Illustration: Scalable Distributed Histogram Calculation for GBM FoundationforDistributedAlgorithms 33
  • 34. H2O Deep Learning in Action 34
  • 35. H2O in Action Quick Demo (5 minutes) 35
  • 36. Simple Demo – Iris 36
  • 40. Key Learning Resources • Help Documentations • docs.h2o.ai • Meetups • bit.ly/h2o_meetups • YouTube Channel • bit.ly/h2o_youtube 40
  • 41. H2O.ai Offers AI Open Source Platform Product Suite to Operationalize Data Science with Visual Intelligence In-Memory, Distributed Machine Learning Algorithms with Speed and Accuracy H2O Integration with Spark. Best Machine Learning on Spark. 100% Open Source 41 Visual Intelligence and UX Framework For Data Interpretation and Story Telling on top of Beautiful Data Products Operationalize and Streamline Model Building, Training and Deployment Automatically and Elastically State-of-the-art Deep Learning on GPUs with TensorFlow, MXNet or Caffe with the ease of use of H2O Deep Water
  • 42. Both TensorFlow and H2O are widely used 42
  • 43. TensorFlow democratizes the power of deep learning. H2O democratizes artificial intelligence & big data science. There are other open source libraries like MXNet and Caffe too. Let’s have a party, this will be fun! 43
  • 44. Deep Water Next-Gen Distributed Deep Learning with H2O H2O integrates with existing GPU backends for significant performance gains One Interface - GPU Enabled - Significant Performance Gains Inherits All H2O Properties in Scalability, Ease of Use and Deployment Recurrent Neural Networks enabling natural language processing, sequences, time series, and more Convolutional Neural Networks enabling Image, video, speech recognition Hybrid Neural Network Architectures enabling speech to text translation, image captioning, scene parsing and more Deep Water 44
  • 45. Deep Water Architecture Node 1 Node N Scala Spark H2O Java Execution Engine TensorFlow/mxnet/Caffe C++ GPU CPU TensorFlow/mxnet/Caffe C++ GPU CPU RPC R/Py/Flow/Scala client REST API Web server H2O Java Execution Engine grpc/MPI/RDMA Scala Spark 45
  • 46. 46 Using H2O Flow to train Deep Water Model
  • 47. Same H2O R/Python Interface 47
  • 49. Deep Water H2O + TensorFlow Demo • H2O + TensorFlow • Dataset – Cat/Dog/Mouse • TensorFlow as GPU backend • Train a LeNet (CNN) model • Interfaces • Python (Jupyter Notebook) • Web (H2O Flow) • Code and Data • github.com/h2oai/deepwater 49
  • 52. Available Networks in Deep Water • LeNet (This Demo) • AlexNet • VGGNet • Inception (GoogLeNet) • ResNet (Deep Residual Learning) • Build Your Own 52 ResNet
  • 53. Deep Water in Action Quick Demo (5 minutes) 53
  • 54. 54
  • 55. 55
  • 56. 56
  • 57. 57
  • 58. 58 Using GPU for training
  • 59. 59
  • 60. 60 Saving the custom network structure as a file
  • 61. 61 Creating the custom network structure with size = 28x28 and channels = 3
  • 62. 62 Specifying the custom network structure for training
  • 63. H2O + TensorFlow Now try it with H2O Flow 63
  • 64. Want to try Deep Water? 64 • Build it • github.com/h2oai/deepwater • Ubuntu 16.04 • CUDA 8 • cuDNN 5 • … • Pre-built Amazon Machine Images (AMIs) • Info to be confirmed
  • 65. H2O.ai Offers AI Open Source Platform Product Suite to Operationalize Data Science with Visual Intelligence In-Memory, Distributed Machine Learning Algorithms with Speed and Accuracy H2O Integration with Spark. Best Machine Learning on Spark. 100% Open Source 65 Visual Intelligence and UX Framework For Data Interpretation and Story Telling on top of Beautiful Data Products Operationalize and Streamline Model Building, Training and Deployment Automatically and Elastically State-of-the-art Deep Learning on GPUs with TensorFlow, MXNet or Caffe with the ease of use of H2O Deep Water
  • 66. Want to find out more about Sparkling Water and Steam? • R Addicts Paris • Tomorrow 6:30pm • Three H2O Talks: • Introduction to H2O • Demos: H2O + R + Steam • Auto Machine Learning using H2O • Sparkling Water 2.0 66
  • 68. Project “Deep Water” • H2O + TensorFlow • a powerful combination of two widely used open source machine learning libraries. • All Goodies from H2O • inherits all H2O properties in scalability, ease of use and deployment. • Unified Interface • allows users to build, stack and deploy deep learning models from different libraries efficiently. 68 • 100% Open Source • the party will get bigger!
  • 69. Deep Water Roadmap (Q4 2016) 69
  • 70. H2O’s Mission 70 Making Machine Learning Accessible to Everyone Photo credit: Virgin Media
  • 71. Deep Water – Current Contributors 71
  • 72. Merci beaucoup! • Organizers & Sponsors • Jiqiong, Natalia & Renat • Dailymotion • Code, Slides & Documents • bit.ly/h2o_meetups • bit.ly/h2o_deepwater • docs.h2o.ai • Contact • joe@h2o.ai • @matlabulous • github.com/woobe 72
  • 73. Extra Slides (Iris Demo) 73
  • 74. 74
  • 75. 75
  • 76. 76
  • 77. 77
  • 78. 78
  • 79. 79
  • 80. 80
  • 81. 81
  • 82. 82
  • 83. 83
  • 84. Extra Slides (Deep Water Demo) 84
  • 85. 85
  • 86. 86
  • 87. 87
  • 88. 88
  • 89. 89
  • 90. 90
  • 91. 91
  • 92. 92
  • 93. 93
  • 94. 94
  • 95. 95

Hinweis der Redaktion

  1. Open Source: 100% open source. Deep learning is in public domain and users should have full access to it in open source. Seamless integration with Big Data frameworks, such as Apache Spark. We have Sparkling Water. Also, H2O co-located in the company’s Hadoop cluster, so analysts can discover insights in the data without extracting it or taking samples Flexible interface: R, Python and Scala, Java Smart Algorithms: Deep Learning differs in the depth of features they support. Our support is the capabilities listed below save time and manual programming for the data scientist, which leads to be better models. • Automatic standardization on of the predictors • Automatic initialization of model parameters (weights & biases) • Automatic adaptive learning rates • Ability to specify manual learning rate with annealing and momentum • Automatic handling of categorical and missing data • Regularization on techniques to manage complexity • Automatic parameter optimization on with grid search • Early stopping based on validation on datasets • Automatic cross-validation Distributed Computing Platform Highly scalable Superior performance Rapid Model Deployment: POJO is very portable, and makes model deployment very fast. Steam has a streamlined scoring server deployment mechanism, which makes model deployment so much faster. GPU Enablement: Deep Learning algorithms can be so much faster on GPU’s than on CPU’s. This enablement will give H2O implementations a boost in performance over other implementations. Cloud integration: Amazon Web Services, Microsoft Azure, Google Cloud
  2. Date # Users 1-Jan-15 11000 1-Jul-15 25000 1-Jan-16 29000 1-Oct-16 65858 Date # Companies 1-Jan-15 1800 1-Jul-15 3810 1-Jan-16 4890 1-Oct-16 7803
  3. Works on Hadoop & Spark Key capabilities Distributed in-memory algorithms Columnar compression Feature Engineering EDA Data Munging Supervised Learning Algorithms Unsupervised Learning Algorithms Model deployment via POJO
  4. GBM authors from Stanford on Scientific Advisory Council