SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
AI meets Big Data
How to cross the chasm
Google TPU
AI history à Perceptron
1958 F. Rosenblatt,
“Perceptron” model,
neuronal networks
1943 W. McCulloch,
W. Pitts, “Neuron” as
logical element
OR function XOR function
1969 M. Minsky,
S. Papert, triggers
first AI winter
feed forward
AI history à AI winter
1958 F. Rosenblatt,
Perzeptron model,
neuronal networks
1987-1993 the second
AI winter, desktop
computer, LISP
machines expensive
1943 W. McCulloch,
W. Pitts, neuron as
logical element
1980 Boom expert
systems, Q&A using
logical rules, Prolog
1969 M. Minsky,
S. Papert, trigger
first AI winter
1993-2001
Moore’s law, Deep
blue chess-
playing, Standford
DARPA challenge
6
AI beats human in games - 2016
Komodo beasts H. Nakamura in 2016AlphaGo beats L. Sedols in 2016
Go 4:1 Chess 2:1
Image Classification- 2016
Human Performance AI Performance
https://arxiv.org/pdf/1602.07261.pdf
95% 97%
The ability to understand the content of an image by using machine learning
Breast Cancer Diagnoses - 2017
Pathologist Performance AI Performance
https://research.googleblog.com/2017/03/assisting-pathologists-in-detecting.html
73% 92%
Doctors often use additional tests to find or diagnose breast cancer
The pathologist ended up
spending 30 hours on this
task on 130 slides
A closeup of a lymph node biopsy.
9
Machine Learning Problem Types
Structured data
80% of world’s data is unstructured
Fishing in the sea versus fishing in the lake
Data Warehouse Data Lake
Business Intellingence helps find
answers to questions you know.
Data Science helps you find the
question itself.
Any kind of data & schema-on-readStructured data & schema-on-write
Parallel processing on big dataSQL-ish queries on database tables
Extract, Transform, Load Extract, Load, Transform-on-the-fly
Low cost on commodity hardwareExpensive for large data
More Data + Bigger Models
Accuracy
Scale (data size, model size)
other approaches
neural networks
1990s
https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
More Data + Bigger Models + More Computation
Accuracy
Scale (data size, model size)
other approaches
neural networks
Now
https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
more compute
More Data + Bigger Models + More Computation
= Better Results in Machine Learning
Millions of “trip”
events each day globally
400+ billion viewing-
related events per day
Five billion data points
for Price Tip feature
Movie
recommendation
Price
optimization
Routing and price
optimization
How to start?
Single machineML specialist Small data
Single machineML specialist Small data
Single machineML specialist Small data
Single machineML specialist Small data
Single machineML specialist Small data
X X
Single machineML specialist Big data
Single machineML specialist Big data
X X
Train and evaluate machine learning models at scale
Single machine Data center
How to run more experiments faster and in parallel?
How to share and reproduce research?
How to go from research to real products?
Distributed Machine Learning
Data Size
Model Size
Model parallelism
Single machine
Data center
Data
parallelism
training very large models exploring several model
architectures, hyper-
parameter optimization,
training several
independent models
speeds up the training
Compute Workload for Training and Evaluation
I/O intensive
Compute
intensive
Single machine
Data center
I/O Workload for Simulation and Testing
I/O intensive
Compute
intensive
Single machine
Data center
Distributed Machine Learning
Distributed Machine Learning
X
The new rising star
11/24/17 28
TensorFlow
Standalone
TensorFlow
On YARN
TensorFlow
On multi-
colored YARN
TensorFlow
On Spark
TensorFrames
TensorFlow
On
Kubernetes
TensorFlow
On Mesos
Distributed TensorFlow on
Hadoop, Mesos, Kubernetes,
Spark
https://www.slideshare.net/jwiegelmann/distributed
-tensorflow-on-hadoop-mesos-kubernetes-spark
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Google, 2015
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Google, 2015
http://stevenwhang.com/tfx_paper.pdf
TFX: A TensorFlow-Based Production-Scale
Machine Learning Platform
Google, 2017
https://eng.uber.com/michelangelo/
Michelangelo: Uber’s Machine Learning Platform
http://searchbusinessanalytics.techtarget.com/feature/Machine-learning-platforms-comparison-Amazon-Azure-Google-IBM
Pricing for 890,000 real-time predictions w/o training
AWS:
Compute Fees + Prediction Fees = $8.40 + $96.44
= $104.84 per month
Google:
Prediction $0.10 per thousand predictions, plus $0.40 per hour
= $377 per month
Azure:
Packages $0, $100,13, $1.000,06, $9.999,98
= $1.000 per month
Q3, 2017
COMMERCE USE CASE
Who are you?
Who do you know?
What can you afford?
Where are you?
What have you purchased?
What do you like?
What content do you prefer?
Why have you contacted us?
Marketing
Tools
Touchpoints Aftersales
Capture Data across Digital Channels
Each of these customer interactions produces data
Breaking Down Data Silos
Connect all your data tools,
other sources, and gain a 360
degree view on your data
Get actionable insights and
serve them personal, relevant
content along their journey
Real-time processing and
decision making
One
Data Platform
Marketing
Tools
Touchpoints
Historical
Aftersales
Data Analytics
Machine Learning
Data Apps
Actionable data insights that
businesses can use to...
ü Better understand and better
engage your customers
ü Respond to the convergence
of customer expectations
ü Driver of brand perception
360 degree view of your customers
Social
Apps
CRM
Billing
Channels
Service Call
Location
Devices
Network
Ordering
Customer 360
Where AI can help…
+ Predicting lifetime value
+ Churn estimation
+ Customer segmentation
+ Cross/Upselling
+ Recommendations
+ Demand forecasting
+ Market Basket Analysis
+ Sentiment analysis
+ Loyalty programs
+ Reactivation likelihood
+ Discount targeting
+ Call to action
+ Risk analysis
+ In store traffic patterns
AUTOMOTIVE USE CASE
High-level Development Process for Autonomous Vehicles
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
Agenda
Sensors Udacity Lincoln MKZ
Camera 3x Blackfly GigE Camera, 20 Hz
Lidar Velodyne HDL-32E, 9.5 Hz
IMU Xsens, 400 Hz
GPS 2x fixed, 1 Hz
CAN bus, 1,1 kHz
Robot Operating System
Data 3 GB per minute
https://github.com/udacity/self-driving-car
Sensors Spec
Sensor blinding,
sunlight,
darkness
rain, fog,
snow
non-metal
objects
wind/ high
velocity
resolution range data
Ultrasonic yes yes yes no + + +
Lidar yes no yes yes +++ ++ +
Radar yes yes no yes ++ +++ +
Camera no no yes yes +++ +++ +++
Machine Learning 101
Observations
State
Estimation
Modeling &
Prediction
Planning
Controls
f(x)
Controls
Observations
Machine Learning for Autonomous Driving
+ Sensor Fusion clustering, segmentation, pattern recognition
+ Road ego-motion, image processing and pattern recognition
+ Localization simultaneous localization and mapping
+ Situation Understanding detection and classification
+ Trajectory Planning motion planning and control
+ Control Strategy reinforcement and supervised learning
+ Driver Model image processing and pattern recognition
Machine Learning Cycle
Data collection
for training/test
Feature
engineering
I/O workload
Model development
and architecture
Compute workload I/O workload
Training and
evaluation
Re- Simulation
and Testing
Scaling and
monitoring
Model deployment
versioning
1 2 3
Model tuning
Flux – Open Machine Learning Stack
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Feature
Engineering
Training
Evaluation
Re-Simulation
Testing
CaffeOnSpark
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
Feature Engineering
+ Hadoop InputFormat and
Record Reader for Rosbag
+ Process Rosbag with Spark,
Yarn, MapReduce, Hadoop
Streaming API, …
+ Spark RDD are cached and
optimized for analysis
Ros
bag
Processing
Engine
Computer
Network
Storage
Advanced
Analytics
RDD
Record
Reader
RDD
DataFrame, DataSet
SQL, Spark APIs
NumPy
Ros
Msg
Training & Evaluation
+ Tensorflow ROSRecordDataset
+ Protocol Buffers to serialize
records
+ Save time because data
conversion not needed
+ Save storage because data
duplication not needed
Training
Engine
Machine
Learning
Ros
bag
Computer
Network
Storage
ROS
Dataset
Ros
msg
Re-Simulation & Testing
+ Use Spark for preprocessing,
transformation, cleansing,
aggregation, time window
selection before publish to ROS
topics
+ Use Re-Simulation framework
of choice to subscribe to the
ROS topics
Engine
Re-Simulation
with framework
of choice
Computer
Network
Storage
Ros
bag
Ros
topic
core
subscribe
publish
HOW TO START?
+ Classification, Regression, Clustering,
Collaborative Filtering, Anomaly Detection
+ Supervised/Unsupervised Reinforcement
Learning, Deep Learning, CNN
+ Model Training, Evaluation, Testing,
Simulation, Inference
+ Big Data Strategy, Consulting, Data
Lab, Data Science as a Service
+ Data Collection, Cleaning, Analyzing,
Modeling, Validation, Visualization
+ Business Case Validation,
Prototyping, MVPs, Dashboards
Data Science Machine Learning
+ Architecture, DevOps, Cloud Building
+ App. Management Hadoop Ecosystem
+ Managed Infrastructure Services
+ Compute, Network, Storage, Firewall,
Loadbalancer, DDoS, Protection
+ Continuous Integration and Deployment
+ Data Pipelines (Acquisition,
Ingestion, Analytics, Visualization)
+ Distributed Data Architectures
+ Data Processing Backend
+ Hadoop Ecosystem
+ Test Automation and Testing
Data Engineering Data Operations
Think Big Business Strategy
Data Strategy
Technology Strategy
Agile Delivery Model
Business Case Validation
Prototypes, MVPs
Data Exploration
Data AcquisitionStart Small
Value
Proposition
thank you
https://www.slideshare.net/jwiegelmann

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Responsible AI
Responsible AIResponsible AI
Responsible AINeo4j
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science clubData Science Club
 
Explainable AI
Explainable AIExplainable AI
Explainable AIDinesh V
 
AI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsAI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsDan O'Leary
 
Introduction to AI & ML
Introduction to AI & MLIntroduction to AI & ML
Introduction to AI & MLMandy Sidana
 
Introduction to Artificial Intelligence and Machine Learning
Introduction to Artificial Intelligence and Machine Learning Introduction to Artificial Intelligence and Machine Learning
Introduction to Artificial Intelligence and Machine Learning Emad Nabil
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AIMostafa Elsheikh
 
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...Edureka!
 
What is Deep Learning?
What is Deep Learning?What is Deep Learning?
What is Deep Learning?NVIDIA
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
Introduction To Artificial Intelligence Powerpoint Presentation Slides
Introduction To Artificial Intelligence Powerpoint Presentation SlidesIntroduction To Artificial Intelligence Powerpoint Presentation Slides
Introduction To Artificial Intelligence Powerpoint Presentation SlidesSlideTeam
 
Lesson 1 intro to ai
Lesson 1   intro to aiLesson 1   intro to ai
Lesson 1 intro to aiankit_ppt
 

Was ist angesagt? (20)

Data science
Data scienceData science
Data science
 
Data science
Data scienceData science
Data science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science club
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
AI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsAI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science Concepts
 
Introduction to AI & ML
Introduction to AI & MLIntroduction to AI & ML
Introduction to AI & ML
 
Introduction to Artificial Intelligence and Machine Learning
Introduction to Artificial Intelligence and Machine Learning Introduction to Artificial Intelligence and Machine Learning
Introduction to Artificial Intelligence and Machine Learning
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AI
 
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
 
What is Deep Learning?
What is Deep Learning?What is Deep Learning?
What is Deep Learning?
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Data science
Data scienceData science
Data science
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction To Artificial Intelligence Powerpoint Presentation Slides
Introduction To Artificial Intelligence Powerpoint Presentation SlidesIntroduction To Artificial Intelligence Powerpoint Presentation Slides
Introduction To Artificial Intelligence Powerpoint Presentation Slides
 
Lesson 1 intro to ai
Lesson 1   intro to aiLesson 1   intro to ai
Lesson 1 intro to ai
 

Ähnlich wie AI meets Big Data

Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingJan Wiegelmann
 
Distributed Deep Learning with Hadoop and TensorFlow
Distributed Deep Learning with Hadoop and TensorFlowDistributed Deep Learning with Hadoop and TensorFlow
Distributed Deep Learning with Hadoop and TensorFlowJan Wiegelmann
 
END-TO-END MACHINE LEARNING STACK
END-TO-END MACHINE LEARNING STACKEND-TO-END MACHINE LEARNING STACK
END-TO-END MACHINE LEARNING STACKJan Wiegelmann
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinTuri, Inc.
 
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdfMachine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdfCarlos Paredes
 
AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics Ruben Pertusa Lopez
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsAndrzej Michałowski
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesCodePolitan
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Karthik Murugesan
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Databricks
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLPaco Nathan
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7Paul Lo
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleDatabricks
 
Machine Learning for Self-Driving Cars
Machine Learning for Self-Driving CarsMachine Learning for Self-Driving Cars
Machine Learning for Self-Driving CarsJan Wiegelmann
 
DN 2017 | Machine Learning for Self-Driving Cars | Jan Wiegelmann | Valtech
DN 2017 |  Machine Learning for Self-Driving Cars | Jan Wiegelmann | ValtechDN 2017 |  Machine Learning for Self-Driving Cars | Jan Wiegelmann | Valtech
DN 2017 | Machine Learning for Self-Driving Cars | Jan Wiegelmann | ValtechDataconomy Media
 
Designing Artificial Intelligence
Designing Artificial IntelligenceDesigning Artificial Intelligence
Designing Artificial IntelligenceDavid Chou
 
Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Sri Ambati
 

Ähnlich wie AI meets Big Data (20)

Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous Driving
 
Distributed Deep Learning with Hadoop and TensorFlow
Distributed Deep Learning with Hadoop and TensorFlowDistributed Deep Learning with Hadoop and TensorFlow
Distributed Deep Learning with Hadoop and TensorFlow
 
END-TO-END MACHINE LEARNING STACK
END-TO-END MACHINE LEARNING STACKEND-TO-END MACHINE LEARNING STACK
END-TO-END MACHINE LEARNING STACK
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
 
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdfMachine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
 
AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
 
Machine Learning for Self-Driving Cars
Machine Learning for Self-Driving CarsMachine Learning for Self-Driving Cars
Machine Learning for Self-Driving Cars
 
DN 2017 | Machine Learning for Self-Driving Cars | Jan Wiegelmann | Valtech
DN 2017 |  Machine Learning for Self-Driving Cars | Jan Wiegelmann | ValtechDN 2017 |  Machine Learning for Self-Driving Cars | Jan Wiegelmann | Valtech
DN 2017 | Machine Learning for Self-Driving Cars | Jan Wiegelmann | Valtech
 
Designing Artificial Intelligence
Designing Artificial IntelligenceDesigning Artificial Intelligence
Designing Artificial Intelligence
 
Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session
 

Mehr von Jan Wiegelmann

Analytics for Autonomous Driving with ROS
Analytics for Autonomous Driving with ROSAnalytics for Autonomous Driving with ROS
Analytics for Autonomous Driving with ROSJan Wiegelmann
 
Challenges of Deep Learning in the Automotive Industry and Autonomous Driving
Challenges of Deep Learning in the Automotive Industry and Autonomous DrivingChallenges of Deep Learning in the Automotive Industry and Autonomous Driving
Challenges of Deep Learning in the Automotive Industry and Autonomous DrivingJan Wiegelmann
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineJan Wiegelmann
 
Distributed TensorFlow on Hadoop, Mesos, Kubernetes, Spark
Distributed TensorFlow on Hadoop, Mesos, Kubernetes, SparkDistributed TensorFlow on Hadoop, Mesos, Kubernetes, Spark
Distributed TensorFlow on Hadoop, Mesos, Kubernetes, SparkJan Wiegelmann
 
How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...
How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...
How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...Jan Wiegelmann
 
10 things A.I. can do better than you
10 things A.I. can do better than you10 things A.I. can do better than you
10 things A.I. can do better than youJan Wiegelmann
 

Mehr von Jan Wiegelmann (6)

Analytics for Autonomous Driving with ROS
Analytics for Autonomous Driving with ROSAnalytics for Autonomous Driving with ROS
Analytics for Autonomous Driving with ROS
 
Challenges of Deep Learning in the Automotive Industry and Autonomous Driving
Challenges of Deep Learning in the Automotive Industry and Autonomous DrivingChallenges of Deep Learning in the Automotive Industry and Autonomous Driving
Challenges of Deep Learning in the Automotive Industry and Autonomous Driving
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
 
Distributed TensorFlow on Hadoop, Mesos, Kubernetes, Spark
Distributed TensorFlow on Hadoop, Mesos, Kubernetes, SparkDistributed TensorFlow on Hadoop, Mesos, Kubernetes, Spark
Distributed TensorFlow on Hadoop, Mesos, Kubernetes, Spark
 
How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...
How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...
How Artificial Intelligence is Reducing Costs and Improving Outcomes in Pharm...
 
10 things A.I. can do better than you
10 things A.I. can do better than you10 things A.I. can do better than you
10 things A.I. can do better than you
 

Kürzlich hochgeladen

BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...Pooja Nehwal
 
Strategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal AnalsysisStrategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal Analsysistanmayarora45
 
Agile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptxAgile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptxalinstan901
 
internal analysis on strategic management
internal analysis on strategic managementinternal analysis on strategic management
internal analysis on strategic managementharfimakarim
 
Safety T fire missions army field Artillery
Safety T fire missions army field ArtillerySafety T fire missions army field Artillery
Safety T fire missions army field ArtilleryKennethSwanberg
 
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...Pooja Nehwal
 
International Ocean Transportation p.pdf
International Ocean Transportation p.pdfInternational Ocean Transportation p.pdf
International Ocean Transportation p.pdfAlejandromexEspino
 
Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...Hedda Bird
 
Day 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC BootcampDay 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC BootcampPLCLeadershipDevelop
 
Beyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentBeyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentNimot Muili
 
Reviewing and summarization of university ranking system to.pptx
Reviewing and summarization of university ranking system  to.pptxReviewing and summarization of university ranking system  to.pptx
Reviewing and summarization of university ranking system to.pptxAss.Prof. Dr. Mogeeb Mosleh
 
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607dollysharma2066
 

Kürzlich hochgeladen (15)

BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
 
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
 
Strategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal AnalsysisStrategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal Analsysis
 
Agile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptxAgile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptx
 
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTECAbortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
 
internal analysis on strategic management
internal analysis on strategic managementinternal analysis on strategic management
internal analysis on strategic management
 
Safety T fire missions army field Artillery
Safety T fire missions army field ArtillerySafety T fire missions army field Artillery
Safety T fire missions army field Artillery
 
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...
 
International Ocean Transportation p.pdf
International Ocean Transportation p.pdfInternational Ocean Transportation p.pdf
International Ocean Transportation p.pdf
 
Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...
 
Day 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC BootcampDay 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC Bootcamp
 
Beyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentBeyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable development
 
Reviewing and summarization of university ranking system to.pptx
Reviewing and summarization of university ranking system  to.pptxReviewing and summarization of university ranking system  to.pptx
Reviewing and summarization of university ranking system to.pptx
 
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607
 
Intro_University_Ranking_Introduction.pptx
Intro_University_Ranking_Introduction.pptxIntro_University_Ranking_Introduction.pptx
Intro_University_Ranking_Introduction.pptx
 

AI meets Big Data

  • 1. AI meets Big Data How to cross the chasm
  • 2.
  • 4. AI history à Perceptron 1958 F. Rosenblatt, “Perceptron” model, neuronal networks 1943 W. McCulloch, W. Pitts, “Neuron” as logical element OR function XOR function 1969 M. Minsky, S. Papert, triggers first AI winter feed forward
  • 5. AI history à AI winter 1958 F. Rosenblatt, Perzeptron model, neuronal networks 1987-1993 the second AI winter, desktop computer, LISP machines expensive 1943 W. McCulloch, W. Pitts, neuron as logical element 1980 Boom expert systems, Q&A using logical rules, Prolog 1969 M. Minsky, S. Papert, trigger first AI winter 1993-2001 Moore’s law, Deep blue chess- playing, Standford DARPA challenge
  • 6. 6 AI beats human in games - 2016 Komodo beasts H. Nakamura in 2016AlphaGo beats L. Sedols in 2016 Go 4:1 Chess 2:1
  • 7. Image Classification- 2016 Human Performance AI Performance https://arxiv.org/pdf/1602.07261.pdf 95% 97% The ability to understand the content of an image by using machine learning
  • 8. Breast Cancer Diagnoses - 2017 Pathologist Performance AI Performance https://research.googleblog.com/2017/03/assisting-pathologists-in-detecting.html 73% 92% Doctors often use additional tests to find or diagnose breast cancer The pathologist ended up spending 30 hours on this task on 130 slides A closeup of a lymph node biopsy.
  • 10. Structured data 80% of world’s data is unstructured
  • 11. Fishing in the sea versus fishing in the lake Data Warehouse Data Lake Business Intellingence helps find answers to questions you know. Data Science helps you find the question itself. Any kind of data & schema-on-readStructured data & schema-on-write Parallel processing on big dataSQL-ish queries on database tables Extract, Transform, Load Extract, Load, Transform-on-the-fly Low cost on commodity hardwareExpensive for large data
  • 12. More Data + Bigger Models Accuracy Scale (data size, model size) other approaches neural networks 1990s https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
  • 13. More Data + Bigger Models + More Computation Accuracy Scale (data size, model size) other approaches neural networks Now https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI more compute
  • 14. More Data + Bigger Models + More Computation = Better Results in Machine Learning
  • 15. Millions of “trip” events each day globally 400+ billion viewing- related events per day Five billion data points for Price Tip feature Movie recommendation Price optimization Routing and price optimization
  • 18. Single machineML specialist Small data Single machineML specialist Small data
  • 19. Single machineML specialist Small data Single machineML specialist Small data X X
  • 20. Single machineML specialist Big data Single machineML specialist Big data X X
  • 21. Train and evaluate machine learning models at scale Single machine Data center How to run more experiments faster and in parallel? How to share and reproduce research? How to go from research to real products?
  • 22. Distributed Machine Learning Data Size Model Size Model parallelism Single machine Data center Data parallelism training very large models exploring several model architectures, hyper- parameter optimization, training several independent models speeds up the training
  • 23. Compute Workload for Training and Evaluation I/O intensive Compute intensive Single machine Data center
  • 24. I/O Workload for Simulation and Testing I/O intensive Compute intensive Single machine Data center
  • 28. 11/24/17 28 TensorFlow Standalone TensorFlow On YARN TensorFlow On multi- colored YARN TensorFlow On Spark TensorFrames TensorFlow On Kubernetes TensorFlow On Mesos Distributed TensorFlow on Hadoop, Mesos, Kubernetes, Spark https://www.slideshare.net/jwiegelmann/distributed -tensorflow-on-hadoop-mesos-kubernetes-spark
  • 29. Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Google, 2015
  • 30. Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Google, 2015
  • 31. http://stevenwhang.com/tfx_paper.pdf TFX: A TensorFlow-Based Production-Scale Machine Learning Platform Google, 2017
  • 34. Pricing for 890,000 real-time predictions w/o training AWS: Compute Fees + Prediction Fees = $8.40 + $96.44 = $104.84 per month Google: Prediction $0.10 per thousand predictions, plus $0.40 per hour = $377 per month Azure: Packages $0, $100,13, $1.000,06, $9.999,98 = $1.000 per month Q3, 2017
  • 36. Who are you? Who do you know? What can you afford? Where are you? What have you purchased? What do you like? What content do you prefer? Why have you contacted us?
  • 37. Marketing Tools Touchpoints Aftersales Capture Data across Digital Channels Each of these customer interactions produces data
  • 38. Breaking Down Data Silos Connect all your data tools, other sources, and gain a 360 degree view on your data Get actionable insights and serve them personal, relevant content along their journey Real-time processing and decision making One Data Platform Marketing Tools Touchpoints Historical Aftersales Data Analytics Machine Learning Data Apps
  • 39. Actionable data insights that businesses can use to... ü Better understand and better engage your customers ü Respond to the convergence of customer expectations ü Driver of brand perception 360 degree view of your customers Social Apps CRM Billing Channels Service Call Location Devices Network Ordering Customer 360
  • 40. Where AI can help… + Predicting lifetime value + Churn estimation + Customer segmentation + Cross/Upselling + Recommendations + Demand forecasting + Market Basket Analysis + Sentiment analysis + Loyalty programs + Reactivation likelihood + Discount targeting + Call to action + Risk analysis + In store traffic patterns
  • 42. High-level Development Process for Autonomous Vehicles 1 Collect sensors data 3 Autonomous Driving 2 Model Engineering Data Logger Control Unit Big Data Trained Model Data Center Agenda
  • 43. Sensors Udacity Lincoln MKZ Camera 3x Blackfly GigE Camera, 20 Hz Lidar Velodyne HDL-32E, 9.5 Hz IMU Xsens, 400 Hz GPS 2x fixed, 1 Hz CAN bus, 1,1 kHz Robot Operating System Data 3 GB per minute https://github.com/udacity/self-driving-car
  • 44. Sensors Spec Sensor blinding, sunlight, darkness rain, fog, snow non-metal objects wind/ high velocity resolution range data Ultrasonic yes yes yes no + + + Lidar yes no yes yes +++ ++ + Radar yes yes no yes ++ +++ + Camera no no yes yes +++ +++ +++
  • 45. Machine Learning 101 Observations State Estimation Modeling & Prediction Planning Controls f(x) Controls Observations
  • 46. Machine Learning for Autonomous Driving + Sensor Fusion clustering, segmentation, pattern recognition + Road ego-motion, image processing and pattern recognition + Localization simultaneous localization and mapping + Situation Understanding detection and classification + Trajectory Planning motion planning and control + Control Strategy reinforcement and supervised learning + Driver Model image processing and pattern recognition
  • 47. Machine Learning Cycle Data collection for training/test Feature engineering I/O workload Model development and architecture Compute workload I/O workload Training and evaluation Re- Simulation and Testing Scaling and monitoring Model deployment versioning 1 2 3 Model tuning
  • 48. Flux – Open Machine Learning Stack Training & Test data Compute + Network + Storage Deploy model ML Development & Catalog & REST API ML-Specialists Feature Engineering Training Evaluation Re-Simulation Testing CaffeOnSpark Sample Model Prediction Batch Regression Cluster Dataset Correlation Centroid Anomaly Test Scores ü Mainly open source ü No vendor lock in ü Scale-out architecture ü Multi user support ü Resource management ü Job scheduling ü Speed-up training ü Speed-up simulation
  • 49. Feature Engineering + Hadoop InputFormat and Record Reader for Rosbag + Process Rosbag with Spark, Yarn, MapReduce, Hadoop Streaming API, … + Spark RDD are cached and optimized for analysis Ros bag Processing Engine Computer Network Storage Advanced Analytics RDD Record Reader RDD DataFrame, DataSet SQL, Spark APIs NumPy Ros Msg
  • 50. Training & Evaluation + Tensorflow ROSRecordDataset + Protocol Buffers to serialize records + Save time because data conversion not needed + Save storage because data duplication not needed Training Engine Machine Learning Ros bag Computer Network Storage ROS Dataset Ros msg
  • 51. Re-Simulation & Testing + Use Spark for preprocessing, transformation, cleansing, aggregation, time window selection before publish to ROS topics + Use Re-Simulation framework of choice to subscribe to the ROS topics Engine Re-Simulation with framework of choice Computer Network Storage Ros bag Ros topic core subscribe publish
  • 53. + Classification, Regression, Clustering, Collaborative Filtering, Anomaly Detection + Supervised/Unsupervised Reinforcement Learning, Deep Learning, CNN + Model Training, Evaluation, Testing, Simulation, Inference + Big Data Strategy, Consulting, Data Lab, Data Science as a Service + Data Collection, Cleaning, Analyzing, Modeling, Validation, Visualization + Business Case Validation, Prototyping, MVPs, Dashboards Data Science Machine Learning
  • 54. + Architecture, DevOps, Cloud Building + App. Management Hadoop Ecosystem + Managed Infrastructure Services + Compute, Network, Storage, Firewall, Loadbalancer, DDoS, Protection + Continuous Integration and Deployment + Data Pipelines (Acquisition, Ingestion, Analytics, Visualization) + Distributed Data Architectures + Data Processing Backend + Hadoop Ecosystem + Test Automation and Testing Data Engineering Data Operations
  • 55. Think Big Business Strategy Data Strategy Technology Strategy Agile Delivery Model Business Case Validation Prototypes, MVPs Data Exploration Data AcquisitionStart Small Value Proposition