SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Building Intelligent Applications & Experimental
ML with Uber’s Data Science Workbench
Adam Hudson & Atul Gupte
Uber Inc.
/ Data at Uber
/ Analytics Stack
/ Machine Learning at Uber
/ Data Science Workbench
/ Real-world Impact
Contents
Engineer turned Product Manager
Previously: building FarmVille & the mobile advertising platform @ Zynga
Currently: Product Manager for Data Science Workbench & Data
Warehouse
/ About Atul
/ Data at Uber
Uber's mission is to bring reliable
transportation - to everyone, everywhere
Data informs every decision at the company
Uber’s massive data holds deep, hidden insights.
We surface them
6,000+ data scientists, engineers, and operations
managers rely on us to support the business
Data is what differentiates Uber
but, data at Uber is unlike anywhere else.
Delicate marketplace with
network effects
Bits to atoms
Business
New LOBs spun up in a snap
Pluggable mobility platform
Spatio-temporal
Analytics
Sheer scale
Real-time. Real-world.
ML is Uber’s brain
Apps/Machine generated
queries
Varied skills: BI to DNN
Consumers
Internal and external
6,000 and growing
What makes Uber unique
MISSION
Move the world with
global data, local
insights, and intelligent
decisions.
Data Platform Team
/ Data Analytics Stack
The Data Team
Ingest
Workflow
Management
Store
Produce Model
Ad-Hoc &
Streaming
Analytics
Business
Intelligence
Machine
Learning
Metadata/
Knowledge
Experimentation
/
Segmentation
Visualization
Data
Infrastructure
Data Platforms
Data Services
& Analytics
Disperse
Kafka
Schemaless
SOA
BI Apps Ad-hocExperimentation ML Notebooks
Cluster
Management
All-Active
Observability
Security
Raw
Data
Raw
Tables
Hadoop
Hive Presto Spark
Modeled
Tables
Vertica
Vertica
Warehouse
AthenaX
Apollo
Streaming
Real-time
Metadata/Workflow Management
Data Infrastructure
/ Machine Learning At Uber
The hype
● Ability of a machine to learn without being explicitly programmed
● Identify hidden patterns in the world based on current and historical
data and use it to predict the future
● Ability of a machine to get better at a task with data and experience
● Learn from mistakes and improve when given newer/more information
Demand prediction
Object detection/tracking
Motion prediction
Route planning
Pick-up clustering
Voice recognition
Supply modeling
Occupancy
modeling
Route planning, ETA, road modeling, low-
latency image classifier
Elasticity estimation, ETA, route
optimization, demand prediction
Speech generation, Natural language generations,
image classifiers, drop-off clustering
2. prototype
3. productionize
1. define
4. measure
Launch and Iterate
Typical ML Workflow
UNDERSTAND
BUSINESS NEED(S)
DEFINE MINIMUM
VIABLE PRODUCT (MVP)
○ Customers + cross-functional team
○ Define objectives and key results
○ Data-driven
○ Research
○ Ruthless prioritization
2. prototype
3. productionize
4. measure
1. define
Problem Definition
UNDERSTAND
BUSINESS NEED(S)
DEFINE MINIMUM
VIABLE PRODUCT (MVP)
2. prototype
1. define
GET DATA
DATA PREPARATION
TRAIN MODELS
EVALUATE MODELS
3. productionize
4. measure
validation
computational cost
interpretability
SQL, Spark
data cleansing and pre-
processing,
R / Python
CPU or GPU
Exploration
UNDERSTAND
BUSINESS NEED(S)
2. prototype
1. define
DATA PREPARATION
TRAIN MODELS
EVALUATE MODELS
4. measure GET DATA
PRODUCTIONIZE
MODELS
3. productionize
DEPLOY MODELS
Engineers + Data Scientists,
Java or Go,
unit tests
MAKE PREDICTIONSReal-time or
batch
Experimentation and
rollout monitoring;
Retraining strategy
DEFINE MINIMUM
VIABLE PRODUCT (MVP)
Production
UNDERSTAND
BUSINESS NEED(S)
DEFINE MINIMUM
VIABLE PRODUCT (MVP)
2. prototype
1. define
DATA PREPARATION
TRAIN MODELS
EVALUATE MODELS
GET DATA
DEPLOY MODELS
PRODUCTIONIZE
MODELS
MONITOR
PREDICTIONS
4. measure
MAKE PREDICTIONS
3. productionize
Automatically detect
degradations
GATHER AND
ANALYZE INSIGHTS
Deep-dive analyses
inform future product
roadmap
Measure
/ Data Science Workbench
Senior Software Engineer
Previously: Big data and big network R&D in gaming, social media &
finance
Currently: Developer on Data Science Workbench
/ About Adam
A growing Data Science community was facing
problems with many aspects of their workflows
Our world in 2016 NEW
Getting Started
CollaborationShared Standards
Moving Models to
Production
Scalability
Available Features
Data Access
To unleash the productivity of
Uber’s Data Science
community
Mission
Similar Offerings
We Wanted More!
● Diverse customers working from same data
○ Data scientists
○ Developers
○ Interns
○ Operations
○ External parties
● Scalability with access to internal data, computation and accounts
● Acceptable licensing cost for large number of casual users
Introducing Data Science Workbench
eng.uber.com/dsw
Our World Today
Getting Started
Collaboration
Shared Standards
Scalability
Available Features
Data Access
Fully hosted 1-click Jupyter Notebook & RStudio IDE
Pre-baked Environments
Sharing options on notebooks; 1-click Shiny dashboard publication
All internal data sources / Multi-DC / Secure / GDPR Compliant
Various Session Sizes, Types (CPU, GPU)/Access to Compute
Engines
Documentation Support
Common Use-Cases
● Large-scale data exploration
● Feature generation and model training
● Ad-hoc analysis and prototypes
● Review and collaboration
Product Walkthrough
RStudio and Shiny are trademarks of RStudio, Inc
"Jupyter" is a trademark of the NumFOCUS foundation, of which Project Jupyter is a part.
"Python" is a registered trademark of the PSF. The Python logos (in several variants) are use trademarks of the PSF as well.
RStudio and Shiny are trademarks of RStudio, Inc
The World of Tomorrow!
Getting Started
Collaboration
Customized team environments
Social media-like interface; more flexible dashboards
Distributed deep learning
Low friction workflow
Available Features
Available Features
Moving Models
to Production
DSW Impact
Safety
Trip classification
Risk
Driver account check
Driver referral risk scoring
Uber Eats
Restaurant recommendations
Support
NLP model for support tickets
Operations
Lifetime value (LTV) model
more
!
And with that, I will pass you back to Atul to
discuss the impact that DSW is having.
/ … one more thing
We’re hiring!
Excited to build the data platform that moves the world?
Come join us!
http://t.uber.com/datahire
San Francisco, Palo Alto, Seattle, Bangalore
Proprietary and confidential © 2018 Uber Technologies, Inc. All rights reserved. No part of this
document may be reproduced or utilized in any form or by any means, electronic or mechanical,
including photocopying, recording, or by any information storage or retrieval systems, without
permission in writing from Uber. This document is intended only for the use of the individual or entity
to whom it is addressed and contains information that is privileged, confidential or otherwise exempt
from disclosure under applicable law. All recipients of this document are notified that the information
contained herein includes proprietary and confidential information of Uber, and recipient may not
make use of, disseminate, or in any way disclose this document or any of the enclosed information
to any person other than employees of addressee to the extent necessary for consultations with
authorized personnel of Uber.
Thank you!
and remember, t.uber.com/datahire
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIDataWorks Summit
 
Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...DataWorks Summit
 
Achieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturingAchieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturingDataWorks Summit
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle TechnologiesOleksii Movchaniuk
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...DataWorks Summit
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes John Archer
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...DataWorks Summit
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopGhassan Al-Yafie
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDataWorks Summit
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Yellowfin
 
IoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldIoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldDataWorks Summit
 
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...DataWorks Summit
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data editionMark Kerzner
 
Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017Ray Bugg
 
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the EnterpriseData Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the EnterpriseDataWorks Summit
 

Was ist angesagt? (20)

Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AI
 
Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...
 
BDaas- BigData as a service
BDaas- BigData as a service  BDaas- BigData as a service
BDaas- BigData as a service
 
Achieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturingAchieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturing
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle Technologies
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
 
OpenPOWER Update
OpenPOWER UpdateOpenPOWER Update
OpenPOWER Update
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoop
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted Analytics
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 
IoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldIoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected World
 
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017
 
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the EnterpriseData Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 

Ähnlich wie Building intelligent applications, experimental ML with Uber’s Data Science Workbench

Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Databricks
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Karthik Murugesan
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleDatabricks
 
Serverless projects at Myplanet
Serverless projects at MyplanetServerless projects at Myplanet
Serverless projects at MyplanetDaniel Zivkovic
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSMatt Stubbs
 
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with SplunkSplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with SplunkSplunk
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...Jürgen Ambrosi
 
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligenceTour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligenceAlex Danvy
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Sarah Aerni
 
Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...
Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...
Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...Carie John
 
Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...
Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...
Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...Carie John
 
Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...
Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...
Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...Carie John
 
Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...Carie John
 
Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...Carie John
 
Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...Carie John
 
Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...Carie John
 
Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...
Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...
Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...Carie John
 

Ähnlich wie Building intelligent applications, experimental ML with Uber’s Data Science Workbench (20)

Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Serverless projects at Myplanet
Serverless projects at MyplanetServerless projects at Myplanet
Serverless projects at Myplanet
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
 
AI at Scale in Enterprises
AI at Scale in Enterprises AI at Scale in Enterprises
AI at Scale in Enterprises
 
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with SplunkSplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligenceTour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
 
Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...
Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...
Tableau reseller partner in Djibouti Bilytica Best business Intelligence Comp...
 
Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...
Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...
Tableau reseller partner in Fiji Bilytica Best business Intelligence Company ...
 
Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...
Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...
Tableau reseller partner in Brunei Bilytica Best business Intelligence Compan...
 
Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Cambodia Bilytica Best business Intelligence Comp...
 
Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Estonia Bilytica Best business Intelligence Compa...
 
Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...
Tableau reseller partner in Croatia Bilytica Best business Intelligence Compa...
 
Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...
Tableau reseller partner in Ethiopia Bilytica Best business Intelligence Comp...
 
Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...
Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...
Tableau reseller partner in Botswana Bilytica Best business Intelligence Comp...
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Kürzlich hochgeladen (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Building intelligent applications, experimental ML with Uber’s Data Science Workbench

  • 1. Building Intelligent Applications & Experimental ML with Uber’s Data Science Workbench Adam Hudson & Atul Gupte Uber Inc.
  • 2. / Data at Uber / Analytics Stack / Machine Learning at Uber / Data Science Workbench / Real-world Impact Contents
  • 3. Engineer turned Product Manager Previously: building FarmVille & the mobile advertising platform @ Zynga Currently: Product Manager for Data Science Workbench & Data Warehouse / About Atul
  • 4. / Data at Uber
  • 5. Uber's mission is to bring reliable transportation - to everyone, everywhere
  • 6. Data informs every decision at the company
  • 7. Uber’s massive data holds deep, hidden insights. We surface them
  • 8. 6,000+ data scientists, engineers, and operations managers rely on us to support the business
  • 9. Data is what differentiates Uber but, data at Uber is unlike anywhere else.
  • 10. Delicate marketplace with network effects Bits to atoms Business New LOBs spun up in a snap Pluggable mobility platform Spatio-temporal Analytics Sheer scale Real-time. Real-world. ML is Uber’s brain Apps/Machine generated queries Varied skills: BI to DNN Consumers Internal and external 6,000 and growing What makes Uber unique
  • 11. MISSION Move the world with global data, local insights, and intelligent decisions. Data Platform Team
  • 13. The Data Team Ingest Workflow Management Store Produce Model Ad-Hoc & Streaming Analytics Business Intelligence Machine Learning Metadata/ Knowledge Experimentation / Segmentation Visualization Data Infrastructure Data Platforms Data Services & Analytics Disperse
  • 14. Kafka Schemaless SOA BI Apps Ad-hocExperimentation ML Notebooks Cluster Management All-Active Observability Security Raw Data Raw Tables Hadoop Hive Presto Spark Modeled Tables Vertica Vertica Warehouse AthenaX Apollo Streaming Real-time Metadata/Workflow Management Data Infrastructure
  • 16. The hype ● Ability of a machine to learn without being explicitly programmed ● Identify hidden patterns in the world based on current and historical data and use it to predict the future ● Ability of a machine to get better at a task with data and experience ● Learn from mistakes and improve when given newer/more information
  • 17. Demand prediction Object detection/tracking Motion prediction Route planning Pick-up clustering Voice recognition Supply modeling Occupancy modeling Route planning, ETA, road modeling, low- latency image classifier Elasticity estimation, ETA, route optimization, demand prediction Speech generation, Natural language generations, image classifiers, drop-off clustering
  • 18. 2. prototype 3. productionize 1. define 4. measure Launch and Iterate Typical ML Workflow
  • 19. UNDERSTAND BUSINESS NEED(S) DEFINE MINIMUM VIABLE PRODUCT (MVP) ○ Customers + cross-functional team ○ Define objectives and key results ○ Data-driven ○ Research ○ Ruthless prioritization 2. prototype 3. productionize 4. measure 1. define Problem Definition
  • 20. UNDERSTAND BUSINESS NEED(S) DEFINE MINIMUM VIABLE PRODUCT (MVP) 2. prototype 1. define GET DATA DATA PREPARATION TRAIN MODELS EVALUATE MODELS 3. productionize 4. measure validation computational cost interpretability SQL, Spark data cleansing and pre- processing, R / Python CPU or GPU Exploration
  • 21. UNDERSTAND BUSINESS NEED(S) 2. prototype 1. define DATA PREPARATION TRAIN MODELS EVALUATE MODELS 4. measure GET DATA PRODUCTIONIZE MODELS 3. productionize DEPLOY MODELS Engineers + Data Scientists, Java or Go, unit tests MAKE PREDICTIONSReal-time or batch Experimentation and rollout monitoring; Retraining strategy DEFINE MINIMUM VIABLE PRODUCT (MVP) Production
  • 22. UNDERSTAND BUSINESS NEED(S) DEFINE MINIMUM VIABLE PRODUCT (MVP) 2. prototype 1. define DATA PREPARATION TRAIN MODELS EVALUATE MODELS GET DATA DEPLOY MODELS PRODUCTIONIZE MODELS MONITOR PREDICTIONS 4. measure MAKE PREDICTIONS 3. productionize Automatically detect degradations GATHER AND ANALYZE INSIGHTS Deep-dive analyses inform future product roadmap Measure
  • 23. / Data Science Workbench
  • 24. Senior Software Engineer Previously: Big data and big network R&D in gaming, social media & finance Currently: Developer on Data Science Workbench / About Adam
  • 25. A growing Data Science community was facing problems with many aspects of their workflows Our world in 2016 NEW Getting Started CollaborationShared Standards Moving Models to Production Scalability Available Features Data Access
  • 26. To unleash the productivity of Uber’s Data Science community Mission
  • 28. We Wanted More! ● Diverse customers working from same data ○ Data scientists ○ Developers ○ Interns ○ Operations ○ External parties ● Scalability with access to internal data, computation and accounts ● Acceptable licensing cost for large number of casual users
  • 29. Introducing Data Science Workbench eng.uber.com/dsw
  • 30.
  • 31. Our World Today Getting Started Collaboration Shared Standards Scalability Available Features Data Access Fully hosted 1-click Jupyter Notebook & RStudio IDE Pre-baked Environments Sharing options on notebooks; 1-click Shiny dashboard publication All internal data sources / Multi-DC / Secure / GDPR Compliant Various Session Sizes, Types (CPU, GPU)/Access to Compute Engines Documentation Support
  • 32. Common Use-Cases ● Large-scale data exploration ● Feature generation and model training ● Ad-hoc analysis and prototypes ● Review and collaboration
  • 34. RStudio and Shiny are trademarks of RStudio, Inc "Jupyter" is a trademark of the NumFOCUS foundation, of which Project Jupyter is a part. "Python" is a registered trademark of the PSF. The Python logos (in several variants) are use trademarks of the PSF as well.
  • 35. RStudio and Shiny are trademarks of RStudio, Inc
  • 36.
  • 37.
  • 38.
  • 39. The World of Tomorrow! Getting Started Collaboration Customized team environments Social media-like interface; more flexible dashboards Distributed deep learning Low friction workflow Available Features Available Features Moving Models to Production
  • 40. DSW Impact Safety Trip classification Risk Driver account check Driver referral risk scoring Uber Eats Restaurant recommendations Support NLP model for support tickets Operations Lifetime value (LTV) model more ! And with that, I will pass you back to Atul to discuss the impact that DSW is having.
  • 41. / … one more thing
  • 42. We’re hiring! Excited to build the data platform that moves the world? Come join us! http://t.uber.com/datahire San Francisco, Palo Alto, Seattle, Bangalore
  • 43. Proprietary and confidential © 2018 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent necessary for consultations with authorized personnel of Uber. Thank you! and remember, t.uber.com/datahire Questions?