SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
© 2015 PayPal Inc. All rights reserved..
Data Science with Big Data
in a Corporate Environment
Tasks, Challenges, and Tradeoffs
Nachum Shacham
Principal Data Scientist
PayPal
11/10/2015
© 2015 PayPal Inc. All rights reserved.
Data Science From 10K ft
2
ML Algorithm Design
Optimization
Scores
Performance
Charts
Training
set
Test
set(s)
Model
Fitting
Validate, calculate KPIs
© 2015 PayPal Inc. All rights reserved.
Predictive Modeling Meets Big Data:Bridging the Gap
3
Data
Warehouse
Model(s) selection
and tuning
Data Munging
Deploy
Feature
Engineering
Observational
big-data
Data
Exploration
Validation
Data
Sourcing
Deep, wide, distributed
,sparse, partiallyunknown,
redundant,unreliable
• Unified  format  (“data  frame”)
• Well  understood
• Statistically  known  and  “well  behaved”
Data Science with Big Data: Quantitative change à Process adjustment
© 2015 PayPal Inc. All rights reserved. 4
Predictive Models for Serving Customers
Rewards
Tech
Support
Customer’s
future
needs
Unblanced
Precision/recall
latency
Personalization
Model
Organic
External
Cohort
Info
gathering
vs UX
Features
DATA
Metrics
Deplyment
© 2015 PayPal Inc. All rights reserved.
Data Exploration: Knowing What’s in the Data and Fitting the Pieces Together
5
Understaning Data Contents &
meaning
Big data visualization
Units
Language Join keys
Null values: codes, quantities
Correlations
Distributions and outliers
© 2015 PayPal Inc. All rights reserved. 6
Data Munging: Reshaping the Data à Decisions, Decisions
Unify values
Money, time
Dummy
variables
Sub
groups
categorical
variable
Missing values
Join keys do
not match
Unify null symbols
Impute / discard
Reformat
keys
Ignore
rows
DiscardTransform
Nullify
Irrelevant/
redundant
values
Transform
Remove
outliers
Standardize
Skewed
distributions
© 2015 PayPal Inc. All rights reserved.
•R
•data.table
•Dplyr
•Lubridate
•complete
•{t, l, s}apply
•{g}sub
•grep{l}
•ggplot2
7
Data Munging Packages & Functions
•Python
•pandas
•re
•Numpy
•Matplotlib
•csv
•datetime
© 2015 PayPal Inc. All rights reserved. 8
Feature Engineering
Transform raw data to better represent the problem to the model
Better features à Simpler models, better results
Manual Feature Manipulation
Feature learning
algorithms
Integrated in
model fitting
• Dummy  variables
• Text  integration
• Sub/super  sampling  
(unbalanced  sets)
© 2015 PayPal Inc. All rights reserved. 9
Model Tuning
Learning Rate
Parameter Grid
Ensemble: degree, type
Interpretability vs Accuacy
© 2015 PayPal Inc. All rights reserved.
Model Building Using H2O
10
Rich library of ML algorithms Auto detect feature
format
Interoperate with
ordinary R: split
work for complete
munging Runs through
parameter grid
Big Server
© 2015 PayPal Inc. All rights reserved.
• Who should judge the value of the results?
•Evaluation criteria
• What are the cost/value of FP, TP, FN, TP?
•FP: PR cost
•FN: Financialloss
•TP: Intended gain
• ContinuousModel evaluation in
production
11
Model Evaluation
© 2015 PayPal Inc. All rights reserved.
• Model results’ deployment destinations
•Humans (decision makers, analysts)
•Computers (recommendations)
•Databases (scores, leads)
• Latencies: hours to milliseconds
•From events to model
•From model output to action
12
Model Deployment
© 2015 PayPal Inc. All rights reserved.
• Multiple aligned plots
• Large number of data points on frame
• Statisticalplots
• Network plots
• Model fitting on plot data
13
Big Data Visualization
© 2015 PayPal Inc. All rights reserved.
• Detection of concept drift post deployment
•Changein feature set distribution
•Changein conditionalprobability P(y | X)
• Auto correction for concept drift
•Repeat training on time window of recent data
•Time window: fixed vs. variable size
• Changedetection
•CUSUM
•StatisticalProcess Control
14
Model Drift
© 2015 PayPal Inc. All rights reserved.
Transaction & Behavioral: Site Speed impact
15
NOT  ALL  SPEED  IMPROVEMENTS  
ARE  CREATED  EQUAL
load  time
Bounce  rate
© 2015 PayPal Inc. All rights reserved.
System Log Data:Query Cost Comparison
16
TCO CONSUMPTION EXEC  COST
© 2015 PayPal Inc. All rights reserved.
Summary: The “Extra” Steps Make The Difference
17
Details: the Critical Factor
Platform à
Performance
Decisions:
understand the
impact Delayed
Incomplete
Irrelevant
© 2015 PayPal Inc. All rights reserved.
THANK YOU
18

Weitere ähnliche Inhalte

Was ist angesagt?

H2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelH2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelSri Ambati
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 Dataiku
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSSri Ambati
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
Dealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AIDealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AIData Products Meetup
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignJuliet Hougland
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014The Hive
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
Summary introduction to data engineering
Summary introduction to data engineeringSummary introduction to data engineering
Summary introduction to data engineeringNovita Sari
 
Machine Learning with Big Data using Apache Spark
Machine Learning with Big Data using Apache SparkMachine Learning with Big Data using Apache Spark
Machine Learning with Big Data using Apache SparkInSemble
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityDatabricks
 
FrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyFrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyDatabricks
 
Scaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With LuminaireScaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With LuminaireDatabricks
 
Wizard Driven AI Anomaly Detection with Databricks in Azure
Wizard Driven AI Anomaly Detection with Databricks in AzureWizard Driven AI Anomaly Detection with Databricks in Azure
Wizard Driven AI Anomaly Detection with Databricks in AzureDatabricks
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 

Was ist angesagt? (20)

H2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelH2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno Candel
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Dealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AIDealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AI
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
 
Before Kaggle
Before KaggleBefore Kaggle
Before Kaggle
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Summary introduction to data engineering
Summary introduction to data engineeringSummary introduction to data engineering
Summary introduction to data engineering
 
Machine Learning with Big Data using Apache Spark
Machine Learning with Big Data using Apache SparkMachine Learning with Big Data using Apache Spark
Machine Learning with Big Data using Apache Spark
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
FrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyFrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and Cheaply
 
Scaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With LuminaireScaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With Luminaire
 
Wizard Driven AI Anomaly Detection with Databricks in Azure
Wizard Driven AI Anomaly Detection with Databricks in AzureWizard Driven AI Anomaly Detection with Databricks in Azure
Wizard Driven AI Anomaly Detection with Databricks in Azure
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 

Andere mochten auch

Big Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, PaypalBig Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, PaypalInnovation Enterprise
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterMat Keep
 
Driving Sales Execution with Compelling Messaging
Driving Sales Execution with Compelling MessagingDriving Sales Execution with Compelling Messaging
Driving Sales Execution with Compelling MessagingRussell Scherwin
 
Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1GurinderG
 
EAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using HadoopEAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using HadoopDataWorks Summit
 
Big- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPalBig- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPalCodemotion Tel Aviv
 
H2O World - Generalized Low Rank Models - Madeleine Udell
H2O World - Generalized Low Rank Models - Madeleine UdellH2O World - Generalized Low Rank Models - Madeleine Udell
H2O World - Generalized Low Rank Models - Madeleine UdellSri Ambati
 
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...Sri Ambati
 
H2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas NykodymH2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas NykodymSri Ambati
 
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthyH2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthySri Ambati
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time AnalyticsAnil Madan
 
2014 09 30_sparkling_water_hands_on
2014 09 30_sparkling_water_hands_on2014 09 30_sparkling_water_hands_on
2014 09 30_sparkling_water_hands_onSri Ambati
 
PayPal Behavioral Analytics on Hadoop
PayPal Behavioral Analytics on HadoopPayPal Behavioral Analytics on Hadoop
PayPal Behavioral Analytics on HadoopDataWorks Summit
 
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...Sri Ambati
 
Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...
Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...
Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...Sri Ambati
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphDataWorks Summit
 
H2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandryH2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandrySri Ambati
 
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...
eCommerce and ePayments markets in Russia : trends , analytics , perspect...Data Insight
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelSri Ambati
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoSri Ambati
 

Andere mochten auch (20)

Big Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, PaypalBig Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, Paypal
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
 
Driving Sales Execution with Compelling Messaging
Driving Sales Execution with Compelling MessagingDriving Sales Execution with Compelling Messaging
Driving Sales Execution with Compelling Messaging
 
Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1
 
EAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using HadoopEAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using Hadoop
 
Big- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPalBig- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPal
 
H2O World - Generalized Low Rank Models - Madeleine Udell
H2O World - Generalized Low Rank Models - Madeleine UdellH2O World - Generalized Low Rank Models - Madeleine Udell
H2O World - Generalized Low Rank Models - Madeleine Udell
 
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
 
H2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas NykodymH2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas Nykodym
 
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthyH2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time Analytics
 
2014 09 30_sparkling_water_hands_on
2014 09 30_sparkling_water_hands_on2014 09 30_sparkling_water_hands_on
2014 09 30_sparkling_water_hands_on
 
PayPal Behavioral Analytics on Hadoop
PayPal Behavioral Analytics on HadoopPayPal Behavioral Analytics on Hadoop
PayPal Behavioral Analytics on Hadoop
 
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
 
Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...
Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...
Driving In-Store Sales with Real-Time Personalization - Cyril Nigg, Catalina ...
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
 
H2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandryH2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark Landry
 
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
 

Ähnlich wie H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham

ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudCloudera, Inc.
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
How to add Artificial Intelligence Capabilities to Existing Software Platforms
How to add Artificial Intelligence Capabilities to Existing Software PlatformsHow to add Artificial Intelligence Capabilities to Existing Software Platforms
How to add Artificial Intelligence Capabilities to Existing Software PlatformsHarish Nalagandla
 
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughtonReal-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughtonSynerzip
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the GlobeDataWorks Summit
 
How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...NRB
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concretoHP Enterprise Italia
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data AnalyticsDatameer
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital TransformationMukund Babbar
 
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsKamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsGreg Makowski
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamSenturus
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...DataWorks Summit
 
ICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceKaran Sachdeva
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityDATAVERSITY
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data editionMark Kerzner
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 

Ähnlich wie H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham (20)

ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
How to add Artificial Intelligence Capabilities to Existing Software Platforms
How to add Artificial Intelligence Capabilities to Existing Software PlatformsHow to add Artificial Intelligence Capabilities to Existing Software Platforms
How to add Artificial Intelligence Capabilities to Existing Software Platforms
 
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughtonReal-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe
 
How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concreto
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital Transformation
 
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsKamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science Team
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
 
ICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data Science
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 

Mehr von Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxSri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thSri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMsSri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the WaySri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OSri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersSri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email AgainSri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 

Mehr von Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Kürzlich hochgeladen

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 

Kürzlich hochgeladen (20)

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 

H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham

  • 1. © 2015 PayPal Inc. All rights reserved.. Data Science with Big Data in a Corporate Environment Tasks, Challenges, and Tradeoffs Nachum Shacham Principal Data Scientist PayPal 11/10/2015
  • 2. © 2015 PayPal Inc. All rights reserved. Data Science From 10K ft 2 ML Algorithm Design Optimization Scores Performance Charts Training set Test set(s) Model Fitting Validate, calculate KPIs
  • 3. © 2015 PayPal Inc. All rights reserved. Predictive Modeling Meets Big Data:Bridging the Gap 3 Data Warehouse Model(s) selection and tuning Data Munging Deploy Feature Engineering Observational big-data Data Exploration Validation Data Sourcing Deep, wide, distributed ,sparse, partiallyunknown, redundant,unreliable • Unified  format  (“data  frame”) • Well  understood • Statistically  known  and  “well  behaved” Data Science with Big Data: Quantitative change à Process adjustment
  • 4. © 2015 PayPal Inc. All rights reserved. 4 Predictive Models for Serving Customers Rewards Tech Support Customer’s future needs Unblanced Precision/recall latency Personalization Model Organic External Cohort Info gathering vs UX Features DATA Metrics Deplyment
  • 5. © 2015 PayPal Inc. All rights reserved. Data Exploration: Knowing What’s in the Data and Fitting the Pieces Together 5 Understaning Data Contents & meaning Big data visualization Units Language Join keys Null values: codes, quantities Correlations Distributions and outliers
  • 6. © 2015 PayPal Inc. All rights reserved. 6 Data Munging: Reshaping the Data à Decisions, Decisions Unify values Money, time Dummy variables Sub groups categorical variable Missing values Join keys do not match Unify null symbols Impute / discard Reformat keys Ignore rows DiscardTransform Nullify Irrelevant/ redundant values Transform Remove outliers Standardize Skewed distributions
  • 7. © 2015 PayPal Inc. All rights reserved. •R •data.table •Dplyr •Lubridate •complete •{t, l, s}apply •{g}sub •grep{l} •ggplot2 7 Data Munging Packages & Functions •Python •pandas •re •Numpy •Matplotlib •csv •datetime
  • 8. © 2015 PayPal Inc. All rights reserved. 8 Feature Engineering Transform raw data to better represent the problem to the model Better features à Simpler models, better results Manual Feature Manipulation Feature learning algorithms Integrated in model fitting • Dummy  variables • Text  integration • Sub/super  sampling   (unbalanced  sets)
  • 9. © 2015 PayPal Inc. All rights reserved. 9 Model Tuning Learning Rate Parameter Grid Ensemble: degree, type Interpretability vs Accuacy
  • 10. © 2015 PayPal Inc. All rights reserved. Model Building Using H2O 10 Rich library of ML algorithms Auto detect feature format Interoperate with ordinary R: split work for complete munging Runs through parameter grid Big Server
  • 11. © 2015 PayPal Inc. All rights reserved. • Who should judge the value of the results? •Evaluation criteria • What are the cost/value of FP, TP, FN, TP? •FP: PR cost •FN: Financialloss •TP: Intended gain • ContinuousModel evaluation in production 11 Model Evaluation
  • 12. © 2015 PayPal Inc. All rights reserved. • Model results’ deployment destinations •Humans (decision makers, analysts) •Computers (recommendations) •Databases (scores, leads) • Latencies: hours to milliseconds •From events to model •From model output to action 12 Model Deployment
  • 13. © 2015 PayPal Inc. All rights reserved. • Multiple aligned plots • Large number of data points on frame • Statisticalplots • Network plots • Model fitting on plot data 13 Big Data Visualization
  • 14. © 2015 PayPal Inc. All rights reserved. • Detection of concept drift post deployment •Changein feature set distribution •Changein conditionalprobability P(y | X) • Auto correction for concept drift •Repeat training on time window of recent data •Time window: fixed vs. variable size • Changedetection •CUSUM •StatisticalProcess Control 14 Model Drift
  • 15. © 2015 PayPal Inc. All rights reserved. Transaction & Behavioral: Site Speed impact 15 NOT  ALL  SPEED  IMPROVEMENTS   ARE  CREATED  EQUAL load  time Bounce  rate
  • 16. © 2015 PayPal Inc. All rights reserved. System Log Data:Query Cost Comparison 16 TCO CONSUMPTION EXEC  COST
  • 17. © 2015 PayPal Inc. All rights reserved. Summary: The “Extra” Steps Make The Difference 17 Details: the Critical Factor Platform à Performance Decisions: understand the impact Delayed Incomplete Irrelevant
  • 18. © 2015 PayPal Inc. All rights reserved. THANK YOU 18