SlideShare a Scribd company logo
1 of 13
Donald Gennetten
Rahul Gupta
Data Engineer
Data Engineer
Using H2O for Mobile Transaction Forecasting &
Anomaly Detection
Problems are usually identifiable through
elevated failures or volume anomalies
Easy to detect, measure, and alert Hard to detect, measure, and alert
Elevated Failure Rate Low Volume Anomaly
No
AlertAlert
Why not set volume alerts?
Unlike failure alerts, volume-based thresholds
vary by event type, hour, minute, day of week,
week of the year, holiday, and much more.
100+ customer event types
x
24 hours/day
x
7 days/week
x
52 weeks/year
Over 873k distinct thresholds to calculate, set
and maintain.
Machine Learning should be used
when:
• You cannot effectively code the solution
• You cannot scale
Solving the problem required going
beyond modeling
Visualize/Alert Pilot
Develop
Platform
ModelingDefine DataIdentify Business Case
Our goal was to deliver Machine Learning for Production Monitoring that:
• Followed Governance Requirements
• Used Available Data Science and Machine Learning Resources
• Leveraged Platform Engineering and Open Source Technology
• Ensured Usability and Scalability
Sparkling Water allowed us to rapidly test
and deploy machine learning
• Sparkling Water combines the fast, scalable ML algorithms of H2O, the H2O Flow UI, Scala, and
Python with the capabilities of Apache Spark
• In-memory processing supports big data environment needs
• Spark + Python + Scala enables a unified coding pipeline
• Grid search options allow for greater efficiency
• Test models
• Optimize hyperparameters
• H2O Flow facilitates ad-hoc experimentation
• REST API is easily integrated into production software
GBM provided greater flexibility and
benefits over traditional methods
• Traditional time series techniques assume stationary data (no trends/seasonality), constant variance
over time
• Univariate time series consists of single, sequential observations over equal time increments
• GBM model accepts external explanatory variables
• # accounts having payment due
• Incidents
• Change orders
• Payment due dates
• GBM also enables data filtering/exclusion (e.g., incident data for training set)
We developed an open source, cloud-
based platform for rapid delivery
1 Retrieve volumes for training
2 Provide holiday and other static
data
3 Store forecast with
actual volumes
4 Detect & flag anomalies
Amazon S3 Sparkling Water InfluxDB Amazon EC2 Grafana
5 Display volumes, forecast & anomalies
What does it look like?
Monitoring teams are easily able to visually inspect forecasted and actual volumes in real-time
Forecasts are
available for future
dates to aid in capacity
planning
Now
What does anomalous volume look like?
Small changes in expected volume are easy to detect, measure, and alert
~12% of expected
events were missing
after a planned change
to the streaming data
platform
Alerts triggered due to lower than expected volume; Root cause analysis determined a platform
release was casing dropped data and a code roll back was required to resolve the issue
Does it improve incident detection times?
Anomaly detection alerts are sent ahead of escalation and detection times, including when other
alarms aren't triggered
Anomaly detected at
11:15 p.m. when Login
volumes spiked ~20k
higher than expected
Incident response teams were alerted at 11:17 p.m., more than 4 minutes before other incident
alarms
Solar events as a predictor?
Variation from predicted login volume was easily quantified during the August 21st solar eclipse;
Interest appears to have been lost within 15 minutes of totality
A
A. 12:06 p.m. EDT
(9:06 a.m. PDT)
the solar eclipse
starts in Salem,
Oregon
B. 2:41 p.m. EDT
(11:41 a.m. PDT)
totality begins in
Columbia, South
Carolina
C. 4:06 p.m. EDT
(1:06 p.m. PDT)
eclipse ends
B C
Variation from forecast

More Related Content

What's hot

H2O Driverless AI Workshop
H2O Driverless AI WorkshopH2O Driverless AI Workshop
H2O Driverless AI WorkshopSri Ambati
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Sri Ambati
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Productioniguazio
 
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...Sri Ambati
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoSri Ambati
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Sri Ambati
 
Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...Bas Geerdink
 
Building A Product Assortment Recommendation Engine
Building A Product Assortment Recommendation EngineBuilding A Product Assortment Recommendation Engine
Building A Product Assortment Recommendation EngineDatabricks
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicSri Ambati
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OSri Ambati
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation PlatformKarthik Murugesan
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab CreateTuri, Inc.
 
Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2OSri Ambati
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine LearningLynn Langit
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadData Con LA
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learningVinoth Kannan
 
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Sri Ambati
 
Storage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningStorage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningNisha Talagala
 

What's hot (20)

H2O Driverless AI Workshop
H2O Driverless AI WorkshopH2O Driverless AI Workshop
H2O Driverless AI Workshop
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Production
 
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
Real-Time AI: Designing for Low Latency and High Throughput - Dr. Sergei Izra...
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
 
Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...
 
Building A Product Assortment Recommendation Engine
Building A Product Assortment Recommendation EngineBuilding A Product Assortment Recommendation Engine
Building A Product Assortment Recommendation Engine
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom Kraljevic
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2O
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learning
 
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
 
Storage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningStorage Challenges for Production Machine Learning
Storage Challenges for Production Machine Learning
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 

Similar to Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One

Justifying Capacity Management Efforts
Justifying Capacity Management EffortsJustifying Capacity Management Efforts
Justifying Capacity Management EffortsPrecisely
 
Why advanced monitoring is key for healthy
Why advanced monitoring is key for healthyWhy advanced monitoring is key for healthy
Why advanced monitoring is key for healthyDenodo
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptQingsong Yao
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
 
Understanding VMware Capacity
Understanding VMware CapacityUnderstanding VMware Capacity
Understanding VMware CapacityPrecisely
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
 
Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?eG Innovations
 
Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...
Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...
Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...Greg Eva
 
Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityAll Things Open
 
Baltimore share point user group june 2015
Baltimore share point user group june 2015Baltimore share point user group june 2015
Baltimore share point user group june 2015Toby McGrail
 
Auditing in the Cloud
Auditing in the CloudAuditing in the Cloud
Auditing in the Cloudtcarrucan
 
(BDT316) Offloading ETL to Amazon Elastic MapReduce
(BDT316) Offloading ETL to Amazon Elastic MapReduce(BDT316) Offloading ETL to Amazon Elastic MapReduce
(BDT316) Offloading ETL to Amazon Elastic MapReduceAmazon Web Services
 
Introducing Elevate Capacity Management
Introducing Elevate Capacity ManagementIntroducing Elevate Capacity Management
Introducing Elevate Capacity ManagementPrecisely
 
Mission IT operations for a good night's sleep
Mission IT operations for a good night's sleepMission IT operations for a good night's sleep
Mission IT operations for a good night's sleepwwwally
 
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...CloudSmartz
 
Oracle Demantra - Demand Planning Overview
Oracle Demantra - Demand Planning OverviewOracle Demantra - Demand Planning Overview
Oracle Demantra - Demand Planning OverviewMihir Jhaveri (26,000+)
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
I pushed in production :). Have a nice weekend
I pushed in production :). Have a nice weekendI pushed in production :). Have a nice weekend
I pushed in production :). Have a nice weekendNicolas Carlier
 
Making the Case for Legacy Data in Modern Data Analytics Platforms
Making the Case for Legacy Data in Modern Data Analytics PlatformsMaking the Case for Legacy Data in Modern Data Analytics Platforms
Making the Case for Legacy Data in Modern Data Analytics PlatformsPrecisely
 

Similar to Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One (20)

Justifying Capacity Management Efforts
Justifying Capacity Management EffortsJustifying Capacity Management Efforts
Justifying Capacity Management Efforts
 
Why advanced monitoring is key for healthy
Why advanced monitoring is key for healthyWhy advanced monitoring is key for healthy
Why advanced monitoring is key for healthy
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 
Understanding VMware Capacity
Understanding VMware CapacityUnderstanding VMware Capacity
Understanding VMware Capacity
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...
 
Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?
 
Agile planning and estimating
Agile planning and estimatingAgile planning and estimating
Agile planning and estimating
 
Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...
Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...
Capitaliser sur la valeur de l’IoT : comment démarrer sa transformation numér...
 
Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of Observability
 
Baltimore share point user group june 2015
Baltimore share point user group june 2015Baltimore share point user group june 2015
Baltimore share point user group june 2015
 
Auditing in the Cloud
Auditing in the CloudAuditing in the Cloud
Auditing in the Cloud
 
(BDT316) Offloading ETL to Amazon Elastic MapReduce
(BDT316) Offloading ETL to Amazon Elastic MapReduce(BDT316) Offloading ETL to Amazon Elastic MapReduce
(BDT316) Offloading ETL to Amazon Elastic MapReduce
 
Introducing Elevate Capacity Management
Introducing Elevate Capacity ManagementIntroducing Elevate Capacity Management
Introducing Elevate Capacity Management
 
Mission IT operations for a good night's sleep
Mission IT operations for a good night's sleepMission IT operations for a good night's sleep
Mission IT operations for a good night's sleep
 
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
 
Oracle Demantra - Demand Planning Overview
Oracle Demantra - Demand Planning OverviewOracle Demantra - Demand Planning Overview
Oracle Demantra - Demand Planning Overview
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
I pushed in production :). Have a nice weekend
I pushed in production :). Have a nice weekendI pushed in production :). Have a nice weekend
I pushed in production :). Have a nice weekend
 
Making the Case for Legacy Data in Modern Data Analytics Platforms
Making the Case for Legacy Data in Modern Data Analytics PlatformsMaking the Case for Legacy Data in Modern Data Analytics Platforms
Making the Case for Legacy Data in Modern Data Analytics Platforms
 

More from Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxSri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thSri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMsSri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the WaySri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OSri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersSri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email AgainSri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 

More from Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One

  • 1.
  • 2. Donald Gennetten Rahul Gupta Data Engineer Data Engineer
  • 3. Using H2O for Mobile Transaction Forecasting & Anomaly Detection
  • 4. Problems are usually identifiable through elevated failures or volume anomalies Easy to detect, measure, and alert Hard to detect, measure, and alert Elevated Failure Rate Low Volume Anomaly No AlertAlert
  • 5. Why not set volume alerts? Unlike failure alerts, volume-based thresholds vary by event type, hour, minute, day of week, week of the year, holiday, and much more. 100+ customer event types x 24 hours/day x 7 days/week x 52 weeks/year Over 873k distinct thresholds to calculate, set and maintain. Machine Learning should be used when: • You cannot effectively code the solution • You cannot scale
  • 6. Solving the problem required going beyond modeling Visualize/Alert Pilot Develop Platform ModelingDefine DataIdentify Business Case Our goal was to deliver Machine Learning for Production Monitoring that: • Followed Governance Requirements • Used Available Data Science and Machine Learning Resources • Leveraged Platform Engineering and Open Source Technology • Ensured Usability and Scalability
  • 7. Sparkling Water allowed us to rapidly test and deploy machine learning • Sparkling Water combines the fast, scalable ML algorithms of H2O, the H2O Flow UI, Scala, and Python with the capabilities of Apache Spark • In-memory processing supports big data environment needs • Spark + Python + Scala enables a unified coding pipeline • Grid search options allow for greater efficiency • Test models • Optimize hyperparameters • H2O Flow facilitates ad-hoc experimentation • REST API is easily integrated into production software
  • 8. GBM provided greater flexibility and benefits over traditional methods • Traditional time series techniques assume stationary data (no trends/seasonality), constant variance over time • Univariate time series consists of single, sequential observations over equal time increments • GBM model accepts external explanatory variables • # accounts having payment due • Incidents • Change orders • Payment due dates • GBM also enables data filtering/exclusion (e.g., incident data for training set)
  • 9. We developed an open source, cloud- based platform for rapid delivery 1 Retrieve volumes for training 2 Provide holiday and other static data 3 Store forecast with actual volumes 4 Detect & flag anomalies Amazon S3 Sparkling Water InfluxDB Amazon EC2 Grafana 5 Display volumes, forecast & anomalies
  • 10. What does it look like? Monitoring teams are easily able to visually inspect forecasted and actual volumes in real-time Forecasts are available for future dates to aid in capacity planning Now
  • 11. What does anomalous volume look like? Small changes in expected volume are easy to detect, measure, and alert ~12% of expected events were missing after a planned change to the streaming data platform Alerts triggered due to lower than expected volume; Root cause analysis determined a platform release was casing dropped data and a code roll back was required to resolve the issue
  • 12. Does it improve incident detection times? Anomaly detection alerts are sent ahead of escalation and detection times, including when other alarms aren't triggered Anomaly detected at 11:15 p.m. when Login volumes spiked ~20k higher than expected Incident response teams were alerted at 11:17 p.m., more than 4 minutes before other incident alarms
  • 13. Solar events as a predictor? Variation from predicted login volume was easily quantified during the August 21st solar eclipse; Interest appears to have been lost within 15 minutes of totality A A. 12:06 p.m. EDT (9:06 a.m. PDT) the solar eclipse starts in Salem, Oregon B. 2:41 p.m. EDT (11:41 a.m. PDT) totality begins in Columbia, South Carolina C. 4:06 p.m. EDT (1:06 p.m. PDT) eclipse ends B C Variation from forecast