A “Real-Time” Architecture for Machine Learning Execution with MLeap

A “Real-Time”
Architecture for
Machine Learning
Execution with
MLeap
Noah Pritikin, Site Reliability Engineer
Spark+AI Summit 2019 | April 24, 2019

Machine Learning Applications
Detecting credit-card fraud
Financial markets
Online advertising
Recommender systems
Robotics
…
Agriculture
Automated medical diagnosis
Computer vision
Insurance
Marketing
Sentiment analysis
User behavior analytics
Weather forecasting
…
I am defining “Real-Time” as <100ms for the context of this presentation.
Not “Real-Time” “Real-Time”

Agenda
What is Kount?
Data Pipeline Context
“Real-Time” Architecture / Model Governance
Statistical Metrics and Monitoring
Q&A

Fighting Fraud, Boosting Revenue
Industry-Leading Technology & Experience
Developing fraud-fighting technology since 1999
AI/Machine Learning Implemented in 2007
Dozens of Patented Technologies
Continuous Innovation
A SaaS-Based, All-in-One Fraud Mitigation
Platform Safeguard Some of the World’s Largest
Merchants
Payment Service Providers
Ecommerce Platforms
$80M Investment from CVC Growth Partners

Data Pipeline Context
Highly-available Client-facing
Infrastructure / Services
Kount Data Lake
Data Science
Magical Fairy Dust!
Machine Learning Model
(MLeap Pipeline)
Machine Learning
Execution Platform
MLeap API Servers

“Real-Time” Architecture / Model Governance

First iteration was our baseline for improvement.
We were faced with a technical problem to solve…
Kount Boost Technology™ was released to production in October 2017.
First iteration of the architecture based on Python3 / Scikit-learn worked, but…
• Lacked portability
• Challenging to scale into the future
• Lacked multiple model support
• Limited model governance
Built in-house Apache Spark cluster in January 2018.
• Begin iterating on Boost Technology™ model improvements (e.g. feature engineering, tuning
model hyper parameters, etc.).
Spark ML-generated models depend on a SparkContext, but “real-time” predictions required!

“Real-Time” Architecture Overview
Feature Extraction separated from
Transaction Prediction
Hosting multiple models allow for blue-
green deployments
Centralized model governance
Load balancer deployed in a “sidecar
proxy” implementation allowing for
simpler Feature Extraction instance
design
• Backend health checks make a
prediction on a test transaction
MLeap API instances run GC-optimized
Java8 configuration
JVM metrics (e.g. Jolokia, etc.)

Dark Production Infrastructure

Dark Production Infrastructure
An entirely separate parallel infrastructure
in production
NO customer impact
NO “real-time” requirements
Parallelization is implemented via a
message bus (e.g. Kafka, Kinesis,
ZeroMQ, etc.)
Optimize cost through only processing a
fraction of production traffic (e.g. 1/3)
Only logs raw predictions that are
returned from MLeap for later analysis
Dark production infrastructure enables model governance / validation.

Tools Enabling Model Governance
Centrally track state of machine learning models – end-to-end!
Train model &
verify quality
Add model to
governance data
store
Deploy model to
dark production
infrastructure
MLeap API
instances
Dark
production
infrastructure
test?
Bad Deploy to available
production MLeap
API instances
Good
Migrate production
traffic to MLeap
API instances
hosting new model
Unload retired
model from MLeap
API instances
End
Replaced
model?
No
Yes

Statistical Metrics and Monitoring

“Real-Time” Architecture Performance – Transforming LEAP frames
This is NOT machine learning model performance (e.g. TOC curve, ROC
curve, PR curve, etc.)
“Real-Time” system requires metrics to measure the systemic performance.

+ Distributions!
Due to “real-time” requirements, averages don’t cut it (by themselves…)
Distributions provide critical visibility in monitoring low latency systems.
Averages

Applied Statistics
Boost without MLeap (previous)
Boost with MLeap (current)
Average 95th Percentile 99th Percentile Standard Deviation
19.27ms 24ms 37ms 5.31ms
Average 95th Percentile 99th Percentile Standard Deviation
7.00ms 9ms 16ms 2.41ms
– Improvement with MLeap!
99th percentile
saw a ~56%
improvement!

Consider Improvements to Your “Real-Time” Architecture!
MLeap…
Model governance…
Dark Production Infrastructure (assisting with model testing)…
Latency Metrics (emphasize the use of distributions)…
Further reading…
• “Deploying Apache Spark Supervised Machine Learning Models to
Production with MLeap” - https://medium.com/@combust/9e0fb57f79db
• MLeap GitHub repo - https://github.com/combust/mleap
• MLeap documentation - http://mleap-docs.combust.ml/

A “Real-Time” Architecture for Machine Learning Execution with MLeap

Recommended

Recommended

More Related Content

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

A “Real-Time” Architecture for Machine Learning Execution with MLeap