Aerospike for machine learning

Using Aerospike and
Machine Learning
Brian Bulkowski
CTO, Founder
@bbulkow

2© 2016 Aerospike Inc. All rights reserved.[ ]
What is Aerospike ?
Large-scale DHT Database ( 10B ++ objects, 100T++, O(1) get / put )
… with queries, data structures, UDF, fast clients ...
... On Linux ...
High availability clustering & rebalancing ( proven 5 9’s, no load balancer )
Very high performance C code – reads and writes
( 2M++ TPS from Flash, 4M++ TPS from DRAM PER SERVER )
KVS++ provides query, UDF, table/columns, aggregations, SQL
Direct attach storage; persistence through replication and Flash
Cloud-savvy – runs with EC2, GCE others; Docker, more …
Dual License: Open Source for devs, Enterprise for deployment

Architecture Overview – Flash based system of engagement
LEGACY DATABASE
(Mainframe)
XDR
Decisioning Engine
DATA WAREHOUSE/
DATA LAKE
LEGACY RDBMS
HDFS BASED
BUSINESS
TRANSACTIONS
Web views
( Payments )
( Mobile Queries ) (
Recommendation )
( And More )
High Performance NoSQL
“REAL-TIME BIG DATA”
“DECISIONING”
500
Business Trans per sec
5000
Calculations per sec
X = 2.5 M
Database Transactions per sec

CREDIT CARD
PROCESSING SYSTEM
FRAUD DETECTION &
PROTECTION APP
ACCOUNT
BEHAVIOR
ACCOUNT
STATISTICS
STATIC DATA
RULE 1 – PASSED ✔
RULE 2 – PASSED ✔
RULE 3 – FAILED ✗
HISTORICAL
DATA
RULES
RULE 1
RULE 2
RULE 3
…
Challenge
■ Overall SLA 750 ms
■ Loss of Business due to latency
■ Every Credit Card transaction requires
hundreds of DB reads/writes
Need to scale reliably
■ 10  100 TB
■ 10B  100 B objects
■ 200k  I Million+ TPS
Selected NoSQL
■ Built for Flash
■ Predictable Low latency at High Throughput
■ Immediate consistency, no data loss
■ Cross data center (XDR) support
■ 20 Server Cluster
■ Dell 730xd w/ 4NVMe SSDs
Example - Fraud Prevention

■ 3 node cluster, Intel S3700 SSDs
■ Followed religiously all DataStax recommendations
■ Standard YCSB, includes instructions to reproduce for your workload
■ http://www.aerospike.com/blog/comparing-nosql-databases-aerospike-and-
cassandra/
Aerospike vs Cassandra ( 2016 )

Online Learning
Leveraging Aerospike to Power Real-time Analytics

Neilson Marketing Cloud Webinar
Brent Keator
VP Infrastructure
Neilson Marketing Cloud
Kevin Lyons
Senior VP Data Science
Neilson Marketing Cloud
YouTube: Neilsen Marketing Cloud Aerospike Webinar 2016
Aerospike: https://aerospike.com/webinars

Models that build profitable marketing audiences at scale...
Finding more of your best
customers: High-income business
professional

The Modeling Process, simplified

2012 2015
30 - 40 models
levering billions of events
Creating 100 million + scores
over 1000 models
‘leveraging’ trillions of events
Creating 150 billion+ scores / day
The Challenge

A system creates as many models as we want, when
we want them, that dynamically adapts in real-time
to changing conditions
▪ Automatically creates, validates, ships, and
monitors models, with a capacity that scales
to 10s of thousands of models
The Opportunity
What we really need:

In other words, we simply need ….

Online models evolve &
adapt over time, in
reaction to a changing
environment with each
and every event
Given a complete
data set, a batch
model is created in
entirety all at once
Introducing Online Learning
Batch Online Learning
Creation Evolution

large-scale data
storage
large-scale
data movement
painful data
aggregation
lots of manual
everything
Harder to build models,
but easier to evaluate
limited data storage,
mostly for monitoring
event-level
data streams
light data
aggregation
lots of automatic
everything
Easier to build, but harder
to evaluate (& support)
Batch Models (Offline) vs. Online Learning
Online LearningBatch Models (Offline)

● Outperformed both L2 and Elastic Net
● Leverages small (‘micro’) batches
● Validates and monitors models in real time
● Alerts team when models are not behaving
Some Techno Mumbo Jumbo
Stochastic gradient descent with L1 regularization

eXelate.com @eXelate
Technical Solutions
How do we do it?

eXpresso Serving Cluster
10B+ events/day
300+ nodes across
4 data centers
eXtream Modeling Cluster
160B models/day
100+ nodes across
4 data centers
JGroups
Distribute
d
Messagin
g
Serving Layer

Our Aerospike “Citrusleaf” Use-Cases
Unique User DataStore
53 Servers across
4 data centers
Specs
Memory: 512GB
CPU: e5-2620v2 (Dual-Socket)
Disk: Intel S3710(13-15 1.2TB SSDs)
Network: Aggregated 10GB NICs
2-Namespaces
Online Learning (Models DataStore)
9 Servers across
3 data centers
Specs
Memory: 32GB
CPU: e5-2620 (Dual-Socket)
Disk:1-240GB SSDs
Network: Aggregated 1GB NICs
1-Namespace
Online Learning

Online LearningBatch Models (Offline)
Batch
Predefined ratio
Predefined feature selection
One time Validation
Streaming
Downsampling
Automated feature selection
Ongoing data cleaning
Ongoing validation
The Online Learning Challenge

● All necessary data already exists in eXtream
● The cluster’s processing resources can be better utilized
● eXtream addresses most performance / scalability requirements
● Scoring mechanism already exists
eXtream as a Framework for Online Learning
Why it works...

● Labeling Mechanism - customer defined target
audience
Events Classification

● Downsampling mechanism
● Burst tolerance
● Duplicate entries
Dataset Preparation

● Blacklist
● Whitelist
● Automatic Tuning
Features Selection

● Sliding window of recent events
● 60/40 not-converted/converted ratio
● Various accuracy metrics (lift, precision, recall, confusion matrix)
● Decide if the model is ready for making predictions
Model Validation

● Two phases (Scoring, Re-code)
● Scale vs Accuracy tradeoff
Predictions Mechanism

Scalability / Performance
Thousands of
Concurrent Models:
High Throughput:
billions of training events per daytraining, validation, scoring

Why do we need it?
● Store the models in one common place
● Persistency
● Built-in replication
Scalability / Performance
Why do we need it?

XDR Replication Map
Inter-DC Network
Bare-Metal Cloud
LVS/GSLB/XDR =
HA
Online Learning Datastore
Replication

Monitoring- Why do we need it?
thousands of models
automatically created by users
some models won’t converge

eXelate.com @eXelate
Thank You!

Aerospike for machine learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Aerospike for machine learning

Similar to Aerospike for machine learning (20)

Recently uploaded

Recently uploaded (20)

Aerospike for machine learning