SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
building intelligent data products
what actually is fraud
architecting flexible data ‘plumbing’
building solid data products on top of them
stephen whitworth
2 years at Hailo as data scientist/jack of some trades out of
university
product and marketplace analytics, agent based
modelling, data engineering, ‘ML’ services
data science/engineering at ravelin, specifically
focused on our detection capabilities
what is ravelin?
online fraud detection and prevention platform
stream application/server data to our events API
we give fraud probability + beautiful data visualisation
backed by techstars/passion/playfair/amadeus/indeed.com
founder/wonga founder amongst other great investors
fraud?
$14B
a dollar for every year the universe has existed
Same day delivery On-demand services
‘victimless crime’
police ill-equipped to handle
low barrier to entry from dark net
3D secure - conversion killer
traditional: human generated rules, born of deep expertise
order-centric view of the world
hybrid: augment expertise by learning rules from data
cards don’t commit fraud, people do
building
good
plumbing
receive firehose through API
decode arbitrary data and store
extract hundreds of features
http/slack/whatever notification to customer
in 100-300ms (ish)
run through N models and rule engine to get probability
BUZZWORDS ABOUND
go
postgres
AWS
microservices
zookeeper
NSQ python
event-driven
elasticsearch bigquery dynamodb
redis
instrumentation
different
databases
for different
needs
kudos if you get The Office reference
postgres: solid, start here
dynamodb: very high throughput, low latency data
bigquery: to answer any question you could possibly have
elasticsearch: rich querying in a reasonable amount of time
graph db: haven’t decided, recommendations?
asynchronous systemsfirehoses
nice deployment patterns
‘lambda architecture’ - the append only log
services store their own interpretation of events
services are almost entirely decoupled
asynchronous systemsfirehoses
error propagation is challenging
no guarantees of SLA - at least as slow as your queue
hard to know who or what is consuming your data
building
data
products
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
N
precision: of all of my predictions, what % was I correct?
recall: out of all of the fraudsters, what % did I catch?
implicit tradeoff between conversion and fraud loss
‘accuracy’ a useless metric for fraud
99.8% ACCURATE
keep model interfaces simple
hide arbitrarily complex transformations behind it
blend global and client specific models
building and training statistical models
currently batch
will combine with online
RANDOM FORESTS
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
RANDOM FORESTS
MONITORING
probabilistic, not deterministic
dogfood - use live robot customers
run models in ‘dark mode’ to determine performance
why not deep learning? ..yet
ability to debug random forests
had nice results with keras
serialisation and deployment: an unsolved problem
in beta and signing up clients
looking for on-demand services/marketplaces
talk to me afterwards
obligatory: we are hiring!
senior machine learning engineers/data scientists
stephen.whitworth@ravelin.com or talk to me after
@sjwhitworth
www.ravelin.com - @ravelinhq

Weitere ähnliche Inhalte

Ähnlich wie Building Intelligent Data Products

Experiment
ExperimentExperiment
Experiment
jbashask
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
Ratnakar Pandey
 
LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014
Ashlie Steele
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
Francesca Lazzeri, PhD
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Nick Galbreath
 
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxshyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
ShyamaprasadMS
 
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon
 

Ähnlich wie Building Intelligent Data Products (20)

WeDo Technologies Blog 2014
WeDo Technologies Blog 2014WeDo Technologies Blog 2014
WeDo Technologies Blog 2014
 
Experiment
ExperimentExperiment
Experiment
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
 
LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
 
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about CybersecurityMark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
 
ThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted EganThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted Egan
 
Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015
 
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxshyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
 
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 
GraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraudGraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraud
 
Falcon 012009
Falcon 012009Falcon 012009
Falcon 012009
 
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraudGraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
 
A Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present FraudA Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present Fraud
 
Data Loss Threats and Mitigations
Data Loss Threats and MitigationsData Loss Threats and Mitigations
Data Loss Threats and Mitigations
 
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and Linkurious
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Building Intelligent Data Products