SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Fraud Analytics
Alejandro Correa Bahnsen, PhD
Data Scientist
About me
• PhD in Machine Learning at Luxembourg University
• Data Scientist at Easy Solutions
• Worked for +8 years as a data scientist at GE Money, Scotiabank
and SIX Financial Services
• Bachelor and Master in Industrial Engineering
• Organizer of the Big Data & Data Science Bogota Meetup
2
About us
Industry recognitionA leading global provider of electronic fraud
prevention for financial institutions and enterprise
customers
280+ customers
In 26 countries
75 million
Users protected
22+ billion
Online connections monitored in
last 12 months
3
Our Approach:Total Fraud Protection®
4
~1Billion USD
~171Millions USD
~3Billions USD
Does fraud affect me?
5
Does fraud affect me?
6
€ -
€ 100
€ 200
€ 300
€ 400
€ 500
€ 600
€ 700
€ 800
2007 2008 2009 2010 2011 2012
Europe fraud evolution
Card not present (Internet) transactions
7
$-
$500
$1,000
$1,500
$2,000
$2,500
$3,000
$3,500
$4,000
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
US fraud evolution
Card not present (Internet) transactions
8
1.10%
1.30%
1.10%
0.90% 0.88% 0.87%
0.09% 0.08% 0.08% 0.06% 0.05% 0.05%
2006 2007 2008 2009 2010 2011
Card Present vs. Card Not Present Fraud Rates
Card Not Present Card Present
23.3
26.8
30.0
33.3
35.0
2009 2010 2011 2012 2013
US Online Banking
Billions of Transactions
1.2
3.0
5.6
9.4
14.0
2009 2010 2011 2012 2013
US Mobile Banking
Billions of Transactions
9
10
La Banca Móvil continúa creciendo mientras los canales
tradicionales pierden usuarios
¿Qué medios usa para realizar operaciones bancarias / consulta de saldo / pagos de servicios
/pago de impuestos u otros pagos o compras
11
Retos de Seguridad en Móviles
12
La principal razón de quienes NO usan Internet para
transacciones o compras es el temor al fraude electrónico
¿Por qué NO USA Internet para realizar operaciones bancarias, pagos o compras?
There is a need for
better fraud
detection strategies
13
14
“War is ninety percent information”
• Napoleon Bonaparte
15
BigData?
16
17
18
Big data (Data Science) is like teenage sex:
everyone talks about it,
nobody really knows how to do it,
everyone thinks everyone else is doing it,
so everyone claims they are doing it...
19
20
21
Man on the Moon
Man on the Moon
Distance: 356,000Km
Never been there
before
Must return to Earth
22
Man on the Moon – Small Data!!
Apollo XI
Speed: 3,500 km/hour
Weight: 13,500kg
Lots of complex data
Computer Program
64kb, 2Kb RAM,
Fortran
Must work the first
time
Apollo XI, 1969
64Kb, 2Kb RAM
23
Man on the Moon – Small Data!!
iphone 6
128GB, 2GB RAM
BigData Analytics
24
BigData Analytics is the
use of methods and
tools of Machine
Learning and Artificial
Intelligence with the
objective making data-
driven decisions
25
Fraud detection
and prevention
26
Estimate the probability of a transaction being fraud based on analyzing
customer patterns and recent fraudulent behavior
Issues when constructing a fraud detection system:
• Skewness of the data
• Cost-sensitivity
• Short time response of the system
• Dimensionality of the search space
• Feature preprocessing
• Model selection
27
Credit card fraud detection
Network
Fraud??
28
• Larger European card processing
company
• 2012 & 2013 card present
transactions
• 20MM Transactions
• 40,000 Frauds
• 0.467% Fraud rate
• ~ 2MM EUR lost due to fraud on
test dataset
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
Test
Train
Data
• “Purpose is to use facts and rules, taken from the knowledge
of many human experts, to help make decisions.”
• Example of rules
• More than 4 ATM transactions in one hour?
• More than 2 transactions in 5 minutes?
• Magnetic stripe transaction then internet transaction?
30
If-Then rules (Expert rules)
1.04%
31%
17%
22%
Miss-cla Recall Precision F1-Score
31
If-Then rules (Expert rules)
Credit card fraud detection is a cost-sensitive problem. As the cost due to a
false positive is different than the cost of a false negative.
• False positives: When predicting a transaction as fraudulent, when in
fact it is not a fraud, there is an administrative cost that is incurred by
the financial institution.
• False negatives: Failing to detect a fraud, the amount of that transaction
is lost.
Moreover, it is not enough to assume a constant cost difference between
false positives and false negatives, as the amount of the transactions varies
quite significantly.
32
Financial evaluation
Cost matrix
𝐶𝑜𝑠𝑡 𝑓 𝑆 =
𝑖=1
𝑁
𝑦𝑖 𝑐𝑖 𝐶 𝑇𝑃 𝑖
+ 1 − 𝑐𝑖 𝐶 𝐹𝑁 𝑖
+ 1 − 𝑦𝑖 𝑐𝑖 𝐶 𝐹𝑃 𝑖
+ 1 − 𝑐𝑖 𝐶 𝑇𝑁 𝑖
33
Actual Positive
𝒚𝒊 = 𝟏
Actual Negative
𝒚𝒊 = 𝟎
Predicted Positive
𝒄𝒊 = 𝟏
𝐶 𝑇𝑃 𝑖
= 𝐶 𝑎 𝐶 𝐹𝑃 𝑖
= 𝐶 𝑎
Predicted Negative
𝒄𝒊 = 𝟎
𝐶 𝐹𝑁 𝑖
= 𝐴𝑚𝑡𝑖 𝐶 𝑇𝑁 𝑖
= 0
Financial evaluation
1.24 €
1.94 €
Cost Total Losses
1.04%
31%
17%
22%
Miss-cla Recall Precision F1-Score
34
If-Then rules (Expert rules)
Fraud Analytics
35
Fraud Analytics is the use of statistical
and mathematical techniques (Machine
Learning) to discover patterns in data in
order to make predictions
Fraud Analytics
Raw features
37
Attribute name Description
Transaction ID Transaction identification number
Time Date and time of the transaction
Account number Identification number of the customer
Card number Identification of the credit card
Transaction type ie. Internet, ATM, POS, ...
Entry mode ie. Chip and pin, magnetic stripe, ...
Amount Amount of the transaction in Euros
Merchant code Identification of the merchant type
Merchant group Merchant group identification
Country Country of trx
Country 2 Country of residence
Type of card ie. Visa debit, Mastercard, American Express...
Gender Gender of the card holder
Age Card holder age
Bank Issuer bank of the card
Features
Transaction aggregation strategy
38
Raw Features
TrxId Time Type Country Amt
1 1/1 18:20 POS Lux 250
2 1/1 20:35 POS Lux 400
3 1/1 22:30 ATM Lux 250
4 2/1 00:50 POS Ger 50
5 2/1 19:18 POS Ger 100
6 2/1 23:45 POS Ger 150
7 3/1 06:00 POS Lux 10
Aggregated Features
No Trx
last 24h
Amt last
24h
No Trx
last 24h
same
type and
country
Amt last
24h same
type and
country
0 0 0 0
1 250 1 250
2 650 0 0
3 900 0 0
3 700 1 50
2 150 2 150
3 400 0 0
Features
When is a customer expected to
make a new transaction?
Considering a von Mises
distribution with a period of 24
hours such that
𝑃(𝑡𝑖𝑚𝑒) ~ 𝑣𝑜𝑛𝑚𝑖𝑠𝑒𝑠 𝜇, 𝜎
=
𝑒 𝜎𝑐𝑜𝑠(𝑡𝑖𝑚𝑒−𝜇)
2𝜋𝐼0 𝜎
where 𝝁 is the mean, 𝝈 is the standard
deviation, and 𝑰 𝟎 is the Bessel function
39
Periodic features
40
Periodic features
Amountofthetransaction
Number of transactions last day
Normal Transaction
Fraud
41
42
Amountofthetransaction
Number of transactions last day
Normal Transaction
Fraud
43
Amount of the transaction
Normal Transaction
Fraud
Number of transactions last dayNumber of ATM transactions
last week
Fraud Analytics
Algorithms
Fuzzy Rules
Neural Nets
Naive Bayes
*Random Forests
RF – with Cost-Proportionate
Rejection Sampling
*Cost-Sensitive Random Patches
Decision Trees
44
45
Decision Trees
X1=Amountofthetransaction
X2= Number of transactions last day
A decision tree is a classification model that iteratively creates binary
decision rules that maximize certain criteria (Gini, entropy, …).
Initial
Node
X2<10 X2≥10
X1<100
X1<50
X2<15 X2≥15
X1≥50
X1≥100
A Random Forest is made by combining many different decision trees. Each
one trained on a random subset of the initial dataset
46
Random Forests
47
Random Forests & Random Patches
1
2
3
4
5
6
7
8
8
6
2
5
2
1
3
6
1
5
8
1
4
4
2
1
9
4
6
1
1
5
8
1
4
4
2
1
1
5
8
1
4
4
2
1
1
5
8
1
4
4
2
1
Bagging Random forest Random patches
Training set
48
Cost-Sensitive Decision Trees
• Standard decision trees create rules
that maximize either the Gini or the
entropy measures
• However this assumes that all
misclassification errors carry the same
cost
• Not true in fraud detection
• Instead the cost-sensitive decision tree
minimizes the cost of each rule
𝐶𝑜𝑠𝑡 𝑓 𝑛𝑜𝑑𝑒
Initial
Node
X2<10 X2≥10
X1<100
X1<50
X2<15 X2≥15
X1≥50
X1≥100
0%
20%
40%
60%
80%
100%
Expert
Rules
Fuzzy
Rules
Neural
Nets
Naïve
Bayes
Random
Forests
RF - CP
Random
Sampling
CS
Random
Patches
% Savings % Frauds
49
• Fraud Analytics (ML) models are significantly
better than expert rules
• Models should be evaluated taking into
account real financial costs of the application
• Algorithms should be developed to
incorporate those financial costs
Conclusions
50
51
Questions?
Alejandro Correa Bahnsen, PhD
Data Scientist
acorrea@Easysol.net
52

Weitere ähnliche Inhalte

Was ist angesagt?

Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
 
How to apply graph analytics for bank loan fraud detection?
How to apply graph analytics for bank loan fraud detection?How to apply graph analytics for bank loan fraud detection?
How to apply graph analytics for bank loan fraud detection?Linkurious
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
Analysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionAnalysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionJustluk Luk
 
Online Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningOnline Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningStefano Tempesta
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detectionPEIPEI HAN
 
Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Andrea Dal Pozzolo
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)k.surya kumar
 
Data Quality Success Stories
Data Quality Success StoriesData Quality Success Stories
Data Quality Success StoriesDATAVERSITY
 
Real-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactionsReal-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactionsMariusz Rafało
 
AI powered decision making in banks
AI powered decision making in banksAI powered decision making in banks
AI powered decision making in banksPankaj Baid
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learningijtsrd
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learningdataalcott
 
Artificial Intelligence and Digital Banking - What about fraud prevention ?
Artificial Intelligence and Digital Banking - What about fraud prevention ?Artificial Intelligence and Digital Banking - What about fraud prevention ?
Artificial Intelligence and Digital Banking - What about fraud prevention ?Jérôme Kehrli
 
Enterprise Fraud Management: How Banks Need to Adapt
Enterprise Fraud Management: How Banks Need to AdaptEnterprise Fraud Management: How Banks Need to Adapt
Enterprise Fraud Management: How Banks Need to AdaptCapgemini
 
Anti Money Laundering Framework
Anti Money Laundering FrameworkAnti Money Laundering Framework
Anti Money Laundering Frameworknikatmalik
 
Artificial Intelligence for Banking Fraud Prevention
Artificial Intelligence for Banking Fraud PreventionArtificial Intelligence for Banking Fraud Prevention
Artificial Intelligence for Banking Fraud PreventionJérôme Kehrli
 

Was ist angesagt? (20)

Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
How to apply graph analytics for bank loan fraud detection?
How to apply graph analytics for bank loan fraud detection?How to apply graph analytics for bank loan fraud detection?
How to apply graph analytics for bank loan fraud detection?
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
Analysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionAnalysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detection
 
Online Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningOnline Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine Learning
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detection
 
Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
 
Data Quality Success Stories
Data Quality Success StoriesData Quality Success Stories
Data Quality Success Stories
 
Trends in AML Compliance and Technology
Trends in AML Compliance and TechnologyTrends in AML Compliance and Technology
Trends in AML Compliance and Technology
 
Fraud analytics
Fraud analyticsFraud analytics
Fraud analytics
 
Real-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactionsReal-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactions
 
Dark data
Dark dataDark data
Dark data
 
AI powered decision making in banks
AI powered decision making in banksAI powered decision making in banks
AI powered decision making in banks
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learning
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Artificial Intelligence and Digital Banking - What about fraud prevention ?
Artificial Intelligence and Digital Banking - What about fraud prevention ?Artificial Intelligence and Digital Banking - What about fraud prevention ?
Artificial Intelligence and Digital Banking - What about fraud prevention ?
 
Enterprise Fraud Management: How Banks Need to Adapt
Enterprise Fraud Management: How Banks Need to AdaptEnterprise Fraud Management: How Banks Need to Adapt
Enterprise Fraud Management: How Banks Need to Adapt
 
Anti Money Laundering Framework
Anti Money Laundering FrameworkAnti Money Laundering Framework
Anti Money Laundering Framework
 
Artificial Intelligence for Banking Fraud Prevention
Artificial Intelligence for Banking Fraud PreventionArtificial Intelligence for Banking Fraud Prevention
Artificial Intelligence for Banking Fraud Prevention
 

Andere mochten auch

2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practiceAlejandro Correa Bahnsen, PhD
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...Alejandro Correa Bahnsen, PhD
 
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionExample-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionAlejandro Correa Bahnsen, PhD
 
Classifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksClassifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksAlejandro Correa Bahnsen, PhD
 
Maximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningMaximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningAlejandro Correa Bahnsen, PhD
 
PhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationPhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationAlejandro Correa Bahnsen, PhD
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Alejandro Correa Bahnsen, PhD
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesAlejandro Correa Bahnsen, PhD
 

Andere mochten auch (12)

2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...
 
Analytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacionAnalytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacion
 
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionExample-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
 
Classifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksClassifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural Networks
 
Maximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningMaximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learning
 
2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle
 
Modern Data Science
Modern Data ScienceModern Data Science
Modern Data Science
 
PhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationPhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive Classification
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
 
Demystifying machine learning using lime
Demystifying machine learning using limeDemystifying machine learning using lime
Demystifying machine learning using lime
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slides
 

Ähnlich wie Fraud Detection with Cost-Sensitive Predictive Analytics

Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsNeo4j
 
ACAMS NY Chapter Presentation - C-AML: Exploring the New Frontier of Crypto-AML
ACAMS NY Chapter Presentation -  C-AML: Exploring the New Frontier of Crypto-AMLACAMS NY Chapter Presentation -  C-AML: Exploring the New Frontier of Crypto-AML
ACAMS NY Chapter Presentation - C-AML: Exploring the New Frontier of Crypto-AMLMadeline Ross
 
Fraud Detection in Real-time @ Apache Big Data con
Fraud Detection in Real-time @ Apache Big Data conFraud Detection in Real-time @ Apache Big Data con
Fraud Detection in Real-time @ Apache Big Data conSeshika Fernando
 
Fraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data ConFraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data ConSeshika Fernando
 
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersBrian Griffith
 
How AI is preventing account fraud at web scale
How AI is preventing account fraud at web scaleHow AI is preventing account fraud at web scale
How AI is preventing account fraud at web scaleAmir Moghimi
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Fighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceFighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceDataWorks Summit
 
Data analysis for credit card fraud detection.pptx
Data analysis for credit card fraud detection.pptxData analysis for credit card fraud detection.pptx
Data analysis for credit card fraud detection.pptxKRNL1
 
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Bernhard Haslhofer
 
Innovationstag Digital Banking Liechtenstein 2016
Innovationstag Digital Banking Liechtenstein 2016Innovationstag Digital Banking Liechtenstein 2016
Innovationstag Digital Banking Liechtenstein 2016Roman Dinkel
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNeo4j
 
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"Lviv Startup Club
 
Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714IBM Thailand Co Ltd
 
Cybercrime, Digital Investigation and Public Private Partnership by Francesca...
Cybercrime, Digital Investigation and Public Private Partnership by Francesca...Cybercrime, Digital Investigation and Public Private Partnership by Francesca...
Cybercrime, Digital Investigation and Public Private Partnership by Francesca...Tech and Law Center
 
Understanding the Card Fraud Lifecycle : A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle :  A Guide For Private Label IssuersUnderstanding the Card Fraud Lifecycle :  A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle : A Guide For Private Label IssuersChristopher Uriarte
 
2018 oct executive_forum_sysman_214
2018 oct executive_forum_sysman_2142018 oct executive_forum_sysman_214
2018 oct executive_forum_sysman_214Alex Petrov
 
Graphs for Finance - A technological background
Graphs for Finance - A technological backgroundGraphs for Finance - A technological background
Graphs for Finance - A technological backgroundNeo4j
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractVenkat Projects
 
Abuse prevention in the globally distributed economy presentation
Abuse prevention in the globally distributed economy presentationAbuse prevention in the globally distributed economy presentation
Abuse prevention in the globally distributed economy presentationJustin Dorfman
 

Ähnlich wie Fraud Detection with Cost-Sensitive Predictive Analytics (20)

Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and Graphs
 
ACAMS NY Chapter Presentation - C-AML: Exploring the New Frontier of Crypto-AML
ACAMS NY Chapter Presentation -  C-AML: Exploring the New Frontier of Crypto-AMLACAMS NY Chapter Presentation -  C-AML: Exploring the New Frontier of Crypto-AML
ACAMS NY Chapter Presentation - C-AML: Exploring the New Frontier of Crypto-AML
 
Fraud Detection in Real-time @ Apache Big Data con
Fraud Detection in Real-time @ Apache Big Data conFraud Detection in Real-time @ Apache Big Data con
Fraud Detection in Real-time @ Apache Big Data con
 
Fraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data ConFraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data Con
 
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
 
How AI is preventing account fraud at web scale
How AI is preventing account fraud at web scaleHow AI is preventing account fraud at web scale
How AI is preventing account fraud at web scale
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Fighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceFighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial Intelligence
 
Data analysis for credit card fraud detection.pptx
Data analysis for credit card fraud detection.pptxData analysis for credit card fraud detection.pptx
Data analysis for credit card fraud detection.pptx
 
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
 
Innovationstag Digital Banking Liechtenstein 2016
Innovationstag Digital Banking Liechtenstein 2016Innovationstag Digital Banking Liechtenstein 2016
Innovationstag Digital Banking Liechtenstein 2016
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4j
 
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"
 
Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714
 
Cybercrime, Digital Investigation and Public Private Partnership by Francesca...
Cybercrime, Digital Investigation and Public Private Partnership by Francesca...Cybercrime, Digital Investigation and Public Private Partnership by Francesca...
Cybercrime, Digital Investigation and Public Private Partnership by Francesca...
 
Understanding the Card Fraud Lifecycle : A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle :  A Guide For Private Label IssuersUnderstanding the Card Fraud Lifecycle :  A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle : A Guide For Private Label Issuers
 
2018 oct executive_forum_sysman_214
2018 oct executive_forum_sysman_2142018 oct executive_forum_sysman_214
2018 oct executive_forum_sysman_214
 
Graphs for Finance - A technological background
Graphs for Finance - A technological backgroundGraphs for Finance - A technological background
Graphs for Finance - A technological background
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstract
 
Abuse prevention in the globally distributed economy presentation
Abuse prevention in the globally distributed economy presentationAbuse prevention in the globally distributed economy presentation
Abuse prevention in the globally distributed economy presentation
 

Mehr von Alejandro Correa Bahnsen, PhD

Mehr von Alejandro Correa Bahnsen, PhD (6)

black hat deephish
black hat deephishblack hat deephish
black hat deephish
 
DeepPhish: Simulating malicious AI
DeepPhish: Simulating malicious AIDeepPhish: Simulating malicious AI
DeepPhish: Simulating malicious AI
 
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
 
How I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data ProductsHow I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data Products
 
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision TreesFraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision Trees
 
2012 predictive clusters
2012 predictive clusters2012 predictive clusters
2012 predictive clusters
 

Kürzlich hochgeladen

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Kürzlich hochgeladen (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Fraud Detection with Cost-Sensitive Predictive Analytics

  • 1. Fraud Analytics Alejandro Correa Bahnsen, PhD Data Scientist
  • 2. About me • PhD in Machine Learning at Luxembourg University • Data Scientist at Easy Solutions • Worked for +8 years as a data scientist at GE Money, Scotiabank and SIX Financial Services • Bachelor and Master in Industrial Engineering • Organizer of the Big Data & Data Science Bogota Meetup 2
  • 3. About us Industry recognitionA leading global provider of electronic fraud prevention for financial institutions and enterprise customers 280+ customers In 26 countries 75 million Users protected 22+ billion Online connections monitored in last 12 months 3
  • 4. Our Approach:Total Fraud Protection® 4
  • 5. ~1Billion USD ~171Millions USD ~3Billions USD Does fraud affect me? 5
  • 7. € - € 100 € 200 € 300 € 400 € 500 € 600 € 700 € 800 2007 2008 2009 2010 2011 2012 Europe fraud evolution Card not present (Internet) transactions 7
  • 8. $- $500 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 US fraud evolution Card not present (Internet) transactions 8
  • 9. 1.10% 1.30% 1.10% 0.90% 0.88% 0.87% 0.09% 0.08% 0.08% 0.06% 0.05% 0.05% 2006 2007 2008 2009 2010 2011 Card Present vs. Card Not Present Fraud Rates Card Not Present Card Present 23.3 26.8 30.0 33.3 35.0 2009 2010 2011 2012 2013 US Online Banking Billions of Transactions 1.2 3.0 5.6 9.4 14.0 2009 2010 2011 2012 2013 US Mobile Banking Billions of Transactions 9
  • 10. 10 La Banca Móvil continúa creciendo mientras los canales tradicionales pierden usuarios ¿Qué medios usa para realizar operaciones bancarias / consulta de saldo / pagos de servicios /pago de impuestos u otros pagos o compras
  • 11. 11 Retos de Seguridad en Móviles
  • 12. 12 La principal razón de quienes NO usan Internet para transacciones o compras es el temor al fraude electrónico ¿Por qué NO USA Internet para realizar operaciones bancarias, pagos o compras?
  • 13. There is a need for better fraud detection strategies 13
  • 14. 14
  • 15. “War is ninety percent information” • Napoleon Bonaparte 15
  • 17. 17
  • 18. 18
  • 19. Big data (Data Science) is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it... 19
  • 20. 20
  • 22. Man on the Moon Distance: 356,000Km Never been there before Must return to Earth 22 Man on the Moon – Small Data!! Apollo XI Speed: 3,500 km/hour Weight: 13,500kg Lots of complex data Computer Program 64kb, 2Kb RAM, Fortran Must work the first time
  • 23. Apollo XI, 1969 64Kb, 2Kb RAM 23 Man on the Moon – Small Data!! iphone 6 128GB, 2GB RAM
  • 25. BigData Analytics is the use of methods and tools of Machine Learning and Artificial Intelligence with the objective making data- driven decisions 25
  • 27. Estimate the probability of a transaction being fraud based on analyzing customer patterns and recent fraudulent behavior Issues when constructing a fraud detection system: • Skewness of the data • Cost-sensitivity • Short time response of the system • Dimensionality of the search space • Feature preprocessing • Model selection 27 Credit card fraud detection
  • 29. • Larger European card processing company • 2012 & 2013 card present transactions • 20MM Transactions • 40,000 Frauds • 0.467% Fraud rate • ~ 2MM EUR lost due to fraud on test dataset Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan Test Train Data
  • 30. • “Purpose is to use facts and rules, taken from the knowledge of many human experts, to help make decisions.” • Example of rules • More than 4 ATM transactions in one hour? • More than 2 transactions in 5 minutes? • Magnetic stripe transaction then internet transaction? 30 If-Then rules (Expert rules)
  • 31. 1.04% 31% 17% 22% Miss-cla Recall Precision F1-Score 31 If-Then rules (Expert rules)
  • 32. Credit card fraud detection is a cost-sensitive problem. As the cost due to a false positive is different than the cost of a false negative. • False positives: When predicting a transaction as fraudulent, when in fact it is not a fraud, there is an administrative cost that is incurred by the financial institution. • False negatives: Failing to detect a fraud, the amount of that transaction is lost. Moreover, it is not enough to assume a constant cost difference between false positives and false negatives, as the amount of the transactions varies quite significantly. 32 Financial evaluation
  • 33. Cost matrix 𝐶𝑜𝑠𝑡 𝑓 𝑆 = 𝑖=1 𝑁 𝑦𝑖 𝑐𝑖 𝐶 𝑇𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝐹𝑁 𝑖 + 1 − 𝑦𝑖 𝑐𝑖 𝐶 𝐹𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝑇𝑁 𝑖 33 Actual Positive 𝒚𝒊 = 𝟏 Actual Negative 𝒚𝒊 = 𝟎 Predicted Positive 𝒄𝒊 = 𝟏 𝐶 𝑇𝑃 𝑖 = 𝐶 𝑎 𝐶 𝐹𝑃 𝑖 = 𝐶 𝑎 Predicted Negative 𝒄𝒊 = 𝟎 𝐶 𝐹𝑁 𝑖 = 𝐴𝑚𝑡𝑖 𝐶 𝑇𝑁 𝑖 = 0 Financial evaluation
  • 34. 1.24 € 1.94 € Cost Total Losses 1.04% 31% 17% 22% Miss-cla Recall Precision F1-Score 34 If-Then rules (Expert rules)
  • 36. Fraud Analytics is the use of statistical and mathematical techniques (Machine Learning) to discover patterns in data in order to make predictions Fraud Analytics
  • 37. Raw features 37 Attribute name Description Transaction ID Transaction identification number Time Date and time of the transaction Account number Identification number of the customer Card number Identification of the credit card Transaction type ie. Internet, ATM, POS, ... Entry mode ie. Chip and pin, magnetic stripe, ... Amount Amount of the transaction in Euros Merchant code Identification of the merchant type Merchant group Merchant group identification Country Country of trx Country 2 Country of residence Type of card ie. Visa debit, Mastercard, American Express... Gender Gender of the card holder Age Card holder age Bank Issuer bank of the card Features
  • 38. Transaction aggregation strategy 38 Raw Features TrxId Time Type Country Amt 1 1/1 18:20 POS Lux 250 2 1/1 20:35 POS Lux 400 3 1/1 22:30 ATM Lux 250 4 2/1 00:50 POS Ger 50 5 2/1 19:18 POS Ger 100 6 2/1 23:45 POS Ger 150 7 3/1 06:00 POS Lux 10 Aggregated Features No Trx last 24h Amt last 24h No Trx last 24h same type and country Amt last 24h same type and country 0 0 0 0 1 250 1 250 2 650 0 0 3 900 0 0 3 700 1 50 2 150 2 150 3 400 0 0 Features
  • 39. When is a customer expected to make a new transaction? Considering a von Mises distribution with a period of 24 hours such that 𝑃(𝑡𝑖𝑚𝑒) ~ 𝑣𝑜𝑛𝑚𝑖𝑠𝑒𝑠 𝜇, 𝜎 = 𝑒 𝜎𝑐𝑜𝑠(𝑡𝑖𝑚𝑒−𝜇) 2𝜋𝐼0 𝜎 where 𝝁 is the mean, 𝝈 is the standard deviation, and 𝑰 𝟎 is the Bessel function 39 Periodic features
  • 41. Amountofthetransaction Number of transactions last day Normal Transaction Fraud 41
  • 42. 42 Amountofthetransaction Number of transactions last day Normal Transaction Fraud
  • 43. 43 Amount of the transaction Normal Transaction Fraud Number of transactions last dayNumber of ATM transactions last week
  • 44. Fraud Analytics Algorithms Fuzzy Rules Neural Nets Naive Bayes *Random Forests RF – with Cost-Proportionate Rejection Sampling *Cost-Sensitive Random Patches Decision Trees 44
  • 45. 45 Decision Trees X1=Amountofthetransaction X2= Number of transactions last day A decision tree is a classification model that iteratively creates binary decision rules that maximize certain criteria (Gini, entropy, …). Initial Node X2<10 X2≥10 X1<100 X1<50 X2<15 X2≥15 X1≥50 X1≥100
  • 46. A Random Forest is made by combining many different decision trees. Each one trained on a random subset of the initial dataset 46 Random Forests
  • 47. 47 Random Forests & Random Patches 1 2 3 4 5 6 7 8 8 6 2 5 2 1 3 6 1 5 8 1 4 4 2 1 9 4 6 1 1 5 8 1 4 4 2 1 1 5 8 1 4 4 2 1 1 5 8 1 4 4 2 1 Bagging Random forest Random patches Training set
  • 48. 48 Cost-Sensitive Decision Trees • Standard decision trees create rules that maximize either the Gini or the entropy measures • However this assumes that all misclassification errors carry the same cost • Not true in fraud detection • Instead the cost-sensitive decision tree minimizes the cost of each rule 𝐶𝑜𝑠𝑡 𝑓 𝑛𝑜𝑑𝑒 Initial Node X2<10 X2≥10 X1<100 X1<50 X2<15 X2≥15 X1≥50 X1≥100
  • 50. • Fraud Analytics (ML) models are significantly better than expert rules • Models should be evaluated taking into account real financial costs of the application • Algorithms should be developed to incorporate those financial costs Conclusions 50
  • 51. 51
  • 52. Questions? Alejandro Correa Bahnsen, PhD Data Scientist acorrea@Easysol.net 52

Hinweis der Redaktion

  1. Analytics at work. Davenport 2010.
  2. En 2015, el Internet y la tecnología móvil han solidificado su estatus en Latinoamérica como los canales más populares para operaciones bancarias, pagos y compras. Las oficinas bancarias continúan perdiendo uso y menos del 30% de los usuarios utilizan regularmente canales tradicionales como cajeros automáticos o sistemas de audio-respuesta. Está claro que los usuarios de transacciones en Internet muestran una clara preferencia por eliminar el efectivo de sus transacciones tanto como les sea posible, y el uso de tarjetas de crédito parece seguir esta tendencia debido a que los usuarios cada vez más prefieren manejar sus operaciones en computadoras y dispositivos móviles. Si bien el uso de dispositivos móviles para realizar operaciones financieras continúa creciendo, aún existe resistencia de parte de los usuarios para utilizar estos dispositivos de la misma forma que sus computadoras, incluso siendo más convenientes al poderlos llevar a todas partes. Internet se mantiene como el canal más frecuentemente usado, con un promedio de uso por persona de 3.8 veces por mes. Anécdota de bancos en Colombia y sus filas.
  3. Y los usuarios están en lo cierto al ser tan precavidos. Un estudio conducido por la empresa Arxan Technologies dice que un 95% de las principales aplicaciones móviles financieras para Android (y 70% de las de iOS) han sido hackeadas. En 2014, Trend Micro encontró que el 77% de las 50 aplicaciones gratuitas más descargadas de Google Play tenian versiones falsas, haciendo muy difícil para los usuarios detectar cuál es de ellas son auténticas o fraudulentas.
  4. El análisis de la visión y opiniones de aquellos que regularmente utilizan la Internet para operaciones bancarias y compras es de gran importancia a la hora de diseñar una estrategia que intente aprovechar todo el potencial que este canal ofrece. No obstante, es de igual importancia el examinar aquellos usuarios que debido a una variedad de razones no utilizan Internet con propósitos de finanzas o comercio electrónico. La principal razón mencionada por estos usuarios para no tomar ventaja de los servicios bancarios online fue el miedo al fraude electrónico. Si consideramos que los portales de banca online ofrecen mayor conveniencia a los usuarios y sus menores costos de operación benefician a las instituciones, entonces es imperativo que los bancos continúen investigando formas de promover la adopción de canales de banca electrónica. La prevención del fraude en estos canales no es sólo una forma de prevenir pérdidas económicas y proteger la reputación de las instituciones, una fuerte protección contra fraude también puede hacer que usuarios antes escépticos, adquieran la confianza necesaria para incorporar estos canales a su rutina bancaria normal, y que los bancos con tasas de adopción más altas obtengan una ventaja competitiva más amplia.
  5. Analytics at work. Davenport 2010.
  6. http://tagul.com/
  7. The famous French general didn’t even live the information age, and yet he attributed most of his military success to having the right information. When you’re battling for a competitive advantage in business, analytics data can be equally important to your success.
  8. http://www.kurzweilai.net/googles-self-driving-car-gathers-nearly-1-gbsec
  9. http://www.visualnews.com/2012/06/19/how-much-data-created-every-minute/?view=infographic
  10. http://www.visualnews.com/2012/06/19/how-much-data-created-every-minute/?view=infographic
  11. http://www.visualnews.com/2012/06/19/how-much-data-created-every-minute/?view=infographic
  12. http://www.visualnews.com/2012/06/19/how-much-data-created-every-minute/?view=infographic
  13. The famous French general didn’t even live the information age, and yet he attributed most of his military success to having the right information. When you’re battling for a competitive advantage in business, analytics data can be equally important to your success.