Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Seshika Fernando
Technical Lead
Catch them in the Act
Fraud Detection in Real-time
Fraud: A Trillion Dollar Problem
Survey results
๏ $ 3.5 – 4 Trillion in Global Losses per year
(5% of Global GDP)
Payment ...
3
Why WSO2 Analytics Platform?
Domain
Knowledge
Batch
Analytics
Interactive
Analytics
Real-time
Analytics
Predictive
Analy...
Solution: Many Ways
Fraud = Anomaly
We provide many methods of
Anomaly Detection in order to
capture known and unknown
typ...
5
Capturing Domain Expertise
An example from Payment Fraud Domain
Fraudsters…
๏ Use stolen cards
๏ Buy Expensive stuff
๏ I...
Generic Rules
Convert all pre-existing knowledge about Fraudulent Behavior
within a domain to Generic Rules
๏ Blacklists/W...
7
Queries for Expensive Purchases
define table PremiumProducts (itemNo string);
from TransactionStream[(itemNo==
PremiumPr...
8
Queries for Large Quantities
define table QuantityAverages
(itemNo string, avgQty int, stdevQty int);
from TransactionSt...
9
Queries for Large Quantities (Learning)
define table QuantityAverages
(itemNo string, avgQty int, stdevQty int);
from Tr...
10
Queries for Transaction Velocity
from e1 = TransactionStream ->
e2 = TransactionStream[e1.cardNo == e2.cardNo] <3:>
wit...
11
The False Positive Trap
๏ So what if I buy Expensive stuff
๏ And why can’t I buy a lot
๏ Very Quickly
๏ At odd hours
๏ ...
12
Fraud Scoring
๏ Use combinations of rules
๏ Give weights to each rule
๏ Derive a single number that reflects many fraud...
13
Fraud Scoring
Score =
0.001 * itemPrice
+ 0.1 * itemQuantity
+ 2.5 * isFreeEmail
+ 5 * riskyCountry
+ 8 * suspicousIPRa...
Learn from Data
Utilize Machine Learning Techniques to identify ‘unknown’
point anomalies
K-means Clustering
Use Markov Models to discover fraudulent behavior through
rare activity sequences
Markov Models are stochastic models used...
16
Markov Modelling: Process
Classify Events
Update
Probability
Matrix
Compare
Incoming
Sequences
Probability
Matrix
Event...
17
Markov Model: Classification
Example:
Each transaction is classified under the following three
qualities and expressed ...
18
๏ Create a State Transition Probability Matrix
Markov Models: Probability Matrix
LNL LNH LNS LHL HHL HHS HNS
LNL
0.9767...
19
Markov Models: Probability Comparison
๏ Compare the probabilities of incoming transaction
sequences with thresholds and...
Dig Deeper
Access historical data
using
๏ expressive querying
๏ easy filtering
๏ useful visualizations
to isolate incident...
Usecase: Payment Fraud
Dashboard
Transactions
Transactions
Transactions
Transactions
Payment
System
Batch
Analytics
Intera...
Usecase: Anti Money Laundering
Dashboard
Bank Txns
Bank Txns
Bank Txns
Bank Txns
Core
Banking
System
Batch
Analytics
Inter...
Usecase: Identity Fraud
Dashboard
Events
Events
Batch
Analytics
Interactive
Analytics
Real-time
Analytics
Predictive
Analy...
References
o WSO2 Whitepaper on Fraud Detection: http://wso2.com/whitepapers/fraud-
detection-and-prevention-a-data-analyt...
Contact us !
Nächste SlideShare
Wird geladen in …5
×

Fraud Detection in Real-time @ Apache Big Data con

244 Aufrufe

Veröffentlicht am

This is the slide deck used for the talk I gave at #apachebigdata.

Download whitepaper at: http://wso2.com/whitepapers/fraud-detection-and-prevention-a-data-analytics-approach/

Veröffentlicht in: Daten & Analysen
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Fraud Detection in Real-time @ Apache Big Data con

  1. 1. Seshika Fernando Technical Lead Catch them in the Act Fraud Detection in Real-time
  2. 2. Fraud: A Trillion Dollar Problem Survey results ๏ $ 3.5 – 4 Trillion in Global Losses per year (5% of Global GDP) Payment Fraud Only ๏ Merchants are losing around $ 250B globally ๏ Cost of Fraud is around 0.68% of Revenue for Retailers (2014) ๏ Steep rise in Fraud in eCommerce (0.85% of Revenue) and mCommerce (1.36% of Revenue) with a movement of payments to newer channels
  3. 3. 3 Why WSO2 Analytics Platform? Domain Knowledge Batch Analytics Interactive Analytics Real-time Analytics Predictive Analytics Fraud Detection Toolkit
  4. 4. Solution: Many Ways Fraud = Anomaly We provide many methods of Anomaly Detection in order to capture known and unknown types of fraudulent behavior ๏ Generic Rules ๏ Fraud Scoring ๏ Advanced Techniques Capturing anomalous behavior using mathematical modelling
  5. 5. 5 Capturing Domain Expertise An example from Payment Fraud Domain Fraudsters… ๏ Use stolen cards ๏ Buy Expensive stuff ๏ In Large Quantities ๏ Very quickly ๏ At odd hours ๏ Ship to many places ๏ Provide weird email addresses CEP Queries
  6. 6. Generic Rules Convert all pre-existing knowledge about Fraudulent Behavior within a domain to Generic Rules ๏ Blacklists/Whitelists ๏ Moving Averages ๏ Known Patterns ๏ Outliers
  7. 7. 7 Queries for Expensive Purchases define table PremiumProducts (itemNo string); from TransactionStream[(itemNo== PremiumProducts.itemNo) in PremiumProducts ] select * insert into FraudStream;
  8. 8. 8 Queries for Large Quantities define table QuantityAverages (itemNo string, avgQty int, stdevQty int); from TransactionStream [(itemNo== av.itemNo and qty > (av.avgQty + 3 * av.stdevQty)) in QuantityAverages as av] select * insert into FraudStream;
  9. 9. 9 Queries for Large Quantities (Learning) define table QuantityAverages (itemNo string, avgQty int, stdevQty int); from TransactionStream#window.time(8 hours) select itemNo, avg(qty) as avg, stdev(qty) as stdev group by itemNo update QuantityAverages as av on itemNo == av.itemNo; from TransactionStream [(itemNo== av.itemNo and qty > (av.avgQty + 3 * av.stdevQty)) in QuantityAverages as av] select * insert into FraudStream;
  10. 10. 10 Queries for Transaction Velocity from e1 = TransactionStream -> e2 = TransactionStream[e1.cardNo == e2.cardNo] <3:> within 5 min select e1.cardNo, e1.txnID, e2[0].txnID, e2[1].txnID, e2[2].txnID insert into FraudStream; 2:20
  11. 11. 11 The False Positive Trap ๏ So what if I buy Expensive stuff ๏ And why can’t I buy a lot ๏ Very Quickly ๏ At odd hours ๏ Ship to many places Rich guy Gift giver Busy man Night owl Many girlfriends? Blocking genuine customers could be counter productive and costly
  12. 12. 12 Fraud Scoring ๏ Use combinations of rules ๏ Give weights to each rule ๏ Derive a single number that reflects many fraud indicators ๏ Use a threshold to reject transactions ๏ You just bought a Diamond Ring? ๏ You bought 20 Diamond Rings, in 15 minutes at 3am from a blacklisted IP address?
  13. 13. 13 Fraud Scoring Score = 0.001 * itemPrice + 0.1 * itemQuantity + 2.5 * isFreeEmail + 5 * riskyCountry + 8 * suspicousIPRange + 5 * suspicousUsername + 3 * highTransactionVelocity 2:27
  14. 14. Learn from Data Utilize Machine Learning Techniques to identify ‘unknown’ point anomalies K-means Clustering
  15. 15. Use Markov Models to discover fraudulent behavior through rare activity sequences Markov Models are stochastic models used to model randomly changing systems 15 Markov Models for Fraud Detection
  16. 16. 16 Markov Modelling: Process Classify Events Update Probability Matrix Compare Incoming Sequences Probability Matrix Events Alerts
  17. 17. 17 Markov Model: Classification Example: Each transaction is classified under the following three qualities and expressed as a 3 letter token, e.g., HNN ๏ Amount spent: Low, Normal and High ๏ Whether the transaction includes high price ticket item: Normal and High ๏ Time elapsed since the last transaction: Large, Normal and Small
  18. 18. 18 ๏ Create a State Transition Probability Matrix Markov Models: Probability Matrix LNL LNH LNS LHL HHL HHS HNS LNL 0.976788 0.542152 0.20706 0.095459 0.007166 0.569172 0.335481 LNH 0.806876 0.609425 0.188628 0.651126 0.113801 0.630711 0.099825 LNS 0.07419 0.83973 0.951471 0.156532 0.12045 0.201713 0.970792 LHL 0.452885 0.634071 0.328956 0.786087 0.676753 0.063064 0.225353 HHL 0.386206 0.255719 0.451524 0.469597 0.810013 0.444638 0.612242 HHS 0.204606 0.832722 0.043194 0.459342 0.960486 0.796382 0.34544 HNS 0.757737 0.371359 0.326846 0.970243 0.771326 0.015835 0.574333
  19. 19. 19 Markov Models: Probability Comparison ๏ Compare the probabilities of incoming transaction sequences with thresholds and flag fraud as appropriate ๏ Can use direct probabilities or more complex metrics ๏ Miss Rate Metric ๏ Miss Probability Metric ๏ Entropy Reduction Metric ๏ Update Markov Probability table with incoming transactions 2:35
  20. 20. Dig Deeper Access historical data using ๏ expressive querying ๏ easy filtering ๏ useful visualizations to isolate incidents and unearth connections
  21. 21. Usecase: Payment Fraud Dashboard Transactions Transactions Transactions Transactions Payment System Batch Analytics Interactive Analytics Real-time Analytics Predictive Analytics Alerts 21
  22. 22. Usecase: Anti Money Laundering Dashboard Bank Txns Bank Txns Bank Txns Bank Txns Core Banking System Batch Analytics Interactive Analytics Real-time Analytics Predictive Analytics Alerts 22
  23. 23. Usecase: Identity Fraud Dashboard Events Events Batch Analytics Interactive Analytics Real-time Analytics Predictive Analytics Alerts 23 2:40
  24. 24. References o WSO2 Whitepaper on Fraud Detection: http://wso2.com/whitepapers/fraud- detection-and-prevention-a-data-analytics-approach/ o True Cost of Fraud 2014 http://www.lexisnexis.com/risk/downloads/assets/true-cost- fraud-2014.pdf o Stop Billions in Fraud Losses using Machine Learning https://www.forrester.com/Stop+Billions+In+Fraud+Losses+With+Machine+Learning/fullte xt/-/E-res120912 o Big Data In Fraud Management: Variety Leads To Value And Improved Customer Experience https://www.forrester.com/Big+Data+In+Fraud+Management+Variety+Leads+To+Value+A nd+Improved+Customer+Experience/fulltext/-/E-RES103841 o Predictions 2015: Identity Management, Fraud Management, And Cybersecurity Converge https://www.forrester.com/Predictions+2015+Identity+Management+Fraud+Management +And+Cybersecurity+Converge/fulltext/-/E-RES120014 o Markov Modelling for Fraud Detection https://pkghosh.wordpress.com/2013/10/21/real-time-fraud-detection-with- sequence-mining/
  25. 25. Contact us !

×