SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Fin Crime detection using autoencoders
Liubomyr Bregman
Richard Bobek
L'viv, Ukraine
3 Nov 2018
AI & Big Data Day
Lessons learnt
1. Skilled operators
2. Less opportunity for “insider” or “opportunistic” attack
3. Need for ‘out-of-band’ systems for notifications
PwC 2
Global Payment Fraud – lessons learnt and investigations highlights
Lessons learnt
1. Dedicated scenarios within FIs
2. Leverage the convergence between cyber, fraud and ML
3. Leverage advanced analytics – evolving threat
landscape
*BAE Systems
https://en.wikipedia.org/wiki/Bangladesh_Bank_robbery
If Hollywood releases another iteration of the 'Oceans 11' franchise, they should base it on the recent attack against
the Central Bank of Bangladesh (BB)*
Bangladesh cyber heist
The attackers attempted to steal $951m in 35 separate
fraudulent transactions. 30 orders (worth $850m) were stopped
by the US Fed, but 5 orders (worth $101m) went through.
A further $20m was blocked by a recipient bank in Sri Lanka
Vietnam Swift fraud attempt
Further analysis of the Bangladesh cyber heist, led to the
conclusion that the same attackers appear to have struck
previously, using similar tools written for targeting a bank in
Vietnam just a couple months before the Bangladesh attack.
There are many “creative” new strategies in fin crimes
Some of the known financial crime strategies:
Cheque fraud Credit card fraud Mortgage fraud Medical fraud Corporate fraud
Securities fraud
(including insider
trading)
Bank fraud Insurance fraud
Market
manipulation
Payment (point
of sale) fraud
Health care fraud Theft
Scams or
confidence tricks
Tax evasion Bribery
Embezzlement Identity theft Money laundering
Forgery and
counterfeiting
PwC 3
There are many fraud detection and AML software on the market
Market segment by Type, Financial Fraud Detection Software can be split into
Anti Money Laundering
Detection Software
Identity Theft Detection
Software
Credit/Debit Card Fraud
Detection Software
Others
Wire Transfer Fraud
Detection Software
PwC 4
Traditionally, fin crime is approached by reporting and expert knowledge
and assessment
The steps are usually:
* Historically those can be large financial abuse management systems, transaction monitoring systems, in-house development
scripts, etc..
Report of alerts is generated by rule decision engine*. This report is showing transactions / clients detected by
(usually) orthogonal rules.
Experts assess the alerts and decides on appropriate action. This can be for example investigation of the activities
of the client.
1
2
PwC 5
Most financial institutions struggle with similar problems in detecting financial
crime
Huge streams of data Scenarios are far from perfect Fraud schemas are developing
Investigations are costly Number of scenarios are limited and costly More data science is better
PwC 6
Machine learning approaches aim to increase the automation and recall of the
process
Rule based expert based Supervised (Investigation needed) Unsupervised
1. Optimal rules
1. Segmentations
2. Anomaly detection
3. Semi-Supervised approach
2. Deep learning approaches
a) Pattern discovery
a) Rule based Models Creation
b) Threshold optimizations
c) Rule optimization ways
d) Alert prioritization
PwC 7
The key problem is the unbalanced dataset and some terminology
0.1% True positive
and 99.9% False positives
Only 13 scenarios
Around 90 features
~12 segments
~700 threshold
600M transaction
2M
Alerts
2K SARs
PwC 8
More precise numbers from past projects in different banks
Alerting and escalation
Customers L1 (Alerts) Alert Rate (L1/Cust.) L2 (Cases) L1 to L2 Rate L3 (SAR-Rec) L2 to L3 Rate SAR Rate (SARs/Cust.)
Peer 1 55,000,000 320,000 0.58% 32,000 10% 6,400 20% 0.012%
Peer 2 4,500,000 60,500 1.34% 6,340 10% 3,340 53% 0.074%
Peer 3 9,900,000 148,000 1.49% 40,000 27% 670 2% 0.007%
Peer 4 40,000,000 50,000 0.13% 12,000 24% 375 3% 0.001%
Peer Average: 0.89% 17.88% 19.37%
Benchmarking
for alert
volumes
Benchmarking
for AML TM
investigations
Number FTE Annual spend Maturity (0 (low) to 3 (high))
Peer 1 12,000 $2bn 2+
Peer 2 5,000 $800m 2
Peer 3 10,000 $1.2bn 3
Peer 4 210 $50m 0-1
Peer 5 150 $125m 1-2
Peer 6 2,500 $300m 1-2
PwC 9
What is normality
Normal or not?
PwC 10
Anomaly
Sensitivity
Density
How does it work: Normality
Normality is a measure of concentration
separated from anomaly by sensitivity
threshold
Normal
Normal
PwC 11
How does it work: Abnormality of anomalies
How far from Normality?!
How far from other abnormalities?!
Abnormality of anomalies
Normal
Normal
PwC 12
How does it work: Similarity of anomalies
Anomaly cluster
Some anomalies are similar and create a
separate cluster
Investigation of one anomaly and finding a
fraud make other anomalies more probable to
be fraud
Normal
Normal
Similarity of anomalies
PwC 13
How does it work: Stability of normal and anomalous patterns
Anomaly cluster
When normality definition over time remains
”stable”, the analytical set is considered
“operational”
Normal
Normal
PwC 14
PwC 15
There are a lot of ways how to detect anomaly
Non parametric
• Density-based techniques (k-nearest neighbor, local outlier factor, and many more variations of this concept).
• Fuzzy logic-based outlier detection.
• Cluster analysis-based outlier detection.
Parametric
• Subspace- and correlation-based outlier detection for high-dimensional data..
• Bayesian Networks.
• Deviations from association rules and frequent item sets.
&more
• Ensemble techniques, using feature bagging, score normalization and different sources of diversity.
PwC 16
Non parametric: Density-based techniques
PwC 17
Parametric
Prediction
Error term is anomaly score
PwC
Autoencoders
PwC 18
Autoencoders present powerful method for anomaly detection in financial
crimes
What is this?
Approach to training
Measure of quality
x H R
Input
(observation)
Internal representation
(neural network, hidden layer)
Output
(reconstruction)
f(x) g(x)
Target output = observed input:
H = f(x) R = g(x) = g(f(x)) = x
Loss function L(x, g(f(x))), e.g. RMSE
Traditionally used for anomaly detection & dimension reduction
PwC 19
PwC 20
Is it the same principle as compression?
No, compression usually has no loss
Compression is generic
Autoencoding is trained on specific cases
Original observation (Labrador
with brown collar)
Decoded observation (still dog)
Encoding Decoding
1 0 0 1 0
1 1 0 1 1
0 0 1 1 0
0 0 1
1 0 1
0 1 1
PwC 21
Which strategy should we apply?
Train on goods, predict anomality by loss
(difference between input and representation)
1
Train on all, assume that neural network will not
learn bads due to low number of observations
2
StrategiesTransactions
Goods
Bads
Unknown
Bads
1
2
HX R
Why do we need a model of g(f(x)) = x?
We do not
We need the internal representation H of x
With a deep H (multiple layers), autoencoder can approximate any mapping from X to R arbitrary well (Hinton & Salakhutdinor, 2006)
1
1 + e−(a1W1+a2W2+bias)
x1
x2
x3
bias
NEURON
a1W1
a2W2
PwC 22
How can I understand what’s happening inside?
Why do you want?
x1
x2
x3
Ok, then … We simulate:
x1 ∈ {min(x1): max(x1)}
x2 = x2
x3 = x3
PwC 23
So what do I get using this?
It learns only the probable inputs1
You can play with the loss function2
As a result we get a powerful anomaly detector
Autoencoder is able to learn the structure of manifold
Those combined force H to capture + information about the structure of the data generating distribution
Applying the expert knowledge
e.g.
L = n=1
k Wn x−xn 2
k
Wn = 0,5; 3;
,
PwC 24
PwC 25
How do we say in the end what is anomaly?
Input OutputHidden
X1
X2
X3
X1
X2
X3
Comparison of input & output
Bads
Goods
Anomality
treshold
RMSE
Classification
Using the final
layer of encoder
as input for the
classifier
1 2
26
Finally, we train and validate a classification algorithm to predict anomalies in
advance
Anomaly labeling with Autoencoders
Anomaly
Normality
RMSE
1
Boosted decision tree to predict failure and define
predictive rules4
Normality Normality
Anomaly
5 Validating the results
ROC
FP
TP
Time series2
measuredattributeX
time
Counter example Positive example
measuredattributeX
Slidingwindow
Slidingwindow
3
measuredattributeX
time
measuredattributeX
time
Translating the problem to classification
PwC 27
Case study Asian Bank:
Deep Neural Network was built for Anomaly detection
Neural network illustration Accuracy and loss of the resulting solution
12 nodes 32 nodes 8 nodes 8 nodes 32 nodes 12 nodes
PwC 28
Case study Asian Bank
Comparison of input & output of Neural Network
Anomaly and actual SAR
Anomaly but not a fraud
Sensitivity top 1%
Transactions
Final ROC curve results into 80% AUC vs
Prioritization is possible
PwC 29
Measuring of sensitivity of the autoencoder to input
This publication has been prepared for general guidance on matters of interest only, and does not constitute professional advice. You should not act upon the information contained in this publication
without obtaining specific professional advice. No representation or warranty (express or implied) is given as to the accuracy or completeness of the information contained in this publication, and, to the
extent permitted by law, PricewaterhouseCoopers Česká republika, s.r.o., its members, employees and agents do not accept or assume any liability, responsibility or duty of care for any consequences
of you or anyone else acting, or refraining to act, in reliance on the information contained in this publication or for any decision based on it.
© 2018 PricewaterhouseCoopers Česká republika, s.r.o. All rights reserved. “PwC” is the brand under which member firms of PricewaterhouseCoopers International Limited (PwCIL) operate and
provide services. Together, these firms form the PwC network. Each firm in the network is a separate legal entity and does not act as agent of PwCIL or any other member firm. PwCIL does not provide
any services to clients. PwCIL is not responsible or liable for the acts or omissions of any of its member firms nor can it control the exercise of their professional judgment or bind them in any way.
Thank you!

Weitere ähnliche Inhalte

Ähnlich wie Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"

Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016
Mark Jones
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstract
Venkat Projects
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
Ratnakar Pandey
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
Francesca Lazzeri, PhD
 
Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
Alejandro Correa Bahnsen, PhD
 
TransactionBasedAnalytics2010
TransactionBasedAnalytics2010TransactionBasedAnalytics2010
TransactionBasedAnalytics2010
Vijay Desai
 

Ähnlich wie Liubomyr Bregman "Financial Crime Detection using Advanced Analytics" (20)

Audit,fraud detection Using Picalo
Audit,fraud detection Using PicaloAudit,fraud detection Using Picalo
Audit,fraud detection Using Picalo
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detection
 
Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016Fraud Management_CAS_Presentation_Oct2016
Fraud Management_CAS_Presentation_Oct2016
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstract
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
 
R af d
R af dR af d
R af d
 
Risk Analysis for Dummies
Risk Analysis for DummiesRisk Analysis for Dummies
Risk Analysis for Dummies
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
 
Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...
 
SURVEY ON LINK LAYER ATTACKS IN COGNITIVE RADIO NETWORKS
SURVEY ON LINK LAYER ATTACKS IN COGNITIVE RADIO NETWORKSSURVEY ON LINK LAYER ATTACKS IN COGNITIVE RADIO NETWORKS
SURVEY ON LINK LAYER ATTACKS IN COGNITIVE RADIO NETWORKS
 
TransactionBasedAnalytics2010
TransactionBasedAnalytics2010TransactionBasedAnalytics2010
TransactionBasedAnalytics2010
 
[DSC Europe 22] Anti-Money Laundering ML Modeling approach - Gizem Akar
[DSC Europe 22] Anti-Money Laundering ML Modeling approach - Gizem Akar[DSC Europe 22] Anti-Money Laundering ML Modeling approach - Gizem Akar
[DSC Europe 22] Anti-Money Laundering ML Modeling approach - Gizem Akar
 
Arboreum Deck
Arboreum DeckArboreum Deck
Arboreum Deck
 
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptxCredit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
 
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line TransactionA Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
 
idBUSINESS Red Flag Rules Overview
idBUSINESS Red Flag Rules OverviewidBUSINESS Red Flag Rules Overview
idBUSINESS Red Flag Rules Overview
 
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGCREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
 

Mehr von Lviv Startup Club

Mehr von Lviv Startup Club (20)

Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...
Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...
Artem Bykovets: 4 Вершники апокаліпсису робочих стосунків (+антидоти до них) ...
 
Dmytro Khudenko: Challenges of implementing task managers in the corporate an...
Dmytro Khudenko: Challenges of implementing task managers in the corporate an...Dmytro Khudenko: Challenges of implementing task managers in the corporate an...
Dmytro Khudenko: Challenges of implementing task managers in the corporate an...
 
Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...
Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...
Sergii Melnichenko: Лідерство в Agile командах: ТОП-5 основних психологічних ...
 
Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...
Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...
Mariia Rashkevych: Підвищення ефективності розроблення та реалізації освітніх...
 
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
 
Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)
Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)
Oleksii Kyselov: Що заважає ПМу зростати? Розбір практичних кейсів (UA)
 
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
 
Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...
Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...
Mariya Yeremenko: Вплив Генеративного ШІ на сучасний світ та на особисту ефек...
 
Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...
Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...
Petro Nikolaiev & Dmytro Kisov: ТОП-5 методів дослідження клієнтів для успіху...
 
Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...
Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...
Maksym Stelmakh : Державні електронні послуги та сервіси: чому бізнесу варто ...
 
Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)
Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)
Alexander Marchenko: Проблеми росту продуктової екосистеми (UA)
 
Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...
Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...
Oleksandr Grytsenko: Save your Job або прокачай скіли до Engineering Manageme...
 
Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)
Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)
Yuliia Pieskova: Фідбек: не лише "як", але й "коли" і "навіщо" (UA)
 
Nataliya Kryvonis: Essential soft skills to lead your team (UA)
Nataliya Kryvonis: Essential soft skills to lead your team (UA)Nataliya Kryvonis: Essential soft skills to lead your team (UA)
Nataliya Kryvonis: Essential soft skills to lead your team (UA)
 
Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...
Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...
Volodymyr Salyha: Stakeholder Alchemy: Transforming Analysis into Meaningful ...
 
Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...
Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...
Anna Chalyuk: 7 інструментів та принципів, які допоможуть зробити вашу команд...
 
Oksana Smilka: Цінності, цілі та (де) мотивація (UA)
Oksana Smilka: Цінності, цілі та (де) мотивація (UA)Oksana Smilka: Цінності, цілі та (де) мотивація (UA)
Oksana Smilka: Цінності, цілі та (де) мотивація (UA)
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)
Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)
Andrii Skoromnyi: Чому не працює методика "5 Чому?" – і яка є альтернатива? (UA)
 
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
 

Kürzlich hochgeladen

!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
DUBAI (+971)581248768 BUY ABORTION PILLS IN ABU dhabi...Qatar
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
allensay1
 

Kürzlich hochgeladen (20)

Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
Falcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial WingsFalcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial Wings
 
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
 
Falcon Invoice Discounting: Aviate Your Cash Flow Challenges
Falcon Invoice Discounting: Aviate Your Cash Flow ChallengesFalcon Invoice Discounting: Aviate Your Cash Flow Challenges
Falcon Invoice Discounting: Aviate Your Cash Flow Challenges
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
 
Arti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfArti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdf
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
New 2024 Cannabis Edibles Investor Pitch Deck Template
New 2024 Cannabis Edibles Investor Pitch Deck TemplateNew 2024 Cannabis Edibles Investor Pitch Deck Template
New 2024 Cannabis Edibles Investor Pitch Deck Template
 
Falcon Invoice Discounting: Unlock Your Business Potential
Falcon Invoice Discounting: Unlock Your Business PotentialFalcon Invoice Discounting: Unlock Your Business Potential
Falcon Invoice Discounting: Unlock Your Business Potential
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
 
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All TimeCall 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
 
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdfTVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
 
Falcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business Growth
 

Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"

  • 1. Fin Crime detection using autoencoders Liubomyr Bregman Richard Bobek L'viv, Ukraine 3 Nov 2018 AI & Big Data Day
  • 2. Lessons learnt 1. Skilled operators 2. Less opportunity for “insider” or “opportunistic” attack 3. Need for ‘out-of-band’ systems for notifications PwC 2 Global Payment Fraud – lessons learnt and investigations highlights Lessons learnt 1. Dedicated scenarios within FIs 2. Leverage the convergence between cyber, fraud and ML 3. Leverage advanced analytics – evolving threat landscape *BAE Systems https://en.wikipedia.org/wiki/Bangladesh_Bank_robbery If Hollywood releases another iteration of the 'Oceans 11' franchise, they should base it on the recent attack against the Central Bank of Bangladesh (BB)* Bangladesh cyber heist The attackers attempted to steal $951m in 35 separate fraudulent transactions. 30 orders (worth $850m) were stopped by the US Fed, but 5 orders (worth $101m) went through. A further $20m was blocked by a recipient bank in Sri Lanka Vietnam Swift fraud attempt Further analysis of the Bangladesh cyber heist, led to the conclusion that the same attackers appear to have struck previously, using similar tools written for targeting a bank in Vietnam just a couple months before the Bangladesh attack.
  • 3. There are many “creative” new strategies in fin crimes Some of the known financial crime strategies: Cheque fraud Credit card fraud Mortgage fraud Medical fraud Corporate fraud Securities fraud (including insider trading) Bank fraud Insurance fraud Market manipulation Payment (point of sale) fraud Health care fraud Theft Scams or confidence tricks Tax evasion Bribery Embezzlement Identity theft Money laundering Forgery and counterfeiting PwC 3
  • 4. There are many fraud detection and AML software on the market Market segment by Type, Financial Fraud Detection Software can be split into Anti Money Laundering Detection Software Identity Theft Detection Software Credit/Debit Card Fraud Detection Software Others Wire Transfer Fraud Detection Software PwC 4
  • 5. Traditionally, fin crime is approached by reporting and expert knowledge and assessment The steps are usually: * Historically those can be large financial abuse management systems, transaction monitoring systems, in-house development scripts, etc.. Report of alerts is generated by rule decision engine*. This report is showing transactions / clients detected by (usually) orthogonal rules. Experts assess the alerts and decides on appropriate action. This can be for example investigation of the activities of the client. 1 2 PwC 5
  • 6. Most financial institutions struggle with similar problems in detecting financial crime Huge streams of data Scenarios are far from perfect Fraud schemas are developing Investigations are costly Number of scenarios are limited and costly More data science is better PwC 6
  • 7. Machine learning approaches aim to increase the automation and recall of the process Rule based expert based Supervised (Investigation needed) Unsupervised 1. Optimal rules 1. Segmentations 2. Anomaly detection 3. Semi-Supervised approach 2. Deep learning approaches a) Pattern discovery a) Rule based Models Creation b) Threshold optimizations c) Rule optimization ways d) Alert prioritization PwC 7
  • 8. The key problem is the unbalanced dataset and some terminology 0.1% True positive and 99.9% False positives Only 13 scenarios Around 90 features ~12 segments ~700 threshold 600M transaction 2M Alerts 2K SARs PwC 8
  • 9. More precise numbers from past projects in different banks Alerting and escalation Customers L1 (Alerts) Alert Rate (L1/Cust.) L2 (Cases) L1 to L2 Rate L3 (SAR-Rec) L2 to L3 Rate SAR Rate (SARs/Cust.) Peer 1 55,000,000 320,000 0.58% 32,000 10% 6,400 20% 0.012% Peer 2 4,500,000 60,500 1.34% 6,340 10% 3,340 53% 0.074% Peer 3 9,900,000 148,000 1.49% 40,000 27% 670 2% 0.007% Peer 4 40,000,000 50,000 0.13% 12,000 24% 375 3% 0.001% Peer Average: 0.89% 17.88% 19.37% Benchmarking for alert volumes Benchmarking for AML TM investigations Number FTE Annual spend Maturity (0 (low) to 3 (high)) Peer 1 12,000 $2bn 2+ Peer 2 5,000 $800m 2 Peer 3 10,000 $1.2bn 3 Peer 4 210 $50m 0-1 Peer 5 150 $125m 1-2 Peer 6 2,500 $300m 1-2 PwC 9
  • 10. What is normality Normal or not? PwC 10
  • 11. Anomaly Sensitivity Density How does it work: Normality Normality is a measure of concentration separated from anomaly by sensitivity threshold Normal Normal PwC 11
  • 12. How does it work: Abnormality of anomalies How far from Normality?! How far from other abnormalities?! Abnormality of anomalies Normal Normal PwC 12
  • 13. How does it work: Similarity of anomalies Anomaly cluster Some anomalies are similar and create a separate cluster Investigation of one anomaly and finding a fraud make other anomalies more probable to be fraud Normal Normal Similarity of anomalies PwC 13
  • 14. How does it work: Stability of normal and anomalous patterns Anomaly cluster When normality definition over time remains ”stable”, the analytical set is considered “operational” Normal Normal PwC 14
  • 15. PwC 15 There are a lot of ways how to detect anomaly Non parametric • Density-based techniques (k-nearest neighbor, local outlier factor, and many more variations of this concept). • Fuzzy logic-based outlier detection. • Cluster analysis-based outlier detection. Parametric • Subspace- and correlation-based outlier detection for high-dimensional data.. • Bayesian Networks. • Deviations from association rules and frequent item sets. &more • Ensemble techniques, using feature bagging, score normalization and different sources of diversity.
  • 16. PwC 16 Non parametric: Density-based techniques
  • 19. Autoencoders present powerful method for anomaly detection in financial crimes What is this? Approach to training Measure of quality x H R Input (observation) Internal representation (neural network, hidden layer) Output (reconstruction) f(x) g(x) Target output = observed input: H = f(x) R = g(x) = g(f(x)) = x Loss function L(x, g(f(x))), e.g. RMSE Traditionally used for anomaly detection & dimension reduction PwC 19
  • 20. PwC 20 Is it the same principle as compression? No, compression usually has no loss Compression is generic Autoencoding is trained on specific cases Original observation (Labrador with brown collar) Decoded observation (still dog) Encoding Decoding 1 0 0 1 0 1 1 0 1 1 0 0 1 1 0 0 0 1 1 0 1 0 1 1
  • 21. PwC 21 Which strategy should we apply? Train on goods, predict anomality by loss (difference between input and representation) 1 Train on all, assume that neural network will not learn bads due to low number of observations 2 StrategiesTransactions Goods Bads Unknown Bads 1 2
  • 22. HX R Why do we need a model of g(f(x)) = x? We do not We need the internal representation H of x With a deep H (multiple layers), autoencoder can approximate any mapping from X to R arbitrary well (Hinton & Salakhutdinor, 2006) 1 1 + e−(a1W1+a2W2+bias) x1 x2 x3 bias NEURON a1W1 a2W2 PwC 22
  • 23. How can I understand what’s happening inside? Why do you want? x1 x2 x3 Ok, then … We simulate: x1 ∈ {min(x1): max(x1)} x2 = x2 x3 = x3 PwC 23
  • 24. So what do I get using this? It learns only the probable inputs1 You can play with the loss function2 As a result we get a powerful anomaly detector Autoencoder is able to learn the structure of manifold Those combined force H to capture + information about the structure of the data generating distribution Applying the expert knowledge e.g. L = n=1 k Wn x−xn 2 k Wn = 0,5; 3; , PwC 24
  • 25. PwC 25 How do we say in the end what is anomaly? Input OutputHidden X1 X2 X3 X1 X2 X3 Comparison of input & output Bads Goods Anomality treshold RMSE Classification Using the final layer of encoder as input for the classifier 1 2
  • 26. 26 Finally, we train and validate a classification algorithm to predict anomalies in advance Anomaly labeling with Autoencoders Anomaly Normality RMSE 1 Boosted decision tree to predict failure and define predictive rules4 Normality Normality Anomaly 5 Validating the results ROC FP TP Time series2 measuredattributeX time Counter example Positive example measuredattributeX Slidingwindow Slidingwindow 3 measuredattributeX time measuredattributeX time Translating the problem to classification
  • 27. PwC 27 Case study Asian Bank: Deep Neural Network was built for Anomaly detection Neural network illustration Accuracy and loss of the resulting solution 12 nodes 32 nodes 8 nodes 8 nodes 32 nodes 12 nodes
  • 28. PwC 28 Case study Asian Bank Comparison of input & output of Neural Network Anomaly and actual SAR Anomaly but not a fraud Sensitivity top 1% Transactions Final ROC curve results into 80% AUC vs Prioritization is possible
  • 29. PwC 29 Measuring of sensitivity of the autoencoder to input
  • 30. This publication has been prepared for general guidance on matters of interest only, and does not constitute professional advice. You should not act upon the information contained in this publication without obtaining specific professional advice. No representation or warranty (express or implied) is given as to the accuracy or completeness of the information contained in this publication, and, to the extent permitted by law, PricewaterhouseCoopers Česká republika, s.r.o., its members, employees and agents do not accept or assume any liability, responsibility or duty of care for any consequences of you or anyone else acting, or refraining to act, in reliance on the information contained in this publication or for any decision based on it. © 2018 PricewaterhouseCoopers Česká republika, s.r.o. All rights reserved. “PwC” is the brand under which member firms of PricewaterhouseCoopers International Limited (PwCIL) operate and provide services. Together, these firms form the PwC network. Each firm in the network is a separate legal entity and does not act as agent of PwCIL or any other member firm. PwCIL does not provide any services to clients. PwCIL is not responsible or liable for the acts or omissions of any of its member firms nor can it control the exercise of their professional judgment or bind them in any way. Thank you!