Ai and machine learning help detect, predict and prevent fraud - IBM Watson Data Science Meetup

AI and machine learning
help detect, predict and
prevent fraud
Nina Lozo
Data & AI Technical Professional
IBM South East Europe,
nina.lozo@rs.ibm.com
IBM Watson Data Science Meetup

72%
of businesses cite fraud as a
growing concern and
63% report the same or
higher levels of fraudulent
losses over the past 12 months
– Experian Global Fraud Report
Data Science and AI/ March, 2019 / © 2019 IBM Corporation 2

© 2019 IBM Corporation
Fraud Detection improvement with Data Science

4
Fraud Detection without data science

5

6

Enforcement funnels

Data Science in Fraud
Detection
8
A data science platform is
complimentary to other fraud
detection systems
• React: Predictive models can quickly
determine changing patterns in fraud
and react to them in real time
• Improve: Data science can help
derive new fraud detection rules,
which can be used to improve the
business process
• Achieve more: Data science can
increase the rate of fraud detection

9Data Science and AI/ March, 2019 / © 2019 IBM Corporation
Challenge #1:
Digital transformation exacerbates data
issues
Data silos
 Multiple fraud departments
 Internal point-solutions
Data overload
 Different channel, product or fraud type
Incomplete view
 Unable to see patterns or behaviors across
business lines

Need for an end-to-end
data science platform
Provide one consistent experience
 Across multiple departments
 Ready access from public clouds, private
clouds, and on-premises
 Strong governance and security
Collaborate across LOB and SMEs
 Across data scientists, risk analysts,
investigators and other subject matter experts
See a single view of latest insight
 Continuously accommodate real-time data,
monitor and detect fraudulent activities
 Adapt as the patterns change and spot
anomalies

Challenge # 2:
Increasingly difficult to predict fraud
Fraud is rare
 Can be as low as 1% of activities
 Imbalance in classification of detection
models
Faster turnaround
 Transactions executed more quickly
 Less time to identify, counteract, and recover
Fast-evolving fraudster schemes
 Unceasing new and mutating forms
 New routes and channels

More data science for more people
 Visual programming tool for LOB experts and
analysts (citizen data scientists)
 Increase efficiency of “known” threats
Faster discovery and deployment
 Full end-to-end AI lifecycle: build, deploy, and
manage
 More time to identify, counteract, and recover
Deep learning and advanced analytics
 Enable deep learning and neural networks
 Get latest models and frameworks in fraud
prediction
Need to move from
fraud detection to fraud
prediction
Need to upskill the
team to do more data
science

Challenge # 3:
Fraud detection is costly
False positives
 Require manual investigations
 Poor customer experience
Loss payout
 $millions and resources reserved for payout
 Poor customer experience
Reputation risk
 Potential damage to company reputation

Fewer false positives
 Leverage more unstructured data like text
and images
 Enable deep learning and neural networks
Faster model training and development
 Leverage pre-trained APIs to jump start
development process
 Easy tooling to train models in a few clicks
Need to make faster
and more accurate
predictions

Advanced analytics techniques can dramatically
improve the effectiveness and efficiency of fraud
management…
Where once fraud was detected by risk functions flagging
suspect transactions for manual review, firms can now use
neural networks based on unsupervised and supervised
architectures to monitor dubious activities.
– McKinsey & Company
Fraud management: Recovering value through next-generation solutions

16
AI drives real business value in fraud prevention
Faster screening
updates
Enhanced screening
models
Enhanced accuracy
of fraud profiling
Enhanced identity
verification
Centralization of
fraud processes
Enhanced fraud
analytics tools
Lower cost of fraud
infrastructure
Reduced fraud false
positive rates
Improved
investigations
process
Facilitate
investigation case
management
Automated fraud
reporting
Comply with voluntary
and mandated
regulations while
differentiating
competitive position
Reduced costs of
payment fraud
losses
Reduced costs of
fraud screening &
monitoring
Reduced cost of
fraud investigations
Reduced cost of
compliance
reporting of fraud

AI enables you to predict likelihood of fraud and proactively act upon
insight to drive better prevention
Capture
Data Collection delivers an accurate view
of customer attitudes and opinions
Predict Act
Predictive capabilities bring repeatability
to ongoing decision making, and drive
confidence in your results and decisions
Unique deployment technologies and
methodologies maximize the impact of
analytics in your operation
…
…
Data
Collection
Deployment
TechnologiesPlatform
Deep
Learning
Detect Predict Analyze
Data
Mining
Machine
Learning

Easily & seamlessly
move from sandbox
to production
Connect data science
models with real-time
data
Deploy predictive
models into business
process
Create more “citizen
data scientists” with
visual modeling
Train advanced ML
models without data
science degree
Make fraud detection
easier
Empower data
scientists to get
ahead of fraudsters
Enable deep learning
and neural networks
Get latest models and
frameworks in fraud
prediction
Leverage more
unstructured data like
text and images
Easy tooling to train
models in a few clicks
Leverage pre-trained
APIs to jump start
development process
Move from fraud
detection to fraud
prediction
Upskill the team
to do more data
science
Stay ahead of
fraudsters with
latest ML models
Make faster and
more accurate
prediction
Strategies to stay ahead

19
“With the data mining
system, we generated
productivity savings of
nearly 80 percent.”
Francisco Ruiz
Head of Compliance,
Bancolombia
Solution
 Deployed predictive data-modeling software that helped it more easily and
quickly detect transactions that were part of potential money-laundering
operations
 Solution prevents, detects and reports potentially fraudulent banking activities
that may stem from criminals and terrorists
Challenges
 Need to analyze millions of daily transactions to identify current and potential
fraud
 Move from a labor-intensive decentralized system to a more automated
process
Results
 Reveals 40% more suspicious transactions by automatically identifying the
most likely fraudulent activities. Increases reporting capabilities by 200% and
analysts productivity by 80%
 Discovers the latest money-laundering techniques by capturing data from 700
branches and 2,300 ATMs in six countries.
 Aggregates multiple transaction activities with centralized reporting for more
precision in detecting financial relationships.

20
"The IBM Data Science
Elite team was able to
help direct our operating
model, and skills,
towards a deeper and
more integrated
structure."
Guy Taylor
Head of Data & Data-Driven
Intelligence
Solution
 Deployed predictive data-modeling software that helped it more easily and
quickly detect transactions that were part of potential money-laundering
operations
 Solution prevents, detects and reports potentially fraudulent banking activities
that may stem from criminals and terrorists
Challenges
 Fraudulent activity is very rare relative to all online banking activity, making it
difficult to predict, posing a reputational risk to the bank
 Current fraudulent alert system has a very high false positive rate, lowering
customer satisfaction
Results
 Reduce number of alerts that fraud responders must review and reduce missed
fraudulent activity
 Assist fraud responders in identifying which suspicious activities are most likely
to be fraudulent.

21
FIRST MODELING APPROACHCHALLENGES
• Fraudulent activity is very
rare relative to all online
banking activity
(0.004% of sessions)
• ~500M actions/ month
• Predictors need to be
accepted by fraud team
Nedbank:
Predict
Fraudulent
Online
Banking
Activity
SECOND MODELING APPROACH
21
OBJECTIVE
• Use supervised machine learning to
predict fraudulent activity within
Nedbank's mobile banking system
OVERVIEW
• Currently uses a decision-rule based
system to flag suspicious transactions for
review by fraud responders
• High false positive rate, low false negative
rate
• Missed fraudulent activity is costly
• Large volume of alerts places a burden on
responders
94%
48%
CURRENT SYSTEM
WITH AUGMENTATION
False Positives
4%
7%
CURRENT SYSTEM
WITH AUGMENTATION
False Negatives
95%
85%
CURRENT SYSTEM
ML MODEL
False Positives
17%
6%
CURRENT SYSTEM
ML MODEL
False Negatives
• Augment existing system by
predicting which alerts on
individual activity are correct
• Predict which user sessions are
fraudulent within first 10 seconds

22
“Before this solution, the
minimum time it took to
settle a claim was three
days. Now, the low-risk
claims that pass down
the ‘immediate’ channel
can be settled within
an hour.”
Anesh Govender
Head of Finance, Reporting and
Salvage at Santam
Solution
 Santam chose IBM for the range of functionality, flexibility, and its ability to
integrate with an existing system
 The company’s core claims management system resided on a mainframe
platform that still met the company’s needs
 The solution integrated different kinds of rules from across the infrastructure,
including process rules from company’s business process management
software system, decision and agility rules from SPSS software itself, and
override
Challenges
 Fraud losses accounted for an annual 6 to 10 percent of premium costs for
Santam customers
 Needed a solution that more effectively assessed risk and separated potentially
fraudulent claims from lower risk ones would prevent fraud, reduce other costs
and increase efficiency
Results
 Identified a major fraud ring in less than 30 days after implementation.
 Saved more than USD2.5 million in payouts to fraudulent customers, and nearly
USD5 million in total repudiations.
 Reduced claims processing time on low-risk claims by nearly 90 percent.

23
“The [Watson] Studio
gives us the ability to
process millions and
millions of records and
to be able to act real
time.”
Julio Sánchez
Global Analytics Lead -
Accenture Center for IBM
Technologies, Accenture
Solution
 By analyzing internal company and external data to determine the
risk factors and level associated with each service user and alerts
audit managers of risky behavior
Challenges
 Detects the likelihood of fraudulent behavior – such as an individual
posing as a legitimate customer who receives a service but won’t
pay for it
 The ability to identify and trace new anomalous behaviors through
continuous monitoring is vital for companies to take preemptive
actions against future costly occurrences of fraud
Results
 Process millions of records and be able to act real time

24
Reduces payments on
fraudulent claims and
improves its ability to
collect payments from
other insurance
companies
Solution
 IPCC implemented solution to rapidly identify and investigate
suspicious claims and to expedite handling of unsuspicious claims in
order to improve customer satisfaction.
Challenges
 IPCC needed ways to automate the workflows and data gathering
related to fraudulent and subrogated automobile claims.
Results
 Accelerated payments collection
 Reduced costs of claims payments
 Yielded annual return on investment (ROI) of 403% for direct and
indirect benefits and a payback within 3 months

Data Science and AI/ March, 2019 / © 2019 IBM Corporation
25
Single platform to
train and deploy
models to support
both fraud
investigation and
prediction
Easy-to-use visual
modeling
capabilities to
enable “citizen
data scientists” to
work with
advanced ML
Same best-in-class
platform to support
deep learning
frameworks and
models
The only platform
to provide easy
tooling to enable
unstructured data
for investigation
and prediction
Why IBM?
Move from fraud
detection to fraud
prediction
Upskill the team
to do more data
science
Stay ahead of
fraudsters with
latest ML models
Make faster and
more accurate
prediction

Where do we go from here?
An IBM-led AI Journey Workshop provides the strategy and expertise to transform your business into a
cognitive enterprise and unlocks the full potential of your data with AI.
Briefing
& Vision
AI Journey
Workshop
Design
& Validate
Implement
& Deliver
Conclude
& Expand
Identify your unique
business challenges
and needs.
Explore how AI is
transforming
every industry.
Partake in an IBM-led
half or full day
workshop to explore
your use case and
scope out potential
solutions.
Work with IBM subject
matter experts to fully
define the scope and
success criteria for an
AI solution.
Delivery and
deployment of the
agreed upon AI
solution, tailored
specifically to your
business needs.
Explore how to further
accelerate your
organization’s AI
Journey with IBM.

Fraud Prediction in an Auto Insurance Claims Triaging System
28
The auto insurance claims triaging story
– The target we’re driving towards: the claims triaging app
– How to predict auto insurance claims fraud
– Choosing a path through the Watson Studio demo that’s tailored for the customer’s users
Building the fraud prediction model
– Common tooling
• Shopping for data using the catalog
• Preparing data using Data Refinery
– Canvas tool
• Preparing training data
• Training model
– Notebook tool
• Preparing training data
• Training model
– Watson Machine Learning
• Deploying model

Challenge
Processing insurance claims is expensive, time consuming, and risk-intensive. The most significant risk factors are
litigation, which increases in likelihood the longer it takes to adjust a claim, and fraud.
Claims processing become especially intense during natural calamities, when insurers need to process a sudden spike
in claims, even to the point of transporting adjusters to the impacted location.
Goals
Using information that's available to the insurance company, develop a data-driven claims process that does the
following:
• Reduces the median time for a claim to be processed
• Minimize the risk of fraud
Method
Using IBM’s self-service AI tools, develop an intelligent claims processing app
29

Claims Triaging App
This app presents relevant data to claims
triaging staff
Focal point is the fraud prediction
–Probability assigned
–Reasons presented
Watson Studio provides tools to build the
data assets used by this app: refine data
and build, train, deploy the fraud
probability model
30

Fraud indicators
Loss event claimed within 15 days of policy expiration
Expired drivers license
Expensive vehicle damages
Frequent changes of residence
High mileage at loss event for a policyholder with a low mileage discount
High number of previous claims
No police report
Building a fraud prediction model
1. Find the data that shows the fraud indicators
2. Prepare a training data set with the fraud indicators
3. Train a fraud prediction model
4. Deploy the model for use in the fraud triaging app
31

Watson Studio has Tools for Multiple User Types
32

Introducing Watson Studio
Collaboration environment for
data-driven discovery
Two basic concepts
Catalog
Records of your organization’s
data assets
Where you find data assets
Projects
Temporary sandboxes for
discovery and development of new
data assets
Where you build data assets
33

Finding Data in the Catalog: Shop for Data
Metadata-based search
Explore data assets
Social engagement: tagging,
rating, commenting
Find, and add data sets to your
project
Policy-based access control
34

Projects
Sandbox environment
Gather data assets
Build data products with the data
assets
Collaboration
Tools for different user types
Easy for people to see each others’
work and pitch in
35

Canvas Tool
GUI-based data preparation
Flow-based operations
Step-by-step we manipulate the data set until
we have the fields needed for training
Many operations available
Operationalize
Can push data preparation down to database
engines
Save the set of steps as a single data flow
Writable to a staging destination
Staged prepared data can be consumed by
other tools
36

Notebooks
Open tool from asset or “Add to
Project” menu
Programmatic environment
Open source tooling, wrapped in a
managed, secure collaboration
environment
37

Model deployment: Watson Machine Learning
Common tooling for deployed
ML models
Same tooling, regardless of model
build approach
Version control over models and
their deployments
Deployed models admin can be
done through a REST API
Scoring end point is also a REST
API
38

Ai and machine learning help detect, predict and prevent fraud - IBM Watson Data Science Meetup

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Ai and machine learning help detect, predict and prevent fraud - IBM Watson Data Science Meetup

Ähnlich wie Ai and machine learning help detect, predict and prevent fraud - IBM Watson Data Science Meetup (20)

Mehr von Institute of Contemporary Sciences

Mehr von Institute of Contemporary Sciences (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Ai and machine learning help detect, predict and prevent fraud - IBM Watson Data Science Meetup

Hinweis der Redaktion