Hadoop BIG Data - Fraud Detection with Real-Time Analytics
1. Hadoop – BIG Data
Fraud Detection with real-time Analysis
2. What is Fraud Detection?
Fraud Detection with real-time Analysis with Hadoop and Big Data Technologies
for different industries such as Banking, Finance, Insurance, Core Accounts
Receivable, Government, HealthCare, or Retail.
Fraud is a major concern across all industries. You name the industry (Banking,
Insurance, Government, Accounts Receivable, HealthCare, or Retail, for example)
and you’ll find fraud.
In today’s inter-connected world, the sheer volume and complexity of
transactions makes it harder than ever to find fraud.
Traditional approaches to fraud prevention aren’t particularly efficient. For
example, the management of improper payments is often managed by analysts
auditing what amounts to a very small sample of claims paired with requesting
medical documentation from targeted submitters. The industry term for this
model is pay and chase. Claims are accepted and paid out and processes look for
intentional or unintentional overpayments by way of post-payment review of
those claims.
3. Though the sheer volume of transactions makes it harder to spot fraud because
of the volume of data, ironically, this same challenge can help create better
fraud predictive models – an area where Hadoop and Big Data shines.
What is Fraud Detection?
4. How is Fraud detection done?
So how is fraud detection done now?
Because of the limitations of traditional technologies, fraud models are built by
sampling data and using the sample to build a set of fraud-prediction and
detection models. When you contrast this model with a Hadoop Big Data –
anchored fraud department that uses the full data set – No Sampling – to build
out the models, you can see the difference.
For creating fraud-detection models, Hadoop is well suited to
Handle Volume: That means processing the full data set - no data sampling.
Manage new varieties of data: Data coming from different sources and in
different formats.
Maintain an agile environment: Enable different kinds of analysis and changes
to existing models.
5. How is Fraud detection done?
The limitations of sampling
Faced with expensive hardware and a pretty high commitment in terms of time
and RAM, people tried to make the analytics workload a bit more reasonable by
analyzing only a sampling of the data.
While sampling is a good idea in theory, in practice this is often an unreliable
tactic. Finding a statistically significant sampling can be challenging for sparse
and/or skewed data sets, which are quite common. This leads to poorly judged
samplings, which can introduce outliers and anomalous data points, and can, in
turn, bias the results of analysis.
6. BEST PRACTICES IN FRAUD MANAGEMENT
A best-practice fraud management approach is integrated from end to end.
Figure 1: Fraud management approach Integrated End-End
7. BEST PRACTICES IN FRAUD MANAGEMENT
COMBATING FRAUD WITH THE TECHNOLOGY AVAILABLE TODAY – Big Data Hadoop
Step 1. Create an enterprise wide view of patterns and perpetrators.
Step 2. Prevent and detect fraud in enterprise wide context.
Step 3. Investigate and Resolve Fraud in an Integrated Environment.
Figure below shows how Hadoop can be integrated within an Enterprise and how it can be used in an enterprise
for building Fraud Patterns and Models and analytics on full data, rather going for sampling.
Figure 2: Hadoop in Enterprise
8. BEST PRACTICES IN FRAUD MANAGEMENT
A best-practice fraud management system is integrated from end to end, from data
management to analysis (using multiple analytical techniques), alert generation and
management, and case management.
Hadoop as a queryable archive in support of an enterprise data warehouse.
Hadoop can be used as a data transformation engine.
Hadoop as a data processing engine
Hadoop to add Discovery and Sandbox capabilities to a modern-day analytics ecosystem.
Fraud Models and Hadoop
Most Hadoop use cases is that it assists business in breaking through the glass ceiling on the
volume and variety of data that can be incorporated into decision analytics. The more data we
have, the better our models can be.
Mixing non-traditional forms of data with set of historical transactions can make fraud models even
more robust.
Organization can work to move away from market segment modelling and move toward at-
transaction or at-person level modelling. Quite simply, making a forecast based on a segment is
helpful, but making a decision based on particular information about an individual transaction is
better. To do this, we work up a larger set of data than is conventionally possible in the traditional
approach.
9. BEST PRACTICES IN FRAUD MANAGEMENT
If the data used to identify or bolster new fraud-detection models isn’t available at a moment’s
notice, by the time we discover these new patterns, it could be too late to prevent damage.
Evaluate the benefit to business of not only building out more comprehensive models with more
types of data but also being able to refresh and enhance those models faster than ever.
Traditional technologies aren’t as agile, either. Hadoop makes it easy to introduce new variables
into the model.
10. Traditional Statistical Analysis and Hadoop
Traditional statistical analysis applications come with powerful tools for generating workflows.
These applications utilize intuitive graphical user interfaces that allow for better data visualization.
Hadoop follow a similar pattern as these other tools for generating statistical analysis workflows.
See Figure 3, during the final data exploration and visualization step, users can export to human-
readable formats (JSON/CSV) or take advantage of visualization tools.
Figure 3: Generalized statistical analysis workflow with Hadoop
11. CLOSING THOUGHTS
Fraud is a major concern across all industries.
Many organisations spend lot of money and efforts in preventing fraud. With power of modern
technologies such as Big Data and Hadoop analysing, detecting and preventing fraud has gone to a
next level.
Organisations can continue using their existing IT infrastructure and leverage Big Data Hadoop
technologies for real-time fraud analysis.
Organisations can truly be agile while handing Data in Motion, Data at Rest & Data in Many Forms
with Big Data Hadoop Technologies.