SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Fast Data Mining

Real Time Knowledge Discovery for
Predictive Decision Making

Nino Guarnacci
nino.guarnacci@oracle.com

!1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Data
Explosion

Web & social
networks
experienced it
first…

Infographic by Go-gulf.com

!2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
… but enterprises are now facing it too … but
• Services and web transaction data (to
refine recommendations, detect trends
etc.)
• “Sensor” data:
• GPS in mobile phones
• RFIDs
• NFC
• SmartMeters
• Etc.
• Log file monitoring and analysis
• Security monitoring
Utilities deploying smart meters?
! 200x information flowing to data center!

!3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

enterprises are
also facing it
now
%
93
executives who would
grade themselves C or
lower in preparedness

%
89

!4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

executives who say
drawing intelligence
organization is priority
from data is top losing

believe their
revenue as a result of not being
able to fully leverage information

%
67

Source: Oracle Research Study - From Overload to Impact: An Industry Scorecard on Big Data Business Challenges, July 2012
Obstacles to Faster Manage Data – Latency Gap
While Ensuring Accuracy, Efficiency, and Scale
Fragmented
event entities

The Gap

Business Value

Business event

Data captured

Analysis completed
Action taken

Action Time
!5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Source: Richard Hackethorn’s Component’s of Action Time
Obstacles to Faster Manage Data – Latency Gap
While Ensuring Accuracy, Efficiency, and Scale
Fragmented
event entities

The Gap

Business Value

Business event

Data captured

Analysis completed
Action taken

Action Time
!6

Source: Richard Hackethorn’s Component’s of Action Time
What is Fast Data?

Turning High Velocity Data into Value

▪ It’s about getting more from in-flight data
▪ It’s about faster action, faster insights
▪ It’s about running your business in real-time

!7
Oracle Fast Data Approach

Filter, Move, Transform, Analyze, and Act at High Velocity

FILTER & 

CORRELATE

MOVE &
TRANSFORM

!8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

ANALYZE



ACT
Oracle Fast Data Approach

Filter, Move, Transform, Analyze, and Act at High Velocity
Network Status

In-Memory
Data Grid

FILTER & 

CORRELATE

Real Time Streams

Information

• Parallel Multiple Streams: jms, files, coherence, db,..
• Different Object Type: text, java object…
• High throughput for data Aggregation and Event Querying

Coherence Data Grid holds the data and compute in parallel
!9
Oracle Fast Data Approach

Filter, Move, Transform, Analyze, and Act at High Velocity

- Event Streams -

Event-type

Event-type
Event-type

EPN (Event Processing Network) Elements

Adapter

Channel

Cache

!10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

POJO

JSON
Processor

HTTP Pub/Sub
Oracle Event Processing


STREAMS

SLA Detection: Pattern Matching
<TRACE>
<ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY>
<TRACED_ENTITY>PACCO</TRACED_ENTITY>
<TRACE>
<WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED>
<ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY>
<WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED>
<TRACED_ENTITY>PACCO</TRACED_ENTITY>
<TRACE>
<WHERE_HAPPENED_DETAIL>
<WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED>
<ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY>
<OFFICE>
<WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED>
<TRACED_ENTITY>PACCO</TRACED_ENTITY>
<WHERE_HAPPENED_DETAIL>
<WHERE_DESCRIPTION>MONZA</
<WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED>
WHERE_DESCRIPTION>
<OFFICE>
<WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED>
<WHERE_ID>MZ</WHERE_ID>
<WHERE_HAPPENED_DETAIL>
<WHERE_DESCRIPTION>MONZA</
WHERE_DESCRIPTION>
</OFFICE>
<OFFICE>
</WHERE_HAPPENED_DETAIL>
<WHERE_ID>MZ</WHERE_ID>
<WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION>
</TRACE> </OFFICE>
<WHERE_ID>MZ</WHERE_ID>
</WHERE_HAPPENED_DETAIL>
</OFFICE>
</TRACE>
</WHERE_HAPPENED_DETAIL>
</TRACE>

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

DATABASE
SPATIAL

Match Pattern= R 7 ◆
TIME
WINDOW

SELECT
M.SLA_VIOLATED
FROM
TRACE IN CHANNEL,
ENTITIES,
SPATIAL CONTEXT
MATCH_RECOGNIZE (
MEASURES
SLA_VIOLATED
PATTERN (A B)
DEFINE
A (DELIVERY TIME - NOW) < 2 DAYS
B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM
) as M
Oracle Event Processing

SLA Detection: Filtering & Correlation

ISTREAM(
SELECT

FROM
PARTITION BY
SELECT
M.SLA_VIOLATED
FROM
TRACE IN CHANNEL,
ENTITIES,
SPATIAL CONTEXT
MATCH_RECOGNIZE (
MEASURES
SLA_VIOLATED
PATTERN (A B)
DEFINE
A (DELIVERY TIME - NOW)
< 2 DAYS
B DISTANCE BETWEEN
(LOCATION, DESTINATION) > 600 KM
) as M

WITHIN
GROUP BY

)

▪

Aggregate and Correlate received
filter-events
Partition by Trip-Path probable SLA
violations

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

SPATIAL_CONTEXT
SLA_VIOLATED_OUT_CHANNEL
START_OFFICE,
WHERE_HAPPENED

1 HOUR

HAVING

▪

COUNT(*),
START_OFFICE,
WHERE_HAPPEND,
LATITUDE, LONGITUDE

START_OFFICE
COUNT(*) > 5
Oracle Fast Data Approach Mining?
What is Oracle Data

Filter, Move, Transform, Analyze, and Act at High Velocity



!

Real-Time Streams analysis, correlate events from
Automatically sifting through large amounts of data to
different source, manage and use them valuable new
find previously hidden patterns, discover as a windows
and slides relational data.
insights and make predictions
• Identify most important factor (Attribute Importance)
• Predict customer behavior (Classification)
• Predict or estimate a value (Regression)
• Find profiles of targeted people or items (Decision Trees)
• Segment a population (Clustering)
• Find fraudulent or “rare events” (Anomaly Detection)
• Determine co-occurring items in a “baskets” (Associations)

!13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
2013,

CONFIDENTIAL – ORACLE RESTRICTED
Data Mining Provides

Better Information, Valuable Insights and Predictions
Cell Phone Churners

vs. Loyal Customers

Income

Segment #3:

Insight &
Prediction

IF CUST_MO > 7 AND
INCOME < $175K, THEN
Prediction = Cell Phone
Churner, Confidence =
83%, Support = 6/39

Segment #1:
IF CUST_MO > 14 AND
INCOME < $90K, THEN
Prediction = Cell Phone
Churner, Confidence =
100%, Support = 8/39

Customer Months
!14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
A Real Fraud Example

Total purchases exceeds
time period average

My credit card statement—Can you see the fraud?

May 22
May 22
…
June 14
June 14
June 15
June 15
May 28
May 29
June 16
June 16

1:14 PM
7:32 PM
Gas Station?
2:05 PM
2:06 PM
11:48 AM
11:49 AM
6:31 PM
8:39 PM
11:48 AM
11:49 AM

FOOD
WINE

Monaco Café
Wine Bistro

Monaco?
MISC
MISC
MISC
MISC
WINE
FOOD
MISC
MISC

Mobil Mart
Mobil Mart
Mobil Mart
Mobil Mart
Acton Shop
Crossroads
Mobil Mart
Mobil Mart

All same $75 amount?

!15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
2013,

$127.38
$28.00

Insert Information Protection Policy Classification from Slide 13

$75.00
$75.00
$75.00
$75.00
$31.00
$128.14
$75.00
$75.00

Pairs of
$75?
“Essentially, all models are wrong, 

…but some are useful.”







- George Box 

(One of the most influential statisticians of the 20th century and a pioneer in the
areas of quality control, time series analysis, design of experiments and
Bayesian inference.)
!16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
You Can Think of It Like This…
Traditional SQL

Oracle Data Mining

• “Human-driven” queries
• Domain expertise
• Any “rules” must be
defined and managed


• SQL Queries
• SELECT
• DISTINCT

• Automated knowledge
discovery, model building and
deployment
• Domain expertise to assemble
the “right” data to mine


!

+

• ODM “Verbs”
• PREDICT
• DETECT

• AGGREGATE

• CLUSTER

• WHERE

• CLASSIFY

• AND OR

• REGRESS

• GROUP BY

• PROFILE

• ORDER BY

• IDENTIFY FACTORS

• RANK

• ASSOCIATE

!17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
!

Real-time Prediction for a Customer

!

• On-the-fly, single record apply with new data (e.g. from call center)

Select prediction_probability(CLAS_DT_5_2, 'Yes'
USING 7800 as bank_funds, 125 as
checking_amount, 20 as credit_balance, 55 as
age, 'Married' as marital_status,
250 as MONEY_MONTLY_OVERDRAWN, 1 as
house_ownership)
Social
Call
from dual;

Branc

ECM

BI
Get

Web

Email

CRM

!18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Mobile
Predictive and Recommendation Analytics
Real Time Data Mining Modeling with Streaming Events
•

!19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Combine Real Time Event Streaming Data Technologies with
the Industry leading Oracle Historical Data Mining:
– Oracle Data Mining
• Rich set of Algorithms for Data Mining
• Predict Customer Behavior
• Find Profiles of Targeted People or Items, and
determine important relationships
• Immediately Predict Trends and Themes for Data in
motion
• Respond to Prevent Business Threats and take
Advantage of Opportunities
Acting Oracle Data Mining: 

Technology Behind the America’s Cup Win
• “The USA holds 250 sensors to collect raw data: pressure sensors on the wing; angle
sensors on the adjustable trailing edge of the wing sail to monitor the effectiveness of each
adjustment, allowing the crew to ascertain the amount of lift it’s generating; and fiber-optic
strain sensors on the mast and wing to allow maximum thrust without over bending them.

!


• But collecting data was only the 

beginning. ORACLE Racing 

also had to manage that data, 

analyze it, and present useful 

results……

!20 Copyright © 2012,
http://www.sail-world.com/USA/Americas-Cup:-Oracle-Data-Mining-supports-crew-and-BMW-ORACLE-Racing/68834
Copyright © 2012, OracleOracle and/or its affiliates. Allreserved.
and/or its affiliates. All rights rights reserved. Information Protection Policy Classification from Slide 13
Insert
Fast Data Mining Demo: 

Fraud Prediction in action…

▪ Extract Knowledge starting from a csv file
▪ Execute Anomaly Detection Mining on stored data
▪ Put in place a RealTime Event Processing Flow
▪ Consuming event from In-Memory Data Grid
▪ Obtain instantly Fraud Prediction from :

Streaming Data

!21
Q&A

!22
Thanks
!

Fast Data Mining

Real Time Knowledge Discovery for Predictive
Decision Making


Nino Guarnacci

nino.guarnacci@oracle.com

!23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Weitere ähnliche Inhalte

Was ist angesagt?

Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks
Pactera_US
 
Blue Canopy Semantic Web Approach v25 brief
Blue Canopy Semantic Web Approach v25 briefBlue Canopy Semantic Web Approach v25 brief
Blue Canopy Semantic Web Approach v25 brief
Nick Savage
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Vishal Bamba
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Publicis Sapient Engineering
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
pietvz
 

Was ist angesagt? (20)

Harnessing Hadoop Distuption: A Telco Case Study
Harnessing Hadoop Distuption: A Telco Case StudyHarnessing Hadoop Distuption: A Telco Case Study
Harnessing Hadoop Distuption: A Telco Case Study
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
Dealing with Dark Data
Dealing with Dark DataDealing with Dark Data
Dealing with Dark Data
 
Dark Data Discovery & Governance with File Analysis
Dark Data Discovery & Governance with File AnalysisDark Data Discovery & Governance with File Analysis
Dark Data Discovery & Governance with File Analysis
 
Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks
 
Data science workshop
Data science workshopData science workshop
Data science workshop
 
HPE IDOL 10 (Intelligent Data Operating Layer)
HPE IDOL 10 (Intelligent Data Operating Layer)HPE IDOL 10 (Intelligent Data Operating Layer)
HPE IDOL 10 (Intelligent Data Operating Layer)
 
Finding fraud in large, diverse data sets
Finding fraud in large, diverse data setsFinding fraud in large, diverse data sets
Finding fraud in large, diverse data sets
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Blue Canopy Semantic Web Approach v25 brief
Blue Canopy Semantic Web Approach v25 briefBlue Canopy Semantic Web Approach v25 brief
Blue Canopy Semantic Web Approach v25 brief
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
 
The Analytics Continuum
The Analytics ContinuumThe Analytics Continuum
The Analytics Continuum
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
 
Detection of Anomalous Behavior
Detection of Anomalous BehaviorDetection of Anomalous Behavior
Detection of Anomalous Behavior
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
 
A Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision MedicineA Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision Medicine
 

Andere mochten auch

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
University of Hertfordshire
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 

Andere mochten auch (7)

A review on data mining
A  review on data miningA  review on data mining
A review on data mining
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Data Mining
Data MiningData Mining
Data Mining
 
Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 

Ähnlich wie Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making

8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big data
Dr. Wilfred Lin (Ph.D.)
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Denodo
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
Inside Analysis
 

Ähnlich wie Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making (20)

oracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxoracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptx
 
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
 
A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
CDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and TrendsCDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and Trends
 
8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big data
 
How to Consume Your Data for AI
How to Consume Your Data for AIHow to Consume Your Data for AI
How to Consume Your Data for AI
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
 
Optimize IT Infrastructure
Optimize IT InfrastructureOptimize IT Infrastructure
Optimize IT Infrastructure
 
eBook: 5 Steps to Secure Cloud Data Governance
eBook: 5 Steps to Secure Cloud Data GovernanceeBook: 5 Steps to Secure Cloud Data Governance
eBook: 5 Steps to Secure Cloud Data Governance
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
 
Automatic Data Reconciliation, Data Quality, and Data Observability.pdf
Automatic Data Reconciliation, Data Quality, and Data Observability.pdfAutomatic Data Reconciliation, Data Quality, and Data Observability.pdf
Automatic Data Reconciliation, Data Quality, and Data Observability.pdf
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs Matter
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
The Power of Data
The Power of DataThe Power of Data
The Power of Data
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
 
OFSAA - BIGDATA - IBANK
OFSAA - BIGDATA - IBANKOFSAA - BIGDATA - IBANK
OFSAA - BIGDATA - IBANK
 

Mehr von Codemotion

Mehr von Codemotion (20)

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending story
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storia
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard Altwasser
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 

Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making

  • 1. Fast Data Mining Real Time Knowledge Discovery for Predictive Decision Making
 Nino Guarnacci nino.guarnacci@oracle.com !1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 2. Data Explosion
 Web & social networks experienced it first… Infographic by Go-gulf.com !2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 3. … but enterprises are now facing it too … but • Services and web transaction data (to refine recommendations, detect trends etc.) • “Sensor” data: • GPS in mobile phones • RFIDs • NFC • SmartMeters • Etc. • Log file monitoring and analysis • Security monitoring Utilities deploying smart meters? ! 200x information flowing to data center! !3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. enterprises are also facing it now
  • 4. % 93 executives who would grade themselves C or lower in preparedness % 89 !4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. executives who say drawing intelligence organization is priority from data is top losing believe their revenue as a result of not being able to fully leverage information % 67 Source: Oracle Research Study - From Overload to Impact: An Industry Scorecard on Big Data Business Challenges, July 2012
  • 5. Obstacles to Faster Manage Data – Latency Gap While Ensuring Accuracy, Efficiency, and Scale Fragmented event entities The Gap Business Value Business event Data captured Analysis completed Action taken Action Time !5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Source: Richard Hackethorn’s Component’s of Action Time
  • 6. Obstacles to Faster Manage Data – Latency Gap While Ensuring Accuracy, Efficiency, and Scale Fragmented event entities The Gap Business Value Business event Data captured Analysis completed Action taken Action Time !6 Source: Richard Hackethorn’s Component’s of Action Time
  • 7. What is Fast Data? Turning High Velocity Data into Value ▪ It’s about getting more from in-flight data ▪ It’s about faster action, faster insights ▪ It’s about running your business in real-time !7
  • 8. Oracle Fast Data Approach Filter, Move, Transform, Analyze, and Act at High Velocity FILTER & 
 CORRELATE MOVE & TRANSFORM !8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. ANALYZE 
 ACT
  • 9. Oracle Fast Data Approach Filter, Move, Transform, Analyze, and Act at High Velocity Network Status In-Memory Data Grid FILTER & 
 CORRELATE Real Time Streams Information • Parallel Multiple Streams: jms, files, coherence, db,.. • Different Object Type: text, java object… • High throughput for data Aggregation and Event Querying Coherence Data Grid holds the data and compute in parallel !9
  • 10. Oracle Fast Data Approach Filter, Move, Transform, Analyze, and Act at High Velocity - Event Streams - Event-type Event-type Event-type EPN (Event Processing Network) Elements Adapter Channel Cache !10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. POJO JSON Processor HTTP Pub/Sub
  • 11. Oracle Event Processing
 STREAMS SLA Detection: Pattern Matching <TRACE> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <TRACE> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <TRACE> <WHERE_HAPPENED_DETAIL> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <OFFICE> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <WHERE_HAPPENED_DETAIL> <WHERE_DESCRIPTION>MONZA</ <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> WHERE_DESCRIPTION> <OFFICE> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <WHERE_ID>MZ</WHERE_ID> <WHERE_HAPPENED_DETAIL> <WHERE_DESCRIPTION>MONZA</ WHERE_DESCRIPTION> </OFFICE> <OFFICE> </WHERE_HAPPENED_DETAIL> <WHERE_ID>MZ</WHERE_ID> <WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION> </TRACE> </OFFICE> <WHERE_ID>MZ</WHERE_ID> </WHERE_HAPPENED_DETAIL> </OFFICE> </TRACE> </WHERE_HAPPENED_DETAIL> </TRACE> Copyright © 2013, Oracle and/or its affiliates. All rights reserved. DATABASE SPATIAL Match Pattern= R 7 ◆ TIME WINDOW SELECT M.SLA_VIOLATED FROM TRACE IN CHANNEL, ENTITIES, SPATIAL CONTEXT MATCH_RECOGNIZE ( MEASURES SLA_VIOLATED PATTERN (A B) DEFINE A (DELIVERY TIME - NOW) < 2 DAYS B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM ) as M
  • 12. Oracle Event Processing
 SLA Detection: Filtering & Correlation ISTREAM( SELECT FROM PARTITION BY SELECT M.SLA_VIOLATED FROM TRACE IN CHANNEL, ENTITIES, SPATIAL CONTEXT MATCH_RECOGNIZE ( MEASURES SLA_VIOLATED PATTERN (A B) DEFINE A (DELIVERY TIME - NOW) < 2 DAYS B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM ) as M WITHIN GROUP BY ) ▪ Aggregate and Correlate received filter-events Partition by Trip-Path probable SLA violations Copyright © 2013, Oracle and/or its affiliates. All rights reserved. SPATIAL_CONTEXT SLA_VIOLATED_OUT_CHANNEL START_OFFICE, WHERE_HAPPENED 1 HOUR HAVING ▪ COUNT(*), START_OFFICE, WHERE_HAPPEND, LATITUDE, LONGITUDE START_OFFICE COUNT(*) > 5
  • 13. Oracle Fast Data Approach Mining? What is Oracle Data Filter, Move, Transform, Analyze, and Act at High Velocity 
 ! Real-Time Streams analysis, correlate events from Automatically sifting through large amounts of data to different source, manage and use them valuable new find previously hidden patterns, discover as a windows and slides relational data. insights and make predictions • Identify most important factor (Attribute Importance) • Predict customer behavior (Classification) • Predict or estimate a value (Regression) • Find profiles of targeted people or items (Decision Trees) • Segment a population (Clustering) • Find fraudulent or “rare events” (Anomaly Detection) • Determine co-occurring items in a “baskets” (Associations) !13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2013, CONFIDENTIAL – ORACLE RESTRICTED
  • 14. Data Mining Provides
 Better Information, Valuable Insights and Predictions Cell Phone Churners vs. Loyal Customers Income Segment #3: Insight & Prediction IF CUST_MO > 7 AND INCOME < $175K, THEN Prediction = Cell Phone Churner, Confidence = 83%, Support = 6/39 Segment #1: IF CUST_MO > 14 AND INCOME < $90K, THEN Prediction = Cell Phone Churner, Confidence = 100%, Support = 8/39 Customer Months !14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 15. A Real Fraud Example Total purchases exceeds time period average My credit card statement—Can you see the fraud?
 May 22 May 22 … June 14 June 14 June 15 June 15 May 28 May 29 June 16 June 16 1:14 PM 7:32 PM Gas Station? 2:05 PM 2:06 PM 11:48 AM 11:49 AM 6:31 PM 8:39 PM 11:48 AM 11:49 AM FOOD WINE Monaco Café Wine Bistro Monaco? MISC MISC MISC MISC WINE FOOD MISC MISC Mobil Mart Mobil Mart Mobil Mart Mobil Mart Acton Shop Crossroads Mobil Mart Mobil Mart All same $75 amount? !15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2013, $127.38 $28.00 Insert Information Protection Policy Classification from Slide 13 $75.00 $75.00 $75.00 $75.00 $31.00 $128.14 $75.00 $75.00 Pairs of $75?
  • 16. “Essentially, all models are wrong, 
 …but some are useful.” 
 
 
 - George Box 
 (One of the most influential statisticians of the 20th century and a pioneer in the areas of quality control, time series analysis, design of experiments and Bayesian inference.) !16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 17. You Can Think of It Like This… Traditional SQL Oracle Data Mining • “Human-driven” queries • Domain expertise • Any “rules” must be defined and managed
 • SQL Queries • SELECT • DISTINCT • Automated knowledge discovery, model building and deployment • Domain expertise to assemble the “right” data to mine
 ! + • ODM “Verbs” • PREDICT • DETECT • AGGREGATE • CLUSTER • WHERE • CLASSIFY • AND OR • REGRESS • GROUP BY • PROFILE • ORDER BY • IDENTIFY FACTORS • RANK • ASSOCIATE !17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 18. ! Real-time Prediction for a Customer ! • On-the-fly, single record apply with new data (e.g. from call center) Select prediction_probability(CLAS_DT_5_2, 'Yes' USING 7800 as bank_funds, 125 as checking_amount, 20 as credit_balance, 55 as age, 'Married' as marital_status, 250 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership) Social Call from dual; Branc ECM BI Get Web Email CRM !18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Mobile
  • 19. Predictive and Recommendation Analytics Real Time Data Mining Modeling with Streaming Events • !19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Combine Real Time Event Streaming Data Technologies with the Industry leading Oracle Historical Data Mining: – Oracle Data Mining • Rich set of Algorithms for Data Mining • Predict Customer Behavior • Find Profiles of Targeted People or Items, and determine important relationships • Immediately Predict Trends and Themes for Data in motion • Respond to Prevent Business Threats and take Advantage of Opportunities
  • 20. Acting Oracle Data Mining: 
 Technology Behind the America’s Cup Win • “The USA holds 250 sensors to collect raw data: pressure sensors on the wing; angle sensors on the adjustable trailing edge of the wing sail to monitor the effectiveness of each adjustment, allowing the crew to ascertain the amount of lift it’s generating; and fiber-optic strain sensors on the mast and wing to allow maximum thrust without over bending them. ! 
 • But collecting data was only the 
 beginning. ORACLE Racing 
 also had to manage that data, 
 analyze it, and present useful 
 results…… !20 Copyright © 2012, http://www.sail-world.com/USA/Americas-Cup:-Oracle-Data-Mining-supports-crew-and-BMW-ORACLE-Racing/68834 Copyright © 2012, OracleOracle and/or its affiliates. Allreserved. and/or its affiliates. All rights rights reserved. Information Protection Policy Classification from Slide 13 Insert
  • 21. Fast Data Mining Demo: 
 Fraud Prediction in action… ▪ Extract Knowledge starting from a csv file ▪ Execute Anomaly Detection Mining on stored data ▪ Put in place a RealTime Event Processing Flow ▪ Consuming event from In-Memory Data Grid ▪ Obtain instantly Fraud Prediction from : Streaming Data !21
  • 23. Thanks ! Fast Data Mining Real Time Knowledge Discovery for Predictive Decision Making
 Nino Guarnacci nino.guarnacci@oracle.com !23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.