SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
A HIGH THROUGHPUT
COMPLEX EVENT DETECTION TECHNIQUE
WITH BULK EVALUATION
Naotaka Nishimura (University of Tsukuba)
Hideyuki Kawashima (University of Tsukuba)
Hiroyuki Kitagawa (University of Tsukuba)
Outline
Background
Related work (SASE)
State of the art
Chance for further improvement
Proposal: Bulk evaluation
Extension of SASE
Evaluation
Conclusions and Future work
Big Data Streams
-- Volume, Velocity, Variety, Veracity, Value -3

Social network

Facebook, 600 TB in a day
(VLDB’13 Keynote)

Monitoring System
CISCO, 322 Tbps
Science

LHC, 15 PB / year
LSST, 20 TB / day

CRS-3
Quick Review

Data Stream Management System (DSMS)
4

How many packets are
arrived for port 80
in a minute ?

SELECT COUNT(*)
FROM eth0[TIME 1 MIN]
WHERE port = 80

Q1
DSMS

20
Relational schema

Relation
eth0

・Destination IP
・Source IP
・Destination Port
・Source Port
・Interface (e.g. eth0)
・Length
・Version (e.g. IPV4 )
・Payload
SELECT COUNT(*)
FROM eth0[TIME 1 MIN]
WHERE port = 80

5

DSMS
Data

Input
adapter

w

σ

α

Query
Output
adapter

Result

Users/Apps.

SQL is translated to operator tree.
On arrival of data, tree is evaluated.
Operators are based on relational database
w(Window):
Cutting off relations from a stream
σ (Selection): Filter
α (Aggregation): such as AVG, MIN, MAX
CEP (complex event processing)
Complex Event Processing (CEP)
Detect a certain pattern from input stream data
Stream Data

A1

A2

B3

C4

E5

D6

D7

…

Query Pattern:A→B→D
Pattern occurrences (sequences of events specified by user)

A1→B3→D6

A2→B3→D6 A1→B3→D7 A2→B3→D7

…
Complex Event Processing (CEP)
A case for order management in a restaurant.
Detect a guest who passed entrance and took a seat.
Pattern: Entrance→Seat
RFID
Place

Seat2

Seat3

Floor

Entrance

Seat6

Seat5

9:54:11

xx

Toilet

Seat4

Entrance

10:10:01

xx

10:10:31

yy

Floor

Seat1

TagID

Entrance

Toilet

Time

10:10:31

yy

Seat5

10:11:11

yy

A pattern occurrence is constructed by 2
Outline
Background
Related work (SASE)
Proposal
Evaluation
Conclusions and Future work
SASE [1] Overview (1/2)

[1]:High-Performance Complex Event
Processing over Streams, ACM
SIGMOD 2006

SASE detects specified patterns using NFA(Non
deterministic Finite Automata).
NFA (quick review)
Is a finite automaton which can achieve multiple states at the
same time.
FA is an architecture that transits from current state to next state
by input symbol. It is constituted of initial state, acceptance state,
state set, input symbol, and transition function.
Ex) NFA that detects A→B→D
• Self transition;
This is a self loop transition
which is invoked by every event.
SASE Overview (2/2)
Problem of NFA:
NFA can detect specified patterns, but it does not produce

pattern occurrences (sequence of input events that achieved
acceptance state)

SASE
Utilizes stack structure (AIS) to output pattern occurrences.
AIS (Active Instance Stack)
For a state, an AIS is prepared
0

A

1

*
AIS

B

D

2

3

*
AIS

AIS
Behavior of SASE (1/3)
Translate a query pattern to an NFA
A

0

A

B

1

*

B

D

2

*

D

3
Behavior of SASE (2/3)
Prepare an AIS for each state of NFA
Create a link when an event is pushed
Event arrival sequence

t

a1 c2 b3 a4 d5
0

A

1

*
a1
a4

B

D

2

*

b3

3

d5
Behavior of SASE (3/3)
Create a pattern occurrence when acceptance state is
achieved using link information
0

A

1

B

D

2

*

3

*
a1
a4

b3

d5
a1

b3

d5
IDEA: If we can evaluate d7 and d9 in a lump, the cost
Problem of a1-b3 should be reduced (2 to 1).
for constructing SASE we found

Duplicate generation (e.g. b3 → a1)
b6,d7,a8,d9

0

A

1

B

0

D

2

*

3

*

a1

b3

a4

d7

b6

A

1

B

*

b3
b6
b3

d7
d7
d7

3

*

a1

b3

a4

b6

a8

Result Generation

a1
a1
a4

D

2

Result Generation

a1
a1
a4

b3
b6
b3

d9
d9
d9

d9
Outline
Background
Related work (SASE)
Proposal: Bulk evaluation
Extension of SASE
Evaluation
Conclusions and Future work
Concept: Bulk Evaluation
Generate Result

Generate Result

Generate Result

Generate Result

Generate Result

[SASE]
a1

c2

b3

a4

d5

b6

d7

a8

d9

b10

d11

d12

b13

[Proposal]

Generate Result

Generate Result

t
Behavior of Proposal (1/3)
Create a link when an event is pushed to AIS
Keep D events, different from SASE
a1 c2 b3 a4 d5 b6 d7 a8 d9

0

A

1

B

D

2

3

*

*
a1

b3

d5

a4

b6

d7

a8

d9

t
Behavior of Proposal (2/3)
Create a cluster on final AIS
0

A

1

B

D

2

*

*
a1

b3

a4

b6

a8

3

0

A

1

B

D

2

3

*

d5

*
a1

b3

d5

d7

a4

b6

d7

d9

a8

d9
Behavior of Proposal (3/3)
Create pattern occurrences in a bulk
Result with d9 is made with result on d7
0

A

1

B

D

2

3

*

*
a1

b3

d5

a4

b6

d7

a8

d9

a1

b3

d5

a1
a1
a4

b3
b6
b6

d7
d7
d7

a1
a1
a4

b3
b6
b6

d9
d9
d9
Outline
Background
Related work
Proposal
Evaluation
Conclusions and Future work
Environment for Experiment
OS: WindowsXP
RAM: 3GB
CPU: Intel Core2Duo E8400(3GHz)
Language: Java(JRE 1.7.0_4)
Result of Experiment

Pattern:A→B→D

5.24 times
Outline
Background
Related work
Proposal
Evaluation
Conclusions and Future work
Conclusions and Future Work
Conclusions
SASE had a chance for further improvement on throughput.
Bulk evaluation scheme improved throughput.
Factor of 5.24 at the maximum case

Future work
Implementing the proposal to Falcon
CryptDB

Privacy Preservation

Encryption

FPGA

Privacy
ML&DM
Jubatus

Online
LDA
CPD

SQL
Norikra

System S

Borealis

Puma

MADLib
@UCB
Spring
(DTW)

Data Mining &
Machine Learning

Esper

Kafka
SASE

STORM

Cayuga

Window
join

Bismarck
Online @Stanford

Intel MIC

NoSQL

MLBase
Oracle-R
Incr.
LOCI

Tilera

Accelerator

Falcon

26

GPGPU

Window
aggregate

Relational
stream

Continual query & Window

Complex
event
processing

Tuple
stream

S4
27

- UDP-RX
- Window join (64-cores)

Performance
Monitor

Falcon

Basic: 6.7 millions of tuples per second
Proposal: 14.6 millions of tuples per second

Weitere ähnliche Inhalte

Was ist angesagt?

R Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal DependenceR Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal DependenceWork-Bench
 
Finalprojectpresentation
FinalprojectpresentationFinalprojectpresentation
FinalprojectpresentationSANTOSH WAYAL
 
CCLS Internship Presentation
CCLS Internship PresentationCCLS Internship Presentation
CCLS Internship PresentationCharles Naut
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersJen Aman
 
Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Flink Forward
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
Presentation TDG #2
Presentation TDG #2Presentation TDG #2
Presentation TDG #2edofrederix
 
[PDF] Visualizing and discovering non trivial patterns in large time-series d...
[PDF] Visualizing and discovering non trivial patterns in large time-series d...[PDF] Visualizing and discovering non trivial patterns in large time-series d...
[PDF] Visualizing and discovering non trivial patterns in large time-series d...Quân Lê
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRDatabricks
 
Fundamentals of Internet and Communication Competition
Fundamentals of Internet and Communication CompetitionFundamentals of Internet and Communication Competition
Fundamentals of Internet and Communication CompetitionMarcoFasanella
 
06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clusteringSubhas Kumar Ghosh
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringAndreina Uzcategui
 
Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...
Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...
Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...Amrinder Arora
 

Was ist angesagt? (20)

R Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal DependenceR Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal Dependence
 
Finalprojectpresentation
FinalprojectpresentationFinalprojectpresentation
Finalprojectpresentation
 
CCLS Internship Presentation
CCLS Internship PresentationCCLS Internship Presentation
CCLS Internship Presentation
 
Matlab: Discrete Linear Systems
Matlab: Discrete Linear SystemsMatlab: Discrete Linear Systems
Matlab: Discrete Linear Systems
 
1 storm-intro
1 storm-intro1 storm-intro
1 storm-intro
 
Java calendar
Java calendarJava calendar
Java calendar
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity Clusters
 
Calendar class in java
Calendar class in javaCalendar class in java
Calendar class in java
 
JGrass-NewAge water budget
JGrass-NewAge water budget JGrass-NewAge water budget
JGrass-NewAge water budget
 
Date class
Date classDate class
Date class
 
Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming
 
CS267_Graph_Lab
CS267_Graph_LabCS267_Graph_Lab
CS267_Graph_Lab
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Presentation TDG #2
Presentation TDG #2Presentation TDG #2
Presentation TDG #2
 
[PDF] Visualizing and discovering non trivial patterns in large time-series d...
[PDF] Visualizing and discovering non trivial patterns in large time-series d...[PDF] Visualizing and discovering non trivial patterns in large time-series d...
[PDF] Visualizing and discovering non trivial patterns in large time-series d...
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkR
 
Fundamentals of Internet and Communication Competition
Fundamentals of Internet and Communication CompetitionFundamentals of Internet and Communication Competition
Fundamentals of Internet and Communication Competition
 
06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
 
Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...
Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...
Proof of O(log *n) time complexity of Union find (Presentation by Wei Li, Zeh...
 

Andere mochten auch

裏Ocufes BitsummitとGDCの展示物レポート
裏Ocufes BitsummitとGDCの展示物レポート裏Ocufes BitsummitとGDCの展示物レポート
裏Ocufes BitsummitとGDCの展示物レポートHaruto Watanabe
 
Exceedence PItch
Exceedence PItchExceedence PItch
Exceedence PItchrayalco
 
Online Political Campaigning: Croatia "Kukuriku" Case
Online Political Campaigning: Croatia "Kukuriku" CaseOnline Political Campaigning: Croatia "Kukuriku" Case
Online Political Campaigning: Croatia "Kukuriku" CaseTomislav Korman
 
The 12 types of advertising 5&6
The 12 types of advertising 5&6The 12 types of advertising 5&6
The 12 types of advertising 5&6Les Davy
 
Researchers - recommendations from AIGLIA2014
Researchers - recommendations from AIGLIA2014Researchers - recommendations from AIGLIA2014
Researchers - recommendations from AIGLIA2014futureagricultures
 
تمثيل البيانات بيانياً
تمثيل البيانات بيانياًتمثيل البيانات بيانياً
تمثيل البيانات بيانياًheshmat2010
 
Con8837 leverage authorization to monetize content and media subscriptions ...
Con8837   leverage authorization to monetize content and media subscriptions ...Con8837   leverage authorization to monetize content and media subscriptions ...
Con8837 leverage authorization to monetize content and media subscriptions ...OracleIDM
 
Оскудение речи
Оскудение речиОскудение речи
Оскудение речиninaviktor
 
PD workshop - polling and forms
PD workshop - polling and formsPD workshop - polling and forms
PD workshop - polling and formseprice0030
 
Basic conversation 5
Basic conversation 5Basic conversation 5
Basic conversation 5Les Davy
 
Аллергические заболевания слизистой оболочки полости рта у детей
Аллергические заболевания слизистой оболочки полости рта у детейАллергические заболевания слизистой оболочки полости рта у детей
Аллергические заболевания слизистой оболочки полости рта у детейcrasgmu
 

Andere mochten auch (20)

BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
裏Ocufes BitsummitとGDCの展示物レポート
裏Ocufes BitsummitとGDCの展示物レポート裏Ocufes BitsummitとGDCの展示物レポート
裏Ocufes BitsummitとGDCの展示物レポート
 
Exceedence PItch
Exceedence PItchExceedence PItch
Exceedence PItch
 
C 3
C 3C 3
C 3
 
Online Political Campaigning: Croatia "Kukuriku" Case
Online Political Campaigning: Croatia "Kukuriku" CaseOnline Political Campaigning: Croatia "Kukuriku" Case
Online Political Campaigning: Croatia "Kukuriku" Case
 
Materi WWW
Materi WWWMateri WWW
Materi WWW
 
The 12 types of advertising 5&6
The 12 types of advertising 5&6The 12 types of advertising 5&6
The 12 types of advertising 5&6
 
Researchers - recommendations from AIGLIA2014
Researchers - recommendations from AIGLIA2014Researchers - recommendations from AIGLIA2014
Researchers - recommendations from AIGLIA2014
 
DIWALI
DIWALIDIWALI
DIWALI
 
تمثيل البيانات بيانياً
تمثيل البيانات بيانياًتمثيل البيانات بيانياً
تمثيل البيانات بيانياً
 
Sportovi vo uk mapi
Sportovi vo uk   mapiSportovi vo uk   mapi
Sportovi vo uk mapi
 
Con8837 leverage authorization to monetize content and media subscriptions ...
Con8837   leverage authorization to monetize content and media subscriptions ...Con8837   leverage authorization to monetize content and media subscriptions ...
Con8837 leverage authorization to monetize content and media subscriptions ...
 
Оскудение речи
Оскудение речиОскудение речи
Оскудение речи
 
PD workshop - polling and forms
PD workshop - polling and formsPD workshop - polling and forms
PD workshop - polling and forms
 
Casablanca
CasablancaCasablanca
Casablanca
 
Notam 08 mar-2015
Notam 08 mar-2015Notam 08 mar-2015
Notam 08 mar-2015
 
Roditelska noemvri 2014
Roditelska noemvri 2014Roditelska noemvri 2014
Roditelska noemvri 2014
 
Lesson 1 for 1st cycle school
Lesson 1 for 1st cycle  schoolLesson 1 for 1st cycle  school
Lesson 1 for 1st cycle school
 
Basic conversation 5
Basic conversation 5Basic conversation 5
Basic conversation 5
 
Аллергические заболевания слизистой оболочки полости рта у детей
Аллергические заболевания слизистой оболочки полости рта у детейАллергические заболевания слизистой оболочки полости рта у детей
Аллергические заболевания слизистой оболочки полости рта у детей
 

Ähnlich wie SMDMS'13

2007 Tidc India Profiling
2007 Tidc India Profiling2007 Tidc India Profiling
2007 Tidc India Profilingdanrinkes
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.pptAbhijitManna19
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.pptsnowflakebatch
 
strata spark streaming strata spark streamingsrata spark streaming
strata spark streaming strata spark streamingsrata spark streamingstrata spark streaming strata spark streamingsrata spark streaming
strata spark streaming strata spark streamingsrata spark streamingShidrokhGoudarzi1
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.pptrveiga100
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciencesalexstorer
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustEvan Chan
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Databricks
 
Real time data-pipeline from inception to production
Real time data-pipeline from inception to productionReal time data-pipeline from inception to production
Real time data-pipeline from inception to productionShreya Mukhopadhyay
 
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...Flink Forward
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormDataStax
 
Spark Streaming, Machine Learning and meetup.com streaming API.
Spark Streaming, Machine Learning and  meetup.com streaming API.Spark Streaming, Machine Learning and  meetup.com streaming API.
Spark Streaming, Machine Learning and meetup.com streaming API.Sergey Zelvenskiy
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSCobus Bernard
 
Deep dive into spark streaming
Deep dive into spark streamingDeep dive into spark streaming
Deep dive into spark streamingTao Li
 
QoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT Networks
QoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT NetworksQoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT Networks
QoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT NetworksIJCNCJournal
 
Mining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential PatternsMining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential PatternsIOSR Journals
 

Ähnlich wie SMDMS'13 (20)

2007 Tidc India Profiling
2007 Tidc India Profiling2007 Tidc India Profiling
2007 Tidc India Profiling
 
dfl
dfldfl
dfl
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.ppt
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.ppt
 
strata spark streaming strata spark streamingsrata spark streaming
strata spark streaming strata spark streamingsrata spark streamingstrata spark streaming strata spark streamingsrata spark streaming
strata spark streaming strata spark streamingsrata spark streaming
 
strata_spark_streaming.ppt
strata_spark_streaming.pptstrata_spark_streaming.ppt
strata_spark_streaming.ppt
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
 
Real time data-pipeline from inception to production
Real time data-pipeline from inception to productionReal time data-pipeline from inception to production
Real time data-pipeline from inception to production
 
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Spark Streaming, Machine Learning and meetup.com streaming API.
Spark Streaming, Machine Learning and  meetup.com streaming API.Spark Streaming, Machine Learning and  meetup.com streaming API.
Spark Streaming, Machine Learning and meetup.com streaming API.
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
 
Deep dive into spark streaming
Deep dive into spark streamingDeep dive into spark streaming
Deep dive into spark streaming
 
QoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT Networks
QoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT NetworksQoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT Networks
QoS Categories Activeness-Aware Adaptive EDCA Algorithm for Dense IoT Networks
 
Mining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential PatternsMining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential Patterns
 

Kürzlich hochgeladen

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

SMDMS'13

  • 1. A HIGH THROUGHPUT COMPLEX EVENT DETECTION TECHNIQUE WITH BULK EVALUATION Naotaka Nishimura (University of Tsukuba) Hideyuki Kawashima (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba)
  • 2. Outline Background Related work (SASE) State of the art Chance for further improvement Proposal: Bulk evaluation Extension of SASE Evaluation Conclusions and Future work
  • 3. Big Data Streams -- Volume, Velocity, Variety, Veracity, Value -3 Social network Facebook, 600 TB in a day (VLDB’13 Keynote) Monitoring System CISCO, 322 Tbps Science LHC, 15 PB / year LSST, 20 TB / day CRS-3
  • 4. Quick Review Data Stream Management System (DSMS) 4 How many packets are arrived for port 80 in a minute ? SELECT COUNT(*) FROM eth0[TIME 1 MIN] WHERE port = 80 Q1 DSMS 20 Relational schema Relation eth0 ・Destination IP ・Source IP ・Destination Port ・Source Port ・Interface (e.g. eth0) ・Length ・Version (e.g. IPV4 ) ・Payload
  • 5. SELECT COUNT(*) FROM eth0[TIME 1 MIN] WHERE port = 80 5 DSMS Data Input adapter w σ α Query Output adapter Result Users/Apps. SQL is translated to operator tree. On arrival of data, tree is evaluated. Operators are based on relational database w(Window): Cutting off relations from a stream σ (Selection): Filter α (Aggregation): such as AVG, MIN, MAX CEP (complex event processing)
  • 6. Complex Event Processing (CEP) Detect a certain pattern from input stream data Stream Data A1 A2 B3 C4 E5 D6 D7 … Query Pattern:A→B→D Pattern occurrences (sequences of events specified by user) A1→B3→D6 A2→B3→D6 A1→B3→D7 A2→B3→D7 …
  • 7. Complex Event Processing (CEP) A case for order management in a restaurant. Detect a guest who passed entrance and took a seat. Pattern: Entrance→Seat RFID Place Seat2 Seat3 Floor Entrance Seat6 Seat5 9:54:11 xx Toilet Seat4 Entrance 10:10:01 xx 10:10:31 yy Floor Seat1 TagID Entrance Toilet Time 10:10:31 yy Seat5 10:11:11 yy A pattern occurrence is constructed by 2
  • 9. SASE [1] Overview (1/2) [1]:High-Performance Complex Event Processing over Streams, ACM SIGMOD 2006 SASE detects specified patterns using NFA(Non deterministic Finite Automata). NFA (quick review) Is a finite automaton which can achieve multiple states at the same time. FA is an architecture that transits from current state to next state by input symbol. It is constituted of initial state, acceptance state, state set, input symbol, and transition function. Ex) NFA that detects A→B→D • Self transition; This is a self loop transition which is invoked by every event.
  • 10. SASE Overview (2/2) Problem of NFA: NFA can detect specified patterns, but it does not produce pattern occurrences (sequence of input events that achieved acceptance state) SASE Utilizes stack structure (AIS) to output pattern occurrences. AIS (Active Instance Stack) For a state, an AIS is prepared 0 A 1 * AIS B D 2 3 * AIS AIS
  • 11. Behavior of SASE (1/3) Translate a query pattern to an NFA A 0 A B 1 * B D 2 * D 3
  • 12. Behavior of SASE (2/3) Prepare an AIS for each state of NFA Create a link when an event is pushed Event arrival sequence t a1 c2 b3 a4 d5 0 A 1 * a1 a4 B D 2 * b3 3 d5
  • 13. Behavior of SASE (3/3) Create a pattern occurrence when acceptance state is achieved using link information 0 A 1 B D 2 * 3 * a1 a4 b3 d5 a1 b3 d5
  • 14. IDEA: If we can evaluate d7 and d9 in a lump, the cost Problem of a1-b3 should be reduced (2 to 1). for constructing SASE we found Duplicate generation (e.g. b3 → a1) b6,d7,a8,d9 0 A 1 B 0 D 2 * 3 * a1 b3 a4 d7 b6 A 1 B * b3 b6 b3 d7 d7 d7 3 * a1 b3 a4 b6 a8 Result Generation a1 a1 a4 D 2 Result Generation a1 a1 a4 b3 b6 b3 d9 d9 d9 d9
  • 15. Outline Background Related work (SASE) Proposal: Bulk evaluation Extension of SASE Evaluation Conclusions and Future work
  • 16. Concept: Bulk Evaluation Generate Result Generate Result Generate Result Generate Result Generate Result [SASE] a1 c2 b3 a4 d5 b6 d7 a8 d9 b10 d11 d12 b13 [Proposal] Generate Result Generate Result t
  • 17. Behavior of Proposal (1/3) Create a link when an event is pushed to AIS Keep D events, different from SASE a1 c2 b3 a4 d5 b6 d7 a8 d9 0 A 1 B D 2 3 * * a1 b3 d5 a4 b6 d7 a8 d9 t
  • 18. Behavior of Proposal (2/3) Create a cluster on final AIS 0 A 1 B D 2 * * a1 b3 a4 b6 a8 3 0 A 1 B D 2 3 * d5 * a1 b3 d5 d7 a4 b6 d7 d9 a8 d9
  • 19. Behavior of Proposal (3/3) Create pattern occurrences in a bulk Result with d9 is made with result on d7 0 A 1 B D 2 3 * * a1 b3 d5 a4 b6 d7 a8 d9 a1 b3 d5 a1 a1 a4 b3 b6 b6 d7 d7 d7 a1 a1 a4 b3 b6 b6 d9 d9 d9
  • 21. Environment for Experiment OS: WindowsXP RAM: 3GB CPU: Intel Core2Duo E8400(3GHz) Language: Java(JRE 1.7.0_4)
  • 24. Conclusions and Future Work Conclusions SASE had a chance for further improvement on throughput. Bulk evaluation scheme improved throughput. Factor of 5.24 at the maximum case Future work Implementing the proposal to Falcon
  • 25. CryptDB Privacy Preservation Encryption FPGA Privacy ML&DM Jubatus Online LDA CPD SQL Norikra System S Borealis Puma MADLib @UCB Spring (DTW) Data Mining & Machine Learning Esper Kafka SASE STORM Cayuga Window join Bismarck Online @Stanford Intel MIC NoSQL MLBase Oracle-R Incr. LOCI Tilera Accelerator Falcon 26 GPGPU Window aggregate Relational stream Continual query & Window Complex event processing Tuple stream S4
  • 26. 27 - UDP-RX - Window join (64-cores) Performance Monitor Falcon Basic: 6.7 millions of tuples per second Proposal: 14.6 millions of tuples per second