SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Optimization of Continuous Queries in Federated
Database and Stream Processing Systems
Yuanzhen Ji1, Zbigniew Jerzak1, Anisoara Nica1, Gregor Hackenbroich1,
Christof Fetzer2
1SAP SE 2TU Dresden
1firstname.lastname@sap.com 2christof.fetzer@tu-dresden.de
March 16, 2015 BTW 2015
Agenda
• Introduction
• Federated Continuous Query Execution
• Query Optimization Problem
• Our Optimization Solution
• Evaluation
• Conclusions
2
• Problem: optimizing continuous queries (CQ) for federated execution over
a native stream processing engine (SPE) and column-oriented in-memory
database (CIMDB).
– operators: select, join, project, aggregate
• Goal: maximize query throughput (amount of data processed in unit time)
Introduction
3
SPE
CIMDB
data
streams
query
results
data flow
Introduction
• Motivation:
– “No one size fits all” (Cyclops[LHB13], [JI13])
– obtain the best of both worlds (SPE, CIMDB)
• Application Scenario:
– analyzing energy consumption data collected from smart plugs
installed in households (DEBS 2014 Grand Challenge)
• Main contributions:
– a static cost-based optimizer for federated systems
• extends established optimization techniques
• considers the feasibility property of CQ
– showed the potential of federated CQ execution over a SPE and a CIMDB
• up to 8.5x as high as throughput of pure SPE based processing
• up to 1.8x as high as throughput of pure CIMDB based processing
4
Federated Continuous Query Execution
• send relevant input data from SPE to CIMDB
• trigger re-evaluation of query pieces moved to CIMDB
• take results of query pieces executed in CIMDB back to SPE
5
SPE
CIMDB
data
streams
query
results
SQL
query
MIG
MIG
data flow
Query Optimization Problem
• Problem: determine the optimal execution
plan for a given CQ
– currently at deployment time
• Feasibility of continuous queries [AN04]:
– feasible execution plan: can keep up
with data arrival rate
– feasible query: has at least one feasible plan
6
SPE CIMDB
• Feasibility-dependent optimization objective:
– feasible queries: find the feasible plan with least resource consumption
– infeasible queries: find the plan which with maximal throughput
• State of the art: either consider feasibility of CQ but not the federation
context, or the federation context but not the feasibility of CQ.
Optimization Solution
Cost Model – Operator Cost (1)
• Operator cost: CPU cost caused by tuples arrived from data sources within
unit-time
For an 𝑂 with k direct upstream operators:
– li: # tuples produced by the i-th upstream operator as a result of
unit-time source arrivals
– ci: time to process a single tuple from the i-th upstream operator
7
𝑢 > 1  bottleneck  infeasible plan
𝑢(𝑂) = 𝑖=1
𝑘
li 𝑐𝑖 = l1 𝑐1 + l2 𝑐2
O
l1=300
=200
=0.001
= 0.002l2
c1
c2
= 300* 0.001+ 200 * 0.002 = 0.7
Optimization Solution
Cost Model – Operator Cost (2)
• A query piece executed in CIMDB and its corresponding MIG operator:
– treated as a composite operator and cost as a whole
– cost includes data transfer (in & out) cost and query execution cost
8
SPE
CIMDB
data
streams
query
results
SQL
query
MIG
data flow
• Execution plan cost: C(P) = <𝐶 𝑏 𝑃 , 𝐶 𝑢 𝑃 > (m operator)
– Two components: bottleneck cost: 𝐶 𝑏 𝑃 = max{𝑢(𝑂𝑗): 𝑗 ∈ [1, 𝑚]}
total utilization cost: 𝐶 𝑢 𝑃 = 𝑗=1
𝑚
𝑢(𝑂𝑗)
(m: # operators in P)
– 𝑃 is infeasible if 𝐶 𝑏 𝑃 >1
Optimization Solution
Cost Model – Execution Plan Cost
9
𝐶 𝑏 𝑃 = 1.1
𝐶 𝑢 𝑃 = 2.6
𝑢(𝑂1)=0.5
O3
O1
O2
O4
𝑢(𝑂2)=0.3
𝑢(𝑂3)=1.1 𝑢(𝑂4)=0.7
Optimization Solution
Optimal Execution Plan
• An execution plan P of a CQ is an optimal plan, iff for any other plan P’ of
CQ, one of the following conditions is satisfied:
– Condition 1: P is feasible but P’ is infeasible
(Cb(P) ≤ 1 < Cb(P’) )
– Condition 2: Both P and P’ are feasible, but P has lower Cu(P)
(Cb(P) ≤ 1, Cb(P’) ≤ 1, and Cu(P) ≤ Cu(P’) )
– Condition 3: Both P and P’ are feasible, but P has lower Cu(P)
(1 < Cb(P) ≤ Cb(P’) )
10
Optimization Solution
Two Phase-Optimization
• Large search space (# possible plans):
– many semantically equivalent logical plans
– A logical plan with n operators -> 2n possible placement decisions
• Two-Phase optimization:
– Phase One: determine the optimal logical plan (consider join ordering,
etc.)
– Phase two: determine placement for each operator in the logical plan
produced in phase-one.
• Bottom-up plan construction following dynamic programming (DP) model
• Proved applicability of DP for feasibility-dependent optimization objective
in paper.
11
• For each operator O in a logical plan, the optimal sub-plan until O, where
O is placed in the SPE, can be build from the optimal sub-plans until direct
upstream operators of O.
• For a large logical plan: divide into smaller pieces, optimize and compose
in post order.
Optimization Solution
Pruning in Phase Two
12
I1
𝑶 𝟐
𝑺𝑷𝑬
𝑶 𝟏
𝑺𝑷𝑬
𝑶 𝟐
𝑺𝑷𝑬
𝑶 𝟏
𝑫𝑩 I2
𝐶 𝐼1 < 𝐶 𝐼2
Evaluation
Setup
• Setup: HP Z620 workstation with 24-cores (1.2GHz per core) and 96 GB
RAM, running SUSE Linux.
• Data: real-world energy consumption data from smart plugs installed in
households (DEBS 2014 Grand Challenge).
• Tested queries:
13
26.1
3.1
18.7
0
5
10
15
20
25
30
SELECT in
SPE
All in SPE All in DB
Max.throughput(thousand/s)
0
5
10
15
20
25
30
0 5 10 15 20 25 30 35 40
Actualthroughput(thousand/s)
Requested throughput (thousand/s)
Evaluation
Optimizer effectiveness (1)
• Examine 10 source stream data rates picked from
range [1,000, 40,000] (tuples/s)
• measure throughput of devised optimal query
14
Max. throughput comparisonActual vs. requested throughput
PROJECT
INNER JOIN
AGGR (avg)
SELECT SELECT
WINDOW
(5 min)
WINDOW
(5 min)
AGGR (cnt)
SELECT IN SPE
Evaluation
Optimizer effectiveness (2)
15
0
5
10
15
20
25
30
0 5 10 15 20 25 30 35 40
Actualthroughput(thousand/s)
Requested throughput (thousand/s)
18.1
28.6
6.0
18.0
0
5
10
15
20
25
30
SELECT in
SPE
SEL, JOIN,
P in SPE
All in SPE All in DB
Max.throughput(thousand/s)
P1
P2
P1
P2
Max. throughput comparisonActual vs. requested throughput
• Examine data rates ranging from 1000 to 40,000
tuples/s, at 1000 tuples/s increment
• measure throughput of devised optimal query
P1
PROJECT
INNER JOIN
AGGR
(avg, max)
AGGR
(avg, max)
SELECT SELECT
WINDOW
(5 min)
WINDOW
(1 min)
SELECT IN SPE (P1)
SEL, JOIN, P IN SPE (P2)
Evaluation
Influence of Feasibility Check
16
0
5
10
15
20
25
30
0 5 10 15 20 25 30 35 40
Actualthroughput(thousand/s)
Requested throughput (thousand/s)
PROJECT
INNER JOIN
AGGR
(avg, max)
AGGR
(avg, max)
SELECT SELECT
WINDOW
(5 min)
WINDOW
(1 min)
SELECT IN SPE (with feasibility check)
SEL, JOIN, P IN SPE (with feasibility check)
SEL IN SPE (without feasibility check)
Evaluation
Optimization Time
• Tested with join queries (2-way, 5-way, 8-way).
17
11
312
8411
64
327168
2-way (6) 5-way (15) 8-way (24)
#enumeratedplansinPhase-Two
(logscale)
With pruning
Without pruning
0.9
68.6 100.5
12.3
908.6
61335.3
2-way (6) 5-way (15) 8-way (24)
Timeinmillisecond
(logscale)
Phase-One
Phase-Two
16+ million
PROJECT
INNER JOIN
AGGR
(avg, max)
AGGR
(avg, max)
SELECT SELECT
WINDOW
(5 min)
WINDOW
(1 min)
Conclusion
• Exploits the potential of federated execution of CQ over SPE and IMDB.
• Presents a static optimizer which extends traditional optimization
techniques to consider feasibility of CQ.
• Evaluation show promising results.
For examined queries, throughput of devised federated plan is
– up to 8.5 times as high as throughput of pure SPE-based plan
– up to 1.8 times as high as throughput of pure CIMDB-based plan
18
References
[AN04] Ayad, A. M. & Naughton, J. F., Static Optimization of Conjunctive Queries with Sliding Windows over
Infinite Streams, SIGMOD, 2004
[FKC+09] Franklin, M. J.; Krishnamurthy, S.; Conway, N.; Li, A., Russakovsky, A. & Thombre, N., Continuous
Analytics: Rethinking query processing in a network-effect world. CIDR, 2009
[KS09] Kraemer, J. & Seeger B., Semantics and implementation of continuous sliding window queries over data
streams, ACM TODS, 2009
[BCD+10] Botan, I.; Cho, Y.; Derakhshan, R.; Dindar, N.; Gupta, A.; Haas, L. M.; Kim, K.; Lee, C.; Mundada, G.;
Shan, M.-C.; Tatbul, N.; Yan, Y.; Yun, B. & Zhang, J. A demonstration of the MaxStream federated stream
processing system. ICDE, 2010
[LMB+10] Liu, M.; Mihaylov, S. R.; Bao, Z.; Jacob, M.; Ives, Z. G.; Loo, B. T. & Guha, S. SmartCIS: integrating
digital and physical environments. SIGMOD Record, 2010
[LIM+12] Liarou, E.; Idreos, S.; Manegold, S. & Kersten, M. MonetDB/DataCell: online analytics in a streaming
column-store, PVLDB, 2012
[LHB13] Lim, H.; Han, Y. & Babu, S. How to Fit when No One Size Fits, CIDR, 2013
[Ji13] Ji, Y., Database support for processing complex aggregate queries over data streams , EDBT Workshops,
2013
[CDK+14] Çetintemel, U.; Du, J.; Kraska, T.; Madden, S.; Maier, D.; Meehan, J.; Pavlo, A.; Stonebraker, M.;
Sutherland, E.; Tatbul, N.; Tufte, K.; Wang, H. & Zdonik, S. B., S-Store: A streaming NewSQL system for big
velocity applications, PVLDB, 2014
[DLB+11] Daum, M.; Lauterwald, F.; Baumgärtel, P.; Pollner, N. & Meyer-Wegener, K., Efficient and Cost-aware
Operator Placement in Heterogeneous Stream-processing Environments, DEBS, 2011
19
Thank you!
Query Optimization Problem
State-of-the-Art
21
CQ
optimization
Federation
context
Optimization
Granularity
Feasibility-
dependent opt.
[VN02, AN04] √ operator √
Traditional distributed,
federated DBMS, e.g.,
[DH02, BCE+05]
√ operator
MaxStream [BCD+10] √
Cyclops [LHB13] √ √ query
ASPEN [LMB+10] √ √ operator
Operator placement,
e.g., [DLB+11]
√ √/X operator
query
Semantics
• Adopt the abstract semantics defined in [ABW06], which is based on:
– Two data types:
• Stream (S): a possibly infinite bag of elements <s, t>, where s is a
tuple belonging to the schema of S and t is the timestamp of s.
• Time-varying Relation (R): a mapping from T to a finite but
unbounded bag of tuples belonging to the schema of R.
– Three classes of query operators:
• stream-to-relation (S2R) operators: produce one relation from one
stream (e.g., window operators)
• relation-to-relation (R2R) operators: produce one relation from
one or more relations.
• relation-to-stream (R2S) operators: produce one stream from one
relation.
22
SPE
continuous query
streaming data query results
Introduction
From DBMS to SPE
• Increasing interests in processing high-velocity data streams generated in
real-time using continuous queries (CQ).
 Need a new processing paradigm
DBMS
one-shot
queries
query results
stored data
23
Introduction
From DBMS to SPE
• However, many applications require:
– persisting input streaming data/query results for on-demand analysis
– combining streaming data with static data during processing.
24
DBMS
one-shot
queries
query results
stored data
SPE
continuous query
streaming data query results
store data
access
stored data
Introduction
Build SPE on Top of DBMS Kernel
• Exploit and merge technologies from both worlds in an integration way.
– Truviso Continuous Analytics [FKC+09], HP Lab work [CH10], DataCell
[LIM+12], S-Store [CDK+14]
25
SPE + DBMS
one-shot
queries query results
stored data
continuous query
streaming data query results
in-memory
table
buffers
in UDFs

Weitere ähnliche Inhalte

Was ist angesagt?

Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesNECST Lab @ Politecnico di Milano
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesNECST Lab @ Politecnico di Milano
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Rafael Ferreira da Silva
 
STUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHM
STUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHMSTUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHM
STUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHMAvay Minni
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwarePooyan Jamshidi
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersXiao Qin
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
 
Performance Comparision of Dynamic Load Balancing Algorithm in Cloud Computing
Performance Comparision of Dynamic Load Balancing Algorithm in Cloud ComputingPerformance Comparision of Dynamic Load Balancing Algorithm in Cloud Computing
Performance Comparision of Dynamic Load Balancing Algorithm in Cloud ComputingEswar Publications
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...Xiao Qin
 
A Comparative Study between Honeybee Foraging Behaviour Algorithm and Round ...
A Comparative Study between Honeybee Foraging Behaviour Algorithm and  Round ...A Comparative Study between Honeybee Foraging Behaviour Algorithm and  Round ...
A Comparative Study between Honeybee Foraging Behaviour Algorithm and Round ...sondhicse
 
capacityshifting1
capacityshifting1capacityshifting1
capacityshifting1Gokul Vasan
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesNECST Lab @ Politecnico di Milano
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelData Works MD
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsRafael Ferreira da Silva
 
An Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud ComputingAn Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud ComputingAisha Kalsoom
 
Load Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newpptLoad Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newpptUtshab Saha
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Lu Wei
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerancePallav Jha
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profilepramodbiligiri
 

Was ist angesagt? (20)

Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
 
STUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHM
STUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHMSTUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHM
STUDY ON PROJECT MANAGEMENT THROUGH GENETIC ALGORITHM
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data Software
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
 
Performance Comparision of Dynamic Load Balancing Algorithm in Cloud Computing
Performance Comparision of Dynamic Load Balancing Algorithm in Cloud ComputingPerformance Comparision of Dynamic Load Balancing Algorithm in Cloud Computing
Performance Comparision of Dynamic Load Balancing Algorithm in Cloud Computing
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
 
A Comparative Study between Honeybee Foraging Behaviour Algorithm and Round ...
A Comparative Study between Honeybee Foraging Behaviour Algorithm and  Round ...A Comparative Study between Honeybee Foraging Behaviour Algorithm and  Round ...
A Comparative Study between Honeybee Foraging Behaviour Algorithm and Round ...
 
capacityshifting1
capacityshifting1capacityshifting1
capacityshifting1
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph Kernel
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
 
An Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud ComputingAn Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud Computing
 
Load Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newpptLoad Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newppt
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 

Andere mochten auch

Shn Overview Updated 2009 06 P21 23
Shn Overview   Updated 2009 06 P21 23Shn Overview   Updated 2009 06 P21 23
Shn Overview Updated 2009 06 P21 23joaovox
 
Visualization-Driven Data Aggregation
Visualization-Driven Data AggregationVisualization-Driven Data Aggregation
Visualization-Driven Data AggregationZbigniew Jerzak
 
MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...
MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...
MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...PlanetData Network of Excellence
 
Does Current Advertising Cause Future Sales?
Does Current Advertising Cause Future Sales?Does Current Advertising Cause Future Sales?
Does Current Advertising Cause Future Sales?Trieu Nguyen
 
Fast Data processing with RFX
Fast Data processing with RFXFast Data processing with RFX
Fast Data processing with RFXTrieu Nguyen
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsDavide Mauri
 
IT Policy - Need of the Hour
IT Policy - Need of the HourIT Policy - Need of the Hour
IT Policy - Need of the HourVijay Dalmia
 
IDP Asia Brochure
IDP Asia BrochureIDP Asia Brochure
IDP Asia Brochureguest0a024
 
OpenEd 2009 OER Organization Stakeholders
OpenEd 2009 OER Organization StakeholdersOpenEd 2009 OER Organization Stakeholders
OpenEd 2009 OER Organization Stakeholderscurtmadison
 
Prezentation \" OS Windiws\"
Prezentation \" OS Windiws\"Prezentation \" OS Windiws\"
Prezentation \" OS Windiws\"KristG
 
Git WorkFlow & Best Practice
Git WorkFlow & Best PracticeGit WorkFlow & Best Practice
Git WorkFlow & Best PracticeHiraq Citra M
 
Cambodian Dinner Night 15/11/08
Cambodian Dinner Night 15/11/08Cambodian Dinner Night 15/11/08
Cambodian Dinner Night 15/11/08camkh12
 
Shn, permaculture pilot, 2008 april, 21 30
Shn, permaculture pilot, 2008 april, 21 30Shn, permaculture pilot, 2008 april, 21 30
Shn, permaculture pilot, 2008 april, 21 30joaovox
 
Amazing number3
Amazing number3Amazing number3
Amazing number3ShdwClaw
 
Tourism Oxford: our rural roots are showing
Tourism Oxford: our rural roots are showingTourism Oxford: our rural roots are showing
Tourism Oxford: our rural roots are showingEmily Robson
 
Shn Overview Updated 2009 06 P11 20
Shn Overview   Updated 2009 06 P11 20Shn Overview   Updated 2009 06 P11 20
Shn Overview Updated 2009 06 P11 20joaovox
 
PowerPoint Training - The power of visuals
PowerPoint Training - The power of visualsPowerPoint Training - The power of visuals
PowerPoint Training - The power of visualsLinda Mkhize-Manashe
 
Marketing research of the future
Marketing research of the futureMarketing research of the future
Marketing research of the futureKristof De Wulf
 
More amazing photoshop tut
More amazing photoshop tutMore amazing photoshop tut
More amazing photoshop tutShdwClaw
 
WordPress Security @ Vienna WordPress + Drupal Meetup
WordPress Security @ Vienna WordPress + Drupal MeetupWordPress Security @ Vienna WordPress + Drupal Meetup
WordPress Security @ Vienna WordPress + Drupal MeetupVeselin Nikolov
 

Andere mochten auch (20)

Shn Overview Updated 2009 06 P21 23
Shn Overview   Updated 2009 06 P21 23Shn Overview   Updated 2009 06 P21 23
Shn Overview Updated 2009 06 P21 23
 
Visualization-Driven Data Aggregation
Visualization-Driven Data AggregationVisualization-Driven Data Aggregation
Visualization-Driven Data Aggregation
 
MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...
MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...
MonetDB/DataCell - Exploiting the Power of Relational Databases for Efficient...
 
Does Current Advertising Cause Future Sales?
Does Current Advertising Cause Future Sales?Does Current Advertising Cause Future Sales?
Does Current Advertising Cause Future Sales?
 
Fast Data processing with RFX
Fast Data processing with RFXFast Data processing with RFX
Fast Data processing with RFX
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream Analytics
 
IT Policy - Need of the Hour
IT Policy - Need of the HourIT Policy - Need of the Hour
IT Policy - Need of the Hour
 
IDP Asia Brochure
IDP Asia BrochureIDP Asia Brochure
IDP Asia Brochure
 
OpenEd 2009 OER Organization Stakeholders
OpenEd 2009 OER Organization StakeholdersOpenEd 2009 OER Organization Stakeholders
OpenEd 2009 OER Organization Stakeholders
 
Prezentation \" OS Windiws\"
Prezentation \" OS Windiws\"Prezentation \" OS Windiws\"
Prezentation \" OS Windiws\"
 
Git WorkFlow & Best Practice
Git WorkFlow & Best PracticeGit WorkFlow & Best Practice
Git WorkFlow & Best Practice
 
Cambodian Dinner Night 15/11/08
Cambodian Dinner Night 15/11/08Cambodian Dinner Night 15/11/08
Cambodian Dinner Night 15/11/08
 
Shn, permaculture pilot, 2008 april, 21 30
Shn, permaculture pilot, 2008 april, 21 30Shn, permaculture pilot, 2008 april, 21 30
Shn, permaculture pilot, 2008 april, 21 30
 
Amazing number3
Amazing number3Amazing number3
Amazing number3
 
Tourism Oxford: our rural roots are showing
Tourism Oxford: our rural roots are showingTourism Oxford: our rural roots are showing
Tourism Oxford: our rural roots are showing
 
Shn Overview Updated 2009 06 P11 20
Shn Overview   Updated 2009 06 P11 20Shn Overview   Updated 2009 06 P11 20
Shn Overview Updated 2009 06 P11 20
 
PowerPoint Training - The power of visuals
PowerPoint Training - The power of visualsPowerPoint Training - The power of visuals
PowerPoint Training - The power of visuals
 
Marketing research of the future
Marketing research of the futureMarketing research of the future
Marketing research of the future
 
More amazing photoshop tut
More amazing photoshop tutMore amazing photoshop tut
More amazing photoshop tut
 
WordPress Security @ Vienna WordPress + Drupal Meetup
WordPress Security @ Vienna WordPress + Drupal MeetupWordPress Security @ Vienna WordPress + Drupal Meetup
WordPress Security @ Vienna WordPress + Drupal Meetup
 

Ähnlich wie Optimization of Continuous Queries in Federated Database and Stream Processing Systems

Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Databricks
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING JournalPandey_G
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesIan Foster
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingUniversity of Washington
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff UniversityPaolo Missier
 
Woop - Workflow Optimizer
Woop - Workflow OptimizerWoop - Workflow Optimizer
Woop - Workflow OptimizerMartin Homik
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
Parallel analytics as a service
Parallel analytics as a serviceParallel analytics as a service
Parallel analytics as a servicePetrie Wong
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceSpiros Economakis
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceSpiros Oikonomakis
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsNECST Lab @ Politecnico di Milano
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 

Ähnlich wie Optimization of Continuous Queries in Federated Database and Stream Processing Systems (20)

Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
Hassan - Condor _48_x_36
Hassan - Condor _48_x_36Hassan - Condor _48_x_36
Hassan - Condor _48_x_36
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity Computing
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff University
 
Woop - Workflow Optimizer
Woop - Workflow OptimizerWoop - Workflow Optimizer
Woop - Workflow Optimizer
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Linux capacity planning
Linux capacity planningLinux capacity planning
Linux capacity planning
 
Cocomo
CocomoCocomo
Cocomo
 
Parallel analytics as a service
Parallel analytics as a serviceParallel analytics as a service
Parallel analytics as a service
 
Cocomo models
Cocomo modelsCocomo models
Cocomo models
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/Reduce
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/Reduce
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 

Mehr von Zbigniew Jerzak

Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing SystemsLatency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing SystemsZbigniew Jerzak
 
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineElastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineZbigniew Jerzak
 
Cloud-based Data Stream Processing
Cloud-based Data Stream ProcessingCloud-based Data Stream Processing
Cloud-based Data Stream ProcessingZbigniew Jerzak
 
ThesisXSiena: The Content-Based Publish/Subscribe System
ThesisXSiena: The Content-Based Publish/Subscribe SystemThesisXSiena: The Content-Based Publish/Subscribe System
ThesisXSiena: The Content-Based Publish/Subscribe SystemZbigniew Jerzak
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsClock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsZbigniew Jerzak
 
XSiena: The Content-Based Publish/Subscribe System
XSiena: The Content-Based Publish/Subscribe SystemXSiena: The Content-Based Publish/Subscribe System
XSiena: The Content-Based Publish/Subscribe SystemZbigniew Jerzak
 
Soft State in Publish/Subscribe
Soft State in Publish/SubscribeSoft State in Publish/Subscribe
Soft State in Publish/SubscribeZbigniew Jerzak
 
Highly Available Publish/Subscribe
Highly Available Publish/SubscribeHighly Available Publish/Subscribe
Highly Available Publish/SubscribeZbigniew Jerzak
 
Prefix Forwarding for Publish/Subscribe
Prefix Forwarding for Publish/SubscribePrefix Forwarding for Publish/Subscribe
Prefix Forwarding for Publish/SubscribeZbigniew Jerzak
 
Fail-Aware Publish/Subscribe
Fail-Aware Publish/SubscribeFail-Aware Publish/Subscribe
Fail-Aware Publish/SubscribeZbigniew Jerzak
 
Bloom Filter Based Routing for Content-Based Publish/Subscribe
Bloom Filter Based Routing for Content-Based Publish/SubscribeBloom Filter Based Routing for Content-Based Publish/Subscribe
Bloom Filter Based Routing for Content-Based Publish/SubscribeZbigniew Jerzak
 
Adaptive Internal Clock Synchronization
Adaptive Internal Clock SynchronizationAdaptive Internal Clock Synchronization
Adaptive Internal Clock SynchronizationZbigniew Jerzak
 

Mehr von Zbigniew Jerzak (12)

Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing SystemsLatency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
 
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineElastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
 
Cloud-based Data Stream Processing
Cloud-based Data Stream ProcessingCloud-based Data Stream Processing
Cloud-based Data Stream Processing
 
ThesisXSiena: The Content-Based Publish/Subscribe System
ThesisXSiena: The Content-Based Publish/Subscribe SystemThesisXSiena: The Content-Based Publish/Subscribe System
ThesisXSiena: The Content-Based Publish/Subscribe System
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsClock Synchronization in Distributed Systems
Clock Synchronization in Distributed Systems
 
XSiena: The Content-Based Publish/Subscribe System
XSiena: The Content-Based Publish/Subscribe SystemXSiena: The Content-Based Publish/Subscribe System
XSiena: The Content-Based Publish/Subscribe System
 
Soft State in Publish/Subscribe
Soft State in Publish/SubscribeSoft State in Publish/Subscribe
Soft State in Publish/Subscribe
 
Highly Available Publish/Subscribe
Highly Available Publish/SubscribeHighly Available Publish/Subscribe
Highly Available Publish/Subscribe
 
Prefix Forwarding for Publish/Subscribe
Prefix Forwarding for Publish/SubscribePrefix Forwarding for Publish/Subscribe
Prefix Forwarding for Publish/Subscribe
 
Fail-Aware Publish/Subscribe
Fail-Aware Publish/SubscribeFail-Aware Publish/Subscribe
Fail-Aware Publish/Subscribe
 
Bloom Filter Based Routing for Content-Based Publish/Subscribe
Bloom Filter Based Routing for Content-Based Publish/SubscribeBloom Filter Based Routing for Content-Based Publish/Subscribe
Bloom Filter Based Routing for Content-Based Publish/Subscribe
 
Adaptive Internal Clock Synchronization
Adaptive Internal Clock SynchronizationAdaptive Internal Clock Synchronization
Adaptive Internal Clock Synchronization
 

Kürzlich hochgeladen

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 

Kürzlich hochgeladen (20)

The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 

Optimization of Continuous Queries in Federated Database and Stream Processing Systems

  • 1. Optimization of Continuous Queries in Federated Database and Stream Processing Systems Yuanzhen Ji1, Zbigniew Jerzak1, Anisoara Nica1, Gregor Hackenbroich1, Christof Fetzer2 1SAP SE 2TU Dresden 1firstname.lastname@sap.com 2christof.fetzer@tu-dresden.de March 16, 2015 BTW 2015
  • 2. Agenda • Introduction • Federated Continuous Query Execution • Query Optimization Problem • Our Optimization Solution • Evaluation • Conclusions 2
  • 3. • Problem: optimizing continuous queries (CQ) for federated execution over a native stream processing engine (SPE) and column-oriented in-memory database (CIMDB). – operators: select, join, project, aggregate • Goal: maximize query throughput (amount of data processed in unit time) Introduction 3 SPE CIMDB data streams query results data flow
  • 4. Introduction • Motivation: – “No one size fits all” (Cyclops[LHB13], [JI13]) – obtain the best of both worlds (SPE, CIMDB) • Application Scenario: – analyzing energy consumption data collected from smart plugs installed in households (DEBS 2014 Grand Challenge) • Main contributions: – a static cost-based optimizer for federated systems • extends established optimization techniques • considers the feasibility property of CQ – showed the potential of federated CQ execution over a SPE and a CIMDB • up to 8.5x as high as throughput of pure SPE based processing • up to 1.8x as high as throughput of pure CIMDB based processing 4
  • 5. Federated Continuous Query Execution • send relevant input data from SPE to CIMDB • trigger re-evaluation of query pieces moved to CIMDB • take results of query pieces executed in CIMDB back to SPE 5 SPE CIMDB data streams query results SQL query MIG MIG data flow
  • 6. Query Optimization Problem • Problem: determine the optimal execution plan for a given CQ – currently at deployment time • Feasibility of continuous queries [AN04]: – feasible execution plan: can keep up with data arrival rate – feasible query: has at least one feasible plan 6 SPE CIMDB • Feasibility-dependent optimization objective: – feasible queries: find the feasible plan with least resource consumption – infeasible queries: find the plan which with maximal throughput • State of the art: either consider feasibility of CQ but not the federation context, or the federation context but not the feasibility of CQ.
  • 7. Optimization Solution Cost Model – Operator Cost (1) • Operator cost: CPU cost caused by tuples arrived from data sources within unit-time For an 𝑂 with k direct upstream operators: – li: # tuples produced by the i-th upstream operator as a result of unit-time source arrivals – ci: time to process a single tuple from the i-th upstream operator 7 𝑢 > 1  bottleneck  infeasible plan 𝑢(𝑂) = 𝑖=1 𝑘 li 𝑐𝑖 = l1 𝑐1 + l2 𝑐2 O l1=300 =200 =0.001 = 0.002l2 c1 c2 = 300* 0.001+ 200 * 0.002 = 0.7
  • 8. Optimization Solution Cost Model – Operator Cost (2) • A query piece executed in CIMDB and its corresponding MIG operator: – treated as a composite operator and cost as a whole – cost includes data transfer (in & out) cost and query execution cost 8 SPE CIMDB data streams query results SQL query MIG data flow
  • 9. • Execution plan cost: C(P) = <𝐶 𝑏 𝑃 , 𝐶 𝑢 𝑃 > (m operator) – Two components: bottleneck cost: 𝐶 𝑏 𝑃 = max{𝑢(𝑂𝑗): 𝑗 ∈ [1, 𝑚]} total utilization cost: 𝐶 𝑢 𝑃 = 𝑗=1 𝑚 𝑢(𝑂𝑗) (m: # operators in P) – 𝑃 is infeasible if 𝐶 𝑏 𝑃 >1 Optimization Solution Cost Model – Execution Plan Cost 9 𝐶 𝑏 𝑃 = 1.1 𝐶 𝑢 𝑃 = 2.6 𝑢(𝑂1)=0.5 O3 O1 O2 O4 𝑢(𝑂2)=0.3 𝑢(𝑂3)=1.1 𝑢(𝑂4)=0.7
  • 10. Optimization Solution Optimal Execution Plan • An execution plan P of a CQ is an optimal plan, iff for any other plan P’ of CQ, one of the following conditions is satisfied: – Condition 1: P is feasible but P’ is infeasible (Cb(P) ≤ 1 < Cb(P’) ) – Condition 2: Both P and P’ are feasible, but P has lower Cu(P) (Cb(P) ≤ 1, Cb(P’) ≤ 1, and Cu(P) ≤ Cu(P’) ) – Condition 3: Both P and P’ are feasible, but P has lower Cu(P) (1 < Cb(P) ≤ Cb(P’) ) 10
  • 11. Optimization Solution Two Phase-Optimization • Large search space (# possible plans): – many semantically equivalent logical plans – A logical plan with n operators -> 2n possible placement decisions • Two-Phase optimization: – Phase One: determine the optimal logical plan (consider join ordering, etc.) – Phase two: determine placement for each operator in the logical plan produced in phase-one. • Bottom-up plan construction following dynamic programming (DP) model • Proved applicability of DP for feasibility-dependent optimization objective in paper. 11
  • 12. • For each operator O in a logical plan, the optimal sub-plan until O, where O is placed in the SPE, can be build from the optimal sub-plans until direct upstream operators of O. • For a large logical plan: divide into smaller pieces, optimize and compose in post order. Optimization Solution Pruning in Phase Two 12 I1 𝑶 𝟐 𝑺𝑷𝑬 𝑶 𝟏 𝑺𝑷𝑬 𝑶 𝟐 𝑺𝑷𝑬 𝑶 𝟏 𝑫𝑩 I2 𝐶 𝐼1 < 𝐶 𝐼2
  • 13. Evaluation Setup • Setup: HP Z620 workstation with 24-cores (1.2GHz per core) and 96 GB RAM, running SUSE Linux. • Data: real-world energy consumption data from smart plugs installed in households (DEBS 2014 Grand Challenge). • Tested queries: 13
  • 14. 26.1 3.1 18.7 0 5 10 15 20 25 30 SELECT in SPE All in SPE All in DB Max.throughput(thousand/s) 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 Actualthroughput(thousand/s) Requested throughput (thousand/s) Evaluation Optimizer effectiveness (1) • Examine 10 source stream data rates picked from range [1,000, 40,000] (tuples/s) • measure throughput of devised optimal query 14 Max. throughput comparisonActual vs. requested throughput PROJECT INNER JOIN AGGR (avg) SELECT SELECT WINDOW (5 min) WINDOW (5 min) AGGR (cnt) SELECT IN SPE
  • 15. Evaluation Optimizer effectiveness (2) 15 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 Actualthroughput(thousand/s) Requested throughput (thousand/s) 18.1 28.6 6.0 18.0 0 5 10 15 20 25 30 SELECT in SPE SEL, JOIN, P in SPE All in SPE All in DB Max.throughput(thousand/s) P1 P2 P1 P2 Max. throughput comparisonActual vs. requested throughput • Examine data rates ranging from 1000 to 40,000 tuples/s, at 1000 tuples/s increment • measure throughput of devised optimal query P1 PROJECT INNER JOIN AGGR (avg, max) AGGR (avg, max) SELECT SELECT WINDOW (5 min) WINDOW (1 min) SELECT IN SPE (P1) SEL, JOIN, P IN SPE (P2)
  • 16. Evaluation Influence of Feasibility Check 16 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 Actualthroughput(thousand/s) Requested throughput (thousand/s) PROJECT INNER JOIN AGGR (avg, max) AGGR (avg, max) SELECT SELECT WINDOW (5 min) WINDOW (1 min) SELECT IN SPE (with feasibility check) SEL, JOIN, P IN SPE (with feasibility check) SEL IN SPE (without feasibility check)
  • 17. Evaluation Optimization Time • Tested with join queries (2-way, 5-way, 8-way). 17 11 312 8411 64 327168 2-way (6) 5-way (15) 8-way (24) #enumeratedplansinPhase-Two (logscale) With pruning Without pruning 0.9 68.6 100.5 12.3 908.6 61335.3 2-way (6) 5-way (15) 8-way (24) Timeinmillisecond (logscale) Phase-One Phase-Two 16+ million PROJECT INNER JOIN AGGR (avg, max) AGGR (avg, max) SELECT SELECT WINDOW (5 min) WINDOW (1 min)
  • 18. Conclusion • Exploits the potential of federated execution of CQ over SPE and IMDB. • Presents a static optimizer which extends traditional optimization techniques to consider feasibility of CQ. • Evaluation show promising results. For examined queries, throughput of devised federated plan is – up to 8.5 times as high as throughput of pure SPE-based plan – up to 1.8 times as high as throughput of pure CIMDB-based plan 18
  • 19. References [AN04] Ayad, A. M. & Naughton, J. F., Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams, SIGMOD, 2004 [FKC+09] Franklin, M. J.; Krishnamurthy, S.; Conway, N.; Li, A., Russakovsky, A. & Thombre, N., Continuous Analytics: Rethinking query processing in a network-effect world. CIDR, 2009 [KS09] Kraemer, J. & Seeger B., Semantics and implementation of continuous sliding window queries over data streams, ACM TODS, 2009 [BCD+10] Botan, I.; Cho, Y.; Derakhshan, R.; Dindar, N.; Gupta, A.; Haas, L. M.; Kim, K.; Lee, C.; Mundada, G.; Shan, M.-C.; Tatbul, N.; Yan, Y.; Yun, B. & Zhang, J. A demonstration of the MaxStream federated stream processing system. ICDE, 2010 [LMB+10] Liu, M.; Mihaylov, S. R.; Bao, Z.; Jacob, M.; Ives, Z. G.; Loo, B. T. & Guha, S. SmartCIS: integrating digital and physical environments. SIGMOD Record, 2010 [LIM+12] Liarou, E.; Idreos, S.; Manegold, S. & Kersten, M. MonetDB/DataCell: online analytics in a streaming column-store, PVLDB, 2012 [LHB13] Lim, H.; Han, Y. & Babu, S. How to Fit when No One Size Fits, CIDR, 2013 [Ji13] Ji, Y., Database support for processing complex aggregate queries over data streams , EDBT Workshops, 2013 [CDK+14] Çetintemel, U.; Du, J.; Kraska, T.; Madden, S.; Maier, D.; Meehan, J.; Pavlo, A.; Stonebraker, M.; Sutherland, E.; Tatbul, N.; Tufte, K.; Wang, H. & Zdonik, S. B., S-Store: A streaming NewSQL system for big velocity applications, PVLDB, 2014 [DLB+11] Daum, M.; Lauterwald, F.; Baumgärtel, P.; Pollner, N. & Meyer-Wegener, K., Efficient and Cost-aware Operator Placement in Heterogeneous Stream-processing Environments, DEBS, 2011 19
  • 21. Query Optimization Problem State-of-the-Art 21 CQ optimization Federation context Optimization Granularity Feasibility- dependent opt. [VN02, AN04] √ operator √ Traditional distributed, federated DBMS, e.g., [DH02, BCE+05] √ operator MaxStream [BCD+10] √ Cyclops [LHB13] √ √ query ASPEN [LMB+10] √ √ operator Operator placement, e.g., [DLB+11] √ √/X operator query
  • 22. Semantics • Adopt the abstract semantics defined in [ABW06], which is based on: – Two data types: • Stream (S): a possibly infinite bag of elements <s, t>, where s is a tuple belonging to the schema of S and t is the timestamp of s. • Time-varying Relation (R): a mapping from T to a finite but unbounded bag of tuples belonging to the schema of R. – Three classes of query operators: • stream-to-relation (S2R) operators: produce one relation from one stream (e.g., window operators) • relation-to-relation (R2R) operators: produce one relation from one or more relations. • relation-to-stream (R2S) operators: produce one stream from one relation. 22
  • 23. SPE continuous query streaming data query results Introduction From DBMS to SPE • Increasing interests in processing high-velocity data streams generated in real-time using continuous queries (CQ).  Need a new processing paradigm DBMS one-shot queries query results stored data 23
  • 24. Introduction From DBMS to SPE • However, many applications require: – persisting input streaming data/query results for on-demand analysis – combining streaming data with static data during processing. 24 DBMS one-shot queries query results stored data SPE continuous query streaming data query results store data access stored data
  • 25. Introduction Build SPE on Top of DBMS Kernel • Exploit and merge technologies from both worlds in an integration way. – Truviso Continuous Analytics [FKC+09], HP Lab work [CH10], DataCell [LIM+12], S-Store [CDK+14] 25 SPE + DBMS one-shot queries query results stored data continuous query streaming data query results in-memory table buffers in UDFs