SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
Philipp M. Grulich (TU Berlin) & Jonas Traub (TU Berlin)
Scotty: Efficient Window Aggregation
for your Stream Processing System
Big Data Track
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
About Us
Philipp M. Grulich
Research Associate (TU Berlin)
grulich@tu-berlin.de
Jonas Traub
Research Associate (TU Berlin)
jonas.traub@tu-berlin.de
2
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
About Us
Philipp M. Grulich
Research Associate (TU Berlin)
grulich@tu-berlin.de
Jonas Traub
Research Associate (TU Berlin)
jonas.traub@tu-berlin.de
Database Systems and Information Management Research Group at TU Berlin
www.dima.tu-berlin.de
2
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Stream Processing Systems
Souce: Rajaraman, A., & Ullman, J. D. (2012). Mining of massive datasets (Vol. 77). Cambridge University Press. Chapter 4, www.mmds.org
3
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Stream Processing Systems
Souce: Rajaraman, A., & Ullman, J. D. (2012). Mining of massive datasets (Vol. 77). Cambridge University Press. Chapter 4, www.mmds.org
3
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
4
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
4
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
4
53
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
53
4
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
8
4
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
8
4
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Aggregations in Stream Processing Pipelines
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
Arithmetic Operations
Sum, min, max etc.
Statistics / Analysis
Reservoir Sampling
ML Model Updates
Concept Drift Detection
Aggregation Examples:
4
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Apache Flink - Stateful Stream Processing
“Apache Flink is a framework and distributed processing engine for stateful computations
over unbounded and bounded data streams. Flink has been designed to run in all common cluster
environments, perform computations at in-memory speed and at any scale.” (flink.apache.org)
5
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Motivation
6
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Cutty: Aggregate Sharing for User-Defined Windows
P. Carbone, J. Traub, A. Katsifodimos, S. Haridi, V. Markl
ACM International on Conference on Information and Knowledge Management (CIKM 2016)
Scotty: Efficient Window Aggregation for out-of-order Stream Processing
J. Traub, P. M. Grulich, A. R. Cuéllar, S. Breß, A. Katsifodimos, T. Rabl, V. Markl
IEEE International Conference on Data Engineering (ICDE 2018)
Efficient Window Aggregation with General Stream Slicing
J. Traub, P. M. Grulich, AR. Cuéllar, S. Breß, A. Katsifodimos, T. Rabl, V. Markl
International Conference on Extending Database Technology (EDBT 2019; Best Paper Award)
Scotty Window Processor:
Efficent Window Aggregations for Flink, Beam, and Storm
https://github.com/TU-Berlin-DIMA/scotty-window-processor
7
Research Background
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 8
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
The number of slices depends on the workload.
9
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 10
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 11
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 12
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 13
We store partial aggregates instead of all tuples. => Small memory footprint.
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 14
We assign each tuple to exactly one slice. => O(1) per-tuple complexity.
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 15
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
We require just a few computation steps to calculate final aggregates. => Low latency.
16
Stream Slicing Example
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Stream
Order
in-order
out-of-order
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
17
Workload Characteristics
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Stream
Order
in-order
out-of-order
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
General Stream Slicing combines generality and efficiency in a single solution.
17
Workload Characteristics
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
Count-based tumbling window
with a length of 5 tuples.
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Count-based tumbling window
with a length of 5 tuples.
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Count-based tumbling window
with a length of 5 tuples.
11 13 12
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
11 13 12
What if the stream is out-of-order?
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Out-of-order Tuple
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Out-of-order Tuple
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
13 12
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 12
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 12
5
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 125 + - 3
5
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 123 1+ -5 + - 3
5
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Impact of Workload Characteristics
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 123 1+ -5 + - 3
5
What if the aggregation function is not invertible?
18
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Scotty Window Processor:
Efficent Window Aggregations
for Flink, Beam, and Storm
https://github.com/TU-Berlin-DIMA/scotty-window-processor
19
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Key-Facts
Features:
● One window operator for many systems.
● High performance window aggregations with stream slicing.
● Scales to thousands of concurrent windows.
● Aggregate sharing among multiple window queries.
● Adapts to workload characteristics:
○ Window Types
○ Aggregation Functions
○ Window Measures
○ Stream Order
20
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Key-Facts
Features:
● One window operator for many systems.
● High performance window aggregations with stream slicing.
● Scales to thousands of concurrent windows.
● Aggregate sharing among multiple window queries.
● Adapts to workload characteristics:
○ Window Types
○ Aggregation Functions
○ Window Measures
○ Stream Order
Connectors:
…more coming soon…
20
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Scotty Core
21
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Scotty Core
21
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Scotty Core
21
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Scotty Core
21
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Scotty Core
Scotty adapts to work load characteristics
and combines generality and efficiency in a single solution.
21
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Benchmark
Concurrent Windows with Built-in Window Operator:
● Flink performs well
with a single window
(no overlap; one
bucket at a time)
0
500.000
1.000.000
1.500.000
2.000.000
2.500.000
1 10 20 50 100 500 1000
Flink Storm Flink on Beam
Throughput(Tuples/sec.)
Number of Councurrent Windows
22
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Benchmark
Concurrent Windows with Built-in Window Operator:
● Flink performs well
with a single window
(no overlap; one
bucket at a time)
0
500.000
1.000.000
1.500.000
2.000.000
2.500.000
1 10 20 50 100 500 1000
Flink Storm Flink on Beam
● With overlapping
concurrent windows,
the throughput drops
drastically.
Throughput(Tuples/sec.)
Number of Councurrent Windows
22
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
0
500.000
1.000.000
1.500.000
2.000.000
2.500.000
1 10 20 50 100 500 1000
Flink+Scotty Storm+Scotty Beam+Flink+Scotty
Benchmark
Concurrent Windows with Scotty:
● With Scotty, the throughput is
independent of the number of
concurrent windows.
23
Throughput(Tuples/sec.)
Number of Councurrent Windows
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Using Scotty on Flink
1. Clone Scotty and install to maven
24
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Using Scotty on Flink
1. Clone Scotty and install to maven
2. Add Scotty to your Flink Project:
24
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Using Scotty on Flink
1. Initialize Scotty Window Operator
25
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Using Scotty on Flink
1. Initialize Scotty Window Operator
2. Add Window Definitions
25
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Using Scotty on Flink
1. Initialize Scotty Window Operator
3. Add Scotty to your Flink Job
2. Add Window Definitions
25
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Implement your own Aggregations Functions
26
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Implement your own Aggregations Functions
Example:
• Average -> Sum/Count
27
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Implement your own Aggregations Functions
8|4
6 7 1 2 5
3|3
Input Stream: Output Stream:
PartialState
Example:
• Average -> Sum/Count
28
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Implement your own Aggregations Functions
8|4
6 7 1 2
1. lift
5
5|1
3|3
Input Stream: Output Stream:
Example:
• Average -> Sum/Count
28
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Implement your own Aggregations Functions
8|4
6 7 1 2
1. lift
5
5|1
2. combine
3|38|4
Input Stream: Output Stream:
Example:
• Average -> Sum/Count
28
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Implement your own Aggregations Functions
8|4
6 7 1 2
1. lift
5
5|1
2. combine
2
2
3. lower
3|38|4
Input Stream: Output Stream:
Example:
• Average -> Sum/Count
28
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Upcoming Research Projects
29
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Upcoming Research Projects
The NebulaStream Platform:
Data and Application Management
for the Internet of Things
https://arxiv.org/pdf/1910.07867.pdf
29
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Upcoming Research Projects
Agora:
An Open Ecosystem for Democratizing Data
Science & Artificial Intelligence
The NebulaStream Platform:
Data and Application Management
for the Internet of Things
https://arxiv.org/pdf/1910.07867.pdf https://arxiv.org/pdf/1909.03026.pdf
29
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Upcoming Research Projects
Agora:
An Open Ecosystem for Democratizing Data
Science & Artificial Intelligence
The NebulaStream Platform:
Data and Application Management
for the Internet of Things
https://arxiv.org/pdf/1910.07867.pdf https://arxiv.org/pdf/1909.03026.pdf
We are hiring!
Research Associates / PhD Students & Post Docs (m/w/d)
Catch us after the talks or send a mail to
jobs@dima.tu-berlin.de
29
Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System
Acknowledgements: This talk is supported by the Berlin Big Data Center (01IS14013A), the Berlin Center for Machine Learning (01IS18037A), and Software Campus (1-3000473-18TP).
Scotty Features:
● One window operator for many systems.
● High performance with stream slicing.
● Scales to thousands of concurrent windows.
● Aggregate sharing among multiple window queries.
● Adapts to workload characteristics
tu-berlin-dima.github.io/
scotty-window-processor
Open Source Repository:
30
Scotty Window Processor

Weitere ähnliche Inhalte

Ähnlich wie code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Processing System

Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingJonas Traub
 
Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...
Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...
Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...Flink Forward
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Jonas Traub
 
Data Streaming in IoT and Big Data Analytics
Data Streaming in  IoT and Big Data AnalyticsData Streaming in  IoT and Big Data Analytics
Data Streaming in IoT and Big Data AnalyticsVincenzo Gulisano
 
How to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC HardwareHow to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC Hardwareinside-BigData.com
 
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
From Cloud to Fog: the Tao of IT Infrastructure DecentralizationFrom Cloud to Fog: the Tao of IT Infrastructure Decentralization
From Cloud to Fog: the Tao of IT Infrastructure DecentralizationFogGuru MSCA Project
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept Miha Ahronovitz
 
The benefits of fine-grained synchronization in deterministic and efficient ...
The benefits of fine-grained synchronization in  deterministic and efficient ...The benefits of fine-grained synchronization in  deterministic and efficient ...
The benefits of fine-grained synchronization in deterministic and efficient ...Vincenzo Gulisano
 
Blue gene technology
Blue gene technologyBlue gene technology
Blue gene technologyVivek Jha
 
Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402vrij
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...Paolo Missier
 
AIAA Future of Fluids 2018 Balaji
AIAA Future of Fluids 2018 BalajiAIAA Future of Fluids 2018 Balaji
AIAA Future of Fluids 2018 BalajiQiqi Wang
 
Distributed stream consistency checking
Distributed stream consistency checkingDistributed stream consistency checking
Distributed stream consistency checkingDaniele Dell'Aglio
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data ContextAlasdair Gray
 
Active Data PDSW'13
Active Data PDSW'13Active Data PDSW'13
Active Data PDSW'13Gilles Fedak
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
 
Costs of the French PWR
Costs of the French PWRCosts of the French PWR
Costs of the French PWRmyatom
 

Ähnlich wie code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Processing System (20)

Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
 
Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...
Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...
Flink Forward Berlin 2018: Jonas Traub & Philipp Grulich - "Efficient Window ...
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
 
Data Streaming in IoT and Big Data Analytics
Data Streaming in  IoT and Big Data AnalyticsData Streaming in  IoT and Big Data Analytics
Data Streaming in IoT and Big Data Analytics
 
How to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC HardwareHow to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC Hardware
 
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
From Cloud to Fog: the Tao of IT Infrastructure DecentralizationFrom Cloud to Fog: the Tao of IT Infrastructure Decentralization
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
Bluegene
BluegeneBluegene
Bluegene
 
The benefits of fine-grained synchronization in deterministic and efficient ...
The benefits of fine-grained synchronization in  deterministic and efficient ...The benefits of fine-grained synchronization in  deterministic and efficient ...
The benefits of fine-grained synchronization in deterministic and efficient ...
 
Blue gene technology
Blue gene technologyBlue gene technology
Blue gene technology
 
Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402
 
Bluegene
BluegeneBluegene
Bluegene
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
 
AIAA Future of Fluids 2018 Balaji
AIAA Future of Fluids 2018 BalajiAIAA Future of Fluids 2018 Balaji
AIAA Future of Fluids 2018 Balaji
 
Multicore computing
Multicore computingMulticore computing
Multicore computing
 
Distributed stream consistency checking
Distributed stream consistency checkingDistributed stream consistency checking
Distributed stream consistency checking
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data Context
 
Active Data PDSW'13
Active Data PDSW'13Active Data PDSW'13
Active Data PDSW'13
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
Costs of the French PWR
Costs of the French PWRCosts of the French PWR
Costs of the French PWR
 

Mehr von Jonas Traub

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Jonas Traub
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Jonas Traub
 
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Jonas Traub
 
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Jonas Traub
 
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Jonas Traub
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Jonas Traub
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingJonas Traub
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLJonas Traub
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...Jonas Traub
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataJonas Traub
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)Jonas Traub
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisJonas Traub
 

Mehr von Jonas Traub (14)

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
 
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
 
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
 
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCL
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming Data
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 

Kürzlich hochgeladen

Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasBACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasChayanika Das
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
Probability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UGProbability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UGSoniaBajaj10
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...Chayanika Das
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterHanHyoKim
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasChayanika Das
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and AnnovaMansi Rastogi
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPRPirithiRaju
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsDanielBaumann11
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPRPirithiRaju
 

Kürzlich hochgeladen (20)

Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasBACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
Probability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UGProbability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UG
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarter
 
Ultrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptxUltrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptx
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
Introduction Classification Of Alkaloids
Introduction Classification Of AlkaloidsIntroduction Classification Of Alkaloids
Introduction Classification Of Alkaloids
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
 
PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 

code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Processing System

  • 1. Philipp M. Grulich (TU Berlin) & Jonas Traub (TU Berlin) Scotty: Efficient Window Aggregation for your Stream Processing System Big Data Track
  • 2. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System About Us Philipp M. Grulich Research Associate (TU Berlin) grulich@tu-berlin.de Jonas Traub Research Associate (TU Berlin) jonas.traub@tu-berlin.de 2
  • 3. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System About Us Philipp M. Grulich Research Associate (TU Berlin) grulich@tu-berlin.de Jonas Traub Research Associate (TU Berlin) jonas.traub@tu-berlin.de Database Systems and Information Management Research Group at TU Berlin www.dima.tu-berlin.de 2
  • 4. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Stream Processing Systems Souce: Rajaraman, A., & Ullman, J. D. (2012). Mining of massive datasets (Vol. 77). Cambridge University Press. Chapter 4, www.mmds.org 3
  • 5. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Stream Processing Systems Souce: Rajaraman, A., & Ullman, J. D. (2012). Mining of massive datasets (Vol. 77). Cambridge University Press. Chapter 4, www.mmds.org 3
  • 6. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. 4
  • 7. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. 4
  • 8. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. Window Aggregation 4 53
  • 9. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. Window Aggregation 53 4
  • 10. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. Window Aggregation 8 4
  • 11. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. Window Aggregation 8 4
  • 12. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Aggregations in Stream Processing Pipelines A stream processing pipeline is a series of concurrently running operators. Window Aggregation Arithmetic Operations Sum, min, max etc. Statistics / Analysis Reservoir Sampling ML Model Updates Concept Drift Detection Aggregation Examples: 4
  • 13. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Apache Flink - Stateful Stream Processing “Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.” (flink.apache.org) 5
  • 14. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Motivation 6
  • 15. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Cutty: Aggregate Sharing for User-Defined Windows P. Carbone, J. Traub, A. Katsifodimos, S. Haridi, V. Markl ACM International on Conference on Information and Knowledge Management (CIKM 2016) Scotty: Efficient Window Aggregation for out-of-order Stream Processing J. Traub, P. M. Grulich, A. R. Cuéllar, S. Breß, A. Katsifodimos, T. Rabl, V. Markl IEEE International Conference on Data Engineering (ICDE 2018) Efficient Window Aggregation with General Stream Slicing J. Traub, P. M. Grulich, AR. Cuéllar, S. Breß, A. Katsifodimos, T. Rabl, V. Markl International Conference on Extending Database Technology (EDBT 2019; Best Paper Award) Scotty Window Processor: Efficent Window Aggregations for Flink, Beam, and Storm https://github.com/TU-Berlin-DIMA/scotty-window-processor 7 Research Background
  • 16. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 8 Stream Slicing Example
  • 17. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System The number of slices depends on the workload. 9 Stream Slicing Example
  • 18. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 10 Stream Slicing Example
  • 19. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 11 Stream Slicing Example
  • 20. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 12 Stream Slicing Example
  • 21. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 13 We store partial aggregates instead of all tuples. => Small memory footprint. Stream Slicing Example
  • 22. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 14 We assign each tuple to exactly one slice. => O(1) per-tuple complexity. Stream Slicing Example
  • 23. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 15 Stream Slicing Example
  • 24. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System We require just a few computation steps to calculate final aggregates. => Low latency. 16 Stream Slicing Example
  • 25. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Workload Characteristics Window Types Context Free Forward Context Free Forward Context Aware Stream Order in-order out-of-order Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 17 Workload Characteristics
  • 26. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Workload Characteristics Window Types Context Free Forward Context Free Forward Context Aware Stream Order in-order out-of-order Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility General Stream Slicing combines generality and efficiency in a single solution. 17 Workload Characteristics
  • 27. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 18
  • 28. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 Count-based tumbling window with a length of 5 tuples. 18
  • 29. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Count-based tumbling window with a length of 5 tuples. 18
  • 30. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Count-based tumbling window with a length of 5 tuples. 11 13 12 18
  • 31. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 11 13 12 What if the stream is out-of-order? 18
  • 32. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 18
  • 33. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 Out-of-order Tuple 18
  • 34. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 Out-of-order Tuple 18
  • 35. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 18
  • 36. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 13 12 18
  • 37. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 12 18
  • 38. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 12 5 18
  • 39. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 125 + - 3 5 18
  • 40. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 123 1+ -5 + - 3 5 18
  • 41. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Impact of Workload Characteristics 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 123 1+ -5 + - 3 5 What if the aggregation function is not invertible? 18
  • 42. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Scotty Window Processor: Efficent Window Aggregations for Flink, Beam, and Storm https://github.com/TU-Berlin-DIMA/scotty-window-processor 19
  • 43. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Key-Facts Features: ● One window operator for many systems. ● High performance window aggregations with stream slicing. ● Scales to thousands of concurrent windows. ● Aggregate sharing among multiple window queries. ● Adapts to workload characteristics: ○ Window Types ○ Aggregation Functions ○ Window Measures ○ Stream Order 20
  • 44. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Key-Facts Features: ● One window operator for many systems. ● High performance window aggregations with stream slicing. ● Scales to thousands of concurrent windows. ● Aggregate sharing among multiple window queries. ● Adapts to workload characteristics: ○ Window Types ○ Aggregation Functions ○ Window Measures ○ Stream Order Connectors: …more coming soon… 20
  • 45. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Scotty Core 21
  • 46. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Scotty Core 21
  • 47. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Scotty Core 21
  • 48. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Scotty Core 21
  • 49. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Scotty Core Scotty adapts to work load characteristics and combines generality and efficiency in a single solution. 21
  • 50. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Benchmark Concurrent Windows with Built-in Window Operator: ● Flink performs well with a single window (no overlap; one bucket at a time) 0 500.000 1.000.000 1.500.000 2.000.000 2.500.000 1 10 20 50 100 500 1000 Flink Storm Flink on Beam Throughput(Tuples/sec.) Number of Councurrent Windows 22
  • 51. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Benchmark Concurrent Windows with Built-in Window Operator: ● Flink performs well with a single window (no overlap; one bucket at a time) 0 500.000 1.000.000 1.500.000 2.000.000 2.500.000 1 10 20 50 100 500 1000 Flink Storm Flink on Beam ● With overlapping concurrent windows, the throughput drops drastically. Throughput(Tuples/sec.) Number of Councurrent Windows 22
  • 52. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System 0 500.000 1.000.000 1.500.000 2.000.000 2.500.000 1 10 20 50 100 500 1000 Flink+Scotty Storm+Scotty Beam+Flink+Scotty Benchmark Concurrent Windows with Scotty: ● With Scotty, the throughput is independent of the number of concurrent windows. 23 Throughput(Tuples/sec.) Number of Councurrent Windows
  • 53. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Using Scotty on Flink 1. Clone Scotty and install to maven 24
  • 54. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Using Scotty on Flink 1. Clone Scotty and install to maven 2. Add Scotty to your Flink Project: 24
  • 55. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Using Scotty on Flink 1. Initialize Scotty Window Operator 25
  • 56. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Using Scotty on Flink 1. Initialize Scotty Window Operator 2. Add Window Definitions 25
  • 57. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Using Scotty on Flink 1. Initialize Scotty Window Operator 3. Add Scotty to your Flink Job 2. Add Window Definitions 25
  • 58. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Implement your own Aggregations Functions 26
  • 59. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Implement your own Aggregations Functions Example: • Average -> Sum/Count 27
  • 60. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Implement your own Aggregations Functions 8|4 6 7 1 2 5 3|3 Input Stream: Output Stream: PartialState Example: • Average -> Sum/Count 28
  • 61. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Implement your own Aggregations Functions 8|4 6 7 1 2 1. lift 5 5|1 3|3 Input Stream: Output Stream: Example: • Average -> Sum/Count 28
  • 62. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Implement your own Aggregations Functions 8|4 6 7 1 2 1. lift 5 5|1 2. combine 3|38|4 Input Stream: Output Stream: Example: • Average -> Sum/Count 28
  • 63. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Implement your own Aggregations Functions 8|4 6 7 1 2 1. lift 5 5|1 2. combine 2 2 3. lower 3|38|4 Input Stream: Output Stream: Example: • Average -> Sum/Count 28
  • 64. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Upcoming Research Projects 29
  • 65. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Upcoming Research Projects The NebulaStream Platform: Data and Application Management for the Internet of Things https://arxiv.org/pdf/1910.07867.pdf 29
  • 66. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Upcoming Research Projects Agora: An Open Ecosystem for Democratizing Data Science & Artificial Intelligence The NebulaStream Platform: Data and Application Management for the Internet of Things https://arxiv.org/pdf/1910.07867.pdf https://arxiv.org/pdf/1909.03026.pdf 29
  • 67. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Upcoming Research Projects Agora: An Open Ecosystem for Democratizing Data Science & Artificial Intelligence The NebulaStream Platform: Data and Application Management for the Internet of Things https://arxiv.org/pdf/1910.07867.pdf https://arxiv.org/pdf/1909.03026.pdf We are hiring! Research Associates / PhD Students & Post Docs (m/w/d) Catch us after the talks or send a mail to jobs@dima.tu-berlin.de 29
  • 68. Jonas Traub (TU Berlin), Philipp M. Grulich (TU Berlin) - Scotty: Efficient Window Aggregation for your Stream Processing System Acknowledgements: This talk is supported by the Berlin Big Data Center (01IS14013A), the Berlin Center for Machine Learning (01IS18037A), and Software Campus (1-3000473-18TP). Scotty Features: ● One window operator for many systems. ● High performance with stream slicing. ● Scales to thousands of concurrent windows. ● Aggregate sharing among multiple window queries. ● Adapts to workload characteristics tu-berlin-dima.github.io/ scotty-window-processor Open Source Repository: 30 Scotty Window Processor