Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Adapative Provisioning of Stream Processing Systems in the Cloud
1. Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
Javier
Cerviño#1,
Eva
Kalyvianaki*2,
Joaquín
Salvachúa#3,
Peter
Pietzuch*4
#
Universidad
Politécnica
de
Madrid,
*
Imperial
College
London
1jcervino@dit.upm.es,
2ekalyv@doc.ic.ac.uk
3jsalvachua@dit.upm.es,
4prp@doc.ic.ac.uk
SMDB
2012
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
1/23
2. Data
Stream
Processing
Systems
(DSPS)
• Real-‐?me
processing
of
con?nuous
data
• Financial
trading,
sensor
networks,
etc.
• Data
from
sources
arrive
as
streams
– Time-‐ordered
sequence
of
tuples
• Characteris?cs
– Tuples
arrival
rates
are
not
uniform
• Performance
requirements
– Low
latency
– Guaranteed
throughput
• Adap6ve
provisioning
– Use
resources
on
demand
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
2/23
3. Cloud
Compu?ng
Cloud
offers
elas?c
compu?ng
by
providing
resources
on
demand
– Characteris?cs
• Scalability
• Geographical
Distribu?on
• Virtualiza?on
• Applica?on
Programming
Interface
(API)
– Amazon
EC2
• Public
cloud
provider
• Infrastructure
as
a
Service
• Images
and
Virtual
Machines
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
3/23
4. Related
work
• Cloud
Stream
Processing
[Kleiminger
et
al,
SMDB’11]
• Cloud
network
performance
– Cloud
and
Internet
paths
support
streaming
data
into
cloud
DCs?
[Barker
et
al,
MMSys’07],
[Wang
et
al,
INFOCOM’10],
[Jackson
et
al,
CLOUDCOM’10]
• Cloud
computa?on
performance
– Best
effort
VMs
support
low-‐latency,
low-‐jiier
and
high-‐throughput
stream
processing?
[Barker
et
al,
MMSys’07]
– Computa?onal
power
of
Amazon
EC2
VMs
for
standard
stream
processes
tasks?
[Diirich
et
al,
VLDB’10],
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
4/23
5. Contribu?ons
• Explore
the
suitability
of
cloud
infrastructures
for
stream
processing,
(case
study
on
Amazon
EC2)
– Measure
network
and
processing
latencies,
jiier
and
throughput
• An
adap?ve
algorithm
to
allocate
cloud
resources
on-‐demand
– Resizes
the
number
of
VMs
in
a
DSPS
deployment
• Algorithm
evalua?on
– Deploying
the
algorithm
as
part
of
a
DSPS
on
Amazon
EC2
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
5/23
6. Outline
1. Cloud
Performance
1. Network
Measurements
2. Processing
Measurements
3. Discussion
2. Adap?ve
Cloud
Stream
Processing
1. Architecture
2. Algorithm
3. Experimental
Evalua?on
1. Descrip?on
2. Results
4. Future
Work
and
Conclusions
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
6/23
7. Outline
1. Cloud
Performance
1. Network
Measurements
2. Processing
Measurements
3. Discussion
2. Adap?ve
Cloud
Stream
Processing
1. Architecture
2. Algorithm
3. Experimental
Evalua?on
1. Descrip?on
2. Results
4. Future
Work
and
Conclusions
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
7/23
8. Cloud
Performance
Network
Measurements
• Goal:
Explore
network
parameters
that
affect
stream
processing
condi?ons:
– Ji9er,
latency
and
bandwidth
• Experimental
set-‐up
– Stream
engines
• Mock
engines
without
processing
• 9
Amazon
EC2
instances:
3
in
US,
3
in
EU
and
3
in
Asia.
• Large
Amazon
EC2
instances:
7.5GB
and
4
ECU
– Stream
sources
• 9
distributed
PlanetLab
nodes:
3
in
US,
3
in
EU
and
3
in
Asia.
– Dataset
• Random
data
at
three
different
data
rates:
10kbps,
100kbps
and
1Mbps
Europe PlanetLab Cloud
USA Asia
node instance
SOURCE PROCESSING
ENGINE
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
8/23
9. Cloud
Performance
Network
Measurements
high rate medium rate low rate
4000
Jitter (ms)
2000
0
1 2 3 4 5 6 7 8 9
PlanetLab nodes
• Average
jiier
is
less
than
2.5
μs
• Some
outliers
have
a
value
of
almost
4
seconds
• Low
ji9er
with
less
than
3%
of
high
outliers
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
9/23
10. Cloud
Performance
Network
Measurements
Round−Trip Time (ms)
Network−Level 300
200
ideal
america
100
asia
europe
0
0 50 100 150 200 250
Application−Level Round−Trip Time (ms)
• Applica?on-‐level
delay
involves
processing
?me:
tsent-‐treceived
• Network-‐level
delay
between
the
source
and
the
engine:
RTT
• Cloud
DC
does
not
increase
applica6on-‐level
delay
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
10/23
11. Cloud
Performance
Processing
Measurements
• Goal
– Explore
performance
varia?on
with
?me-‐of-‐day
(processing
and
latency)
– Check
if
cloud
VMs
can
scale
efficiently
with
varying
input
rate
• Experimental
set-‐up
– Dataset
• Esper
benchmark
tool
• Stream
of
shares
and
stock
values
for
a
given
symbol
at
a
fixed
rate
(30000
tuples/sec)
– Submi9er
• 10
Extra
large
Amazon
EC2
VMs:
15GB,
8
ECU
– Nodes
• 10
Small
Amazon
EC2
VMs:
1.7
GB,
1
ECU
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
11/23
12. Cloud
Performance
Processing
Measurements
Day 1 Day 2
Latency 50
(ms)
0 4
x 10
Throughput
(tuples/s)
2
0
7 8 9 10111213141516171819 7 8 9 10111213141516171819
Time of day, 24−hour format Time of day, 24−hour format
• Throughput
remains
rela?vely
stable
over
the
measurement
period
• Latency
suffers
more
from
unpredictable
outliers
• No
obvious
pa9ern
to
correlate
performance
with
?me-‐of-‐day
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
12/23
13. Cloud
Performance
Processing
Measurements
5 Small VM instances Large VM instances
x 10
2
1.8
Throughput − tuples/s
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
1 3 5 7 9 11 13 15 17 1 3 5 7 9 11 13 15 17
Input Data Rate − x10000 tuples/s Input Data Rate − x10000 tuples/s
• Cloud
VMs
can
be
used
to
scale
efficiently
with
an
increasing
input
rate
• The
number
of
VMs
depends
on
their
type,
as
expected
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
13/23
14. Outline
1. Cloud
Performance
1. Network
Measurements
2. Processing
Measurements
3. Discussion
2. Adap?ve
Cloud
Stream
Processing
1. Architecture
2. Algorithm
3. Experimental
Evalua?on
1. Descrip?on
2. Results
4. Future
Work
and
Conclusions
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
14/23
15. Adap?ve
Cloud
Stream
Processing
• Elas?c
stream
processing
system
to
scale
the
number
of
VMs
to
input
stream
rates
• Goals
– Low-‐latency
with
a
given
throughput
– Keep
VMs
opera?ng
to
their
maximum
processing
capacity
• Workload
is
par??oned
and
balanced
across
mul?ple
VMs
• Many
VMs
available
to
scale
up
and
down
to
workload
demands
• Collector
gathers
results
from
engines
and
process
addi?onal
queries
VM
engine
VM
source
1
engine
VM
collector
source
2
engine
VM
engine
Stream
source
Sub-‐query
1
Sub-‐query
2
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
15/23
16. Adap?ve
Cloud
Stream
Processing
Algorithm
I
VM
N virtual machines
Proc.
Esper
Rate
Input VM
Proc.
Tuple
Proc Extra
Rate submiier
Esper
Rate
Σ
Rate -‐
Rate
VM
Proc.
Esper
Rate
/
Average
Rate
• Gathering
and
calcula6on
– Gathers
processing
rates
from
VMs
– Obtains
• Total
extra
processing
rate
(Extra rate)
• Average
processing
rate
per
VM
(Average rate)
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
16/23
17. Adap?ve
Cloud
Stream
Processing
Algorithm
II
Extra Average
Rate /
Rate
N
scale
up
Σ
Yes
Average
Rate Store
Extra
Rate >
0
?
N’
Return
No
scale
down
Input
Rate /
• Decision
stage
– Calculates
new
number
of
machines
(N’)
– Scale
up
• Stores
the
average
rate
as
maximum
average
rate
– Scale
down
• Uses
last
maximum
average
rate
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
17/23
18. Outline
1. Cloud
Performance
1. Network
Measurements
2. Processing
Measurements
3. Discussion
2. Adap?ve
Cloud
Stream
Processing
1. Architecture
2. Algorithm
3. Experimental
Evalua?on
1. Descrip?on
2. Results
4. Future
Work
and
Conclusions
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
18/23
19. Experimental
Evalua?on
Descrip?on
• Goals
– Adaptability
of
the
algorithm
against
varying
input
rates
– Implica?ons
on
stream
processing
performance
to
adapta?on
• Experimental
set-‐up
– Integrated
with
Esper
processing
system
engine
– Framework
to
control
VMs
and
to
collect
performance
metrics
• Throughput,
processing
latency
and
network
latency
• Collec?on
of
shell
script
– Deployed
on
Amazon
EC2
Amazon
EC2
Controller
VM
Esper
VM
Esper
tuple
Esper
submiier
VM
Esper
Esper
tuple
Esper
submiier
VM
engine
Stream
source
Sub-‐query
1
Sub-‐query
2
Random
values
of
Maximum
value
of
each
stock
Collec?on
and
merge
of
all
results
different
stock
symbols
symbol
per
second
Same
query
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
19/23
20. Experimental
Evalua?on
Results
5
x 10
Small
Instances
Number of VMs
Input Rate Tuples dropped Number of nodes
1.5
Tuples/sec
4
1 3
• Processing
latency
remains
2 low:
7
–
28
μs
0.5
1
0
100 200 300 400 500 600 700 • Scales
up
and
down
the
Time (sec) number
of
VMs
as
required
by
the
input
rate
• There
is
a
significant
reac?on
2
x 10
5
2
delay
before
VMs
are
scaled
Large
Instances
up
and
down
Number of VMs
Input Rate Tuples dropped Number of nodes
Tuples/sec
1 1 • VMs
are
pre-‐allocated
0 0
100 200 300 400 500 600 700
Time (sec)
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
20/23
21. Outline
1. Cloud
Performance
1. Network
Measurements
2. Processing
Measurements
3. Discussion
2. Adap?ve
Cloud
Stream
Processing
1. Architecture
2. Algorithm
3. Experimental
Evalua?on
1. Descrip?on
2. Results
4. Future
Work
and
Conclusions
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
21/23
22. Future
Work
• Inves?gate
ways
to
reduce
the
reac?on
delay
to
performance
viola?ons
• Predict
the
future
behaviour
of
input
data
rates
• Inves?gate
cost
models
for
alloca?on
of
small
and
large
VM
instances
• Evaluate
our
system
in
other
cloud
environments
• Extensive
evalua?on
over
longer
periods
of
?me
and
different
VM
types
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
22/23
23. Conclusions
• An
adap?ve
approach
to
provision
stream
processing
systems
in
the
cloud
• Public
clouds
are
suitable
for
stream
processing
• Network
latency
is
the
domina?ng
factor
in
public
clouds
• Our
approach
can
adap?vely
scale
the
number
of
VMs
to
input
rates
• Processing
latency
and
data
loss
remain
low
Javier
Cerviño
email:
jcervino@dit.upm.es
Thank
you!
Ques?ons?
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
23/23
24. Adap?ve
Cloud
Stream
Processing
Algorithm
e VM instances Algorithm 1 Adaptive provisioning of a cloud-based DSPS
Require: totalInRate, N , maxRatePerVM
Ensure: N 0 s.t. projRatePerVM ⇤ N 0 = totalInRate
1: expRatePerVM = btotalInRate/N c
2: totalExtraRateForVMs = 0; totalProcRate = 0
3: for all deployed VMs do
4: totalExtraRateForVMs += expRatePerVM -
getRate(VM )
7 9 11 13 15 17 5: totalProcRate += getRate(VM )
Rate − x10000 tuples/s
6: end for
7: avgRatePerVM = b(totalProcRate/N )c
sizes on Amazon EC2
) 8: if totalExtraRateForVMs > 0 then
9: N 0 = N +d(totalExtraRateForVMs/avgRatePerVM )e
10: maxRatePerVM = avgRatePerVM
11: else if totalExtraRateForVMs < 0 then
12: N 0 = dtotalInRate/maxRatePerVM e
13: end if
14: projRatePerVM = totalInRate/N 0
15: return N 0
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
24/23
25. Adap?ve
Cloud
Stream
Processing
Algorithm
getExpectedVMs(totalInRate, currentVMs) {
expectedRatePerVM = totalInRate/currentVMs
Input
rate
for each deployed VM {
calcula?ons
vmRate = getRate(VM)
totalExtraRate += (expRatePerVM-vmRate)
}
avgRatePerVM = totalProcRate/N
if (totalExtraRateForVMs > 0) {
Increasing
expectedVMs = currentVMs + totalExtraRate/avgRate
maxRatePerVM = avgRatePerVM
Input
rate
}
Decreasing
else if (totalExtraRateForVMs < 0) {
expectedVMs = totalInRate / maxRatePerVM
Input
rate
}
}
Javier
Cerviño,
Eva
Kalyvianaki,
Joaquín
Salvachúa,
Peter
Pietzuch
Adap?ve
Provisioning
of
Stream
Processing
Systems
in
the
Cloud
25/23