Axa Assurance Maroc - Insurer Innovation Award 2024
Stateless load balancing - Early results
1. August 2013
[research early results]
University tutor:
Università di Catania
Dipartimento di Ingegneria Elettrica,
Elettronica e Informatica
Prof. Eng. O. Tomarchio
Company supervisor:
Medilink srl
Team Leader - R&D Manager
Eng. A. Maddalena
A distributed algorithm for
STATELESS LOAD
BALANCING
Showing results
Abstract: Distributing data-packets on stations with
Università di Catania
scalable and optimal store and retrival functionalities.
Ensuring load balance without collecting load-info
from stations.
Dr. A. Tino
Keywords: Distributed-Systems, Algorithms, Big-Data,
Cloud, Balancing
Trainee:
Facoltà di Ingegneria Informatica
Specialistica
Medilink srl
Sezione Ricerca e Sviluppo
2. August 2013
STATUS: SIMULATING...
Some real-scenario simulations have been conducted over several rings. Now showing early
results while other simulations are still running.
about simulations: state of art
Ring size: Every simulation will create a fixed number of stations, each of which is
assigned a personal hash (identifier in the ring).
Packets volume: When running, a simulation will generate a fixed (usually very high)
number of random packets to be fed to the the ring.
Packet size: When running, simulations will generate packets of fixed length. Each
packet will be fed to a hash function.
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
3. August 2013
TUNING SIMULATIONS
Simulations have been designed to be scalable, configurable and flexible. Basic functionalities
enabled at the moment, further improvements planned.
simulation parameters
T1
T6
Number of stations: N
Number of generated packets: M
Packet length/size: P
T2
T5
simulation tech details
Developed in C++
Template based, fast, support for
reals and big-reals
Results output on files: raw+stats
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
T3
T4
s1
s2
s3
s4
s5 s6
0
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
hmax-1
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
4. August 2013
SIMULATIONS OVERVIEW
Simulating small, medium-sized and large rings against small to large packets volums and size.
Ring size: N
Small rings simulated: N10 to N30
small
vs
big
few
vs
many
short
vs
long
Mid-sized rings simulated: N30 to N50
Big rings rings simulated: N50 to N100
Packets volume: M
Low volumes: M1.000 to M100.000
High volumes: over M1.000.000
Packet length/size (bytes): P
Simulated range: P10 to P1000
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
5. August 2013
INGREDIENTS 4 SIMULATIONS
Simulations are designed to be flexible, configurable, fast, type independent and extensible.
Intel Threading
Building Blocks (TBB)
Intel Core i7
Computer
architecture
GNU C/C++
compiler
(gcc/g++)
Boost C++
Libraries
Tina’s rnd
number
generators
Unix RHEL
based CentOS
systems
OpenSSL
cryptographic
libraries
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Circos circo-diagrams
drawing library
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
6. August 2013
SIMULATIONS ARCHITECTURE
When simulating a NMP-scenario, the whole process consists of different stages.
Compilation
needed for fast
sims
Configuration
through bash
scripts
dhtlb::SIM_ENV_N
sh
dhtlb::SIM_ENV_M
dhtlb::
SIM_ENV_PKTSIZE
hpp hpp cpp hpp
Bash scripts are used
to change the src code
to configure each
simulation.
Medilink srl
Sezione Ricerca e Sviluppo
sh
Simulation is
executed
01 bin/cppsim
sh
out
./x.sh
g++ -lx
01
01
01
obj obj obj
01
01
out out
Templated classes (on
N,M,P as well) cause
compilation as a step.
Tutor:
Results on
files
01 bin/cppsim
out
running
./x.out
cpp cpp hpp cpp
dhtlb::Ring
<N,M,P>::HPart
dhtlb::Ring
<N,M,P>::seed
Quantities are
evaluated
writing...
st1: 0x0385fa6bc
st2: 0x746c6aa6b
...
stn: 0xa12345ddd
Bash scripts create
random hpartitions for
stations that are passed
to simulations.
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
dat
Sims are executed on a
long-running task
machine. Simulations
use use all mach cores.
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
tab
kry
At the end of each
simulation, data is
written and
summarized on files.
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
7. August 2013
ANATOMY OF A SIMULATION
つづく
When a simulation starts, many things happen...
seed
Memory
initialization
Packet
generation
A packet
created in every
parallel cycle!
Medilink srl
Sezione Ricerca e Sviluppo
Packets
handling
Hash/PhiH
computation
dat tab kry
Data
manipulation
Results on
files
Routing
Result in
memory
A considerable amount
of memory is needed
to run a simulation.
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Resource
collection
The core part of the simulation
is handled in parallel thanks
to its Monte Carlo scheme.
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
8. August 2013
ANATOMY OF A SIMULATION
終わり
A better scheme allows the possibility to run more simulations on the same set of packets.
seed
Memory
initialization
Packets
generation
Packets
handling
Hash/PhiH
computation
Packets are ALL generated
at once first.
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
dat tab kry
Data
manipulation
Results on
files
Routing
Resource
collection
Result in
memory
Each parallel cycle will find corresponding
packet in a specific assigned array position.
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
9. August 2013
OTHER FACTS
The simulator has several aspects to be detailed and many extensible components.
advantages
drawbacks
Packet generation policy can be
decided through compilation flags.
Templated classes to support hash
values real, big-reals and packet
internal representation.
Parallel vs. single implementations.
Simulations can be run as parallel or
not using different classes.
To achieve fast implementations,
compilation is necessary everytime a
simulation parameter changes.
Compilation does not take
much time, bash scripts
handle compilation.
Parallel mode only available for Intel
architectures.
High performance on Intel multi-core
architectures when in parallel mode.
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Single mode available for all
architectures.
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
10. August 2013
SHOWING RESULTS
Simulations are still ongoing, so far good results to be detailed in next slides.
Showing heavy simulations’
results collected so far.
Showing simulation
machine details...
Machine architecture HP ProLiant DL180 G6
Operating system
CentOS 6 (RHEL)
CPU details
Intel Xeon (4-Core)
OVERALL OVERALL AVG TIME O V E R A L L
SIMTIME # OF SIMS PER SIM GENTD PKTS #
19d 84
3h 14.1M
and still counting...
= data updating in time
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
11. August 2013
GROUPING SIMULATIONS
For a better analysis, we are going to group simulations in clusters basing on the ring size.
N 1 0N 3 0N 5 0N100
SIMS # SIMS # SIMS # SIMS #
41
Medilink srl
Sezione Ricerca e Sviluppo
60
10
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
10
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Focusing on different
rings: from 10 to 100
stations.
Packets volume
ranging 100k...3M
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
12. August 2013
SIMS #
N10MxPy SIMULATIONS
Ring size 10. Many packet volumes.
PA C K E T S
VO L U M E S
Dispersion against std. deviation for H.
Dispersion against std. deviation for PHI.
31x100k
150000
Dispersion for H
120000
Ov. SIM G E N
T I M E PKTS #
10x1M
90000
60000
30000
0
0
20000
40000
1000
60000
80000
Std deviation for H
100000
41
13h
13.1M
Relating HPART variance to PHI variance.
120000
Std. deviation for PHI & HPART for each simulation.
Dispersion for PHI
800
5
400
3000
4
600
2500
2000
3
200
0
1500
2
1
0
500
Medilink srl
Sezione Ricerca e Sviluppo
1000
1500
2000
Std deviation for PHI
2500
Tutor:
3000
0
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
1000
500
Std deviation for HPART. Mult. coeff: E37
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Std deviation for PHI
Research trainee:
0
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
13. August 2013
SIMS #
N30MxPy SIMULATIONS
Ring size 10. Many packet volumes.
PA C K E T S
VO L U M E S
49x1M
Dispersion against std. deviation for H.
Dispersion against std. deviation for PHI.
Dispersion for H
150000
50000
0
50000
100000
20000
Dispersion for PHI
17g
1x10M
100000
0
Ov. SIM G E N
T I M E PKTS #
10x3M
200000
150000
Std deviation for H
200000
60
89.0M
Relating HPART variance to PHI variance.
250000
Std. deviation for PHI & HPART for each simulation.
15000
20
10000
15
5000
10
0
5
0
10000
Medilink srl
20000
Sezione Ricerca e Sviluppo
30000
40000
50000
Std deviation for PHI
60000
70000
Tutor:
80000
0
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Std deviation for HPART. Mult. coeff: E36
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Std deviation for PHI
Research trainee:
80000
70000
60000
50000
40000
30000
20000
10000
0
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
14. August 2013
SIMS #
N50MxPy SIMULATIONS
Ring size 10. Many packet volumes.
PA C K E T S
VO L U M E S
Dispersion against std. deviation for H.
Dispersion against std. deviation for PHI.
Dispersion for H
30000
Ov. SIM G E N
T I M E PKTS #
10x1M
25000
20000
15000
10000
15000
20000
Std deviation for H
1500
10
3g
10.0M
Relating HPART variance to PHI variance.
25000
Std. deviation for PHI & HPART for each simulation.
Dispersion for PHI
1200
900
600
300
0
0
1000
Medilink srl
Sezione Ricerca e Sviluppo
2000
3000
4000
Std deviation for PHI
5000
Tutor:
6000
8
7
6
5
4
3
2
1
0
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
6000
5000
4000
3000
2000
1000
Std deviation for HPART. Mult. coeff: E36
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Std deviation for PHI
Research trainee:
0
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
15. August 2013
N30 SIMS CIRCO-DIAGS + LLEVELS
つづく
N30M1MP1k
N30M1MP1k
2013.11.25:084556
150000
2013.11.25:104210
200000
120000
N30M1MP1k
2013.11.25:124859
100000
80000
80000
60000
60000
40000
100000
120000
100000
150000
N30M1MP1k
2013.11.25:145003
120000
90000
60000
N30M1MP1k
40000
2013.11.25:165131
200000
150000
100000
30000
50000
20000
20000
0
0
0
0
0
50000
50000
35000
40000
40000
30000
30000
20000
20000
20000
15000
10000
10000
0
0
40000
35000
30000
25000
20000
15000
10000
5000
0
40000
35000
30000
25000
20000
15000
10000
5000
0
Medilink srl
Sezione Ricerca e Sviluppo
30000
25000
10000
5000
0
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
50000
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
How to read this diagram? See slide “How to read simulations”
Load balancing performed! Showing load-levels and circo-diagrams for pkt-migrations.
16. August 2013
N30 SIMS CIRCO-DIAGS + LLEVELS
つづく
N30M1MP1k
N30M1MP1k
2013.11.25:185047
N30M1MP1k
2013.11.25:204218
N30M1MP1k
2013.11.25:224923
N30M1MP1k
2013.11.26:005204
2013.11.26:025536
250000
150000
250000
200000
120000
200000
150000
90000
150000
100000
60000
100000
50000
30000
50000
0
0
0
0
0
40000
35000
30000
25000
20000
15000
10000
5000
0
100000
35000
35000
35000
30000
30000
30000
25000
25000
25000
60000
20000
20000
20000
40000
15000
15000
15000
10000
10000
10000
5000
5000
5000
0
0
0
80000
20000
0
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
200000
150000
100000
50000
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
100000
80000
60000
40000
20000
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
How to read this diagram? See slide “How to read simulations”
Load balancing performed! Showing load-levels and circo-diagrams for pkt-migrations.
17. August 2013
N30 SIMS CIRCO-DIAGS + LLEVELS
終わり
N30M1MP1k
N30M1MP1k
2013.11.26:045420
120000
N30M1MP1k
2013.11.26:065230
N30M1MP1k
2013.11.26:084514
N30M1MP1k
2013.11.26:103934
2013.11.26:123917
150000
100000
150000
250000
120000
80000
120000
200000
90000
60000
90000
150000
40000
60000
40000
60000
100000
20000
30000
20000
30000
50000
0
0
0
0
0
35000
40000
35000
30000
25000
20000
15000
10000
5000
0
40000
35000
30000
25000
20000
15000
10000
5000
0
50000
50000
40000
40000
30000
30000
20000
20000
10000
10000
0
0
100000
80000
60000
30000
25000
20000
15000
10000
5000
0
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
How to read this diagram? See slide “How to read simulations”
Load balancing performed! Showing load-levels and circo-diagrams for pkt-migrations.
18. August 2013
FOCUSING ON A N30M1MP1k
N30M1MP1k
2013.11.28:063449
Showing an ideal case: almost perfect balancing, levels get more homogeneous.
How to read this diagram? See slide “How to read simulations”
Showing H load-levels.
150000
120000
90000
60000
30000
0
Showing PHI load-levels.
50000
40000
30000
20000
10000
0
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
19. August 2013
FOCUSING ON A N30M1MP1k
N30M1MP1k
2013.11.27:204818
Showing a NOT SO ideal case: almost perfect balancing but one level drops significantly.
How to read this diagram? See slide “How to read simulations”
Sometimes strange behaviors
appear: very
narrow-coverage stations are
not correctly balanced.
Symptom: very high hpart std.dev.
std_dev = 1.2e37 (MD5: 0..2^128-1)
150000
60000
120000
50000
40000
90000
30000
60000
20000
30000
10000
0
0
Showing H load-levels.
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Showing PHI load-levels.
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
20. August 2013
SEARCHING FOR
HIDDEN PATTERNS
Using circo-diagrams it is possible to reveal
hidden patterns.
Wide-covarage stations are more likely to
donate packets to other stations.
Narrow-covarage stations are more likely
to receive packets to other stations.
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica
21. August 2013
HOW TO READ SIMULATIONS
Simulation data are presented using dimension-reduction in order to grasp and understand
patterns in a faster way. In this slide, find a little guide introducing most important quantities.
Circo-diagrams showing migration flows.
Migration flow: a number of
pkts virtually moving from one
station to another one upon 2
different station assignments.
Assignments using H
and PHI hash functions.
Function H: MD5 (128
bit, cryptographic hash
function).
Function PHI: Balancing
hash function (secret).
Medilink srl
Sezione Ricerca e Sviluppo
Tutor:
Prof. Eng. Orazio Tomarchio
DIIEI
Università di Catania
Supervisor:
Eng. Andrea Maddalena
Software Development
Medilink srl
Research trainee:
Dr. Andrea Tino
Università degli Studi di Catania
Ingegneria Informatica