An Adaptive Gossip-Based Dissemination Protocol for Multi-Source Message Streams
1. PULP
An Adaptive Gossip-Based
Dissemination Protocol for
Multi-Source Message Streams
Pascal Felber
A.-M. Kermarrec, L. Leonini, E. Rivière, S. Voulgaris
Pascal.Felber@unine.ch
http://iiun.unine.ch/
2. Introduction
l Epidemic protocols are widely used for
information dissemination
l Algorithmic simplicity
l Robustness (failures of nodes and links)
l Adapted to large-scale dynamic networks
l Yet, there have a number of drawbacks
l Bandwidth overutilization (redundant messages)
l High message dissemination latency
l Objectives of the PULP protocol
l Hybrid protocol bandwidth- & latency-efficient
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 2
3. Epidemic Protocols: Push vs. Pull
l Push
l At 1st reception, every node forwards message to
f other nodes, at most TTL times
l Low latency, high redundancy
l Pull
l Periodically, every node contacts another node
and asks for missing messages
l High latency, low redundancy
l Both approaches rely on a sampling service
to obtain random nodes
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 3
6. Context and Objectives
l Dissemination of streams of small messages
l Sources may be any node
l Dissemination from all to all
l Variable publication frequency
l High frequency phases (e.g., react to event)
l Idle phases with no new message
l Objectives
l Low network cost, proportional to actual activity
l Low latency
l Robustness to failures and churn
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 6
7. The PULP Protocol
l Two-phase hybrid approach
1. Exponential growth phase (push)
l Inform sufficiently many nodes w/out redundancy
2. Quadratic shrinking phase (pull)
l Pull frequency driven by message activity
l Exploit sequences of messages
l Push messages carry information for pull phase
l Limits useless pulls
l Supports complete disseminations with low
cost and low latency
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 7
8. PULP: 1st Phase
l Objective: inform sufficiently many nodes
with negligible redundancy
l 4-5% of the network (based on observations)
l Size of network N estimated by sampling service
TTL
l Choose TTL and f: c=N ∑ i=1
f i ≈ 4.5%
l Subsets of nodes reached by different messages
are not correlated (random neighbor selection)
l Forwarded messages embed information about
previously received messages (drive 2nd phase)
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 8
9. PULP: 2nd Phase
l Objective: limit useless pulls
l The protocol uses information about missing
messages (received during 1st phase)
l Pull frequency adapts according to:
l Missing messages
l Ratio of useful to useless pulls in last period
l Pull frequency increases when more messages are
being disseminated
l When there is little activity, pull frequency
depends on how useful previous pulls have been
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 9
10. PULP: Data Structures
l Every node maintains a sorted list of
received messages
l Hp: Recent history (last messages)
l Tp ⊆ Hp: Trading window (available for others)
l Tp is embedded in messages sent by p to q
l If ∃ m ∈ Tp Hq then q can request m from p
t
Hp: recent history Tp: trading window
old messages B3 A7 C3 B1 C6 B2 C5 A5 A6 C7 C8 B5 A9 C4
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 10
11. 3.4 Pulp: The Protocol random peers
// Messages will be pulled at the next pulling period
We now present a detailed description of the Pulp al- missing missing [ {m 2 TQ : m 2 HP } {msg}
/
PULP: Algorithm
gorithm, which combines the push and pull components
for disseminating a sequence of messages in a collabo-
rative and decentralized fashion.
// Periodic pulling of missing elements
thread PeriodicPull()
do every pull seconds
// Shu✏ing reduces the probability of receiving
duplicates by pull
Algorithm 1 shows the pseudo-code of the Pulp pro- shu✏e missing
7
tocol. Each peer P maintains a history of the messages invoke Pull(missing, P, TP ) on a random node Q
it has recently received, denoted as HP . It additionally
maintains a trading window, denoted nodePP containing
Algorithm 1: Pulp algorithm on as T , // Invoked when a node Q requests a message from node P
function Pull(requested, Q, TQ )
Variables 1st element in requested order 2 TP , or ? if none
the list ofPmessages that are available to other nodes on
H : History of (recently) received message IDs
m
invoke PullReply(m, P, TP ) on Q
request. pull : Period of pull operations (initially 30s)
missing: Set of message IDs known, but not yet received // Receive a reply to a pull request from node P
When a message is Size of missing at the end of last node
prevMissingSize: pushed to (or generated at) function PullReply(msg, Q, TQ )
P for theadjust period P registers it in HP and, if the
first time, if msg = ? _ m 2 HP then
prevuseful : Number of useful pull replies during current prevuseless prevuseless + 1
TTL has adjust period reached yet, forwards it to Fanout
not been else
random other period We stress that replies during current
prevuseless : Number of useless pullobtaining the IP ad-
peers. add msg to HP
adjust missing missing [ {m 2 TQ : m 2 HP } {msg}
/
dress of randomly selectedare fixedis a trivial task thanks
( adjust , TTL and Fanout peers protocol parameters) prevuseful prevuseful + 1
to Cyclon, as described inis pushed to node P by node Q
// Invoked when a message Section 3.2.
function Push(msg, hops, Q, TQ ) // Periodic adjustment of pulling period for node P
// Forward further if needed thread AdaptFreq()
if msg received for the first time then t do every adjust seconds
add msg top: recent history
H H
P
Tp: trading window
if |missing| > prevMissingSize then
if hops > 0 then adjust
invoke 3 B1 C6 B hops-1, A P ) C8 B5 A9
old messages B3 A7 CPush(msg, 2 C5 A5 P,6TC7 on Fanout C4 pull |missing| prevMissingSize+prevuseful
random peers else
if |missing| > 0 ^ prevuseless prevuseful then
// Messages will be pulled at the next pulling period pull pull ⇥ 0.9
missing missing [ {m 2 TQ : m 2 HP } {msg}
/ else
Fig. 3 Data structures of the Pulp algorithm. Note that mes- pull pull ⇥ 1.1
sages// Periodic pulling of sources (here A, B, and C) and each
come from multiple missing elements
thread PeriodicPull() the order it received them (which is
node sorts them based on pull max( pull , pull min )
do every pull seconds pull min( pull , pull max )
generally di↵erent for each node).the probability of receiving
// Shu✏ing reduces prevuseless 0
duplicates by pull prevuseful 0
shu✏e missing prevMissingSize |missing|
invoke Pull(missing, P, TP ) on a random node Q
In forwarding a message to another peer Q, node P
// Invoked when a node Q requests a message from node P
also function Pull(requested, Q, TQ ) Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber
forwards the IDs of messages in its trading window 11
12. Evaluation
l Conducted with SPLAY
l 1000 nodes in a cluster
l Reproducing real churn (OverNet trace)
l 300 nodes from PlanetLab
l Heavily loaded machines
l Messages sent from random nodes
l Communication over UDP (i.e., unreliable)
l Metrics: number of receptions (coverage),
latency, evolution of pull frequency
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 12
19. Only−push (Fanout=5, TTL=4): Hit and Duplicate ratios
100
Comparison with Pull-/Push-only
80 PULP (Fanout=3, TTL=3): Reception Delays Distribution
Ratio
60
60 th th
40 Max 75th perc. 25th perc.
50
20 90th perc. 50 perc. 5 perc.
Seconds
Hit Ratio 40 Dup Ratio
0 30
0 100 200 300 20 400 500 600
10
Message sending rate
0
0 100 200 300 400 500 600
1msg
/2s
Only−pull: Reception Delays Distribution
80
70
60
Seconds
1msg 50
/20s 40
0 100 200 300 30 400 500 600
20
10
0
0 100 200 300 400 500 600
PULP (Fanout=3, TTL=3): Reception Delays Distribution Only−push (Fanout=5, TTL=4): Reception Delays Distribution
60 200
50 th
Max 75th perc.
th
25th perc.
th 175
90 perc. 50 perc. 5 perc. 150
Seconds
Seconds
40 125
30 100
20 75
10 50
25
0 0
0 100 200 300 400 500 600 0 100 200 300 400 500 600
Only−pull: Reception Delays Distribution Only−push (Fanout=5, TTL=4): Hit and Duplicate ratios
80
70 100
60 80
Seconds
50
Ratio
40 60
30 40
20
10 20
Hit Ratio Dup Ratio
0 0
0 100 200 300 400 500 600 0 100 200 300 400 500 600
Only−push (Fanout=5, TTL=4): Reception Delays Distribution Message sending rate
200 Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 19
175 1msg
150 /2s
20. Conclusion
l PULP is a lightweight protocol that combines
push and pull dissemination
l Handles streams of message from multiple
sources
l Negligible amounts of redundant messages
l Low dissemination latency thanks to adaptive
pull frequency
l Adapted to the conditions of real networks
l Efficient, robust, churn-tolerant
Pulp: An Adaptive Gossip-Based Dissemination Protocol — P. Felber 20