SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
ACM/SIGCOMM 2010 – New Delhi, India
The Little Engine(s)
that could: Scaling
Online Social Networks
Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang
Nikos Laoutaris, Parminder Chhabra and Pablo Rodriguez
Telefonica Research, Barcelona
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
2
ACM/SIGCOMM 2010 – New Delhi, India
Scalability is a pain: the designers’ dilemma
“Scalability is a desirable
property of a system, a network,
or a process, which indicates its
ability to either handle growing
amounts of work in a graceful
manner or to be readily enlarged"
The designers’ dilemma:
wasted complexity and long
time to market
vs.
short time to market but risk
of death by success
3
ACM/SIGCOMM 2010 – New Delhi, India
Towards transparent scalability
The advent of the Cloud:
virtualized machinery + gigabit network
Hardware scalability
Application scalability
4
ACM/SIGCOMM 2010 – New Delhi, India
Towards transparent scalability
Elastic resource allocation for the
Presentation and the Logic layer:
More machines when load increases.
Components are stateless, therefore,
independent and duplicable
Components are interdependent,
therefore, non-duplicable on demand.
5
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
6
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
7
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
Operations on the data (e.g. a query) can be resolved in a single
server from the application perspective
Application Logic
Data Store
Application Logic
Data
Store
Data
Store
Data
Store
Centralized Distributed
8
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
• Horizontal partitioning or sharding
– Load of read requests decrease with the number of servers
– State decrease with the number of servers
– Maintains data locality as long as…
9
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
• Horizontal partitioning or sharding
– Load of read requests decrease with the number of servers
– State decrease with the number of servers
– Maintains data locality as long as…
… the splits are disjoint (independent)
Bad news for OSNs!!
10
ACM/SIGCOMM 2010 – New Delhi, India
Why sharding is not enough for OSN?
• Shards in OSN can never be
disjoint because:
– Operations on user i require
access to the data of other
users
– at least one-hop away.
• From graph theory:
– there is no partition that for all
nodes all neighbors and itself are
in the same partition if there is a
single connected component.
Data locality cannot be
maintained by partitioning a
social network!!
11
ACM/SIGCOMM 2010 – New Delhi, India
A very active area…
• Online Social Networks are massive systems spanning hundreds
of machines in multiple datacenters
• Partition and placement to minimize network traffic and
delay
– Karagiannis et al. “Hermes: Clustering Users in Large-Scale
Email Services” -- SoCC 2010
– Agarwal et al. “Volley: Automated Data Placement of Geo-
Distributed Cloud Services” -- NSDI 2010
• Partition and replication towards scalability
– Pujol et al “Scaling Online Social Networks without Pains” --
NetDB / SOSP 2009
12
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
13
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
14
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
Me
15
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[]
Inbox
Friends
16
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[]
key-101
17
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-101]
key-101
18
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
19
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going on
in the world
20
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101zz
key-105
key-121
key-146
Let’s see what’s going on
in the world
1) FETCH –inbox [from server ]A
21
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going on
in the world
1) FETCH –inbox [from server ]
R= [key-146, key-121, key-105, key-101]
A
22
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going on
in the world
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
R= [key-146, key-121, key-105, key-101]
A
?
23
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going on
in the world
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
Where is the data?
R= [key-146, key-121, key-105, key-101]
A
?
24
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s Operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
in , then data locality!
R= [key-146, key-121, key-105, key-101]
A
?
A
A AA
A
25
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s Operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
in , then no
data locality!
R= [key-146, key-121, key-105, key-101]
A
?
E
A FB
B F E
26
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1) Relational Databases  Selects and joins across multiple shards of
the database are possible but performance is
poor (e.g. MySQL Cluster, Oracle Rack)
2) Key-Value Stores (DHT)  More efficient than relational databases:
multi-get primitives to transparently fetch
data from multiple servers.
 But it’s not a silver bullet:
o lose SQL query language =>
programmatic queries
o lose abstraction from data operations
o suffer from high traffic, eventually
affecting performance:
• Incast issue
• Multi-get hole
• latency dominated by the worse
performing server
27
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
28
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0
Server 1 Server 2
Users are assigned to
servers randomly
Random partition
(DHT-like)
29
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
30
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
31
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
How do we enforce data locality?
32
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
How do we enforce data locality?
Replication
33
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0
4 7
34
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0
4 7
#4 Master + Replica#4 Master + Replica
35
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 That does it for user #3, but
what about the others?
4 7
36
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 That does it for user #3, but
what about the others?
The very same procedure…4 7
37
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
4 7 1
0 5
3 2 8
9 6
After a while…
38
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
4 7 1
0 5
3 2 8
9 6
39
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
4 7 1
0 5
3 2 8
9 6
40
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
This is full
replication!!
41
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
Can we do better?
42
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
We can leverage the underlying
social structure for the partition…
43
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
We can leverage the underlying
social structure for the partition…
44
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
We can leverage the underlying
social structure for the partition…
0 43
9
2
68
5
7
1 Social Partition
45
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
We can leverage the underlying
social structure for the partition…
0 43
9
2
68
5
7
1 Social Partition and Replication: SPAR
3 2
46
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
We can leverage the underlying
social structure for the partition…
0 43
9
2
68
5
7
1 Social Partition and Replication: SPAR
… only two replicas to achieve data
locality
3 2
47
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!
4 7 1
0 5
3 2 8
9 6
We can leverage the underlying
social structure for the partition…
0 43
9
2
68
5
7
1 Social Partition and Replication: SPAR
… only two replicas to achieve data
locality
3 2
48
ACM/SIGCOMM 2010 – New Delhi, India
SPAR Algorithm from 10000 feet
“All partition algorithms are equal, but some are more equal than others”
49
ACM/SIGCOMM 2010 – New Delhi, India
SPAR Algorithm from 10000 feet
“All partition algorithms are equal, but some are more equal than others”
1) Online (incremental) a) Dynamics of the SN (add/remove node, edge)
b) Dynamics of the System (add/remove server)
2) Fast (and simple) a) Local information
b) Hill-climbing heuristic
c) Load-balancing via back-pressure
3) Stable a) Avoid cascades
4) Effective a) Optimize for MIN_REPLICA (i.e. NP-Hard)
b) While maintaining a fixed number of replicas per
user for redundancy (e.g. K=2)
50
ACM/SIGCOMM 2010 – New Delhi, India
Looks good on paper, but...
Real OSN data
• Twitter
– 2.4M users, 48M edges,
12M tweets for 15 days
(Twitter as of Dec08)
• Orkut (MPI)
– 3M users, 223M edges
• Facebook (MPI)
– 60K users, 500K edges
Partition algorithms
• Random (DHT)
• METIS
• MO+
• modularity
optimization + hack
• SPAR online
... let’s try it for real
51
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
52
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
2.44 replicas per user (on average)
2 to guarantee redundacy
+ 0.44 to guarantee data locality (+22%)
2.44
53
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
Replication with random partitioning,
too costly!
54
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
Other social based partitioning algorithms
have higher replication overhead than SPAR.
55
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
56
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
57
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
58
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
Application, 3-tiers
SPAR Middleware
(data-store specific)
59
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
Application, 3-tiers
SPAR Middleware
(data-store specific)
SPAR Controller :
Partiton manager and
Directory Service
60
ACM/SIGCOMM 2010 – New Delhi, India
SPAR in the wild
• Twitter clone (Laconica, now Statusnet)
– Centralized architecture, PHP + MySQL/Postgres
• Twitter data
– Twitter as of end of 2008, 2.4M users, 12M tweets in 15 days
• The little engines
– 16 commodity desktops: Pentium Duo at 2.33Ghz, 2GB RAM
connected with Gigabit-Ethernet switch
• Test SPAR on top of:
– MySQL (v5.5)
– Cassandra (v0.5.0)
61
ACM/SIGCOMM 2010 – New Delhi, India
SPAR on top of MySQL
• Different read request rate (application level):
– 1 call to LDS, 1 join to on the notice and notice inbox to get the last 20
tweets and its content
– User is queried only once per session (4min)
– Spar allows for data to be partitioned, improving both query times
and cache hit ratios.
– Full replication suffers from low cache hit ratio and large tables
(1B+)
MySQL Full-Replication SPAR + MySQL
99th < 150ms 16 req/s 2500 req/s
62
ACM/SIGCOMM 2010 – New Delhi, India
SPAR on top of Cassandra
Network bandwidth is not an issue but
Network I/O and CPU I/O are:
– Network delays is reduced
• Worse performing server produces
delays
– Memory hit ratio is increased
• Random partition destroys correlations
•Different read request rate (application level):
Vanilla
Cassandra
SPAR +
Cassandra
99th < 100ms 200 req/s 800 req/s
63
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
64
ACM/SIGCOMM 2010 – New Delhi, India
Conclusions
SPAR provides the means
to achieve transparent
scalability
1) For applications using
RDBMS (not necessarily
limited to OSN)
2) And, a performance
boost for key-value stores
due to the reduction of
Network I/O
65
ACM/SIGCOMM 2010 – New Delhi, India
jmps@tid.es
http://research.tid.es/jmps/
@solso
Thanks!
66

Weitere ähnliche Inhalte

Ähnlich wie The little engine(s) that could: scaling online social networks

Using NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application PerformanceUsing NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application PerformanceEmulex Corporation
 
Making Observability Actionable At Scale - DBS DevConnect 2019
Making Observability Actionable At Scale - DBS DevConnect 2019Making Observability Actionable At Scale - DBS DevConnect 2019
Making Observability Actionable At Scale - DBS DevConnect 2019Squadcast Inc
 
Review and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud ComputingReview and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud ComputingIRJET Journal
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveWalid Shaari
 
Networking Challenges for the Next Decade
Networking Challenges for the Next DecadeNetworking Challenges for the Next Decade
Networking Challenges for the Next DecadeOpen Networking Summit
 
Discrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the FutureDiscrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the FutureMainstay
 
miniprojectreport
miniprojectreportminiprojectreport
miniprojectreportsilpa mohan
 
1DS21LVS01-DEEKSHITHA P.pptx
1DS21LVS01-DEEKSHITHA P.pptx1DS21LVS01-DEEKSHITHA P.pptx
1DS21LVS01-DEEKSHITHA P.pptxTcManjunath1
 
How to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT StackHow to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT StackThousandEyes
 
Wicsa2011 cloud tutorial
Wicsa2011 cloud tutorialWicsa2011 cloud tutorial
Wicsa2011 cloud tutorialAnna Liu
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScyllaDB
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinTechWell
 
IRJET- Implementation of Dynamic Internetworking in the Real World it Domain
IRJET-  	  Implementation of Dynamic Internetworking in the Real World it DomainIRJET-  	  Implementation of Dynamic Internetworking in the Real World it Domain
IRJET- Implementation of Dynamic Internetworking in the Real World it DomainIRJET Journal
 
Slambook report
Slambook reportSlambook report
Slambook reportritu garg
 
IRJET- Cost Effective Scheme for Delay Tolerant Data Transmission
IRJET- Cost Effective Scheme for Delay Tolerant Data TransmissionIRJET- Cost Effective Scheme for Delay Tolerant Data Transmission
IRJET- Cost Effective Scheme for Delay Tolerant Data TransmissionIRJET Journal
 
L'Internet des objets (IDO)
L'Internet des objets (IDO)L'Internet des objets (IDO)
L'Internet des objets (IDO)Cisco Canada
 
What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain.
What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain. What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain.
What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain. Kellton Tech Solutions Ltd
 
Clone of an organization
Clone of an organizationClone of an organization
Clone of an organizationIRJET Journal
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data CentersGina Buck
 
Online examination documentation
Online examination documentationOnline examination documentation
Online examination documentationWakimul Alam
 

Ähnlich wie The little engine(s) that could: scaling online social networks (20)

Using NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application PerformanceUsing NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application Performance
 
Making Observability Actionable At Scale - DBS DevConnect 2019
Making Observability Actionable At Scale - DBS DevConnect 2019Making Observability Actionable At Scale - DBS DevConnect 2019
Making Observability Actionable At Scale - DBS DevConnect 2019
 
Review and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud ComputingReview and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud Computing
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspective
 
Networking Challenges for the Next Decade
Networking Challenges for the Next DecadeNetworking Challenges for the Next Decade
Networking Challenges for the Next Decade
 
Discrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the FutureDiscrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the Future
 
miniprojectreport
miniprojectreportminiprojectreport
miniprojectreport
 
1DS21LVS01-DEEKSHITHA P.pptx
1DS21LVS01-DEEKSHITHA P.pptx1DS21LVS01-DEEKSHITHA P.pptx
1DS21LVS01-DEEKSHITHA P.pptx
 
How to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT StackHow to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT Stack
 
Wicsa2011 cloud tutorial
Wicsa2011 cloud tutorialWicsa2011 cloud tutorial
Wicsa2011 cloud tutorial
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the Coin
 
IRJET- Implementation of Dynamic Internetworking in the Real World it Domain
IRJET-  	  Implementation of Dynamic Internetworking in the Real World it DomainIRJET-  	  Implementation of Dynamic Internetworking in the Real World it Domain
IRJET- Implementation of Dynamic Internetworking in the Real World it Domain
 
Slambook report
Slambook reportSlambook report
Slambook report
 
IRJET- Cost Effective Scheme for Delay Tolerant Data Transmission
IRJET- Cost Effective Scheme for Delay Tolerant Data TransmissionIRJET- Cost Effective Scheme for Delay Tolerant Data Transmission
IRJET- Cost Effective Scheme for Delay Tolerant Data Transmission
 
L'Internet des objets (IDO)
L'Internet des objets (IDO)L'Internet des objets (IDO)
L'Internet des objets (IDO)
 
What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain.
What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain. What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain.
What’s Mule 4.3? How Does Anytime RTF Help? Our insights explain.
 
Clone of an organization
Clone of an organizationClone of an organization
Clone of an organization
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
Online examination documentation
Online examination documentationOnline examination documentation
Online examination documentation
 

Kürzlich hochgeladen

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Kürzlich hochgeladen (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

The little engine(s) that could: scaling online social networks

  • 1. ACM/SIGCOMM 2010 – New Delhi, India The Little Engine(s) that could: Scaling Online Social Networks Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang Nikos Laoutaris, Parminder Chhabra and Pablo Rodriguez Telefonica Research, Barcelona
  • 2. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 2
  • 3. ACM/SIGCOMM 2010 – New Delhi, India Scalability is a pain: the designers’ dilemma “Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged" The designers’ dilemma: wasted complexity and long time to market vs. short time to market but risk of death by success 3
  • 4. ACM/SIGCOMM 2010 – New Delhi, India Towards transparent scalability The advent of the Cloud: virtualized machinery + gigabit network Hardware scalability Application scalability 4
  • 5. ACM/SIGCOMM 2010 – New Delhi, India Towards transparent scalability Elastic resource allocation for the Presentation and the Logic layer: More machines when load increases. Components are stateless, therefore, independent and duplicable Components are interdependent, therefore, non-duplicable on demand. 5
  • 6. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers 6
  • 7. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality 7
  • 8. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality Operations on the data (e.g. a query) can be resolved in a single server from the application perspective Application Logic Data Store Application Logic Data Store Data Store Data Store Centralized Distributed 8
  • 9. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality • Horizontal partitioning or sharding – Load of read requests decrease with the number of servers – State decrease with the number of servers – Maintains data locality as long as… 9
  • 10. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality • Horizontal partitioning or sharding – Load of read requests decrease with the number of servers – State decrease with the number of servers – Maintains data locality as long as… … the splits are disjoint (independent) Bad news for OSNs!! 10
  • 11. ACM/SIGCOMM 2010 – New Delhi, India Why sharding is not enough for OSN? • Shards in OSN can never be disjoint because: – Operations on user i require access to the data of other users – at least one-hop away. • From graph theory: – there is no partition that for all nodes all neighbors and itself are in the same partition if there is a single connected component. Data locality cannot be maintained by partitioning a social network!! 11
  • 12. ACM/SIGCOMM 2010 – New Delhi, India A very active area… • Online Social Networks are massive systems spanning hundreds of machines in multiple datacenters • Partition and placement to minimize network traffic and delay – Karagiannis et al. “Hermes: Clustering Users in Large-Scale Email Services” -- SoCC 2010 – Agarwal et al. “Volley: Automated Data Placement of Geo- Distributed Cloud Services” -- NSDI 2010 • Partition and replication towards scalability – Pujol et al “Scaling Online Social Networks without Pains” -- NetDB / SOSP 2009 12
  • 13. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 13
  • 14. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 14
  • 15. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 Me 15
  • 16. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [] Inbox Friends 16
  • 17. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [] key-101 17
  • 18. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-101] key-101 18
  • 19. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 19
  • 20. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 20
  • 21. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101zz key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ]A 21
  • 22. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ] R= [key-146, key-121, key-105, key-101] A 22
  • 23. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] R= [key-146, key-121, key-105, key-101] A ? 23
  • 24. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] Where is the data? R= [key-146, key-121, key-105, key-101] A ? 24
  • 25. ACM/SIGCOMM 2010 – New Delhi, India OSN’s Operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] in , then data locality! R= [key-146, key-121, key-105, key-101] A ? A A AA A 25
  • 26. ACM/SIGCOMM 2010 – New Delhi, India OSN’s Operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] in , then no data locality! R= [key-146, key-121, key-105, key-101] A ? E A FB B F E 26
  • 27. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1) Relational Databases  Selects and joins across multiple shards of the database are possible but performance is poor (e.g. MySQL Cluster, Oracle Rack) 2) Key-Value Stores (DHT)  More efficient than relational databases: multi-get primitives to transparently fetch data from multiple servers.  But it’s not a silver bullet: o lose SQL query language => programmatic queries o lose abstraction from data operations o suffer from high traffic, eventually affecting performance: • Incast issue • Multi-get hole • latency dominated by the worse performing server 27
  • 28. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 28
  • 29. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Server 1 Server 2 Users are assigned to servers randomly Random partition (DHT-like) 29
  • 30. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT 30
  • 31. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT 31
  • 32. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT How do we enforce data locality? 32
  • 33. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT How do we enforce data locality? Replication 33
  • 34. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 4 7 34
  • 35. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 4 7 #4 Master + Replica#4 Master + Replica 35
  • 36. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 That does it for user #3, but what about the others? 4 7 36
  • 37. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 That does it for user #3, but what about the others? The very same procedure…4 7 37
  • 38. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! 4 7 1 0 5 3 2 8 9 6 After a while… 38
  • 39. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! 4 7 1 0 5 3 2 8 9 6 39
  • 40. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! 4 7 1 0 5 3 2 8 9 6 40
  • 41. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 This is full replication!! 41
  • 42. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 Can we do better? 42
  • 43. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 43
  • 44. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 44
  • 45. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition 45
  • 46. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition and Replication: SPAR 3 2 46
  • 47. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition and Replication: SPAR … only two replicas to achieve data locality 3 2 47
  • 48. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition and Replication: SPAR … only two replicas to achieve data locality 3 2 48
  • 49. ACM/SIGCOMM 2010 – New Delhi, India SPAR Algorithm from 10000 feet “All partition algorithms are equal, but some are more equal than others” 49
  • 50. ACM/SIGCOMM 2010 – New Delhi, India SPAR Algorithm from 10000 feet “All partition algorithms are equal, but some are more equal than others” 1) Online (incremental) a) Dynamics of the SN (add/remove node, edge) b) Dynamics of the System (add/remove server) 2) Fast (and simple) a) Local information b) Hill-climbing heuristic c) Load-balancing via back-pressure 3) Stable a) Avoid cascades 4) Effective a) Optimize for MIN_REPLICA (i.e. NP-Hard) b) While maintaining a fixed number of replicas per user for redundancy (e.g. K=2) 50
  • 51. ACM/SIGCOMM 2010 – New Delhi, India Looks good on paper, but... Real OSN data • Twitter – 2.4M users, 48M edges, 12M tweets for 15 days (Twitter as of Dec08) • Orkut (MPI) – 3M users, 223M edges • Facebook (MPI) – 60K users, 500K edges Partition algorithms • Random (DHT) • METIS • MO+ • modularity optimization + hack • SPAR online ... let’s try it for real 51
  • 52. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% 52
  • 53. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% 2.44 replicas per user (on average) 2 to guarantee redundacy + 0.44 to guarantee data locality (+22%) 2.44 53
  • 54. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% Replication with random partitioning, too costly! 54
  • 55. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% Other social based partitioning algorithms have higher replication overhead than SPAR. 55
  • 56. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 56
  • 57. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture 57
  • 58. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture Application, 3-tiers 58
  • 59. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture Application, 3-tiers Application, 3-tiers SPAR Middleware (data-store specific) 59
  • 60. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture Application, 3-tiers Application, 3-tiers SPAR Middleware (data-store specific) SPAR Controller : Partiton manager and Directory Service 60
  • 61. ACM/SIGCOMM 2010 – New Delhi, India SPAR in the wild • Twitter clone (Laconica, now Statusnet) – Centralized architecture, PHP + MySQL/Postgres • Twitter data – Twitter as of end of 2008, 2.4M users, 12M tweets in 15 days • The little engines – 16 commodity desktops: Pentium Duo at 2.33Ghz, 2GB RAM connected with Gigabit-Ethernet switch • Test SPAR on top of: – MySQL (v5.5) – Cassandra (v0.5.0) 61
  • 62. ACM/SIGCOMM 2010 – New Delhi, India SPAR on top of MySQL • Different read request rate (application level): – 1 call to LDS, 1 join to on the notice and notice inbox to get the last 20 tweets and its content – User is queried only once per session (4min) – Spar allows for data to be partitioned, improving both query times and cache hit ratios. – Full replication suffers from low cache hit ratio and large tables (1B+) MySQL Full-Replication SPAR + MySQL 99th < 150ms 16 req/s 2500 req/s 62
  • 63. ACM/SIGCOMM 2010 – New Delhi, India SPAR on top of Cassandra Network bandwidth is not an issue but Network I/O and CPU I/O are: – Network delays is reduced • Worse performing server produces delays – Memory hit ratio is increased • Random partition destroys correlations •Different read request rate (application level): Vanilla Cassandra SPAR + Cassandra 99th < 100ms 200 req/s 800 req/s 63
  • 64. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 64
  • 65. ACM/SIGCOMM 2010 – New Delhi, India Conclusions SPAR provides the means to achieve transparent scalability 1) For applications using RDBMS (not necessarily limited to OSN) 2) And, a performance boost for key-value stores due to the reduction of Network I/O 65
  • 66. ACM/SIGCOMM 2010 – New Delhi, India jmps@tid.es http://research.tid.es/jmps/ @solso Thanks! 66