Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Sep 2010
Scaling to Millions of Concurrent SPARQL Queries on the Cloud OWLIM Replication Cluster @ Amazon EC2

Goals
• Test the scalability of OWLIM RC on a really large
cluster
• Can we break the million queries per hour barrier?
OWLIM Replication Cluster @ AWS Sep 2010 #2

INTRODUCTION

Berlin SPARQL Benchmark (BSBM)
• http://www4.wiwiss.fu-berlin.
de/bizer/BerlinSPARQLBenchmark/results/
• Evaluates the performance of RDF query engines in
an e-commerce use case
– searching products and navigating related information
• Randomized query mixes (25 SPARQL queries) are
evaluated continuously
• Different dataset size & number of concurrent clients
– 25M, 100M and 200M triples

Benchmarking AWS
• Extensive performance tests of EC2 instances
– I/O, CPU, Network
– BSBM (SPARQL), RDF materialisation
• High Memory EC2 instances offer (surprisingly) good
performance for RDF related processing
– Comparable to local non-virtualised hardware

Benchmarking AWS – testbeds
CPU cores RAM (GB) Virtualisation
Local-L 2×2.4 GHz 8 ESX
Local-XL 4×2.9 GHz 12 No
Local-3XL 8×3.3 GHz 48 No
L 2×2 ECU* 7.5 Xen
XL 4×2 ECU* 15 Xen
High-Mem XL 2×3.25 ECU* 17 Xen
High-Mem 2XL 4×3.25 ECU* 34 Xen
High-Mem 4XL 8×3.25 ECU* 68 Xen
High-CPU XL 8×2.5 ECU* 7 Xen
1 ECU provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor

Benchmarking AWS – BSBM 100M results
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1 4 16 32 64
Query mixes / hour
concurrent clients
Local-L
L-ub
Local-XL
XL-ub
HM-XL-ub
HM-2XL-ub
Local-3XL
Local-3XL-SSD
HM-4XL-ub
HC-XL-ub

Benchmarking AWS – RDF materialisation
0
1000
2000
3000
4000
5000
6000
materialisation time (sec)
UMBEL
DBP-SKOS

OWLIM Replication Cluster
• Improves scalability with respect to concurrent user
requests
• How does it work?
– Each write request is multiplexed to all repository
instances
– Each read request is dispatched to one instance only
– To ensure load-balancing,
read requests are sent to the
instance with the shortest
execution queue

OWLIM CLUSTER ON EC2 –
BENCHMARKS

AWS testbed setup
• OWLIM Replication Cluster
– One Master node, 10-100 Slave nodes
– 100 million triples / 16GB database size
• BSBM 100M dataset
– Each cluster node has a replica of the database
– 1000 concurrent BSBM clients
• Amazon EC2
– Master node – HM-2XL (34GB RAM, 4x3.25 ECU)
– Slave nodes – HM-XL (17 GB RAM, 2x3.25 ECU)
– Ubuntu (x64)

Total QMpH (Query Mix per Hour)
0
50000
100000
150000
200000
250000
10 20 30 40 50 60 70 80 90 100
total QMpH
cluster size (HM-XL nodes)
BSBM-100M, 1000 concurrent clients
1000 clients

Total QMpH – summary
• (almost) Linear scalability of the cluster
• 20 nodes handle more than 1 million SPARQL queries
per hour (40,000 QMpH)
– 1 Query Mix = 25 SPARQL queries
• 100 nodes handle 5 million SPARQL queries per hour
(200,000 QMpH)

QMpH per cluster node
1800
1900
2000
2100
2200
2300
2400
10 20 30 40 50 60 70 80 90 100
QMpH per node
cluster size (HM-XL nodes)
BSBM-100M, 1000 concurrent clients
1000 clients
trendline (Power)

QMpH per cluster node – summary
• Low parallelisation overhead
– Only 10% deterioration in QMpH per cluster node when
the cluster grows 10 times (from 10 to 100 nodes)
– Cluster nodes handle 2,000-2,300 QMpH (a standalone
HM-XL node on EC2 handles ~2,500 QMpH)

What about the cost?
• 100,000 SPARQL queries per 1$ on AWS
– ~4,000 Query Mixes / $
• 1 Query Mix = 25 SPARQL queries
– EC2 pricing
• Master node (on-demand HM-2XL) – $1.00/hour
• Slave node (on demand HM-XL) – $0.50/hour

What about the cost (2)
3400
3600
3800
4000
4200
4400
4600
10 20 30 40 50 60 70 80 90 100
Query Mixes / $
cluster size
Query Mixes per 1 USD
QMpH/$

DETAILED CLUSTER METRICS

Cluster monitoring
• Amazon CloudWatch provides instance level
monitoring for EC2
– CPU load, Bandwidth utilisation, I/O, …
– Minimum granularity of monitoring periods – 1 minute
• OWLIM Cluster metrics
– Monitor Master and a random Slave for ~180 min
– Many test runs
• a single run takes a few minutes
– Idle CPU/IO/Network on diagram is the time between test
runs

CPU load (Master)
0
10
20
30
40
50
60
70
80
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
160
165
170
175
180
185
%
time (min)
CPU load (Master)
CPU load

CPU load (Slave)
0
20
40
60
80
100
120
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
%
time (min)
CPU load (random Slave)
CPU load

Network traffic (Master)
0
5
10
15
20
25
30
35
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
160
165
170
175
180
185
MB/s
time (min)
Network traffic (Master)
inbound (MB/s)
outbound (MB/s)

Network traffic (Slave)
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
64
68
72
76
80
84
88
92
96
100
104
108
112
116
120
124
128
132
136
140
144
148
152
156
MB/s
time (min)
Network traffic (random Slave)
inbound (MB/s)
outbound (MB/s)

I/O (Slave)
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
160
165
170
MB/s
time (min)
I/O (random Slave)
Disk Read (MB/s)
Disk Write (MB/s)

Q & A
Questions?
@ontotext

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Ähnlich wie Scaling to Millions of Concurrent SPARQL Queries on the Cloud (20)

Mehr von Marin Dimitrov

Mehr von Marin Dimitrov (16)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Scaling to Millions of Concurrent SPARQL Queries on the Cloud