1. OLAP*: Effectively &
Efficiently Supporting
Parallel OLAP over Big Data
Alfredo Cuzzocrea*, Rim Moussa‡ & Guandong Xu†
* cuzzocrea@si.deis.unical.it, ICAR-CNR & Univ. of Calabria, Italy
‡ rim.moussa@esti.rnu.tn, LATICE, Univ. of Tunis, Tunisia
† guandong.xu@uts.edu.au, AAI, Univ. of Technology, Australia
th
27 , Sept. 2013
3rd International Conference on Model & Data Engineering
MEDI’2013, Amantea, Calabria, Italy.
2. Outline
1. Context
2. Parallel Cube Processing
3. Performance Results
4. Related Work
5. Conclusion
6. Future Work
27th, Sept.
2013
MEDI’2013@Amantea
2
5. Data Partitioning Schemes
-- DWS model (snowflake): Data View of the cube
Dimension
Table
Dimension
Table
Dimension
Table
Dimension
Table
27th, Sept.
2013
Fact
Table
Dimension
Table
MEDI’2013@Amantea
Dimension
Table
Dimension
Table
5
6. Different OLAP Business Questions
-– case of study: TPC-H Benchmark
BQ 1: Revenue / supplier
geography location / Part
Brand / Year-quarter-month
BQ 2:Turnover of customers /
geography location / Yearquarter-month / Part brand
27th, Sept.
2013
MEDI’2013@Amantea
6
7. Data Partitioning Schemes
Computer Cluster
?
DW fully replicated
Fragment Fact Table &
Replicate Dimension Tables
DHP Fact Table, fragment
some Dimension Tables &
replicate the rest
Cube size
Workload Processing
Minimal pre & post-processing, max //
DW Maintenance
Storage overhead
27th, Sept.
2013
MEDI’2013@Amantea
7
8. Performance Results
--TPC-H*d Benchmark
●
Multi-dimensional design of TPC-H benchmark
–
–
●
Minimal changes to TPC-H relational DB schema
Each SQL statement is mapped into an OLAP cube
TPC-H Workload translated into MDX
–
22 MDX statements for OLAP cubes' run
–
22 MDX statements for OLAP queries' run
27th, Sept.
2013
MEDI’2013@Amantea
8
10. Performance Results
-- Software Technologies & Hardware
Mondrian ROLAP Server
Mondrian-3.5.0
Jpivot OLAP client
Relational DBMS
Mysql 5.1
Servlet container
●
27th, Sept.
2013
French Grid Platform G5K
● Sophia site
● Suno nodes, 32 GB of memory, each CPU is Intel Xeon E5520,
2.27 GHz, with 2 CPUs per node and 4 cores per CPU
MEDI’2013@Amantea
10
12. Performance Results
--TPC-H*d for SF=10 & single DB backend
Query
workloadd
Cube-Query workload
cube
query
Q1
2,147.33
2,777.49
0.29
Q10
7,100.24
n/a
-
Q11
2,558.21
3,020.27
1,604.1
n/a
n/a
n/a
Q9
●
●
Over 22 business queries: 14 perform as Q1, 4 perform as
Q10, 2 perform as Q11, 2 perform as Q9
The system under test was unable to build big cubes related
to business queries: Q3, Q9, Q10, Q13, Q18 and Q20, either
for memory leaks or systems constraints (max crossjoin size:
2,147,483,647),
27th, Sept.
2013
MEDI’2013@Amantea
12
13. Performance Results
(ctnd. 1)
--TPC-H*d for SF=10 & 4 DB backends
Query
workload
Cube-Query workload
cube
query
Query
workload
Cube-Query workload
cube
query
Q1
485.73
862.77
0.19
2,147.33
2,777.49
0.29
Q10
2,654.2
13,674.02
1,599.47
7,100.24
n/a
-
Q11
535.75
990.75
505.2
2,558.21
3,020.27
1,604.1
n/a
n/a
n/a
n/a
n/a
n/a
Q9
●
●
●
LineItem is DHPed along Orders, Orders is is DHPed along
Customer, Customer is PHPed, and all the rest are replicated
Over 22 business queries: 20 perform as Q1, Q10, Q11 and 2
perform as Q9
Improvements vary from 42.78% to 100%
27th, Sept.
2013
MEDI’2013@Amantea
13
14. Performance Results
(ctnd. 2)
--TPC-H*d for SF=10 & 4 DB backends & derived data
Query
workload
Cube-Query workload
cube
query
Query
workload
Cube-Query workload
cube
query
1.10
1.32
0.25
485.73
862.77
0.19
Q10
127.67
9,545.68
5.16
2,654.2
13,674.02
1,599.47
Q11
587.99
875.33
497.67
535.75
990.75
505.2
n/a
n/a
n/a
n/a
n/a
n/a
Q1
Q9
●
●
●
Derived data: Aggregate tables for sparse cubes or cubes having a
fixed size whether is SF, and Derived attributes for OLAP cubes
which size increases with SF
Response times of business queries of both workloads, for which
aggregate tables were built were improved.
The impact of derived attributes is mitigated. Performance results
show good improvements for Q10 and Q21, and small impact on
Q11 (saved operations are not complex).
27th, Sept.
2013
MEDI’2013@Amantea
14
15. Performance Results
--Derived Data Calculus
Single DB Backend
l_profit
(LineItem is fragmented into 4
fragments)
agg_c15
27th, Sept.
2013
862.4
18,195.48
1,461.99
4,377.51
1,288.31
71.63
10,904.00
ps_excess_YYYY
(PartSupp, Time are replicated
and LineItem is fragmented
into 4 fragments)
862.4
343.91
ps_isminimum
(PartSupp, Supplier, Nation,
Region are replicated )
agg_c1
For each DB Backend
852.84
MEDI’2013@Amantea
15
16. Related Work
●
PowerDB (Rohm et a., 2000)
–
–
–
●
TPC-R benchmark (SQL workload) for comparing
● fully replicated DW schema
● partial replication and data partitioning (only LineItem
table is fragmented)
PowerDB implements queries' routing algorithms (ShortQueries-ASAP, Affinity-Based routing) for load balancing
and Inter-q and intra-q parallelism
SF=0.1 (300MB all database files included)
cgmOLAP (Chen et al., 2006)
–
–
Panda project (Chen, 2004)
Parallel OLAP cube processor at a rate of 1TB/hour
27th, Sept.
2013
MEDI’2013@Amantea
16
17. Related Work (ctnd. 1)
●
ParGRES (Paes et al., 2008)
–
–
–
–
–
Automatic parsing of SQL statements, inter and intra-query
parallelism enabled,
Subset of TPC-H workload (Q1, Q3, Q4-Q8, Q12, Q14 and
Q19)
TPC-H with SF=5 (11GB including all DB files)
RDBMS: postgreSQL
32-node shared-nothing cluster, grid5000 clusters (2
CPUs, 1GB of memory)
27th, Sept.
2013
MEDI’2013@Amantea
17
18. Related Work (ctnd. 2)
●
SmaQSS DBC middleware (Lima et al., 2009)
–
–
–
–
–
–
Combination of physical/virtual partitioning and partial
replication
Partial replication uses chained declusteing
Subset of TPC-H workload (Q1, Q5, Q6, Q12, Q14, Q18
and Q21 coded in SQL)
TPC-H with SF=5 (11GB including all DB files)
RDBMS: postgreSQL
32-node shared-nothing cluster, grid5000 clusters (2
CPUs, 1GB of memory
27th, Sept.
2013
MEDI’2013@Amantea
18
19. Conclusion
●
Comparison of different DW fragmentation schemes
regarding,
Cube size,
– Distributed cube processing,
– storage overhead,
– DW maintenance
Implementation an OLAP mid-tier
–
●
- Connects to a pool of any RDBMSs through JDBC
- Uses OLAP4j- an open Java API for OLAP
27th, Sept.
2013
MEDI’2013@Amantea
19
20. Conclusion (ctnd.)
●
●
Performance assessment using TPC-H*d benchmark
and considering the whole workload (22 queries)
Implementation and experiments revealed
–
–
–
MDX language' shortcomings
● for each sub-query, we manually set parameters.
(next/previous value of the member if missing
value)
Mondrian ROLAP server limits
● No infinite combination of dimensions. Indeed, the
limit size is 2,147,483,647
Memory leaks
27th, Sept.
2013
MEDI’2013@Amantea
20
21. Future Work
●
Inspect the core of Mondrian and revise its source code
●
Automate DW partitioning
●
Consider bigger datasets
●
Consider TPC-DS benchmark (99 business queries,
multiple data marts)
27th, Sept.
2013
MEDI’2013@Amantea
21
22. Thank you for Your Attention
Q&A
OLAP*: Effectively & Efficiently Supporting Parallel
OLAP over Big Data
Alfredo Cuzzocrea, Rim Moussa & Guandong Xu
MEDI'2013@Amantea
27th Sept. 2013