2. OLAP Implementations
1. MOLAP: OLAP implemented with a multi-dimensional
data structure.
2. ROLAP: OLAP implemented with a relational database.
3. HOLAP: OLAP implemented as a hybrid of MOLAP and
ROLAP.
2
3. MOLAP Implementations
OLAP has historically been implemented using a
multi_dimensional data structure or “cube”.
• Dimensions are key business factors for analysis:
– Geographies (city, district, division, province,...)
– Products (item, product category, product department,...)
– Dates (day, week, month, quarter, year,...)
• Very high performance achieved by O(1) time lookup
into “cube” data structure to retrieve pre_aggregated
results.
3
4. MOLAP Implementations
• No standard query language for querying MOLAP
- No SQL !
• Vendors provide proprietary languages allowing business
users to create queries that involve pivots, drilling down, or
rolling up.
- E.g. MDX of Microsoft
- Languages generally involve extensive visual (click and drag)
support.
- Application Programming Interface (API)’s also provided for probing
the cubes.
4
5. Aggregations in MOLAP
5
Sales volume as a function of (i) product, (ii) time,Sales volume as a function of (i) product, (ii) time,
and (iii) geographyand (iii) geography
A cube structure created to handle this.A cube structure created to handle this.
Dimensions: Product, Geography, Time
Industry
Category
Product
Hierarchical summarization pathsHierarchical summarization paths
Product
G
eog
Time
w1 w2 w3 w4 w5 w6
Milk
Bread
Eggs
Butter
Jam
Juice
N
E
W
S
12
13
45
8
23
10
Province
Division
District
City
Zone
Year
Quarter
Month Week
Day
6. Cube operations
• Drill down: get more details
– e.g., given summarized sales as above, find breakup of
sales by city within each region, or within Sindh
• Rollup: summarize data
– e.g., given sales data, summarize sales for last year by
product category and region
• Slice and dice: select and project
– e.g.: Sales of soft-drinks in Karachi during last quarter
• Pivot: change the view of data
6
8. Querying the cube: Pivoting
-
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
2001 2002
Juices Soda Drinks
-
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
18,000
Orange
juice
Mango
juice
Apple
juice
Rola-
Kola
8-UP Bubbly-
UP
Pola-
Kola
2001 2002
8
9. MOLAP evaluation
9
Advantages of MOLAP:
Instant response (pre-calculated aggregates).
Impossible to ask question without an answer.
Value added functions (ranking, % change).
10. MOLAP evaluation
10
Drawbacks of MOLAP:
Long load time ( pre-calculating the cube
may take days!).
Very sparse cube (wastage of space) for
high cardinality (sometimes in small
hundreds). e.g. number of heaters sold in
Jacobabad or Sibi.
11. MOLAP Implementation issues
Maintenance issue: Every data item received
must be aggregated into every cube (assuming
“to-date” summaries are maintained). Lot of
work.
Storage issue: As dimensions get less detailed
(e.g., year vs. day) cubes get much smaller, but
storage consequences for building hundreds of
cubes can be significant. Lot of space.
11
Hinweis der Redaktion
Molap: it is a multidimensional implementation of olap
Rolap: has underlined table structure and sql also runs.
Holap: best of both world. The enviornment switch b/w molap and rolap for performance.
Dolap: used for sales for automation, cubes are cut into small pieces then used for dss.
Olap is logically a cube traditionaly olap is implemented with multidimensional data struct. Second feature is dimensions. When we discuss cube, it is made by dimensions like time, product, geography. Time to access O(1) v. v. fast. Means independent of data size, or input size time to access remain same. In molap we store the aggregates (all) in multidimensional array. And access time of array is O(1).
Molap implementation has no sql. When u store aggregates in a multi dim array, that array is in 1st normal form 2nd or 3rd normal form? It may be in 1st because you merge the tables and generate the cubes. If there is no sql then how you probe the cube. There are 3rd party tools or graphical enviornments to probe the cubes. Apis are also available to probe the cube.
Cubes have following operations.
Slice and dice: you want to explore a particular part of the cube.
Pivot: make x axis and y axis interchange. Will be clear soon.
Now to query the cube.
First chart has histogram representing juices and soda. We can drill down because it appears that soda drinks have constant and almost equal sale. But now you drill down and find q3 has highest juice sale as compare to soda. Means until drill we cannot know this fact. So when there is lot of data with lot of dimensions decision maker cannot make the decision on its own.
In last chart it appears that few items have continuly low sale an alarming situation.
Previously time was on x axis, and product on y axis, but now interchanged. In 2002 8-up has highest sale, so this pivoting reveals hidden facts. When we probe a cube it shows the results using gui enviornment.
- O(1) access time, because aggregate is already in the main memory. The second adv is that all possible questions are answered in the form of aggregates. Third adv is that molap supports value addition, ie. Sort results, find median, convert into pi chart etc.
Works well when dimensions are few, but otherwise needs to maintin is a tough task.
Cube is sparse because all possible combinations cannot have values i.e. heater sale in sibi which is almost impossible due to hot weather.
We say data is historical but we cannot go v back in history, and if events are rapidly changing we cannot see the past. So required that the fresh data should come and when it comes we need to put it in all possible aggregates which is a lot of work. So updation is difficult thing for molap.
Second issue is about space. Here a misconception may occur that if you aggregates on large grain then less storage space is required but it is not right. But even instead of day to year, the defination of year can vary even within the company so a lot of cubes will become and more storage will be required. When we do the aggregation number of cubes increase and big cube grows and it may not even come in the main memory. So will you purchase new main memory but until when you purchase main memory. So a sol is to partition the cube.