An overview of data warehousing and OLAP technology
1. AN OVERVIEW
OF DATA
WAREHOUSIN
G AND OLAP
TECHNOLOGY
PRESENTATION BY
NIKHAT FATIMA(44954)
BANNY SHARMA(44799)
VAMSI KRISHNAYARRAMSETTI(45324)
2. INTRODUCTION TO WAREHOUSING AND OLAP
TECHNOLOGY
DATA WAREHOUSING
DATA WAREHOUSING IS THE PROCESS OF
CONSTRUCTING AND USING A DATA WAREHOUSE. A
DATA WAREHOUSE IS CONSTRUCTED BY INTEGRATING
DATA FROM MULTIPLE HETEROGENEOUS SOURCES THAT
SUPPORT ANALYTICAL REPORTING, STRUCTURED
AND/OR AD HOC QUERIES, AND DECISION MAKING.
DATA WAREHOUSING INVOLVES DATA CLEANING, DATA
INTEGRATION, AND DATA CONSOLIDATIONS.
DATA WAREHOUSES ARE WIDELY USED IN THE
FOLLOWING FIELDS
• FINANCIAL SERVICES
• BANKING SERVICES
• CONSUMER GOODS
• RETAIL SECTORS
• CONTROLLED MANUFACTURING
ONLINE ANALYTICAL
PROCESSING (OLAP)
ONLINE ANALYTICAL PROCESSING SERVER (OLAP) IS
BASED ON THE MULTIDIMENSIONAL DATA MODEL. IT
ALLOWS MANAGERS, AND ANALYSTS TO GET AN INSIGHT
OF THE INFORMATION THROUGH FAST, CONSISTENT,
AND INTERACTIVE ACCESS TO INFORMATION. THIS
CHAPTER COVER THE TYPES OF OLAP, OPERATIONS ON
OLAP, DIFFERENCE BETWEEN OLAP, AND STATISTICAL
DATABASES AND OLTP.
TYPES OF OLAP SERVERS
• WE HAVE FOUR TYPES OF OLAP SERVERS −
• RELATIONAL OLAP (ROLAP)
• MULTIDIMENSIONAL OLAP (MOLAP)
• HYBRID OLAP (HOLAP)
• SPECIALIZED SQL SERVERS
3. FOLLOWING ARE THE THREE TIERS OF THE
DATA WAREHOUSE ARCHITECTURE
TOP-TIER − THIS TIER IS THE FRONT-END CLIENT LAYER. THIS LAYER
HOLDS THE QUERY TOOLS AND REPORTING TOOLS, ANALYSIS TOOLS
AND DATA MINING TOOLS.
MIDDLE TIER − IN THE MIDDLE TIER, WE HAVE THE OLAP SERVER
THAT CAN BE IMPLEMENTED IN EITHER OF THE FOLLOWING WAYS.
BY RELATIONAL OLAP (ROLAP), WHICH IS AN EXTENDED
RELATIONAL DATABASE MANAGEMENT SYSTEM. THE ROLAP
MAPS THE OPERATIONS ON MULTIDIMENSIONAL DATA TO
STANDARD RELATIONAL OPERATIONS.
BY MULTIDIMENSIONAL OLAP (MOLAP) MODEL, WHICH
DIRECTLY IMPLEMENTS THE MULTIDIMENSIONAL DATAAND
OPERATIONS.
BOTTOM TIER − THE BOTTOM TIER OF THE ARCHITECTURE IS THE
DATA WAREHOUSE DATABASE SERVER. IT IS THE RELATIONAL
DATABASE SYSTEM. WE USE THE BACK END TOOLS AND UTILITIES
TO FEED DATA INTO THE BOTTOM TIER. THESE BACK END TOOLS AND
UTILITIES PERFORM THE EXTRACT, CLEAN, LOAD, AND REFRESH
FUNCTIONS.
4. MULTIDIMENSIONAL DATABASE MODEL
• A MULTIDIMENSIONAL DATABASE MODEL IS A TYPE OF A DATABASE THAT IS
OPTIMIZED FOR DATA WAREHOUSE AND ONLINE ANALYTICAL PROCESSING
(OLAP) APPLICATIONS.
• MULTI DIMENSIONAL DATABASES ARE FREQUENTLY CREATED USING INPUT
FROM EXISTING RELATIONAL DATABASES. TABLES AND SPREADSHEETS TO DATA
CUBES, STARS, SNOWFLAKES AND FACTS CONSTELLATIONS ARE EXAMPLES FOR
MULTIDIMENSIONAL DATABASES.
• TALKING ABOUT STAR SCHEMA, EACH DIMENSION IN THIS IS REPRESENTED WITH
A ONE DIMENSION TABLE WHICH CONTAINS A SET OF ATTRIBUTES.
• IN THIS THE FACT TABLE IS AT THE CENTER, KEYS TO EVERY DIMENSION TABLE
AND ATTRIBUTES.
5. CONCLUSIONS AND FUTURE OUTCOMES
• DATA CLEANING IS A PROBLEM THAT IS REMINISCENT OF HETEROGENEOUS DATA
INTEGRATION, A PROBLEM THAT HAS BEEN STUDIED FOR MANY YEARS. BUT HERE THE
EMPHASIS IS ON DATA INCONSISTENCIES INSTEAD OF SCHEMA INCONSISTENCIES.
• THE MANAGEMENT OF DATA WAREHOUSES ALSO PRESENTS NEW CHALLENGES.
DETECTING RUNAWAY QUERIES, AND MANAGING AND SCHEDULING RESOURCES ARE
PROBLEMS THAT ARE IMPORTANT BUT HAVE NOT BEEN WELL SOLVED.
• WE HAVE DESCRIBED THE SUBSTANTIAL TECHNICAL CHALLENGES IN DEVELOPING AND
DEPLOYING DECISION SUPPORT SYSTEMS. WHILE MANY COMMERCIAL PRODUCTS AND
SERVICES EXIST, THERE ARE STILL SEVERAL INTERESTING AVENUES FOR RESEARCH.
6. REFERENCES
• GRAY J. ET.AL. “DATA CUBE: A RELATIONAL AGGREGATION OPERATOR
GENERALIZING GROUP-BY, CROSS-TAB AND SUB TOTALS” DATA MINING
AND KNOWLEDGE DISCOVERY JOURNAL, VOL 1, NO 1, 1997.
• AGRAWAL S. ET.AL. “ON THE COMPUTATION OF MULTIDIMENSIONAL
AGGREGATES” PROC. OF VLDB CONF., 1996.