SlideShare ist ein Scribd-Unternehmen logo
1 von 111
Data Warehousing (DAY 2) Siwawong W. Project Manager 2010.05.25
Agenda Data Warehouse: DW Structure/Modeling 10:00 – 10:30 09:00 – 09:15 Registration 09:15 – 09:30 Review 1 st  Day class 09:30 – 10:00 Data Warehouse: Data Warehousing 10:30 – 10:45 Break & Morning Refreshment 10:45 – 12:00 DWs and Data Marts 12:00 – 13:00 Lunch Break 13:00 – 15:00 Query Processing 15:00 – 15:15 Break 15:15 – 16:00 OLAP & DSS
1 st  Day Review
Dare Warehouse: Introduction Review ,[object Object],[object Object],[object Object],[object Object],[object Object]
RDBMS & SQL: Review ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse:  Data Warehousing
Data Warehouse: Data Warehousing ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse:  Architecture
Data Warehouse Architecture Data Warehouse  Engine Optimized Loader Extraction Cleansing Analyze Query Metadata Repository Relational Databases Legacy Data Purchased  Data ERP Systems
Components of the Warehouse ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Loading the Warehouse Cleaning the data before it is loaded
Source Data  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Sequential Legacy Relational External Operational/ Source Data
Data Quality – In Reality ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Integration Across Sources Trust Credit card Savings Loans Same data  different name Different data  Same name Data found here  nowhere else Different keys same data
Data Transformation Example Data Warehouse encoding unit field appl A - balance appl B - bal appl C - currbal appl D - balcurr appl A - pipeline - cm appl B - pipeline - in appl C - pipeline - feet appl D - pipeline - yds appl A - m,f appl B - 1,0 appl C - x,y appl D - male, female
Data Integrity Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Transformation Terms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Transformation Terms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Transformation Terms ,[object Object],[object Object],[object Object],[object Object]
Loads ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Load Techniques ,[object Object],[object Object],[object Object],[object Object]
Load Taxonomy ,[object Object],[object Object]
Refresh ,[object Object],[object Object],[object Object],[object Object]
When to Refresh? ,[object Object],[object Object],[object Object],[object Object]
Refresh Techniques ,[object Object],[object Object],[object Object]
How To Detect Changes ,[object Object],[object Object],[object Object],[object Object]
Data Extraction and Cleansing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Scrubbing Data ,[object Object],[object Object],[object Object],[object Object],[object Object]
Scrubbing Tools ,[object Object],[object Object],[object Object]
Structuring/Modeling Issues
Data: Heart of the Data Warehouse ,[object Object],[object Object],[object Object],[object Object]
Data Warehouse Structure ,[object Object],[object Object],[object Object]
Data Warehouse Structure ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Time is  part of  key of  each table
Data Granularity in Warehouse ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Granularity in Warehouse ,[object Object],[object Object],[object Object]
Granularity in Warehouse ,[object Object],[object Object],[object Object],[object Object],[object Object]
Vertical Partitioning Frequently accessed Rarely accessed Smaller table and so less I/O Acct. No Name Balance Date Opened Interest Rate Address Acct. No Balance Acct. No Name Date Opened Interest Rate Address
Derived Data ,[object Object],[object Object],[object Object]
Schema Design ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
D/W Schema
Dimension Tables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Fact Table ,[object Object],[object Object],[object Object],[object Object],[object Object]
Star Schema ,[object Object],[object Object],T i m e p r o d c u s t f a c t date, custno, prodno, cityname,  ... c i t y
Snowflake schema ,[object Object],[object Object],T i m e p r o d c u s t f a c t date, custno, prodno, cityname,  ... c i t y r e g i o n
Fact Constellation ,[object Object],[object Object],[object Object],Hotels Travel Agents Promotion Room Type Customer Booking Checkout
De-normalization ,[object Object],[object Object],[object Object]
Creating Arrays ,[object Object],[object Object],[object Object],[object Object],[object Object]
Selective Redundancy ,[object Object],[object Object]
Partitioning ,[object Object],[object Object],[object Object]
Why Partition? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Criterion for Partitioning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Where to Partition? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse vs. Data Marts What comes first?
What is Data Mart? ,[object Object],[object Object],[object Object],[object Object],[object Object]
From the Data Warehouse to Data Marts Data Information Departmentally Structured Individually Structured Data Warehouse Organizationally Structured Less More History Normalized Detailed
R easons for creating a data mart ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse and Data Marts OLAP Data Mart Lightly summarized Departmentally structured Organizationally structured Atomic Detailed Data Warehouse Data
Characteristics of the Departmental Data Mart ,[object Object],[object Object],[object Object],[object Object],[object Object]
Techniques for Creating Departmental Data Mart Sales Mktg. Finance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mart Centric Data Marts Data Sources Data Warehouse
Problems with Data Mart Centric Solution If you end up creating multiple warehouses, integrating them is a problem
True Warehouse Data Marts Data Sources Data Warehouse
D ata Marts Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
D ata Marts Issues (Cont’) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Query Processing
Query Processing ,[object Object],[object Object],[object Object]
Indexing Techniques ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bitmap index ,[object Object],[object Object],[object Object],[object Object]
Bitmap Index Customer Query : select * from customer where gender = ‘F’ and vote = ‘Y’ gender (f) vote (y) result vote gender 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 M F F F F M Y Y Y N N N
Join Indexes ,[object Object],[object Object],[object Object],[object Object]
Star Join Processing ,[object Object],Calls C+T  C+T+L C+T+L +P Time Loca- tion Plan
Intelligent Scan ,[object Object],[object Object]
Parallel Query Processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Pre-computed Aggregates ,[object Object],[object Object],[object Object],[object Object],[object Object]
Pre-computed Aggregates ,[object Object],[object Object],[object Object],[object Object],[object Object]
SQL Extensions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
IBM Red Brick Data Warehouse  ,[object Object],[object Object],[object Object],[object Object]
On-Line Analytical Processing (OLAP)
Limitations of SQL ,[object Object],[object Object],[object Object]
Typical OLAP Queries ,[object Object],[object Object],[object Object],[object Object]
What Is OLAP? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],* Reference:  http://www.arborsoft.com/essbase/wht_ppr/coddTOC.html
The OLAP Market ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Strengths of OLAP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OLAP Is FASMI ,[object Object],[object Object],[object Object],[object Object],[object Object],Nigel Pendse, Richard Creath - The OLAP Report
Multi-dimensional Data ,[object Object],Dimensions:  Product, Region, Time Hierarchical summarization paths Product  Region  Time Industry  Country  Year Category  Region  Quarter  Product  City  Month  Week   Office  Day Month 1  2 3  4  7 6  5  Product Toothpaste  Juice Cola Milk  Cream Soap  Region W S  N
Data Cube Lattice ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A Visual Operation:  Pivot (Rotate) 10 47 30 12 Juice Cola Milk  Cream NY LA SF 3/1  3/2  3/3 3/4 Month Region Product
“ Slicing and Dicing” Product Sales Channel Regions Retail Direct Special Household Telecomm Video Audio India Far East Europe The Telecomm Slice
“ Slicing and Dicing” ,[object Object],[object Object],[object Object],From:  http :// www . executionmih . com / data - analysis / bi - slice - dice . php
Roll-up and Drill Down ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Higher Level of Aggregation Low-level Details Drill-Down Roll Up
Nature of OLAP Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Organizationally Structured Data ,[object Object],marketing manufacturing sales finance
Multidimensional Spreadsheets ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OLAP - Data Cube ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Relational OLAP: ROLAP ,[object Object]
Relational OLAP:  3 Tier DSS Store atomic data in industry standard RDBMS. Generate SQL execution plans in the ROLAP engine to obtain OLAP functionality. Obtain multi-dimensional reports from the DSS Client. Data Warehouse ROLAP Engine Decision Support Client Database Layer Application Logic Layer Presentation Layer
Advantage/Disadvantage: ROLAP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Multidimensional OLAP: MOLAP ,[object Object]
MD-OLAP: 2 Tier DSS MDDB Engine MDDB Engine Decision Support Client Database Layer Application Logic Layer Presentation Layer Store atomic data in a proprietary data structure (MDDB), pre-calculate as many outcomes as possible, obtain OLAP functionality via proprietary algorithms running against this data. Obtain multi-dimensional reports from the DSS Client.
Advantage/Disadvantage: MOLAP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Metadata Repository ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Metadata Repository   (Cont’) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Recipe for a Successful Warehouse
For a Successful Warehouse ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],From Larry Greenfield
For a Successful Warehouse ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(Cont’)
Data Warehouse Pitfalls ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse Pitfalls ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(Cont’)
DW and OLAP Research Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
DW and OLAP Research Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(Cont’)
References/External Links (1) Data Warehousing & Data Mining   S. Sudarshan Krithi Ramamritham IIT Bombay (2) Data Warehousing Hu Yan e-mail:  [email_address] (3) Data Warehosing Concept: MOLAP, ROLAP and HOLAP http://www.1keydata.com/datawarehousing/molap-rolap.html
Thank you for your attention! [email_address] www.blueballgroup.com

Weitere ähnliche Inhalte

Was ist angesagt?

Bw training 1 intro dw
Bw training   1 intro dwBw training   1 intro dw
Bw training 1 intro dw
Joseph Tham
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
Gersiton Pila Challco
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
Sarita Kataria
 
04 Dimensional Analysis - v6
04 Dimensional Analysis - v604 Dimensional Analysis - v6
04 Dimensional Analysis - v6
Prithwis Mukerjee
 
Steps To Build A Datawarehouse
Steps To Build A DatawarehouseSteps To Build A Datawarehouse
Steps To Build A Datawarehouse
Hendra Saputra
 

Was ist angesagt? (20)

Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
05 OLAP v6 weekend
05 OLAP  v6 weekend05 OLAP  v6 weekend
05 OLAP v6 weekend
 
Bw training 1 intro dw
Bw training   1 intro dwBw training   1 intro dw
Bw training 1 intro dw
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topper
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
 
Business process modeling and analysis for data warehouse design
Business process modeling and analysis for data warehouse designBusiness process modeling and analysis for data warehouse design
Business process modeling and analysis for data warehouse design
 
04 Dimensional Analysis - v6
04 Dimensional Analysis - v604 Dimensional Analysis - v6
04 Dimensional Analysis - v6
 
Choosing the Right Business Intelligence Tools for Your Data and Architectura...
Choosing the Right Business Intelligence Tools for Your Data and Architectura...Choosing the Right Business Intelligence Tools for Your Data and Architectura...
Choosing the Right Business Intelligence Tools for Your Data and Architectura...
 
Chapter 2 - Retail Sales
Chapter 2 - Retail Sales Chapter 2 - Retail Sales
Chapter 2 - Retail Sales
 
Steps To Build A Datawarehouse
Steps To Build A DatawarehouseSteps To Build A Datawarehouse
Steps To Build A Datawarehouse
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Data warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessData warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail business
 
Inmon & kimball method
Inmon & kimball methodInmon & kimball method
Inmon & kimball method
 
Business Intelligence: Multidimensional Analysis
Business Intelligence: Multidimensional AnalysisBusiness Intelligence: Multidimensional Analysis
Business Intelligence: Multidimensional Analysis
 
Real World Business Intelligence and Data Warehousing
Real World Business Intelligence and Data WarehousingReal World Business Intelligence and Data Warehousing
Real World Business Intelligence and Data Warehousing
 

Andere mochten auch

test presentation
test presentationtest presentation
test presentation
H Haughey
 
การนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรม
การนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรมการนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรม
การนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรม
Siwawong Wuttipongprasert
 

Andere mochten auch (17)

Bb Tequila Coding Style (Draft)
Bb Tequila Coding Style (Draft)Bb Tequila Coding Style (Draft)
Bb Tequila Coding Style (Draft)
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
test presentation
test presentationtest presentation
test presentation
 
จบแล้วทำงานอย่างไร ไม่ให้ DRAMA!
จบแล้วทำงานอย่างไร ไม่ให้ DRAMA!จบแล้วทำงานอย่างไร ไม่ให้ DRAMA!
จบแล้วทำงานอย่างไร ไม่ให้ DRAMA!
 
pihms Overview Presentation
pihms Overview Presentationpihms Overview Presentation
pihms Overview Presentation
 
Northern IT Finishing School
Northern IT Finishing SchoolNorthern IT Finishing School
Northern IT Finishing School
 
Poken
PokenPoken
Poken
 
pihmsAnalytic Solutions - Report Compendium
pihmsAnalytic Solutions - Report CompendiumpihmsAnalytic Solutions - Report Compendium
pihmsAnalytic Solutions - Report Compendium
 
FLossEd-BK Tequila Framework3.2.1
FLossEd-BK Tequila Framework3.2.1FLossEd-BK Tequila Framework3.2.1
FLossEd-BK Tequila Framework3.2.1
 
Finishing School .Net Work-Shop (Day2)
Finishing School .Net Work-Shop (Day2)Finishing School .Net Work-Shop (Day2)
Finishing School .Net Work-Shop (Day2)
 
Northern Finishing School: IT Project Managment
Northern Finishing School: IT Project ManagmentNorthern Finishing School: IT Project Managment
Northern Finishing School: IT Project Managment
 
Create Components in TomatoCMS
Create Components in TomatoCMSCreate Components in TomatoCMS
Create Components in TomatoCMS
 
It ready dw_day3_rev00
It ready dw_day3_rev00It ready dw_day3_rev00
It ready dw_day3_rev00
 
It ready dw_day4_rev00
It ready dw_day4_rev00It ready dw_day4_rev00
It ready dw_day4_rev00
 
TomatoCMS in A Nutshell
TomatoCMS in A NutshellTomatoCMS in A Nutshell
TomatoCMS in A Nutshell
 
Modal verbs like_auxiliaries
Modal verbs like_auxiliariesModal verbs like_auxiliaries
Modal verbs like_auxiliaries
 
การนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรม
การนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรมการนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรม
การนำเทคโนโลยีมาปรับใช้ในระบบ Erp ให้ทันสมัยในอุตสาหกรรม
 

Ähnlich wie ITReady DW Day2

Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 
Dataware housing
Dataware housingDataware housing
Dataware housing
work
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
vivekjv
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
ganblues
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
bartlowe
 

Ähnlich wie ITReady DW Day2 (20)

Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
Data Warehouse-Final
Data Warehouse-FinalData Warehouse-Final
Data Warehouse-Final
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
 
3dw
3dw3dw
3dw
 

ITReady DW Day2

  • 1. Data Warehousing (DAY 2) Siwawong W. Project Manager 2010.05.25
  • 2. Agenda Data Warehouse: DW Structure/Modeling 10:00 – 10:30 09:00 – 09:15 Registration 09:15 – 09:30 Review 1 st Day class 09:30 – 10:00 Data Warehouse: Data Warehousing 10:30 – 10:45 Break & Morning Refreshment 10:45 – 12:00 DWs and Data Marts 12:00 – 13:00 Lunch Break 13:00 – 15:00 Query Processing 15:00 – 15:15 Break 15:15 – 16:00 OLAP & DSS
  • 3. 1 st Day Review
  • 4.
  • 5.
  • 6. Data Warehouse: Data Warehousing
  • 7.
  • 8. Data Warehouse: Architecture
  • 9. Data Warehouse Architecture Data Warehouse Engine Optimized Loader Extraction Cleansing Analyze Query Metadata Repository Relational Databases Legacy Data Purchased Data ERP Systems
  • 10.
  • 11. Loading the Warehouse Cleaning the data before it is loaded
  • 12.
  • 13.
  • 14. Data Integration Across Sources Trust Credit card Savings Loans Same data different name Different data Same name Data found here nowhere else Different keys same data
  • 15. Data Transformation Example Data Warehouse encoding unit field appl A - balance appl B - bal appl C - currbal appl D - balcurr appl A - pipeline - cm appl B - pipeline - in appl C - pipeline - feet appl D - pipeline - yds appl A - m,f appl B - 1,0 appl C - x,y appl D - male, female
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. Vertical Partitioning Frequently accessed Rarely accessed Smaller table and so less I/O Acct. No Name Balance Date Opened Interest Rate Address Acct. No Balance Acct. No Name Date Opened Interest Rate Address
  • 38.
  • 39.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53. Data Warehouse vs. Data Marts What comes first?
  • 54.
  • 55. From the Data Warehouse to Data Marts Data Information Departmentally Structured Individually Structured Data Warehouse Organizationally Structured Less More History Normalized Detailed
  • 56.
  • 57. Data Warehouse and Data Marts OLAP Data Mart Lightly summarized Departmentally structured Organizationally structured Atomic Detailed Data Warehouse Data
  • 58.
  • 59.
  • 60. Data Mart Centric Data Marts Data Sources Data Warehouse
  • 61. Problems with Data Mart Centric Solution If you end up creating multiple warehouses, integrating them is a problem
  • 62. True Warehouse Data Marts Data Sources Data Warehouse
  • 63.
  • 64.
  • 66.
  • 67.
  • 68.
  • 69. Bitmap Index Customer Query : select * from customer where gender = ‘F’ and vote = ‘Y’ gender (f) vote (y) result vote gender 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 M F F F F M Y Y Y N N N
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87. A Visual Operation: Pivot (Rotate) 10 47 30 12 Juice Cola Milk Cream NY LA SF 3/1 3/2 3/3 3/4 Month Region Product
  • 88. “ Slicing and Dicing” Product Sales Channel Regions Retail Direct Special Household Telecomm Video Audio India Far East Europe The Telecomm Slice
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96. Relational OLAP: 3 Tier DSS Store atomic data in industry standard RDBMS. Generate SQL execution plans in the ROLAP engine to obtain OLAP functionality. Obtain multi-dimensional reports from the DSS Client. Data Warehouse ROLAP Engine Decision Support Client Database Layer Application Logic Layer Presentation Layer
  • 97.
  • 98.
  • 99. MD-OLAP: 2 Tier DSS MDDB Engine MDDB Engine Decision Support Client Database Layer Application Logic Layer Presentation Layer Store atomic data in a proprietary data structure (MDDB), pre-calculate as many outcomes as possible, obtain OLAP functionality via proprietary algorithms running against this data. Obtain multi-dimensional reports from the DSS Client.
  • 100.
  • 101.
  • 102.
  • 103. Recipe for a Successful Warehouse
  • 104.
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
  • 110. References/External Links (1) Data Warehousing & Data Mining S. Sudarshan Krithi Ramamritham IIT Bombay (2) Data Warehousing Hu Yan e-mail: [email_address] (3) Data Warehosing Concept: MOLAP, ROLAP and HOLAP http://www.1keydata.com/datawarehousing/molap-rolap.html
  • 111. Thank you for your attention! [email_address] www.blueballgroup.com