SlideShare a Scribd company logo
1 of 18
Why aWhy a
Data WarehouseData Warehouse
Concepts in Design
Fred A Kilby, MBA
fred.kilby@pmbsa.com
Copyright © 2009 Fred A. Kilby. All rights reserved
What is a Data WarehouseWhat is a Data Warehouse?
The conglomeration of an organization’s data
warehouse staging and presentation areas,
where operational data is specifically
structured for query and analysis
performance and ease-of-use.
Ralph Kimball,(2002) The Data Warehouse Toolkit.
Now in EnglishNow in English
A data warehouse is a database organized in a
way to allow for fast queries of information.
It contains the data from the different database
systems that is brought together for a single
view.
So what’s the differenceSo what’s the difference?
Transactional Sources
• Centers around
transactions
• 2 dimension reports
– Age by System
• Individual data
• Slow
• “Cut-n-paste” into other
applications
Data Warehouse
• Centers around business
facts
• Multi-dimensional reports
– Age by Race by Program
• Aggregated data
• Fast
• 3rd
party reporting tools
can be used.
Measures Facts not ActivitiesMeasures Facts not Activities
Facts are business performance measurements
– Meals provided
– Dollars expended
– Hours worked
Facts are numerical and additive
– Sum of dollars spent
– Count of clients served
Facts are stored to represent a measurement at a
particular “grain”
What is a Grain?What is a Grain?
A grain is the level of detail at which a business
measurement is stored
Different businesses have different fact needs
– A Social Services grain
• The number of food stamp dollars given to a case each month
– In-Home Support Services grain
• The number of hours of service a client received in a
provider’s pay period
• The number of dollars paid to a provider for a client during a
pay period
What is a DimensionWhat is a Dimension?
A dimension is a textual description that
relates to a fact, for example:
– Ethnicity (White, Black, Japanese)
– Language (English, Spanish, Tagalog)
– Gender (Male, Female)
– Date (05/31/2003, 04/15/2003)
– Location (California, Arizona, New Mexico)
Used in QueriesUsed in Queries
Dimensions are used to restrict and frame queries on
facts, for example:
“Give me a count of all Spanish speaking white males in
California”
• The fact is the count (a number)
• The dimensions are:
– Spanish (language),
– white (race),
– male (gender),
– and California (location)
Identifying Facts and DimensionsIdentifying Facts and Dimensions
By Aid Type
By Program
By Month For (By) Year
Count of
Cases
Sum of Aid
Payments
Average per
Case
What makes a Data WarehouseWhat makes a Data Warehouse?
Cubes Answer Business QuestionsCubes Answer Business Questions
How many Spanish speaking clients did H&HS
serve in each department for each of the past 3
years?
Which cities currently have the highest concentration
of Asian clients? What has the trend been?
How many people who receive Medi-Cal received a
service in 2003 from health services, by service?
Reporting CubesReporting Cubes
Reporting CubesReporting Cubes
Drill Down CapableDrill Down Capable
MultiMulti
-Dimensional-Dimensional
Visual GraphsVisual Graphs
Where do we startWhere do we start?
• Choose the systems to include
• Identify the exact grain of the business
process
• Identify the dimensions available for use
with each fact table row
• Choose the numeric facts of what is being
measured
Key to SuccessKey to Success
To ensure success end user involvement is
required:
Data warehouse success is tied directly to
user acceptance. If the users haven’t
accepted the data warehouse …then your
efforts have been exercises in futility. (Kimball,
2002)

More Related Content

Similar to Why A Data Warehouse

About Your Signature AssignmentThis signature assignment is desi.docx
About Your Signature AssignmentThis signature assignment is desi.docxAbout Your Signature AssignmentThis signature assignment is desi.docx
About Your Signature AssignmentThis signature assignment is desi.docx
bartholomeocoombs
 
Final ppt sec.data.coll
Final ppt sec.data.collFinal ppt sec.data.coll
Final ppt sec.data.coll
Ram Sonawane
 
Databases
DatabasesDatabases
Databases
UMaine
 
Databases
DatabasesDatabases
Databases
UMaine
 
Cts csl phoenix 20131104 v1
Cts csl phoenix 20131104 v1Cts csl phoenix 20131104 v1
Cts csl phoenix 20131104 v1
ISSIP
 
90300 633579030311875000
90300 63357903031187500090300 633579030311875000
90300 633579030311875000
sumit621
 

Similar to Why A Data Warehouse (20)

About Your Signature AssignmentThis signature assignment is desi.docx
About Your Signature AssignmentThis signature assignment is desi.docxAbout Your Signature AssignmentThis signature assignment is desi.docx
About Your Signature AssignmentThis signature assignment is desi.docx
 
Understanding the Value of Database Discovery - Beyond Unstructured Data
Understanding the Value of Database Discovery - Beyond Unstructured DataUnderstanding the Value of Database Discovery - Beyond Unstructured Data
Understanding the Value of Database Discovery - Beyond Unstructured Data
 
Unit 3 - Marketing Research
Unit 3 - Marketing ResearchUnit 3 - Marketing Research
Unit 3 - Marketing Research
 
Too Much Information? What Big Data Means for the Council
Too Much Information? What Big Data Means for the CouncilToo Much Information? What Big Data Means for the Council
Too Much Information? What Big Data Means for the Council
 
Hscb Focus 2010 Data Acquisition Extraction Management Debrief Jgm R1
Hscb Focus 2010 Data Acquisition Extraction Management Debrief Jgm R1Hscb Focus 2010 Data Acquisition Extraction Management Debrief Jgm R1
Hscb Focus 2010 Data Acquisition Extraction Management Debrief Jgm R1
 
Sirgroup1
Sirgroup1Sirgroup1
Sirgroup1
 
Final ppt sec.data.coll
Final ppt sec.data.collFinal ppt sec.data.coll
Final ppt sec.data.coll
 
How to Use Your Database to Power Your Fundraising - FINZ 2014 Presentation
How to Use Your Database to Power Your Fundraising - FINZ 2014 PresentationHow to Use Your Database to Power Your Fundraising - FINZ 2014 Presentation
How to Use Your Database to Power Your Fundraising - FINZ 2014 Presentation
 
Fundamentals of Database Design
Fundamentals of Database DesignFundamentals of Database Design
Fundamentals of Database Design
 
Data quality metrics infographic
Data quality metrics infographicData quality metrics infographic
Data quality metrics infographic
 
Business Intelligence - What is it?
Business Intelligence - What is it?Business Intelligence - What is it?
Business Intelligence - What is it?
 
Big Data for HR
Big Data for HRBig Data for HR
Big Data for HR
 
Databases
DatabasesDatabases
Databases
 
Databases
DatabasesDatabases
Databases
 
Technologies and Innovation – Digital Economics
Technologies and Innovation – Digital EconomicsTechnologies and Innovation – Digital Economics
Technologies and Innovation – Digital Economics
 
Collaborate 2018: How to Get Cross Functional Reporting with an Enterprise Da...
Collaborate 2018: How to Get Cross Functional Reporting with an Enterprise Da...Collaborate 2018: How to Get Cross Functional Reporting with an Enterprise Da...
Collaborate 2018: How to Get Cross Functional Reporting with an Enterprise Da...
 
Cts csl phoenix 20131104 v1
Cts csl phoenix 20131104 v1Cts csl phoenix 20131104 v1
Cts csl phoenix 20131104 v1
 
90300 633579030311875000
90300 63357903031187500090300 633579030311875000
90300 633579030311875000
 
Census
CensusCensus
Census
 
Profiling a Person With Search Log Data
Profiling a Person With Search Log DataProfiling a Person With Search Log Data
Profiling a Person With Search Log Data
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Why A Data Warehouse

  • 1. Why aWhy a Data WarehouseData Warehouse Concepts in Design Fred A Kilby, MBA fred.kilby@pmbsa.com Copyright © 2009 Fred A. Kilby. All rights reserved
  • 2. What is a Data WarehouseWhat is a Data Warehouse? The conglomeration of an organization’s data warehouse staging and presentation areas, where operational data is specifically structured for query and analysis performance and ease-of-use. Ralph Kimball,(2002) The Data Warehouse Toolkit.
  • 3. Now in EnglishNow in English A data warehouse is a database organized in a way to allow for fast queries of information. It contains the data from the different database systems that is brought together for a single view.
  • 4. So what’s the differenceSo what’s the difference? Transactional Sources • Centers around transactions • 2 dimension reports – Age by System • Individual data • Slow • “Cut-n-paste” into other applications Data Warehouse • Centers around business facts • Multi-dimensional reports – Age by Race by Program • Aggregated data • Fast • 3rd party reporting tools can be used.
  • 5. Measures Facts not ActivitiesMeasures Facts not Activities Facts are business performance measurements – Meals provided – Dollars expended – Hours worked Facts are numerical and additive – Sum of dollars spent – Count of clients served Facts are stored to represent a measurement at a particular “grain”
  • 6. What is a Grain?What is a Grain? A grain is the level of detail at which a business measurement is stored Different businesses have different fact needs – A Social Services grain • The number of food stamp dollars given to a case each month – In-Home Support Services grain • The number of hours of service a client received in a provider’s pay period • The number of dollars paid to a provider for a client during a pay period
  • 7. What is a DimensionWhat is a Dimension? A dimension is a textual description that relates to a fact, for example: – Ethnicity (White, Black, Japanese) – Language (English, Spanish, Tagalog) – Gender (Male, Female) – Date (05/31/2003, 04/15/2003) – Location (California, Arizona, New Mexico)
  • 8. Used in QueriesUsed in Queries Dimensions are used to restrict and frame queries on facts, for example: “Give me a count of all Spanish speaking white males in California” • The fact is the count (a number) • The dimensions are: – Spanish (language), – white (race), – male (gender), – and California (location)
  • 9. Identifying Facts and DimensionsIdentifying Facts and Dimensions By Aid Type By Program By Month For (By) Year Count of Cases Sum of Aid Payments Average per Case
  • 10. What makes a Data WarehouseWhat makes a Data Warehouse?
  • 11. Cubes Answer Business QuestionsCubes Answer Business Questions How many Spanish speaking clients did H&HS serve in each department for each of the past 3 years? Which cities currently have the highest concentration of Asian clients? What has the trend been? How many people who receive Medi-Cal received a service in 2003 from health services, by service?
  • 14. Drill Down CapableDrill Down Capable
  • 17. Where do we startWhere do we start? • Choose the systems to include • Identify the exact grain of the business process • Identify the dimensions available for use with each fact table row • Choose the numeric facts of what is being measured
  • 18. Key to SuccessKey to Success To ensure success end user involvement is required: Data warehouse success is tied directly to user acceptance. If the users haven’t accepted the data warehouse …then your efforts have been exercises in futility. (Kimball, 2002)

Editor's Notes

  1. What are some business facts that you need, or would like, to be able to report on?
  2. Here is an example of how to identify Facts and Dimensions on an existing report The Facts are “math” words - Count of Cases, Sum of Aid Payments, Average of Pay per Case The Dimensions are “grouping” words - (By) Program, (By) Aid Type, (By) Calendar Month, (For) Fiscal Year
  3. We start with data from operational sources We move this data into a staging area where business rules are applied Code values are translated to a common set; i.e., M vs. male Formats are changed to fit a standard; i.e., 5.1 vs. 5.1000 These rules make the data from different sources comparable (apples to apples) Once the data is made “standard” it is loaded into the warehouse’s fact and dimension tables We create the reporting cubes Users access the cubes to analyze and report on the data
  4. The data warehouse is to help you answer business questions. To help you answer these questions there are Reporting Cubes.
  5. Now I’m going to show some examples of how this comes together to help you in reporting and analyzing data. While going through these, think of what YOU would like see. In this report, the facts are Client counts, the dimensions are By Department, By Gender, and By Active Year.
  6. Here we see another report. Again the fact is a count of Clients, the dimensions are By Race group, By Department, For Active Year
  7. These cubes can provide for drilling down into greater level of detail. From the previous report we have “drilled” into the Social Services Division, “down” to the program level. Can you tell what the dimensions are here? By Race By Department By Program By Active Year
  8. We say that these cubes are multi-dimensional. This report shows that we can combine dimensions to find even more interest information. Notice that the Fact is Unique Client, the Dimensions are By Race Group, By Gender, By Marital Status, By Department, and the Filter, or Selection, is For Active Year
  9. Depending on the reporting tool, these reports can easily be converted in to visual graphs. Here we see the prior grid report in a graph format. This allows the user to quickly notice interesting information.
  10. Now that the value of the data warehouse can be seen, how do we begin?