SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Introduction to Data Warehousing


            By
                 MAHESH.AMPOLU
Necessity is the mother of invention




            Why Data Warehouse?
Scenario

Unilever is a company with branches at UK,
India, America and Japan. The Sales Manager
wants quarterly sales report. Each branch has a
separate operational system.
Scenario 1 : Unilever company.
   India




    UK
            Sales per item type per branch    Sales
                   for first quarter.        Manager

  America




   Japan
Solution :      Unilever company.

 Extract sales information from each database.
 Store the information in a common repository at a
  single site.
Solution : Unilever company.
 India


                                         Report
  UK
                          Query &                  Sales
              Data      Analysis tools            Manager
            Warehouse

America




 Japan
Scenario :

Hindustan Unilever is a small,new company.
President of the company wants his company should
grow. He needs information so that he can make
correct decisions.
Solution :
 Improve  the quality of data before loading it
  into the warehouse.
 Perform data cleaning and transformation
  before loading the data.
 Use query analysis tools to support adhoc
  queries.
Solution
                                                  Expansio
                                                     n

                             sales

   Data      Query and Analysis             President
 Warehouse          tool

                                     time

                                                Improvemen
                                                     t
What is Data Warehouse??
Inmons’s definition :
 A data warehouse is
       -subject-oriented,
       -integrated,
       -time-variant,
       -nonvolatile
collection of data in support of management’s
decision making process.
Subject-oriented
 Data  warehouse is organized around subjects such
  as sales,product,customer.
 It focuses on modeling and analysis of data for
  decision makers.
 Excludes data not useful in decision support
  process.
Integration
 Data Warehouse is constructed by integrating
  multiple heterogeneous sources.
 Data Preprocessing are applied to ensure
  consistency.
                   RDBMS



                                            Data
                    Legacy                Warehouse
                    System



                   Flat File            Data Processing
                                        Data Transformation
Integration
 In   terms of data.
  – encoding structures.

  – Measurement of
       attributes.

  – physical attribute.
        of data            remarks




  – naming conventions.

  – Data type format
Time-variant
 Provides  information from historical perspective
  e.g. past 5-10 years
 Every key structure contains either implicitly or
  explicitly an element of time
Nonvolatile
 Data  once recorded cannot be updated.
 Data warehouse requires two operations in data
  accessing
   – Initial loading of data
   – Access of data




     load


                            access
Data Warehousing Architecture
Data Warehouse Architecture
 Data  Warehouse server
  – almost always a relational DBMS,rarely flat
     files
 OLAP servers
  – to support and operate on multi-dimensional
     data structures
 Clients
  – Query and reporting tools
  – Analysis tools
  – Data mining tools
OLTP vs OLAP
Data Warehouse Schema
 Star Schema
 Fact Constellation Schema
 Snowflake Schema
Star Schema
A star schema consists of at least one
 fact table and a number of dimension
 tables.

 Star
     Schema is highly recommended
 schema for SSAS cubes.
Star Schema
Store Dimension           Fact Table                   Time Dimension
 Store Key                 Store Key                  Period Key
 Store Name                Product Key                Year
 City                      Period Key                 Quarter
 State                     Units                      Month
 Region                    Price


                           Product Key
                           Product Desc

                         Product
                         Dimension

Benefits: Easy to understand, easy to define hierarchies, reduces
no. of physical
joins.
SnowFlake Schema
 Variant of star schema model.
 A single,large and central fact table and one
  or more tables for each dimension.
 Dimension tables are normalized i.e. split
  dimension table data into additional tables
SnowFlake Schema
Store Dimension             Fact Table                   Time Dimension
                             Store Key                   Period Key
Store Key
                             Product Key                 Year
Store Name
                             Period Key                  Quarter
City Key
                             Units                       Month
                             Price
 City Dimension
City Key
                             Product Key
City
                             Product Desc
State
Region                    Product
                          Dimension
Drawbacks: Time consuming joins,report generation slow
Fact Constellation
 Multiple fact tables share dimension tables.
 This schema is viewed as collection of stars
  hence called galaxy schema or fact
  constellation.
 Sophisticated application requires such
  schema.
Fact Constellation
    Sales                              Shipping
    Fact Table     Product
                   Dimension           Fact Table
Store Key
                                     Shipper Key
Product Key        Product Key       Store Key
Period Key         Product Desc      Product Key
Units
                                     Period Key
Price
                                     Units
                                     Price
                   Store Dimension

                   Store Key
                   Store Name
                   City
                   State
                   Region
Fact Constellation
    Sales                              Shipping
    Fact Table     Product
                   Dimension           Fact Table
Store Key
                                     Shipper Key
Product Key        Product Key       Store Key
Period Key         Product Desc      Product Key
Units
                                     Period Key
Price
                                     Units
                                     Price
                   Store Dimension

                   Store Key
                   Store Name
                   City
                   State
                   Region
Building Data Warehouse
 Data Selection
 Data Preprocessing
  – Fill missing values
  – Remove inconsistency
 Data Transformation & Integration
 Data Loading
 Data in warehouse is stored in form of fact tables
 and dimension tables.
Case Study
  Unilever is a new company which produces
  soaps,paste and baverages products with
  production unit located at NA.
 There products are sold in North,North West and
  Western region of India.
 They have sales units at India, America , UK and
  Japan.
 The President of the company wants sales
  information.
Sales Information

Report: The number of units sold.

113


Report: The number of units sold over time and date


 January       February       March         April
 14            41             33            25
Sales Information
Report : The number of items sold for each product with
time

             Jan Feb Mar Apr
Soaps                 6    17

Paste        6   16   6    8




                                        Time
bread        8   25   21

                                                   Product
Sales Information
Report: The number of items sold in each country for each
product with time
                     Jan   Feb Mar   Apr
India    Soaps                  3    10              City

         Paste       3     16   6
         bread       4     16   6




                                              Time
UK       soaps                  3    7

         paste       3               8                      Product
         bread       4     9    15
Sales Information
Report: The number of items sold and income in each region for
each product with time.
                     Jan        Feb          Mar          Apr
                     Rs     U   Rs      U    Rs      U    Rs      U

 India   Soaps                               7.44    3    24.80   10

         Paste       7.95   3   42.40   16   15.90   6

         bread       7.32   4   29.98   16   10.98   6

 UK      Soaps                               7.44    3    17.36   7

         paste       7.95   3                             21.20   8

         bread       7.32   4   16.47   9    27.45   15
Sales Measures & Dimensions

 Measure – Units sold, Amount.
 Dimensions – Product,Time,Region.
Sales Data Warehouse Model
Fact Table
Country      Product   Month      Units   Rupees
India        Soaps     January    3       7.95
India        Paste     January    4       7.32
UK           Soaps     January    3       7.95
UK           Paste     January    4       7.32
India        Bread     February   16      42.40
Sales Data Warehouse Model

  City_ID Prod_ID   Month      Units   Rupees
  1       589       1/1/1998   3       7.95
  1       1218      1/1/1998   4       7.32
  2       589       1/1/1998   3       7.95
  2       1218      1/1/1998   4       7.32
  1       589       2/1/1998   16      42.40
Sales Data Warehouse Model
Product Dimension Tables
  Prod_ID     Product_Name        Product_Category_ID
  589         Soaps               1
  590         Paste               1
  288         Bread               2


  Product_Category_Id Product_Category
  1                   domestic
  2                   food
Sales Data Warehouse Model
Region Dimension Table

City_ID       City       Region      Country

1             India      West        India
2             UK         NorthWest   India
Sales Data Warehouse Model
   Time




                                 Product
          Sales Fact   Product
                                 Category




  Region
Data Warehousing includes
 Build Data Warehouse
 Online analysis processing(OLAP).
 Presentation.

            Cleaning ,Selection &
            Integration

RDBMS                                                  Presentation




Flat File
                                                                      Client
                             Warehouse & OLAP server
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Procter & gamble marketing strtergy MBA PPT OF MARKETING
Procter & gamble marketing strtergy  MBA PPT OF MARKETING Procter & gamble marketing strtergy  MBA PPT OF MARKETING
Procter & gamble marketing strtergy MBA PPT OF MARKETING Babasab Patil
 
Situation Analysis of JCP
Situation Analysis of JCPSituation Analysis of JCP
Situation Analysis of JCPHayley Gray
 
Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...
Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...
Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...Pravinkad941
 
Marketing Mix RedBull
Marketing Mix RedBullMarketing Mix RedBull
Marketing Mix RedBullAryo Prabowo
 
Unilever Strategic Management Assignment
Unilever Strategic Management AssignmentUnilever Strategic Management Assignment
Unilever Strategic Management AssignmentDhaifina Idznitia
 
Marketing Monster Energy Drink Presentation
Marketing Monster Energy Drink PresentationMarketing Monster Energy Drink Presentation
Marketing Monster Energy Drink Presentationjderemo
 
Pepsico product portfolio transformation
Pepsico product portfolio transformationPepsico product portfolio transformation
Pepsico product portfolio transformationAshwini Chauhan
 
Market Research on Coca-Cola Vs. Pepsi
Market Research on Coca-Cola Vs. Pepsi Market Research on Coca-Cola Vs. Pepsi
Market Research on Coca-Cola Vs. Pepsi Dushyant Singh
 
Pepsi group assignment 20110918 final
Pepsi group assignment 20110918 finalPepsi group assignment 20110918 final
Pepsi group assignment 20110918 finalSTARSSIP LIMITED
 
Baskin Robbins Global Marketing
Baskin Robbins Global MarketingBaskin Robbins Global Marketing
Baskin Robbins Global Marketingmayankl jain
 
Fix Bar Feasibility Analysis
Fix Bar Feasibility Analysis Fix Bar Feasibility Analysis
Fix Bar Feasibility Analysis Jezzebell Vinuela
 
Monster Presentation
Monster PresentationMonster Presentation
Monster Presentationrahell89
 
Oreo Plans BookFinal4-24
Oreo Plans BookFinal4-24Oreo Plans BookFinal4-24
Oreo Plans BookFinal4-24Miles Gantt
 
Under Armour:Human Resources
Under Armour:Human ResourcesUnder Armour:Human Resources
Under Armour:Human ResourcesIsabelleSchultz1
 

Was ist angesagt? (17)

Procter & gamble marketing strtergy MBA PPT OF MARKETING
Procter & gamble marketing strtergy  MBA PPT OF MARKETING Procter & gamble marketing strtergy  MBA PPT OF MARKETING
Procter & gamble marketing strtergy MBA PPT OF MARKETING
 
Situation Analysis of JCP
Situation Analysis of JCPSituation Analysis of JCP
Situation Analysis of JCP
 
Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...
Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...
Success Case Study: Happy Skin Cosmetics-How a local make-up brand succeeded ...
 
Marketing Mix RedBull
Marketing Mix RedBullMarketing Mix RedBull
Marketing Mix RedBull
 
Unilever Strategic Management Assignment
Unilever Strategic Management AssignmentUnilever Strategic Management Assignment
Unilever Strategic Management Assignment
 
Marketing Monster Energy Drink Presentation
Marketing Monster Energy Drink PresentationMarketing Monster Energy Drink Presentation
Marketing Monster Energy Drink Presentation
 
Pepsico product portfolio transformation
Pepsico product portfolio transformationPepsico product portfolio transformation
Pepsico product portfolio transformation
 
Market Research on Coca-Cola Vs. Pepsi
Market Research on Coca-Cola Vs. Pepsi Market Research on Coca-Cola Vs. Pepsi
Market Research on Coca-Cola Vs. Pepsi
 
Yili innovation center-2015
Yili innovation center-2015Yili innovation center-2015
Yili innovation center-2015
 
Pepsi group assignment 20110918 final
Pepsi group assignment 20110918 finalPepsi group assignment 20110918 final
Pepsi group assignment 20110918 final
 
Baskin Robbins Global Marketing
Baskin Robbins Global MarketingBaskin Robbins Global Marketing
Baskin Robbins Global Marketing
 
Fix Bar Feasibility Analysis
Fix Bar Feasibility Analysis Fix Bar Feasibility Analysis
Fix Bar Feasibility Analysis
 
Pepsico
PepsicoPepsico
Pepsico
 
Monster Presentation
Monster PresentationMonster Presentation
Monster Presentation
 
Pepsi marketing management
Pepsi   marketing managementPepsi   marketing management
Pepsi marketing management
 
Oreo Plans BookFinal4-24
Oreo Plans BookFinal4-24Oreo Plans BookFinal4-24
Oreo Plans BookFinal4-24
 
Under Armour:Human Resources
Under Armour:Human ResourcesUnder Armour:Human Resources
Under Armour:Human Resources
 

Ähnlich wie Data warehousing

Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dwsam2sung2
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasiryasir873
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
The Business Value of Business Intelligence
The Business Value of Business IntelligenceThe Business Value of Business Intelligence
The Business Value of Business IntelligenceSenturus
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 
Optimizing Shopper Insights For Grocery Retailers
Optimizing Shopper Insights For Grocery RetailersOptimizing Shopper Insights For Grocery Retailers
Optimizing Shopper Insights For Grocery RetailersG3 Communications
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPDhiren Gala
 
Measure Data Quality
Measure Data QualityMeasure Data Quality
Measure Data QualityZavalaJV
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)tafosepsdfasg
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerWildan Maulana
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0Jannet Peetz
 
Tomas mis eng
Tomas mis engTomas mis eng
Tomas mis engtomasdse
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting SuiteClassic Polo
 
Significant Profit Increase during economic slowdown through TOC Implementation
Significant Profit Increase during economic slowdown through TOC ImplementationSignificant Profit Increase during economic slowdown through TOC Implementation
Significant Profit Increase during economic slowdown through TOC ImplementationAvenirManagement
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail storeSiddharth Chaudhary
 

Ähnlich wie Data warehousing (20)

Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dw
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
The Business Value of Business Intelligence
The Business Value of Business IntelligenceThe Business Value of Business Intelligence
The Business Value of Business Intelligence
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
Optimizing Shopper Insights For Grocery Retailers
Optimizing Shopper Insights For Grocery RetailersOptimizing Shopper Insights For Grocery Retailers
Optimizing Shopper Insights For Grocery Retailers
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
Data ware housing- Introduction to olap .
Data ware housing- Introduction to  olap .Data ware housing- Introduction to  olap .
Data ware housing- Introduction to olap .
 
Measure Data Quality
Measure Data QualityMeasure Data Quality
Measure Data Quality
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence Layer
 
02 Essbase
02 Essbase02 Essbase
02 Essbase
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0
 
Tomas mis eng
Tomas mis engTomas mis eng
Tomas mis eng
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting Suite
 
Significant Profit Increase during economic slowdown through TOC Implementation
Significant Profit Increase during economic slowdown through TOC ImplementationSignificant Profit Increase during economic slowdown through TOC Implementation
Significant Profit Increase during economic slowdown through TOC Implementation
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail store
 
Dominick’s finer foods
Dominick’s finer foodsDominick’s finer foods
Dominick’s finer foods
 
Dominick’s finer foods
Dominick’s finer foodsDominick’s finer foods
Dominick’s finer foods
 

Kürzlich hochgeladen

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Kürzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Data warehousing

  • 1. Introduction to Data Warehousing By MAHESH.AMPOLU
  • 2. Necessity is the mother of invention Why Data Warehouse?
  • 3. Scenario Unilever is a company with branches at UK, India, America and Japan. The Sales Manager wants quarterly sales report. Each branch has a separate operational system.
  • 4. Scenario 1 : Unilever company. India UK Sales per item type per branch Sales for first quarter. Manager America Japan
  • 5. Solution : Unilever company.  Extract sales information from each database.  Store the information in a common repository at a single site.
  • 6. Solution : Unilever company. India Report UK Query & Sales Data Analysis tools Manager Warehouse America Japan
  • 7. Scenario : Hindustan Unilever is a small,new company. President of the company wants his company should grow. He needs information so that he can make correct decisions.
  • 8. Solution :  Improve the quality of data before loading it into the warehouse.  Perform data cleaning and transformation before loading the data.  Use query analysis tools to support adhoc queries.
  • 9. Solution Expansio n sales Data Query and Analysis President Warehouse tool time Improvemen t
  • 10. What is Data Warehouse??
  • 11. Inmons’s definition : A data warehouse is -subject-oriented, -integrated, -time-variant, -nonvolatile collection of data in support of management’s decision making process.
  • 12. Subject-oriented  Data warehouse is organized around subjects such as sales,product,customer.  It focuses on modeling and analysis of data for decision makers.  Excludes data not useful in decision support process.
  • 13. Integration  Data Warehouse is constructed by integrating multiple heterogeneous sources.  Data Preprocessing are applied to ensure consistency. RDBMS Data Legacy Warehouse System Flat File Data Processing Data Transformation
  • 14. Integration  In terms of data. – encoding structures. – Measurement of attributes. – physical attribute. of data remarks – naming conventions. – Data type format
  • 15. Time-variant  Provides information from historical perspective e.g. past 5-10 years  Every key structure contains either implicitly or explicitly an element of time
  • 16. Nonvolatile  Data once recorded cannot be updated.  Data warehouse requires two operations in data accessing – Initial loading of data – Access of data load access
  • 18. Data Warehouse Architecture  Data Warehouse server – almost always a relational DBMS,rarely flat files  OLAP servers – to support and operate on multi-dimensional data structures  Clients – Query and reporting tools – Analysis tools – Data mining tools
  • 20. Data Warehouse Schema  Star Schema  Fact Constellation Schema  Snowflake Schema
  • 21. Star Schema A star schema consists of at least one fact table and a number of dimension tables.  Star Schema is highly recommended schema for SSAS cubes.
  • 22. Star Schema Store Dimension Fact Table Time Dimension Store Key Store Key Period Key Store Name Product Key Year City Period Key Quarter State Units Month Region Price Product Key Product Desc Product Dimension Benefits: Easy to understand, easy to define hierarchies, reduces no. of physical joins.
  • 23.
  • 24. SnowFlake Schema  Variant of star schema model.  A single,large and central fact table and one or more tables for each dimension.  Dimension tables are normalized i.e. split dimension table data into additional tables
  • 25. SnowFlake Schema Store Dimension Fact Table Time Dimension Store Key Period Key Store Key Product Key Year Store Name Period Key Quarter City Key Units Month Price City Dimension City Key Product Key City Product Desc State Region Product Dimension Drawbacks: Time consuming joins,report generation slow
  • 26. Fact Constellation  Multiple fact tables share dimension tables.  This schema is viewed as collection of stars hence called galaxy schema or fact constellation.  Sophisticated application requires such schema.
  • 27. Fact Constellation Sales Shipping Fact Table Product Dimension Fact Table Store Key Shipper Key Product Key Product Key Store Key Period Key Product Desc Product Key Units Period Key Price Units Price Store Dimension Store Key Store Name City State Region
  • 28. Fact Constellation Sales Shipping Fact Table Product Dimension Fact Table Store Key Shipper Key Product Key Product Key Store Key Period Key Product Desc Product Key Units Period Key Price Units Price Store Dimension Store Key Store Name City State Region
  • 29. Building Data Warehouse  Data Selection  Data Preprocessing – Fill missing values – Remove inconsistency  Data Transformation & Integration  Data Loading Data in warehouse is stored in form of fact tables and dimension tables.
  • 30. Case Study  Unilever is a new company which produces soaps,paste and baverages products with production unit located at NA.  There products are sold in North,North West and Western region of India.  They have sales units at India, America , UK and Japan.  The President of the company wants sales information.
  • 31. Sales Information Report: The number of units sold. 113 Report: The number of units sold over time and date January February March April 14 41 33 25
  • 32. Sales Information Report : The number of items sold for each product with time Jan Feb Mar Apr Soaps 6 17 Paste 6 16 6 8 Time bread 8 25 21 Product
  • 33. Sales Information Report: The number of items sold in each country for each product with time Jan Feb Mar Apr India Soaps 3 10 City Paste 3 16 6 bread 4 16 6 Time UK soaps 3 7 paste 3 8 Product bread 4 9 15
  • 34. Sales Information Report: The number of items sold and income in each region for each product with time. Jan Feb Mar Apr Rs U Rs U Rs U Rs U India Soaps 7.44 3 24.80 10 Paste 7.95 3 42.40 16 15.90 6 bread 7.32 4 29.98 16 10.98 6 UK Soaps 7.44 3 17.36 7 paste 7.95 3 21.20 8 bread 7.32 4 16.47 9 27.45 15
  • 35. Sales Measures & Dimensions  Measure – Units sold, Amount.  Dimensions – Product,Time,Region.
  • 36. Sales Data Warehouse Model Fact Table Country Product Month Units Rupees India Soaps January 3 7.95 India Paste January 4 7.32 UK Soaps January 3 7.95 UK Paste January 4 7.32 India Bread February 16 42.40
  • 37. Sales Data Warehouse Model City_ID Prod_ID Month Units Rupees 1 589 1/1/1998 3 7.95 1 1218 1/1/1998 4 7.32 2 589 1/1/1998 3 7.95 2 1218 1/1/1998 4 7.32 1 589 2/1/1998 16 42.40
  • 38. Sales Data Warehouse Model Product Dimension Tables Prod_ID Product_Name Product_Category_ID 589 Soaps 1 590 Paste 1 288 Bread 2 Product_Category_Id Product_Category 1 domestic 2 food
  • 39. Sales Data Warehouse Model Region Dimension Table City_ID City Region Country 1 India West India 2 UK NorthWest India
  • 40. Sales Data Warehouse Model Time Product Sales Fact Product Category Region
  • 41. Data Warehousing includes  Build Data Warehouse  Online analysis processing(OLAP).  Presentation. Cleaning ,Selection & Integration RDBMS Presentation Flat File Client Warehouse & OLAP server

Hinweis der Redaktion

  1. we invent something only if there is a need for that thing….today we are going to see what data warehousing is…data warehouse is evolved to satisfy some needs….we will see some of these need now
  2. When we need to extract data from various sources, some may be manually maintained on paper, some on different legacy systems and integrating the data is a laborious work. Many systems provide some DTS systems to convert data in appropriate format and provide necessary transformations
  3. We need subject oriented and multidimensional data amodel fro data warehouse which facilitates online analysis