SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
IAS Inc


Data Warehouse:
The Choice of
Inmon versus Kimball
  Ian Abramson
  IAS Inc.
IAS Inc

Agenda
   The 2 Approaches
     Bill
         Inmon – Enterprise Warehouse (CIF)
     Ralph Kimball – Dimensional Design

 Similarities
 Differences
 Choices
IAS Inc

DW History
   1990
       Inmon publishes “Building the Data Warehouse”
   1996
       Kimball publishes “The Data Warehouse Toolkit”
   2002
       Inmon updates book and defines architecture for collection of
        disparate sources into detailed, time variant data store.
            The top down approach
       Kimball updates book and defines multiple databases called data
        marts that are organized by business processes, but use
        enterprise standard data bus
            The bottom-up approach
IAS Inc

The Data Warehouse Is:
    Bill Inmon, an early and influential practitioner, has formally defined a
     data warehouse in the following terms;
          Subject-oriented
                    The data in the database is organized so that all the data elements relating to the
                     same real-world event or object are linked together;
          Time-variant
                    The changes to the data in the database are tracked and recorded so that reports
                     can be produced showing changes over time;
          Non-volatile
                    Data in the database is never over-written or deleted - once committed, the data
                     is static, read-only, but retained for future reporting; and
          Integrated
                    The database contains data from most or all of an organization's operational
                     applications, and that this data is made consistent

    Ralph Kimball, a leading proponent of the dimensional approach to
     building data warehouses, provides a succinct definition for a data
     warehouse:
          “A copy of transaction data specifically structured for query and analysis.“

Ref: wikipedia
4
IAS Inc

What are they saying?
   These two influential data warehousing experts represent the
    current prevailing views on data warehousing.

   Kimball, in 1997, stated that
        "...the data warehouse is nothing more than the union of all the data
         marts",
        Kimball indicates a bottom-up data warehousing methodology in which
         individual data marts providing thin views into the organizational data
         could be created and later combined into a larger all-encompassing
         data warehouse.
   Inmon responded in 1998 by saying,
        "You can catch all the minnows in the ocean and stack them together
         and they still do not make a whale,"
        This indicates the opposing view that the data warehouse should be
         designed from the top-down to include all corporate data. In this
         methodology, data marts are created only after the complete data
         warehouse has been created.


5
IAS Inc

What is a Data Warehouse:
   The single organizational repository of
    enterprise wide data across many or all
    lines of business and subject areas.
     Contains  massive and integrated data
     Represents the complete organizational view
      of information needed to run and understand
      the business
IAS Inc

What is a Data Mart?
   The specific, subject oriented, or departmental
    view of information from the organization.
    Generally these are built to satisfy user
    requirements for information
     Multiple data marts for one organization
     A data mart is built using dimensional modeling
     More focused
     Generally smaller, selected facts and dimensions
     Integrated
IAS Inc
Data Warehouses vs.
Data Marts
Data   Warehouses:                               Data   Marts:
    Scope                                           Scope
          Applicationindependent                           Specificapplication
          Centralized or Enterprise                        Decentralized by group

          Planned                                          Organic but may be planned

    Data                                            Data
          Historical,
                   detailed, summary                        Some  history, detailed, summary
          Some denormalization                             High denormalization

    Subjects                                        Subjects
          Multiple   subjects                              Single   central subject area
    Source                                          Source
          Many   internal and external sources             Few   internal and external sources
    Other                                           Other
          Flexible                                         Restrictive

          Data oriented                                    Project  oriented
          Long life                                        Short life

          Single complex structure                         Multiple simple structures that may
                                                            form a complex structure
8
IAS Inc

The Inmon Model
   Consists of all databases and information
    systems in an organization…..
     The   CIF (Corporate Information Factory)
   Defines overall database environment as:
     Operational
     Atomic  data warehouse
     Departmental
     Individual
   The Warehouse is part of the bigger whole (CIF)
IAS Inc

The Data Warehouse
                Operational
         (Day-to-Day Operations)    Customer Credit Rating
              * Transactions *
         Atomic Data Warehouse
       (Data manipulated & moved)   Customer Credit History
             * Transactions *
               Departmental
                 (Focused)          Customer by Postal Code
            * Source is ADW *
                 Individual
                  (Ad hoc)          Delinquent Customers
            * Source is ADW *
IAS Inc

Inmon Modeling
   Three levels of data modeling
       ERD (Entity Relationship Diagram)
            Refines entities, attributes and relationships
       Mid-Level model (*DIS*)
            Data Item Sets
            Data sets by department
            Four constructs:
                  Primary data groupings
                  Secondary data groupings
                  Connectors
                  “Type of” data
       Physical data model
            Optimize for performance (de-normalize)
IAS Inc
Relationship between Levels One and Two
of Inmon's Data model (Inmon,2002)
IAS Inc

The Warehouse Architecture
IAS Inc
The Inmon Warehouse
Data Sources    Staging    The Data Warehouse    Data Access




       Source
        DB 1




                 Landing
       Source
                 Staging                          Data Marts
        DB 2
                  Area




  File or
 External
   Data
                                                   Cubes




14
IAS Inc

The Kimball Approach
   The Dimensional Data Model
     Starts   with tables
          Facts
          Dimensions
     Factscontain metrics
     Dimensions contain attributes
          May contain repeating groups
     Does not adhere to normalization theory
     User accessible
IAS Inc
The Kimball Data Lifecycle
 Data Sources       Staging    The Data Warehouse   Data Access




           Source
            DB 1




                                                    Workstation Group




           Source
                     Landing                        End Users
                     Staging
            DB 2
                      Area




      File or
     External
       Data
                                                         Cubes




16
IAS Inc

The Dimensional Model
IAS Inc

The Kimball Data Bus
   Data is moved to staging area
     Data   is scrubbed and made consistent
   From Staging Data Marts are created
   Data Marts are based on a single process
   Sum of the data marts can constitute an
    Enterprise Data Warehouse
   Conformed dimensions are the key to success
IAS Inc

The Kimball Design Approach
 Select business process
 Declare the grain
 Choose dimensions
 Identify facts (metrics)
IAS Inc

Kimball’s Philosophy
 Make data easily accessible
 Present the organization’s information
  consistently
 Be adaptive and resilient to change
 Protect information
 Service as the foundation for improved
  decision making.
IAS Inc

Getting Started with Choices
   Kimball
     Will
         start with data marts
     Focused on quick delivery to users

   Inmon
     Will
         focus on the enterprise
     Organizational focus
IAS Inc
Kimball vs. Inmon
    Inmon:
        Subject-Oriented
        Integrated
        Non-Volatile
        Time-Variant
        Top-Down
        Integration Achieved via an Assumed Enterprise Data Model
        Characterizes Data marts as Aggregates
    Kimball
        Business-Process-Oriented
        Bottom-Up and Evolutionary
        Stresses Dimensional Model, Not E-R
        Integration Achieved via Conformed Dimensions
        Star Schemas Enforce Query Semantics

22
IAS Inc
The Comparison
(Methodology and Architecture)
                              Inmon                                 Kimball
Overall approach              Top-down                              Bottom-up

Architectural structure       Enterprise-wide DW                    Data marts model a
                              feeds departmental DBs                business process;
                                                                    enterprise is achieved
                                                                    with conformed dims
Complexity of method          Quite complex                         Fairly simple




               Reference: http://www.bi-bestpractices.com/view-articles/4768
IAS Inc
The Comparison
(Data Modeling)
                              Inmon                                 Kimball
Data orientation              Subject or data driven                Process oriented

Tools                         Traditional (ERDs and                 Dimensional modeling;
                              DIS)                                  departs from traditional
                                                                    relational modeling
End user accessibility        Low                                   High




               Reference: http://www.bi-bestpractices.com/view-articles/4768
IAS Inc
The Comparison
(Dimensions)
                           Inmon                                 Kimball
Timeframe                  Continuous & Discrete                 Slowly Changing

Methods                    Timestamps                            Dimension keys




            Reference: http://www.bi-bestpractices.com/view-articles/4768
IAS Inc
Inmon Continuous & Discrete
Dimension Management
   Define data management via dates in your
    data
     Continuous   time
       When is a record active
       Start and end dates

     Discrete   time
       A point in time
       Snapshot
IAS Inc
Kimball Slowly Changing
Dimension Management
   Define data management via versioning
     Type    I
          Change record as required
          No History
     Type    II
          Manage all changes
          History is recorded
     Type    III
          Some history is parallel
          Limit to defined history
IAS Inc
The Comparison
(Philosophy)
                              Inmon                                 Kimball
Primary Audience              IT                                    End Users

Place in the                  Integral part of the                  Transformer and retainer
Organization                  Corporate Information                 of operational data
                              Factory (CIF)
Objective                     Deliver a sound technical             Deliver a solution that
                              solution based on proven              makes it easy for end
                              methods                               users to directly query
                                                                    data and still have
                                                                    reasonable response rate




               Reference: http://www.bi-bestpractices.com/view-articles/4768
IAS Inc

How to Choose?
Characteristic Favors Kimball Favors Inmon
Nature of the       Tactical              Strategic
organization's
decision support
requirements
Data integration    Individual business   Enterprise-wide integration
requirements        areas
Structure of data   Business metrics,     Non-metric data and for data that
                    performance           will be applied to meet multiple
                    measures, and         and varied information needs
                    scorecards
Scalability         Need to adapt to       Growing scope and changing
                    highly volatile needs requirements are critical
                    within a limited scope
IAS Inc

How to Choose?
Characteristic Favors Kimball Favors Inmon
Persistency of data   Source systems are      High rate of change from source
                      relatively stable       systems


Staffing and skills   Small teams of          Larger team(s) of specialists
requirements          generalists
Time to delivery      Need for the first      Organization's requirements allow
                      data warehouse          for longer start-up time
                      application is urgent
Cost to deploy        Lower start-up costs,   Higher start-up costs, with lower
                      with each               subsequent project development
                      subsequent project      costs
                      costing about the
                      same
IAS Inc

References
   Data Warehousing Battle of the Giants: Comparing the Basics of the
    Kimball and Inmon Models: by Mary Breslin http://www.bi-
    bestpractices.com/view-articles/4768
   Inmon CIF glossary: http://www.inmoncif.com/library/glossary/#D
   The Data Warehouse Toolkit, Kimball, 2002
   Inmon, W.H. Building the Data Warehouse (Third Edition), New
    York: John Wiley & Sons, (2002).
   Kimball, R. and M. Ross. The Data Warehouse Toolkit: The
    Complete Guide to Dimensional Modeling (Second Edition), New
    York: John Wiley & Sons, 2000.
IAS Inc

Thanks and Questions?
   Ian Abramson
    IAS Inc
    416-407-2448
    ian@abramson.ca

Weitere ähnliche Inhalte

Was ist angesagt?

Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Chapter 5: Data Development
Chapter 5: Data Development Chapter 5: Data Development
Chapter 5: Data Development Ahmed Alorage
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.Navdeep Charan
 
‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management
‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management
‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence ManagementAhmed Alorage
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform LoadABDUL KHALIQ
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseData Con LA
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics
 
Data Modeling Basics
Data Modeling BasicsData Modeling Basics
Data Modeling Basicsrenuindia
 
Data Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesData Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesDATAVERSITY
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Mike Frampton
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consultingadivasoft
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Hans Hultgren
 
Advanced Dimensional Modelling
Advanced Dimensional ModellingAdvanced Dimensional Modelling
Advanced Dimensional ModellingVincent Rainardi
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 

Was ist angesagt? (20)

Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Chapter 5: Data Development
Chapter 5: Data Development Chapter 5: Data Development
Chapter 5: Data Development
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management
‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management
‏‏‏‏Chapter 9: Data Warehousing and Business Intelligence Management
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
Data Modeling Basics
Data Modeling BasicsData Modeling Basics
Data Modeling Basics
 
Data Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesData Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical Approaches
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011
 
Advanced Dimensional Modelling
Advanced Dimensional ModellingAdvanced Dimensional Modelling
Advanced Dimensional Modelling
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 

Andere mochten auch

Kimball Vs Inmon
Kimball Vs InmonKimball Vs Inmon
Kimball Vs Inmonguest2308b5
 
Introduccion datawarehouse
Introduccion datawarehouseIntroduccion datawarehouse
Introduccion datawarehouseEduardo Castro
 
Business intelligence systems
Business intelligence systemsBusiness intelligence systems
Business intelligence systemsUMaine
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault ModelingKent Graziano
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Analisis predictivo con microsoft azure
Analisis predictivo con microsoft azureAnalisis predictivo con microsoft azure
Analisis predictivo con microsoft azureEduardo Castro
 
Ontologising the Health Level Seven (HL7) Standard
Ontologising the Health Level Seven (HL7) StandardOntologising the Health Level Seven (HL7) Standard
Ontologising the Health Level Seven (HL7) StandardRatnesh Sahay
 
SOA standards
SOA standardsSOA standards
SOA standardsKumar
 
Soa business centric and soap basic
Soa business centric and soap basicSoa business centric and soap basic
Soa business centric and soap basicJothi Lakshmi
 
Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001jucaab
 
Splunking HL7 Healthcare Data for Business Value
Splunking HL7 Healthcare Data for Business ValueSplunking HL7 Healthcare Data for Business Value
Splunking HL7 Healthcare Data for Business ValueSplunk
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?Datameer
 
Automated Testing for BizTalk HL7 Solutions
Automated Testing for BizTalk HL7 SolutionsAutomated Testing for BizTalk HL7 Solutions
Automated Testing for BizTalk HL7 SolutionsMichael Stephenson
 
Que es Azure Machine Learning 2015
Que es Azure Machine Learning 2015Que es Azure Machine Learning 2015
Que es Azure Machine Learning 2015Eduardo Castro
 
Showdown: IBM DB2 versus Oracle Database for OLTP
Showdown: IBM DB2 versus Oracle Database for OLTPShowdown: IBM DB2 versus Oracle Database for OLTP
Showdown: IBM DB2 versus Oracle Database for OLTPcomahony
 
Description of soa and SOAP,WSDL & UDDI
Description of soa and SOAP,WSDL & UDDIDescription of soa and SOAP,WSDL & UDDI
Description of soa and SOAP,WSDL & UDDITUSHAR VARSHNEY
 
Business Case: IBM DB2 versus Oracle Database - Conor O'Mahony
Business Case: IBM DB2 versus Oracle Database - Conor O'MahonyBusiness Case: IBM DB2 versus Oracle Database - Conor O'Mahony
Business Case: IBM DB2 versus Oracle Database - Conor O'Mahonycomahony
 

Andere mochten auch (20)

Kimball Vs Inmon
Kimball Vs InmonKimball Vs Inmon
Kimball Vs Inmon
 
Inmon & kimball method
Inmon & kimball methodInmon & kimball method
Inmon & kimball method
 
Introduccion datawarehouse
Introduccion datawarehouseIntroduccion datawarehouse
Introduccion datawarehouse
 
Business intelligence systems
Business intelligence systemsBusiness intelligence systems
Business intelligence systems
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Analisis predictivo con microsoft azure
Analisis predictivo con microsoft azureAnalisis predictivo con microsoft azure
Analisis predictivo con microsoft azure
 
Ontologising the Health Level Seven (HL7) Standard
Ontologising the Health Level Seven (HL7) StandardOntologising the Health Level Seven (HL7) Standard
Ontologising the Health Level Seven (HL7) Standard
 
SOA standards
SOA standardsSOA standards
SOA standards
 
Webservices REST com Zend Framework
Webservices REST com Zend FrameworkWebservices REST com Zend Framework
Webservices REST com Zend Framework
 
Soa business centric and soap basic
Soa business centric and soap basicSoa business centric and soap basic
Soa business centric and soap basic
 
Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001
 
Splunking HL7 Healthcare Data for Business Value
Splunking HL7 Healthcare Data for Business ValueSplunking HL7 Healthcare Data for Business Value
Splunking HL7 Healthcare Data for Business Value
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?
 
Automated Testing for BizTalk HL7 Solutions
Automated Testing for BizTalk HL7 SolutionsAutomated Testing for BizTalk HL7 Solutions
Automated Testing for BizTalk HL7 Solutions
 
Que es Azure Machine Learning 2015
Que es Azure Machine Learning 2015Que es Azure Machine Learning 2015
Que es Azure Machine Learning 2015
 
Showdown: IBM DB2 versus Oracle Database for OLTP
Showdown: IBM DB2 versus Oracle Database for OLTPShowdown: IBM DB2 versus Oracle Database for OLTP
Showdown: IBM DB2 versus Oracle Database for OLTP
 
Description of soa and SOAP,WSDL & UDDI
Description of soa and SOAP,WSDL & UDDIDescription of soa and SOAP,WSDL & UDDI
Description of soa and SOAP,WSDL & UDDI
 
Business Case: IBM DB2 versus Oracle Database - Conor O'Mahony
Business Case: IBM DB2 versus Oracle Database - Conor O'MahonyBusiness Case: IBM DB2 versus Oracle Database - Conor O'Mahony
Business Case: IBM DB2 versus Oracle Database - Conor O'Mahony
 

Ähnlich wie 080827 abramson inmon vs kimball

Ähnlich wie 080827 abramson inmon vs kimball (20)

Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
 
Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dw
 
Introduction-to-Databases.pptx
Introduction-to-Databases.pptxIntroduction-to-Databases.pptx
Introduction-to-Databases.pptx
 
DW 101
DW 101DW 101
DW 101
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
 
Unit 1
Unit 1Unit 1
Unit 1
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
02. Data Warehouse and OLAP
02. Data Warehouse and OLAP02. Data Warehouse and OLAP
02. Data Warehouse and OLAP
 
Dw 02-basics
Dw 02-basicsDw 02-basics
Dw 02-basics
 
DW Basics
DW BasicsDW Basics
DW Basics
 
6 - Foundations of BI: Database & Info Mgmt
6 - Foundations of BI: Database & Info Mgmt6 - Foundations of BI: Database & Info Mgmt
6 - Foundations of BI: Database & Info Mgmt
 
Presentation
PresentationPresentation
Presentation
 
Data lakes
Data lakesData lakes
Data lakes
 
The BI Sandbox
The BI SandboxThe BI Sandbox
The BI Sandbox
 
Dbms Useful PPT
Dbms Useful PPTDbms Useful PPT
Dbms Useful PPT
 
Clase2 introdw
Clase2 introdwClase2 introdw
Clase2 introdw
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 

080827 abramson inmon vs kimball

  • 1. IAS Inc Data Warehouse: The Choice of Inmon versus Kimball Ian Abramson IAS Inc.
  • 2. IAS Inc Agenda  The 2 Approaches  Bill Inmon – Enterprise Warehouse (CIF)  Ralph Kimball – Dimensional Design  Similarities  Differences  Choices
  • 3. IAS Inc DW History  1990  Inmon publishes “Building the Data Warehouse”  1996  Kimball publishes “The Data Warehouse Toolkit”  2002  Inmon updates book and defines architecture for collection of disparate sources into detailed, time variant data store.  The top down approach  Kimball updates book and defines multiple databases called data marts that are organized by business processes, but use enterprise standard data bus  The bottom-up approach
  • 4. IAS Inc The Data Warehouse Is:  Bill Inmon, an early and influential practitioner, has formally defined a data warehouse in the following terms;  Subject-oriented  The data in the database is organized so that all the data elements relating to the same real-world event or object are linked together;  Time-variant  The changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time;  Non-volatile  Data in the database is never over-written or deleted - once committed, the data is static, read-only, but retained for future reporting; and  Integrated  The database contains data from most or all of an organization's operational applications, and that this data is made consistent  Ralph Kimball, a leading proponent of the dimensional approach to building data warehouses, provides a succinct definition for a data warehouse:  “A copy of transaction data specifically structured for query and analysis.“ Ref: wikipedia 4
  • 5. IAS Inc What are they saying?  These two influential data warehousing experts represent the current prevailing views on data warehousing.  Kimball, in 1997, stated that  "...the data warehouse is nothing more than the union of all the data marts",  Kimball indicates a bottom-up data warehousing methodology in which individual data marts providing thin views into the organizational data could be created and later combined into a larger all-encompassing data warehouse.  Inmon responded in 1998 by saying,  "You can catch all the minnows in the ocean and stack them together and they still do not make a whale,"  This indicates the opposing view that the data warehouse should be designed from the top-down to include all corporate data. In this methodology, data marts are created only after the complete data warehouse has been created. 5
  • 6. IAS Inc What is a Data Warehouse:  The single organizational repository of enterprise wide data across many or all lines of business and subject areas.  Contains massive and integrated data  Represents the complete organizational view of information needed to run and understand the business
  • 7. IAS Inc What is a Data Mart?  The specific, subject oriented, or departmental view of information from the organization. Generally these are built to satisfy user requirements for information  Multiple data marts for one organization  A data mart is built using dimensional modeling  More focused  Generally smaller, selected facts and dimensions  Integrated
  • 8. IAS Inc Data Warehouses vs. Data Marts Data Warehouses: Data Marts: Scope Scope Applicationindependent Specificapplication Centralized or Enterprise Decentralized by group Planned Organic but may be planned Data Data Historical, detailed, summary Some history, detailed, summary Some denormalization High denormalization Subjects Subjects Multiple subjects Single central subject area Source Source Many internal and external sources Few internal and external sources Other Other Flexible Restrictive Data oriented Project oriented Long life Short life Single complex structure Multiple simple structures that may form a complex structure 8
  • 9. IAS Inc The Inmon Model  Consists of all databases and information systems in an organization…..  The CIF (Corporate Information Factory)  Defines overall database environment as:  Operational  Atomic data warehouse  Departmental  Individual  The Warehouse is part of the bigger whole (CIF)
  • 10. IAS Inc The Data Warehouse Operational (Day-to-Day Operations) Customer Credit Rating * Transactions * Atomic Data Warehouse (Data manipulated & moved) Customer Credit History * Transactions * Departmental (Focused) Customer by Postal Code * Source is ADW * Individual (Ad hoc) Delinquent Customers * Source is ADW *
  • 11. IAS Inc Inmon Modeling  Three levels of data modeling  ERD (Entity Relationship Diagram)  Refines entities, attributes and relationships  Mid-Level model (*DIS*)  Data Item Sets  Data sets by department  Four constructs:  Primary data groupings  Secondary data groupings  Connectors  “Type of” data  Physical data model  Optimize for performance (de-normalize)
  • 12. IAS Inc Relationship between Levels One and Two of Inmon's Data model (Inmon,2002)
  • 13. IAS Inc The Warehouse Architecture
  • 14. IAS Inc The Inmon Warehouse Data Sources Staging The Data Warehouse Data Access Source DB 1 Landing Source Staging Data Marts DB 2 Area File or External Data Cubes 14
  • 15. IAS Inc The Kimball Approach  The Dimensional Data Model  Starts with tables  Facts  Dimensions  Factscontain metrics  Dimensions contain attributes  May contain repeating groups  Does not adhere to normalization theory  User accessible
  • 16. IAS Inc The Kimball Data Lifecycle Data Sources Staging The Data Warehouse Data Access Source DB 1 Workstation Group Source Landing End Users Staging DB 2 Area File or External Data Cubes 16
  • 18. IAS Inc The Kimball Data Bus  Data is moved to staging area  Data is scrubbed and made consistent  From Staging Data Marts are created  Data Marts are based on a single process  Sum of the data marts can constitute an Enterprise Data Warehouse  Conformed dimensions are the key to success
  • 19. IAS Inc The Kimball Design Approach  Select business process  Declare the grain  Choose dimensions  Identify facts (metrics)
  • 20. IAS Inc Kimball’s Philosophy  Make data easily accessible  Present the organization’s information consistently  Be adaptive and resilient to change  Protect information  Service as the foundation for improved decision making.
  • 21. IAS Inc Getting Started with Choices  Kimball  Will start with data marts  Focused on quick delivery to users  Inmon  Will focus on the enterprise  Organizational focus
  • 22. IAS Inc Kimball vs. Inmon  Inmon:  Subject-Oriented  Integrated  Non-Volatile  Time-Variant  Top-Down  Integration Achieved via an Assumed Enterprise Data Model  Characterizes Data marts as Aggregates  Kimball  Business-Process-Oriented  Bottom-Up and Evolutionary  Stresses Dimensional Model, Not E-R  Integration Achieved via Conformed Dimensions  Star Schemas Enforce Query Semantics 22
  • 23. IAS Inc The Comparison (Methodology and Architecture) Inmon Kimball Overall approach Top-down Bottom-up Architectural structure Enterprise-wide DW Data marts model a feeds departmental DBs business process; enterprise is achieved with conformed dims Complexity of method Quite complex Fairly simple Reference: http://www.bi-bestpractices.com/view-articles/4768
  • 24. IAS Inc The Comparison (Data Modeling) Inmon Kimball Data orientation Subject or data driven Process oriented Tools Traditional (ERDs and Dimensional modeling; DIS) departs from traditional relational modeling End user accessibility Low High Reference: http://www.bi-bestpractices.com/view-articles/4768
  • 25. IAS Inc The Comparison (Dimensions) Inmon Kimball Timeframe Continuous & Discrete Slowly Changing Methods Timestamps Dimension keys Reference: http://www.bi-bestpractices.com/view-articles/4768
  • 26. IAS Inc Inmon Continuous & Discrete Dimension Management  Define data management via dates in your data  Continuous time  When is a record active  Start and end dates  Discrete time  A point in time  Snapshot
  • 27. IAS Inc Kimball Slowly Changing Dimension Management  Define data management via versioning  Type I  Change record as required  No History  Type II  Manage all changes  History is recorded  Type III  Some history is parallel  Limit to defined history
  • 28. IAS Inc The Comparison (Philosophy) Inmon Kimball Primary Audience IT End Users Place in the Integral part of the Transformer and retainer Organization Corporate Information of operational data Factory (CIF) Objective Deliver a sound technical Deliver a solution that solution based on proven makes it easy for end methods users to directly query data and still have reasonable response rate Reference: http://www.bi-bestpractices.com/view-articles/4768
  • 29. IAS Inc How to Choose? Characteristic Favors Kimball Favors Inmon Nature of the Tactical Strategic organization's decision support requirements Data integration Individual business Enterprise-wide integration requirements areas Structure of data Business metrics, Non-metric data and for data that performance will be applied to meet multiple measures, and and varied information needs scorecards Scalability Need to adapt to Growing scope and changing highly volatile needs requirements are critical within a limited scope
  • 30. IAS Inc How to Choose? Characteristic Favors Kimball Favors Inmon Persistency of data Source systems are High rate of change from source relatively stable systems Staffing and skills Small teams of Larger team(s) of specialists requirements generalists Time to delivery Need for the first Organization's requirements allow data warehouse for longer start-up time application is urgent Cost to deploy Lower start-up costs, Higher start-up costs, with lower with each subsequent project development subsequent project costs costing about the same
  • 31. IAS Inc References  Data Warehousing Battle of the Giants: Comparing the Basics of the Kimball and Inmon Models: by Mary Breslin http://www.bi- bestpractices.com/view-articles/4768  Inmon CIF glossary: http://www.inmoncif.com/library/glossary/#D  The Data Warehouse Toolkit, Kimball, 2002  Inmon, W.H. Building the Data Warehouse (Third Edition), New York: John Wiley & Sons, (2002).  Kimball, R. and M. Ross. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition), New York: John Wiley & Sons, 2000.
  • 32. IAS Inc Thanks and Questions?  Ian Abramson IAS Inc 416-407-2448 ian@abramson.ca