SlideShare ist ein Scribd-Unternehmen logo
1 von 14
101
Introduction to Data Warehousing
          Fundamentals
Definition of a Data Warehouse
• A data warehouse is an enterprise
  structured repository of subject-oriented,
  time-variant data used for information
  retrieval and decision support. The data
  warehouse stores atomic and summary
  data.
Typical Data Warehousing Process
 Phase I: STRATEGY
 Identify business requirements.
 Define objectives and purpose of DW.
   Phase II: DEFINITION
   Project scoping and planning: Using building block
   approach
              Phase III: ANALYSIS
              Information requirements are defined.
                      Phase IV: DESIGN
                      Database structures to hold base data and
                      summaries are created. Translation
                      mechanisms are designed.
                             Phase V: BUILD AND DOCUMENT
                             The warehouse is built and documentation is
                             developed.
                               Phase VI: POPULATE, TEST, AND TRAIN
        Iterative              The warehouse is populated and tested. The users
                               are trained on system and tools.
                                   Phase VII: DISCOVERY AND EVOLUTION
                                   The warehouse is monitored and adjustments are
                                   applied, or future extensions are planned.
Data Warehouse Compared to OLTP
Property         OLTP                    Data Warehouse
Activities       Processes               Analysis
Response Time    Subseconds              Seconds to hours
                 to seconds
Operations       DML                     Primarily read-only
Nature of Data   Current                 Snapshots over time

Data Organized   By application          By subject, time
Size             Small to large          Large to very large
Data Sources     Operational, internal   Operational, internal,
                                         external
Data Warehouse Compared
             with Data Mart
Property         Data Warehouse    Data Mart
Scope            Enterprise        Department
Subjects         Multiple          Single-subject, line
                                   of business (LOB)
Data Source      Many              Few
Size (typical)   See notes below   See notes below
Implementation   Months to years   Months
Time
Independent Versus Dependent Marts
                        Data                          Data
Sources                 marts   Sources               marts




                                            Ware-
                                            house




          Independent                     Dependent
Independent Data Mart
Operational
systems


                Flat files



                             Sales or
                             marketing
                             data mart




External data
Dependent Data Mart
Operational                  Data warehouse   Data mart
systems


                Flat files
                                              Marketing


                               Marketing
                               Sales
                               Finance          Sales
                               Human
                               Resources


                                               Finance
External data
Purpose of an Enterprise Model
 Extract                Transform/Load                                 Publish       Subscribe
                                                          Federated data warehouse
    Flat files
                                      TL                  Dependent data marts



                 Staging areas
                                                                   L




                                                                                        Access layers
                                                                                                        Portal
                                 Transformations
 Operational
                                                                           B2C
             E

RDBMS                                                                      B2B

    External                                       Enterprise
                                                   model               Clickstream
Server log                                         (atomic data)
files


                 Metadata repository
Extract, Transform, Load (ETL)
              Processes
– Extract source data.            – Load data into warehouse.
– Transform/clean data.           – Detect changes.
– Index and summarize.            – Refresh data.




                          Programs

                          Gateways

Operational systems       Tools               Warehouse
                                  ETL
ETL Processes
  – Must result in data that is relevant, useful, high-
    quality, accurate, and accessible
  – Require a large proportion of warehouse
    development time and resources

                                                  Relevant
                        Clean up                  Useful

                        Consolidate               Quality

Operational systems     Restructure   Warehouse   Accurate

                            ETL                   Accessible
Possible Reasons for ETL Failure
– A missing source file
– A system failure
– Inadequate metadata
– Poor mapping information
– Inadequate storage planning
– A source structural change
– No contingency plan
– Inadequate data validation
Typical Warehousing Development
              Tasks
                 Define source metadata
Source           Define staging area metadata
                 Map source to staging area
to               Deploy database structures
staging          Deploy mappings
                 Extract data into staging tables
                 Define enterprise model (warehouse) metadata
Staging          Map staging area to enterprise model
to               Deploy database structures
warehouse        Deploy mappings
                 Extract data into the enterprise model
                 Define data mart metadata (cubes, dimensions)
Warehouse        Map enterprise model to data marts
to               Deploy database structures
data marts       Deploy mappings
                 Extract data into the data mart
                 Refresh warehouse and data mart
Administration
                 Maintain warehouse and data mart
Visit more self help tutorials

• Pick a tutorial of your choice and browse
  through it at your own pace.
• The tutorials section is free, self-guiding and
  will not involve any additional support.
• Visit us at www.dataminingtools.net

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
Girish Dhareshwar
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
Sarita Kataria
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
Eyad Manna
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
cpjcollege
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 

Was ist angesagt? (20)

Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Data Warehousing and Mining
Data Warehousing and MiningData Warehousing and Mining
Data Warehousing and Mining
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Ppt
PptPpt
Ppt
 

Andere mochten auch

Data Ware House Testing
Data Ware House TestingData Ware House Testing
Data Ware House Testing
manojpmat
 
Types of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse projectTypes of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse project
Rakesh Hansalia
 
Tivoli data warehouse version 1.3 planning and implementation sg246343
Tivoli data warehouse version 1.3 planning and implementation sg246343Tivoli data warehouse version 1.3 planning and implementation sg246343
Tivoli data warehouse version 1.3 planning and implementation sg246343
Banking at Ho Chi Minh city
 
Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-
AshishGuleria
 

Andere mochten auch (20)

Data Ware House Testing
Data Ware House TestingData Ware House Testing
Data Ware House Testing
 
Types of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse projectTypes of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse project
 
Tivoli data warehouse version 1.3 planning and implementation sg246343
Tivoli data warehouse version 1.3 planning and implementation sg246343Tivoli data warehouse version 1.3 planning and implementation sg246343
Tivoli data warehouse version 1.3 planning and implementation sg246343
 
Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-
 
Data warehousing testing strategies cognos
Data warehousing testing strategies cognosData warehousing testing strategies cognos
Data warehousing testing strategies cognos
 
Dw Kickoff Meeting V4
Dw Kickoff Meeting V4Dw Kickoff Meeting V4
Dw Kickoff Meeting V4
 
Testing data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanTesting data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti Bhushan
 
ETL Validator: Creating Data Model
ETL Validator: Creating Data ModelETL Validator: Creating Data Model
ETL Validator: Creating Data Model
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
 
2013 OHSUG - Clinical Data Warehouse Implementation
2013 OHSUG - Clinical Data Warehouse Implementation2013 OHSUG - Clinical Data Warehouse Implementation
2013 OHSUG - Clinical Data Warehouse Implementation
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2
 
Procedures And Functions in Matlab
Procedures And Functions in MatlabProcedures And Functions in Matlab
Procedures And Functions in Matlab
 
Association Rules
Association RulesAssociation Rules
Association Rules
 
Ontwikkeling In Eigen Handen Nl Web
Ontwikkeling In Eigen Handen Nl WebOntwikkeling In Eigen Handen Nl Web
Ontwikkeling In Eigen Handen Nl Web
 
Oracle: DML
Oracle: DMLOracle: DML
Oracle: DML
 
LISP: Type specifiers in lisp
LISP: Type specifiers in lispLISP: Type specifiers in lisp
LISP: Type specifiers in lisp
 
LISP: Declarations In Lisp
LISP: Declarations In LispLISP: Declarations In Lisp
LISP: Declarations In Lisp
 
BI: Open Source
BI: Open SourceBI: Open Source
BI: Open Source
 
Retrieving Data From A Database
Retrieving Data From A DatabaseRetrieving Data From A Database
Retrieving Data From A Database
 
Simulation
SimulationSimulation
Simulation
 

Ähnlich wie Oracle: Fundamental Of DW

What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
RTTS
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
JawaherAlbaddawi
 
data resource management
 data resource management data resource management
data resource management
soodsurbhi123
 

Ähnlich wie Oracle: Fundamental Of DW (20)

What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
DWH_Session_1.pptx
DWH_Session_1.pptxDWH_Session_1.pptx
DWH_Session_1.pptx
 
1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt
 
DW 101
DW 101DW 101
DW 101
 
142230 633685297550892500
142230 633685297550892500142230 633685297550892500
142230 633685297550892500
 
Oracle: DW Design
Oracle: DW DesignOracle: DW Design
Oracle: DW Design
 
Oracle: Dw Design
Oracle: Dw DesignOracle: Dw Design
Oracle: Dw Design
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data Warehouse
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
data resource management
 data resource management data resource management
data resource management
 
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMicrosoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 

Mehr von DataminingTools Inc

Mehr von DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Oracle: Fundamental Of DW

  • 1. 101 Introduction to Data Warehousing Fundamentals
  • 2. Definition of a Data Warehouse • A data warehouse is an enterprise structured repository of subject-oriented, time-variant data used for information retrieval and decision support. The data warehouse stores atomic and summary data.
  • 3. Typical Data Warehousing Process Phase I: STRATEGY Identify business requirements. Define objectives and purpose of DW. Phase II: DEFINITION Project scoping and planning: Using building block approach Phase III: ANALYSIS Information requirements are defined. Phase IV: DESIGN Database structures to hold base data and summaries are created. Translation mechanisms are designed. Phase V: BUILD AND DOCUMENT The warehouse is built and documentation is developed. Phase VI: POPULATE, TEST, AND TRAIN Iterative The warehouse is populated and tested. The users are trained on system and tools. Phase VII: DISCOVERY AND EVOLUTION The warehouse is monitored and adjustments are applied, or future extensions are planned.
  • 4. Data Warehouse Compared to OLTP Property OLTP Data Warehouse Activities Processes Analysis Response Time Subseconds Seconds to hours to seconds Operations DML Primarily read-only Nature of Data Current Snapshots over time Data Organized By application By subject, time Size Small to large Large to very large Data Sources Operational, internal Operational, internal, external
  • 5. Data Warehouse Compared with Data Mart Property Data Warehouse Data Mart Scope Enterprise Department Subjects Multiple Single-subject, line of business (LOB) Data Source Many Few Size (typical) See notes below See notes below Implementation Months to years Months Time
  • 6. Independent Versus Dependent Marts Data Data Sources marts Sources marts Ware- house Independent Dependent
  • 7. Independent Data Mart Operational systems Flat files Sales or marketing data mart External data
  • 8. Dependent Data Mart Operational Data warehouse Data mart systems Flat files Marketing Marketing Sales Finance Sales Human Resources Finance External data
  • 9. Purpose of an Enterprise Model Extract Transform/Load Publish Subscribe Federated data warehouse Flat files TL Dependent data marts Staging areas L Access layers Portal Transformations Operational B2C E RDBMS B2B External Enterprise model Clickstream Server log (atomic data) files Metadata repository
  • 10. Extract, Transform, Load (ETL) Processes – Extract source data. – Load data into warehouse. – Transform/clean data. – Detect changes. – Index and summarize. – Refresh data. Programs Gateways Operational systems Tools Warehouse ETL
  • 11. ETL Processes – Must result in data that is relevant, useful, high- quality, accurate, and accessible – Require a large proportion of warehouse development time and resources Relevant Clean up Useful Consolidate Quality Operational systems Restructure Warehouse Accurate ETL Accessible
  • 12. Possible Reasons for ETL Failure – A missing source file – A system failure – Inadequate metadata – Poor mapping information – Inadequate storage planning – A source structural change – No contingency plan – Inadequate data validation
  • 13. Typical Warehousing Development Tasks Define source metadata Source Define staging area metadata Map source to staging area to Deploy database structures staging Deploy mappings Extract data into staging tables Define enterprise model (warehouse) metadata Staging Map staging area to enterprise model to Deploy database structures warehouse Deploy mappings Extract data into the enterprise model Define data mart metadata (cubes, dimensions) Warehouse Map enterprise model to data marts to Deploy database structures data marts Deploy mappings Extract data into the data mart Refresh warehouse and data mart Administration Maintain warehouse and data mart
  • 14. Visit more self help tutorials • Pick a tutorial of your choice and browse through it at your own pace. • The tutorials section is free, self-guiding and will not involve any additional support. • Visit us at www.dataminingtools.net