SlideShare ist ein Scribd-Unternehmen logo
1 von 46
An efficient way of data analysis.
Data warehousing
Data Warehouse (DW) is a Subject oriented,
 integrated, time variant, non-volatile collection of
 data in support of management’s system.
        -W.H.INMON

It is a collection of data designed to support
 management decision making by presenting a
 coherent picture of business conditions at a single
 point of time.
Features of Data Warehouse:-
    Subject-oriented

    Integrated

    Time-variant
                                    Data
    Nonvolatile
                                  Warehouse




                        Subject             Time-     Non
                        -       Integrated
                                           variant   volatile
                        oriente
                        d
Subject-oriented:
    Data are organized according to the subject instead of
     application.
     It mainly focuses on modeling and analysis of data for
     decision makers, not on daily operations or transaction
     processing.

Integrated:
     Constructed by integrating multiple, heterogeneous
      data sources like relational databases, flat files, on-line
      transaction records.
     Ensure consistency in naming conventions, encoding
      structures, attribute measures, etc. among different
      data sources.
Time-variant:
      The time horizon for the data warehouse is significantly
       longer than that of operational systems. i.e. provide
       information from a historical perspective (e.g., past 5-10
       years).


Non-volatile:
    No updates are allowed. Once the data entered into the
     data warehouse, they are never removed.
    The data in warehouse represent company’s history, the
     operational data representing near term history are
     always added to it..
Need for Decision Support
System(DSS) in business
In a typical business environment with an increasing
  competitive market, we cannot ponder over decisions for
  too long.
   Hence the business needs managers with quick decision
     making.
   Therefore, to succeed in business today, any company
     needs information systems that can support the diverse
     information and decision-making needs.
How many businesses appear and disappear
each year?
Decision Support Systems helps us to assess and
 resolve everyday business questions.
It works by compiling useful information from a
 combination of raw data, documents, personal
 knowledge, or business models.
Structured and Unstructured components
of DSS

Every task has a structured component as well as
 an unstructured component.
A Structured component is one which directly
 helps us to proceed towards decision.
An Unstructured component is that which is to
 be still processed and requires human interaction
 with the DSS.
For example, choosing a stock portfolio.
   Here the   Structured components are the amount
    of risk and the performance of stocks, but
   The choice of stocks for a portfolio requires human
    intuition. Hence it is Unstructured.


                                   risk
                                               Structured
          Stock
                                performance
        Portfolio
                                 Choice of
                                 Stock for     Unstructured
                                 portfolio
In order to implement DSS……
DSS architectural styles
 OLTP (Online Transaction Processing)
  -used by traditional operational systems (RDBMS).

 OLAP (Online Analytical Processing)
  -used by Data Warehouse.
The Operational Database is one which is
 accessed by an Operational System to carry out
 regular operations of an organization.
Operational Databases usually use an OLTP
 architecture which is optimized for faster
 transaction processing.
OLTP Databases access the data in the form of
 operations like- Inserting, Deleting, and Updating
 data.
OLTP
OLTP is a methodology to provide end users with
 access to large amounts of data
It works in an intuitive and rapid manner to assist
 with deductions based on investigative reasoning.
OLTP refers to a class of systems that facilitate
 and manage transaction-oriented applications,
 typically for data entry and retrieval transaction
 processing.
A typical OLTP Architecture


An Automatic Teller Machine  (ATM)
 for a bank is an example of a commercial
 transaction processing application.
Benefits of OLTP
1.   Simplicity & Efficiency:
             Reduced paper trails and the faster &
     more accurate forecasts for revenues and
     expenses are both examples of how OLTP
     makes things simpler for businesses.
2.   OLTP systems maintain data integrity and they
     also provide fast query processing in
     environments having multiple access.
Pitfalls of OLTP:-
 OLTP requires instant update.
 The data what we get from OLTP is not suitable
  for data analysis.
 To perform one simple transaction even with
  the normalized structure, we need to query
  multiple tables by using joins.
We usually refer to information as
 attributes related to entities, objects or
 classes, like product price, invoice amount
 or client name. Mapping can be with a
 simple, one argument function:


               product » price
               invoice » amount
               client » name

       each attribute is mapped to one
       column.
What is gross margin by product category
in Europe and Asia?
What's our current inventory by product
and warehouse?
Which was the evolution of return rate of
different products acquired by different
suppliers?
 functions of multiple arguments



   Product category × Region » Gross margin
   Product × Warehouse » Inventory
   Supplier × Time × Product » Return rate



  More than one attribute is mapped to one
  column.
Hence for data reporting and analysis we
 must have a new system.
       DATA WAREHOUSING
DATA WAREHOUSE
Data Warehouse is a database used for data
 reporting and analysis.
The data stored in the warehouse
 are uploaded from the operational systems (such
 as marketing, sales etc.)
The data store contains two main types of data.
    -Business Data
    -Business Data Model
                             Operational
The business data are          data
                                               Business
 extracted from                                  data

 operational database         External
                                data
  and from external
 sources which are relevant to the business (stock
 prices, market indicators, marketing information
 and competitor’s data).
DSS data vs Operational data
The DSS data which is extracted from
 multiple sources differ from operational data
 in three main areas:
         -time span,
             -granularity and
                - dimensionality.
Business data represent a
 snapshot of the company’s
 situation.
 A typical ETL (Extract
   Transform Load)-based
   data warehouse uses
  a) staging,
  b) integration and
  c) access layers (Data
     Marts)
     to house its key
      functions.
The data that arrived at data
 warehouse are first passed to
 Operational Data Store
 (ODS).
Data is integrated from
 multiple sources for
 additional operations on the
 data.
This integrated data is
 passed back to operational
 systems for decision-making.
DSS components
The DSS is usually composed of five main
 components:-
     Data store component
     Data extraction component

     Data filtering component

     End user query tool

     End user presentation tool
Data Sources   Data Storage   OLAP engine Front end tools
The data in the data warehouse is stored in the form
 of Data marts.
It allows the user to access the data in terms of a
 specific business line or team.
Data mart
The data mart is a subset of the data warehouse
 that is usually oriented to a specific business line
 or team.                        Multi-Tier Data
                                                Warehouse
               Distributed Data
               Marts


                                                     Enterprise
        Data              Data
                                                     Data
        Mart              Mart
                                                     Warehouse

           Model refinement       Model refinement

         Define a high-level corporate data model
OLAP(Online Analytical Processing)
OLAP is an approach to answer multi-dimensional
 analytical queries which also encompasses relational
 reporting and data mining.

An OLAP cube is an array of data that is understood
 in terms of its 0 or more dimensions which enables
 the users to gain insight into their data in a fast,
 interactive, easy-to-use manner.
For example, a company might
  wish to summarize financial
  data by product (Maruthi), by
  time-period (May 2006), by
  city (Mumbai) to compare
  actual budget expenses.

 Product, time, city and
  scenario (actual and budget)
  are the data's dimensions.

In this example, the cell
  contains the values of all      OLAP Cube
  Maruthi sales in Mumbai for
  May 2006.
The data in Data Warehouse is arranged in the form
 of hierarchical groups often called dimensions and
 into facts tables and aggregate facts.

OLAP data is typically stored in a Star Schema.
          -which is a combination of dimensions and
 fact tables.
STAR SCHEMA
time
Time_key                                             item
day                                                item_key
day_of_the_week               Sales Fact Table     item_name
month                                              brand
quarter                                time_key    type
year                                               supplier_type
                                       item_key
                                     branch_key
       branch                                      location
                                    location_key
       branch_key                                  location_key
       branch_name                    units_sold   street
       branch_type                                 city
                                    dollars_sold   state_or_province
                                                   country
                                      avg_sales
                  Measures
Fact table consists of the measurements metrics or
 facts of a business process. It is located at the center
 of a star schema surrounded by dimension tables.
In the above example, it holding sales by item, by
 time, by branch and by location.
 Dimension tables contain descriptive attributes (or
 fields) which are typically textual fields or discrete
 numbers that behave like text.
These attributes are designed to serve two critical
 purposes:
         -query constraining/filtering
         -query result set labeling.
OLAP Architecture
OLAP Server
OLAP Server receives the data from data warehouse
 by which it represents the data in a user
 understandable format which actually supply
 analytical functionality for the DSS system.
OLAP Server generally performs data analysis in two
 forms.
              -ROLAP(Relational OLAP)
              -MOLAP(Multi-dimensional OLAP)
ROLAP
 It is a form of OLAP that performs dynamic multi-
 dimensional analysis of data stored in a relational
 database rather than in a multi-dimensional
 database (which is usually considered the OLAP
 standard).
Data processing may take place within the database
 system, a mid-tier server, or the client.
In two-tier architecture, the user submits a
 Structured Query Language (SQL) query to the
 database and receives back the requested data.
Two-Tier Architecture of ROLAP
MOLAP (Multi-dimensional OLAP)

It is a form of OLAP that helps the user to “slice and
 dice” information, providing multi-dimensional
 analysis of data by putting data in a cube structure.

Most MOLAP products use a multi-cube approach in
 which a series of small, dense, pre-calculated cubes
 make a hypercube.
Slicing(above) and Dicing(below) a Cube
It is an ideal tool that
  helps to query data which
  involves time-line (day,
  week, month, year),
  geographical areas (city,
  state, country), various
  product lines or categories
  and channels (sales
  people, stores, etc).
Three kinds of data warehouse applications
 Information processing
     supports querying, basic statistical analysis, and reporting using
      crosstabs, tables, charts and graphs.
 Analytical processing
     multidimensional analysis of data warehouse data.
     supports basic OLAP operations, slice-dice, drilling, pivoting.
 Data mining
     knowledge discovery from hidden patterns .
     supports associations, constructing analytical models,
      performing classification and prediction, and presenting the
      mining results using visualization tools.
Our business in life is not to get ahead of others,
 but to get ahead of ourselves—to break our own
 records, to outstrip our yesterday by our today.
                -Stewart B.Johnson
Queries ??

Weitere ähnliche Inhalte

Was ist angesagt?

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consultingadivasoft
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Edureka!
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Kernel Training
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousingKavisha Uniyal
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 

Was ist angesagt? (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data mart
Data martData mart
Data mart
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 

Ähnlich wie Introduction to Data Warehouse

DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseSOMASUNDARAM T
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxDURGADEVIL
 
SAP BODS -quick guide.docx
SAP BODS -quick guide.docxSAP BODS -quick guide.docx
SAP BODS -quick guide.docxKen T
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3ambujm
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPDhiren Gala
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4ambujm
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptDougSchoemaker
 
Data warehouse
Data warehouseData warehouse
Data warehouseMR Z
 
Data warehouse
Data warehouseData warehouse
Data warehouse_123_
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 

Ähnlich wie Introduction to Data Warehouse (20)

DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
SAP BODS -quick guide.docx
SAP BODS -quick guide.docxSAP BODS -quick guide.docx
SAP BODS -quick guide.docx
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
DW 101
DW 101DW 101
DW 101
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.ppt
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 

Kürzlich hochgeladen

Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 

Kürzlich hochgeladen (20)

Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 

Introduction to Data Warehouse

  • 1. An efficient way of data analysis.
  • 2. Data warehousing Data Warehouse (DW) is a Subject oriented, integrated, time variant, non-volatile collection of data in support of management’s system. -W.H.INMON It is a collection of data designed to support management decision making by presenting a coherent picture of business conditions at a single point of time.
  • 3. Features of Data Warehouse:-  Subject-oriented  Integrated  Time-variant Data  Nonvolatile Warehouse Subject Time- Non - Integrated variant volatile oriente d
  • 4. Subject-oriented: Data are organized according to the subject instead of application.  It mainly focuses on modeling and analysis of data for decision makers, not on daily operations or transaction processing. Integrated:  Constructed by integrating multiple, heterogeneous data sources like relational databases, flat files, on-line transaction records.  Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources.
  • 5. Time-variant:  The time horizon for the data warehouse is significantly longer than that of operational systems. i.e. provide information from a historical perspective (e.g., past 5-10 years). Non-volatile:  No updates are allowed. Once the data entered into the data warehouse, they are never removed.  The data in warehouse represent company’s history, the operational data representing near term history are always added to it..
  • 6. Need for Decision Support System(DSS) in business In a typical business environment with an increasing competitive market, we cannot ponder over decisions for too long. Hence the business needs managers with quick decision making. Therefore, to succeed in business today, any company needs information systems that can support the diverse information and decision-making needs.
  • 7. How many businesses appear and disappear each year?
  • 8. Decision Support Systems helps us to assess and resolve everyday business questions. It works by compiling useful information from a combination of raw data, documents, personal knowledge, or business models.
  • 9. Structured and Unstructured components of DSS Every task has a structured component as well as an unstructured component. A Structured component is one which directly helps us to proceed towards decision. An Unstructured component is that which is to be still processed and requires human interaction with the DSS.
  • 10. For example, choosing a stock portfolio.  Here the Structured components are the amount of risk and the performance of stocks, but  The choice of stocks for a portfolio requires human intuition. Hence it is Unstructured. risk Structured Stock performance Portfolio Choice of Stock for Unstructured portfolio
  • 11. In order to implement DSS……
  • 12. DSS architectural styles  OLTP (Online Transaction Processing) -used by traditional operational systems (RDBMS).  OLAP (Online Analytical Processing) -used by Data Warehouse.
  • 13. The Operational Database is one which is accessed by an Operational System to carry out regular operations of an organization. Operational Databases usually use an OLTP architecture which is optimized for faster transaction processing. OLTP Databases access the data in the form of operations like- Inserting, Deleting, and Updating data.
  • 14. OLTP OLTP is a methodology to provide end users with access to large amounts of data It works in an intuitive and rapid manner to assist with deductions based on investigative reasoning. OLTP refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing.
  • 15. A typical OLTP Architecture An Automatic Teller Machine  (ATM) for a bank is an example of a commercial transaction processing application.
  • 16. Benefits of OLTP 1. Simplicity & Efficiency: Reduced paper trails and the faster & more accurate forecasts for revenues and expenses are both examples of how OLTP makes things simpler for businesses. 2. OLTP systems maintain data integrity and they also provide fast query processing in environments having multiple access.
  • 17. Pitfalls of OLTP:-  OLTP requires instant update.  The data what we get from OLTP is not suitable for data analysis.  To perform one simple transaction even with the normalized structure, we need to query multiple tables by using joins.
  • 18. We usually refer to information as attributes related to entities, objects or classes, like product price, invoice amount or client name. Mapping can be with a simple, one argument function: product » price invoice » amount client » name each attribute is mapped to one column.
  • 19. What is gross margin by product category in Europe and Asia? What's our current inventory by product and warehouse? Which was the evolution of return rate of different products acquired by different suppliers?
  • 20.  functions of multiple arguments Product category × Region » Gross margin Product × Warehouse » Inventory Supplier × Time × Product » Return rate More than one attribute is mapped to one column.
  • 21. Hence for data reporting and analysis we must have a new system. DATA WAREHOUSING
  • 22. DATA WAREHOUSE Data Warehouse is a database used for data reporting and analysis. The data stored in the warehouse are uploaded from the operational systems (such as marketing, sales etc.)
  • 23. The data store contains two main types of data. -Business Data -Business Data Model Operational The business data are data Business extracted from data operational database External data and from external sources which are relevant to the business (stock prices, market indicators, marketing information and competitor’s data).
  • 24. DSS data vs Operational data The DSS data which is extracted from multiple sources differ from operational data in three main areas: -time span, -granularity and - dimensionality.
  • 25. Business data represent a snapshot of the company’s situation.
  • 26.  A typical ETL (Extract Transform Load)-based data warehouse uses a) staging, b) integration and c) access layers (Data Marts) to house its key functions.
  • 27. The data that arrived at data warehouse are first passed to Operational Data Store (ODS). Data is integrated from multiple sources for additional operations on the data. This integrated data is passed back to operational systems for decision-making.
  • 28. DSS components The DSS is usually composed of five main components:-  Data store component  Data extraction component  Data filtering component  End user query tool  End user presentation tool
  • 29. Data Sources Data Storage OLAP engine Front end tools
  • 30. The data in the data warehouse is stored in the form of Data marts. It allows the user to access the data in terms of a specific business line or team.
  • 31. Data mart The data mart is a subset of the data warehouse that is usually oriented to a specific business line or team. Multi-Tier Data Warehouse Distributed Data Marts Enterprise Data Data Data Mart Mart Warehouse Model refinement Model refinement Define a high-level corporate data model
  • 32.
  • 33. OLAP(Online Analytical Processing) OLAP is an approach to answer multi-dimensional analytical queries which also encompasses relational reporting and data mining. An OLAP cube is an array of data that is understood in terms of its 0 or more dimensions which enables the users to gain insight into their data in a fast, interactive, easy-to-use manner.
  • 34. For example, a company might wish to summarize financial data by product (Maruthi), by time-period (May 2006), by city (Mumbai) to compare actual budget expenses.  Product, time, city and scenario (actual and budget) are the data's dimensions. In this example, the cell contains the values of all OLAP Cube Maruthi sales in Mumbai for May 2006.
  • 35. The data in Data Warehouse is arranged in the form of hierarchical groups often called dimensions and into facts tables and aggregate facts. OLAP data is typically stored in a Star Schema. -which is a combination of dimensions and fact tables.
  • 36. STAR SCHEMA time Time_key item day item_key day_of_the_week Sales Fact Table item_name month brand quarter time_key type year supplier_type item_key branch_key branch location location_key branch_key location_key branch_name units_sold street branch_type city dollars_sold state_or_province country avg_sales Measures
  • 37. Fact table consists of the measurements metrics or facts of a business process. It is located at the center of a star schema surrounded by dimension tables. In the above example, it holding sales by item, by time, by branch and by location.  Dimension tables contain descriptive attributes (or fields) which are typically textual fields or discrete numbers that behave like text. These attributes are designed to serve two critical purposes: -query constraining/filtering -query result set labeling.
  • 39. OLAP Server OLAP Server receives the data from data warehouse by which it represents the data in a user understandable format which actually supply analytical functionality for the DSS system. OLAP Server generally performs data analysis in two forms. -ROLAP(Relational OLAP) -MOLAP(Multi-dimensional OLAP)
  • 40. ROLAP  It is a form of OLAP that performs dynamic multi- dimensional analysis of data stored in a relational database rather than in a multi-dimensional database (which is usually considered the OLAP standard). Data processing may take place within the database system, a mid-tier server, or the client. In two-tier architecture, the user submits a Structured Query Language (SQL) query to the database and receives back the requested data.
  • 42. MOLAP (Multi-dimensional OLAP) It is a form of OLAP that helps the user to “slice and dice” information, providing multi-dimensional analysis of data by putting data in a cube structure. Most MOLAP products use a multi-cube approach in which a series of small, dense, pre-calculated cubes make a hypercube.
  • 43. Slicing(above) and Dicing(below) a Cube It is an ideal tool that helps to query data which involves time-line (day, week, month, year), geographical areas (city, state, country), various product lines or categories and channels (sales people, stores, etc).
  • 44. Three kinds of data warehouse applications  Information processing  supports querying, basic statistical analysis, and reporting using crosstabs, tables, charts and graphs.  Analytical processing  multidimensional analysis of data warehouse data.  supports basic OLAP operations, slice-dice, drilling, pivoting.  Data mining  knowledge discovery from hidden patterns .  supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization tools.
  • 45. Our business in life is not to get ahead of others, but to get ahead of ourselves—to break our own records, to outstrip our yesterday by our today. -Stewart B.Johnson