SlideShare ist ein Scribd-Unternehmen logo
1 von 13
ETL Process In Data
    Warehouse

  By: Komal Choudhary
Outline
 ETL
 Extraction
 Transformation
 Loading
ETL Overview
 Extraction Transformation Loading – ETL
 To get data out of the source and load it into the
  data warehouse.
 Data is extracted from an OLTP database,
  transformed to match the data warehouse
  schema and loaded into the data warehouse
  database
Process
Why???
 As data sources change the data warehouse will
  periodically updated.
 Also, as business changes the DW system needs
  to change – in order to maintain its value as a tool
  for decision makers, as a result of that the ETL
  also changes and evolves. The ETL processes
  must be designed for ease of modification. As
  solid, well-designed, and documented ETL
  system is necessary for the success of a data
  warehouse project.
 An ETL system consists of three consecutive
  functional
    steps: extraction, transformation, and loading:
Extraction
Extract Process
 The Extract step covers the data extraction from
  the source system and makes it accessible for
  further processing. The main objective of the
  extract step is to retrieve all the required data
  from the source system with as little resources as
  possible.
 There are several ways to perform the extract:
1. Update notification
2. Incremental extract
3. Full extract
Clean
 The cleaning step is one of
     the most important as it
     ensures the quality of the data
     in the data warehouse.
    Cleaning should perform basic
     data unification rules, such as:
1.      Making identifiers unique
2.      Convert null values into
        standardized
3.      Convert phone numbers,
        ZIP codes to a standardized
        form
4.      Validate address fields,
        convert them into proper
        naming, e.g.
        Street/St/St./Str./Str
5.      Validate address fields
        against each other.
Transformation
 applies a set of rules
  to transform the data
  from the source to the
  target.
 This includes
  converting any
  measured data to the
  same dimension using
  the same units so that
  they can later be
  joined.
Problems???
 classes of conficts
  and problems that can
  be distinguished in
  two levels : the
  schema and the
  instance level.
1. Schema-level
    problems.
2. Record-level
    problems.
3. Value-level
    problems.
Solution…
 To deal with such
 issues, the integration
 and transformation
 tasks involve a wide
 variety of functions,
 such as normalizing,
 de-normalizing ,
 reformatting,
 recalculating,
 summarizing, merging
 data from multiple
 sources, modifying key
 structures, adding an
 element of time,
 identifying default
 values, supplying
 decision commands to
 choose between
Loading
 Loading data to the
 target
 multidimensional
 structure is the final
 ETL step. In this step,
 extracted and
 transformed data is
 written into the
 dimensional
 structures actually
 accessed by the end
 users and application
 systems. Loading step
 includes both loading
 dimension tables and
Thanks!!!!!

Weitere ähnliche Inhalte

Was ist angesagt?

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseSOMASUNDARAM T
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse ArchitecturesTheju Paul
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to DatabaseSiti Ismail
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousingShahed Khalili
 
Etl overview training
Etl overview trainingEtl overview training
Etl overview trainingMondy Holten
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesSaif Ullah
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
 

Was ist angesagt? (20)

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
ETL
ETLETL
ETL
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to Database
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 
Etl overview training
Etl overview trainingEtl overview training
Etl overview training
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
ETL Process
ETL ProcessETL Process
ETL Process
 
11. data management
11. data management11. data management
11. data management
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 

Andere mochten auch

Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Data extraction, transformation, and loading
Data extraction, transformation, and loadingData extraction, transformation, and loading
Data extraction, transformation, and loadingSiddique Ibrahim
 
Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)LizLavaveshkul
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing processRakesh Hansalia
 
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Health Catalyst
 

Andere mochten auch (6)

Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Le processus ETL (Extraction, Transformation, Chargement)
Le processus ETL (Extraction, Transformation, Chargement)Le processus ETL (Extraction, Transformation, Chargement)
Le processus ETL (Extraction, Transformation, Chargement)
 
Data extraction, transformation, and loading
Data extraction, transformation, and loadingData extraction, transformation, and loading
Data extraction, transformation, and loading
 
Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing process
 
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
 

Ähnlich wie Etl process in data warehouse

ETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse FundamentalsETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse FundamentalsSOMASUNDARAM T
 
“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration process“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration processRashidRiaz18
 
Get started with data migration
Get started with data migrationGet started with data migration
Get started with data migrationThinqloud
 
An Overview on Data Quality Issues at Data Staging ETL
An Overview on Data Quality Issues at Data Staging ETLAn Overview on Data Quality Issues at Data Staging ETL
An Overview on Data Quality Issues at Data Staging ETLidescitation
 
Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETLganblues
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptxJesusaEspeleta
 
ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training PresentationApurba Biswas
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Information Flow Mechanism in Data warehouse
Information Flow Mechanism in Data warehouseInformation Flow Mechanism in Data warehouse
Information Flow Mechanism in Data warehouseGunjanShree1
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guidethomasmary607
 
Final Project Write-up
Final Project Write-upFinal Project Write-up
Final Project Write-upshiyang feng
 
extract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.pptextract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.pptNeerupa Chauhan
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseBugRaptors
 
Enhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data AccessEnhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data AccessEditor IJCATR
 
Enhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data AccessEnhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data AccessEditor IJCATR
 

Ähnlich wie Etl process in data warehouse (20)

ETL_Methodology.pptx
ETL_Methodology.pptxETL_Methodology.pptx
ETL_Methodology.pptx
 
ETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse FundamentalsETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse Fundamentals
 
“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration process“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration process
 
Database migration
Database migrationDatabase migration
Database migration
 
Get started with data migration
Get started with data migrationGet started with data migration
Get started with data migration
 
An Overview on Data Quality Issues at Data Staging ETL
An Overview on Data Quality Issues at Data Staging ETLAn Overview on Data Quality Issues at Data Staging ETL
An Overview on Data Quality Issues at Data Staging ETL
 
Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
 
ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training Presentation
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Information Flow Mechanism in Data warehouse
Information Flow Mechanism in Data warehouseInformation Flow Mechanism in Data warehouse
Information Flow Mechanism in Data warehouse
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
GROPSIKS.pptx
GROPSIKS.pptxGROPSIKS.pptx
GROPSIKS.pptx
 
Final Project Write-up
Final Project Write-upFinal Project Write-up
Final Project Write-up
 
extract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.pptextract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.ppt
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
 
Enhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data AccessEnhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data Access
 
Enhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data AccessEnhancing Data Staging as a Mechanism for Fast Data Access
Enhancing Data Staging as a Mechanism for Fast Data Access
 

Etl process in data warehouse

  • 1. ETL Process In Data Warehouse By: Komal Choudhary
  • 2. Outline  ETL  Extraction  Transformation  Loading
  • 3. ETL Overview  Extraction Transformation Loading – ETL  To get data out of the source and load it into the data warehouse.  Data is extracted from an OLTP database, transformed to match the data warehouse schema and loaded into the data warehouse database
  • 5. Why???  As data sources change the data warehouse will periodically updated.  Also, as business changes the DW system needs to change – in order to maintain its value as a tool for decision makers, as a result of that the ETL also changes and evolves. The ETL processes must be designed for ease of modification. As solid, well-designed, and documented ETL system is necessary for the success of a data warehouse project.  An ETL system consists of three consecutive functional steps: extraction, transformation, and loading:
  • 7. Extract Process  The Extract step covers the data extraction from the source system and makes it accessible for further processing. The main objective of the extract step is to retrieve all the required data from the source system with as little resources as possible.  There are several ways to perform the extract: 1. Update notification 2. Incremental extract 3. Full extract
  • 8. Clean  The cleaning step is one of the most important as it ensures the quality of the data in the data warehouse.  Cleaning should perform basic data unification rules, such as: 1. Making identifiers unique 2. Convert null values into standardized 3. Convert phone numbers, ZIP codes to a standardized form 4. Validate address fields, convert them into proper naming, e.g. Street/St/St./Str./Str 5. Validate address fields against each other.
  • 9. Transformation  applies a set of rules to transform the data from the source to the target.  This includes converting any measured data to the same dimension using the same units so that they can later be joined.
  • 10. Problems???  classes of conficts and problems that can be distinguished in two levels : the schema and the instance level. 1. Schema-level problems. 2. Record-level problems. 3. Value-level problems.
  • 11. Solution…  To deal with such issues, the integration and transformation tasks involve a wide variety of functions, such as normalizing, de-normalizing , reformatting, recalculating, summarizing, merging data from multiple sources, modifying key structures, adding an element of time, identifying default values, supplying decision commands to choose between
  • 12. Loading  Loading data to the target multidimensional structure is the final ETL step. In this step, extracted and transformed data is written into the dimensional structures actually accessed by the end users and application systems. Loading step includes both loading dimension tables and