SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
What is ETL?
                        Extraction, Transformation, Loading

Simple Example of ETL


                                Customer      Customer
                                   ID          Name


                               105           Sainsbury
        Master Data

                               102           Tesco


                               109           Waitrose


                               101           Asda



                                                              By
                                                              Karthikeyan Selvaraj
Let’s say the master data table here is a flat file ie excel file which is in your computer .
                   We need to bring this table into SAP BI platform




                                                                        Customer Customer
SAP BI Platform                                                            ID     Name

                                                                        105         Sainsbury

                                                                        102         Tesco

                                                                        109         Waitrose

                                                                        101         Asda




                                                                       By
                                                                       Karthikeyan Selvaraj
The first step is to extract the master data table ie excel file into BI-data warehouse
The components needed for extracting the data into BI data warehouse are
1. DataSource
2. InfoPackage

1. DataSource



    DataSource
                                                  DataSource: It defines about the data.
                                                 For eg: Once I finish this presentation, I
  What type of                                    will choose a location to save this ppt
  data?                                          and I also define in what version I want
  Where the                                      to save this ppt similarly, In datasource
  data is                                             we will define about the data.
  located?




                                                                       By
                                                                       Karthikeyan Selvaraj
The first step is to extract the master data table ie excel file into BI-data warehouse
 The components needed for extracting the data into BI data warehouse are
 1. DataSource
 2. InfoPackage

  2. InfoPackage

    What is InfoPackage?
    In simple words we can define InfoPackage, It is like a key to open and enter into a
    room.
    It helps to bring the data from a legacy system or SAP system. For our scenario it
    helps to bring the data from our computer into BI datawarehouse.

        Customer      Customer                            DataSource
Excel      ID          Name
File    105          Sainsbury
                                                        What type of
        102          Tesco                              data?
        109          Waitrose                           Where the
        101          Asda                               data is
                                                        located?
                                     InfoPackage                       By
           Computer                                   BI Datawarehouse Karthikeyan Selvaraj
Now we have moved the master data table into BI datawarehouse by executing the
 InfoPackage
 Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area)
 The data that comes inside from any source system will be stored temporarily in PSA.

Excel
File
  Customer    Customer                 DataSource
                                                                     PSA
     ID        Name
 105         Sainsbury                                     Customer     Customer
                                   What type of               ID         Name
 102         Tesco                 data?
                                                           105         Sainsbury
 109         Waitrose              Where the
                                   data is                 102         Tesco
 101         Asda                  located?                109         Waitrose

                         InfoPackage                       101         Asda


                                                  BI Datawarehouse
        Computer
                                                                       By
                                                                       Karthikeyan Selvaraj
Transformation of Data
The first part of ETL ie Extraction is done successfully. Now we need to transform the data
so that it can be made more optimized for reporting.
In order to do that, we define fields of the table as Info Objects. In our master data table
we have two fields ie Customer ID and Customer Name so in BI we define them as Info
Objects.
Info Objects are divided into three types
1. Characteristics – sorting keys such as company code, product ID, etc.
2. Key Figures – quantity, amount or number of items. Data that can be manipulated.
3. Units – currency, measure this all comes under unit.
 Customer ID and Customer name are characteristic Info Objects.
           PSA
 Customer     Customer                                     Customer ID
    ID         Name                                         Info Object
105           Sainsbury                                   Customer Name
                                                            Info Object
102           Tesco
109           Waitrose
101           Asda
                                                                         By
                                       Characteristic Info Object        Karthikeyan Selvaraj
Transformation of Data
The attribute for Customer ID is Customer name
In database we define the attributes for primary key similarly we need to define the
attributes for master data field ie for Customer ID.
Once that is done we do the mapping ie transformation. We map the fields of the
DataSource to the fields of the Info Objects


                                                           InfoProvider
            DataSource


           Customer ID                                     Customer ID
                                                            Info Object
                                 Transformation
             Customer                                    Customer Name
              Name                                         Info Object




                                                                        By
                                                                        Karthikeyan Selvaraj
Loading
Once the mapping is done, data has to be transferred from DataSource (PSA Table) to
InfoProvider ( Info Objects)
This is done by a process called Data Transfer Process (DTP).
How?: We create the DTP in InfoProvider layer and activate it. After activation we execute
the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their
respective InfoObjects.

                                                           InfoProvider
            DataSource


           Customer ID                                    Customer ID
                                                           Info Object
                                 Transformation
             Customer                                   Customer Name
              Name                                        Info Object




                                     DTP
                                                                       By
                                                                       Karthikeyan Selvaraj
Loading
Data are moved to their respective InfoObjects as per their mapping and it’s ready for
reporting from the InfoProvider Layer.

                                     InfoProvider



                      Customer ID            Customer Name
                       Info Object             Info Object

                           105                  Sainsbury
                           102                      Tesco
                           109                  Waitrose
                           101                      Asda




                                                                    By
                                                                    Karthikeyan Selvaraj
Thank You

            By
            Karthikeyan Selvaraj

Weitere ähnliche Inhalte

Was ist angesagt?

Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
Theju Paul
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 

Was ist angesagt? (20)

Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
ETL Technologies.pptx
ETL Technologies.pptxETL Technologies.pptx
ETL Technologies.pptx
 
Why shift from ETL to ELT?
Why shift from ETL to ELT?Why shift from ETL to ELT?
Why shift from ETL to ELT?
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Testing data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanTesting data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti Bhushan
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data Warehouse Basics
Data Warehouse BasicsData Warehouse Basics
Data Warehouse Basics
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 

Ähnlich wie ETL Process

Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
 

Ähnlich wie ETL Process (20)

Lezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server PortfolioLezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server Portfolio
 
Best-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data WarehousesBest-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data Warehouses
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
Keynote Presentation
Keynote PresentationKeynote Presentation
Keynote Presentation
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Lançamento ERwin 08/02
Lançamento ERwin 08/02Lançamento ERwin 08/02
Lançamento ERwin 08/02
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Msbi
MsbiMsbi
Msbi
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
 
Summit 2011 ods edw technical
Summit 2011 ods edw technicalSummit 2011 ods edw technical
Summit 2011 ods edw technical
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
 
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications PlatformFulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

ETL Process

  • 1. What is ETL? Extraction, Transformation, Loading Simple Example of ETL Customer Customer ID Name 105 Sainsbury Master Data 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 2. Let’s say the master data table here is a flat file ie excel file which is in your computer . We need to bring this table into SAP BI platform Customer Customer SAP BI Platform ID Name 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 3. The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage 1. DataSource DataSource DataSource: It defines about the data. For eg: Once I finish this presentation, I What type of will choose a location to save this ppt data? and I also define in what version I want Where the to save this ppt similarly, In datasource data is we will define about the data. located? By Karthikeyan Selvaraj
  • 4. The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage 2. InfoPackage What is InfoPackage? In simple words we can define InfoPackage, It is like a key to open and enter into a room. It helps to bring the data from a legacy system or SAP system. For our scenario it helps to bring the data from our computer into BI datawarehouse. Customer Customer DataSource Excel ID Name File 105 Sainsbury What type of 102 Tesco data? 109 Waitrose Where the 101 Asda data is located? InfoPackage By Computer BI Datawarehouse Karthikeyan Selvaraj
  • 5. Now we have moved the master data table into BI datawarehouse by executing the InfoPackage Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area) The data that comes inside from any source system will be stored temporarily in PSA. Excel File Customer Customer DataSource PSA ID Name 105 Sainsbury Customer Customer What type of ID Name 102 Tesco data? 105 Sainsbury 109 Waitrose Where the data is 102 Tesco 101 Asda located? 109 Waitrose InfoPackage 101 Asda BI Datawarehouse Computer By Karthikeyan Selvaraj
  • 6. Transformation of Data The first part of ETL ie Extraction is done successfully. Now we need to transform the data so that it can be made more optimized for reporting. In order to do that, we define fields of the table as Info Objects. In our master data table we have two fields ie Customer ID and Customer Name so in BI we define them as Info Objects. Info Objects are divided into three types 1. Characteristics – sorting keys such as company code, product ID, etc. 2. Key Figures – quantity, amount or number of items. Data that can be manipulated. 3. Units – currency, measure this all comes under unit. Customer ID and Customer name are characteristic Info Objects. PSA Customer Customer Customer ID ID Name Info Object 105 Sainsbury Customer Name Info Object 102 Tesco 109 Waitrose 101 Asda By Characteristic Info Object Karthikeyan Selvaraj
  • 7. Transformation of Data The attribute for Customer ID is Customer name In database we define the attributes for primary key similarly we need to define the attributes for master data field ie for Customer ID. Once that is done we do the mapping ie transformation. We map the fields of the DataSource to the fields of the Info Objects InfoProvider DataSource Customer ID Customer ID Info Object Transformation Customer Customer Name Name Info Object By Karthikeyan Selvaraj
  • 8. Loading Once the mapping is done, data has to be transferred from DataSource (PSA Table) to InfoProvider ( Info Objects) This is done by a process called Data Transfer Process (DTP). How?: We create the DTP in InfoProvider layer and activate it. After activation we execute the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their respective InfoObjects. InfoProvider DataSource Customer ID Customer ID Info Object Transformation Customer Customer Name Name Info Object DTP By Karthikeyan Selvaraj
  • 9. Loading Data are moved to their respective InfoObjects as per their mapping and it’s ready for reporting from the InfoProvider Layer. InfoProvider Customer ID Customer Name Info Object Info Object 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 10. Thank You By Karthikeyan Selvaraj