SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
What is ETL?
                        Extraction, Transformation, Loading

Simple Example of ETL


                                Customer      Customer
                                   ID          Name


                               105           Sainsbury
        Master Data

                               102           Tesco


                               109           Waitrose


                               101           Asda



                                                              By
                                                              Karthikeyan Selvaraj
Let’s say the master data table here is a flat file ie excel file which is in your computer .
                   We need to bring this table into SAP BI platform




                                                                        Customer Customer
SAP BI Platform                                                            ID     Name

                                                                        105         Sainsbury

                                                                        102         Tesco

                                                                        109         Waitrose

                                                                        101         Asda




                                                                       By
                                                                       Karthikeyan Selvaraj
The first step is to extract the master data table ie excel file into BI-data warehouse
The components needed for extracting the data into BI data warehouse are
1. DataSource
2. InfoPackage

1. DataSource



    DataSource
                                                  DataSource: It defines about the data.
                                                 For eg: Once I finish this presentation, I
  What type of                                    will choose a location to save this ppt
  data?                                          and I also define in what version I want
  Where the                                      to save this ppt similarly, In datasource
  data is                                             we will define about the data.
  located?




                                                                       By
                                                                       Karthikeyan Selvaraj
The first step is to extract the master data table ie excel file into BI-data warehouse
 The components needed for extracting the data into BI data warehouse are
 1. DataSource
 2. InfoPackage

  2. InfoPackage

    What is InfoPackage?
    In simple words we can define InfoPackage, It is like a key to open and enter into a
    room.
    It helps to bring the data from a legacy system or SAP system. For our scenario it
    helps to bring the data from our computer into BI datawarehouse.

        Customer      Customer                            DataSource
Excel      ID          Name
File    105          Sainsbury
                                                        What type of
        102          Tesco                              data?
        109          Waitrose                           Where the
        101          Asda                               data is
                                                        located?
                                     InfoPackage                       By
           Computer                                   BI Datawarehouse Karthikeyan Selvaraj
Now we have moved the master data table into BI datawarehouse by executing the
 InfoPackage
 Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area)
 The data that comes inside from any source system will be stored temporarily in PSA.

Excel
File
  Customer    Customer                 DataSource
                                                                     PSA
     ID        Name
 105         Sainsbury                                     Customer     Customer
                                   What type of               ID         Name
 102         Tesco                 data?
                                                           105         Sainsbury
 109         Waitrose              Where the
                                   data is                 102         Tesco
 101         Asda                  located?                109         Waitrose

                         InfoPackage                       101         Asda


                                                  BI Datawarehouse
        Computer
                                                                       By
                                                                       Karthikeyan Selvaraj
Transformation of Data
The first part of ETL ie Extraction is done successfully. Now we need to transform the data
so that it can be made more optimized for reporting.
In order to do that, we define fields of the table as Info Objects. In our master data table
we have two fields ie Customer ID and Customer Name so in BI we define them as Info
Objects.
Info Objects are divided into three types
1. Characteristics – sorting keys such as company code, product ID, etc.
2. Key Figures – quantity, amount or number of items. Data that can be manipulated.
3. Units – currency, measure this all comes under unit.
 Customer ID and Customer name are characteristic Info Objects.
           PSA
 Customer     Customer                                     Customer ID
    ID         Name                                         Info Object
105           Sainsbury                                   Customer Name
                                                            Info Object
102           Tesco
109           Waitrose
101           Asda
                                                                         By
                                       Characteristic Info Object        Karthikeyan Selvaraj
Transformation of Data
The attribute for Customer ID is Customer name
In database we define the attributes for primary key similarly we need to define the
attributes for master data field ie for Customer ID.
Once that is done we do the mapping ie transformation. We map the fields of the
DataSource to the fields of the Info Objects


                                                           InfoProvider
            DataSource


           Customer ID                                     Customer ID
                                                            Info Object
                                 Transformation
             Customer                                    Customer Name
              Name                                         Info Object




                                                                        By
                                                                        Karthikeyan Selvaraj
Loading
Once the mapping is done, data has to be transferred from DataSource (PSA Table) to
InfoProvider ( Info Objects)
This is done by a process called Data Transfer Process (DTP).
How?: We create the DTP in InfoProvider layer and activate it. After activation we execute
the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their
respective InfoObjects.

                                                           InfoProvider
            DataSource


           Customer ID                                    Customer ID
                                                           Info Object
                                 Transformation
             Customer                                   Customer Name
              Name                                        Info Object




                                     DTP
                                                                       By
                                                                       Karthikeyan Selvaraj
Loading
Data are moved to their respective InfoObjects as per their mapping and it’s ready for
reporting from the InfoProvider Layer.

                                     InfoProvider



                      Customer ID            Customer Name
                       Info Object             Info Object

                           105                  Sainsbury
                           102                      Tesco
                           109                  Waitrose
                           101                      Asda




                                                                    By
                                                                    Karthikeyan Selvaraj
Thank You

            By
            Karthikeyan Selvaraj

Weitere ähnliche Inhalte

Was ist angesagt?

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data Engineering.pdf
Data Engineering.pdfData Engineering.pdf
Data Engineering.pdfDatacademy.ai
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing Girish Dhareshwar
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringDurga Gadiraju
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profilingShailja Khurana
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
 

Was ist angesagt? (20)

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data Engineering.pdf
Data Engineering.pdfData Engineering.pdf
Data Engineering.pdf
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Metadata ppt
Metadata pptMetadata ppt
Metadata ppt
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata Harmonisation
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
3 Data Mining Tasks
3  Data Mining Tasks3  Data Mining Tasks
3 Data Mining Tasks
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Ppt
PptPpt
Ppt
 
Data Cleaning Techniques
Data Cleaning TechniquesData Cleaning Techniques
Data Cleaning Techniques
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 

Ähnlich wie ETL Process

Lezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server PortfolioLezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server Portfoliolacndar1
 
Best-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data WarehousesBest-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data WarehousesDenodo
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019Intel® Software
 
Keynote Presentation
Keynote PresentationKeynote Presentation
Keynote PresentationSplunk
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Cana Ko
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault ModelingKent Graziano
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Cloudera, Inc.
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense✔ Eric David Benari, PMP
 
Summit 2011 ods edw technical
Summit 2011 ods edw technicalSummit 2011 ods edw technical
Summit 2011 ods edw technicalGreg Turmel
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenterRamy Mahrous
 
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...Cathrine Wilhelmsen
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications PlatformFulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications PlatformPerficient, Inc.
 

Ähnlich wie ETL Process (20)

Lezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server PortfolioLezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server Portfolio
 
Best-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data WarehousesBest-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data Warehouses
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
Keynote Presentation
Keynote PresentationKeynote Presentation
Keynote Presentation
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Lançamento ERwin 08/02
Lançamento ERwin 08/02Lançamento ERwin 08/02
Lançamento ERwin 08/02
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Msbi
MsbiMsbi
Msbi
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
 
Summit 2011 ods edw technical
Summit 2011 ods edw technicalSummit 2011 ods edw technical
Summit 2011 ods edw technical
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
 
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications PlatformFulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
 

Kürzlich hochgeladen

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Kürzlich hochgeladen (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

ETL Process

  • 1. What is ETL? Extraction, Transformation, Loading Simple Example of ETL Customer Customer ID Name 105 Sainsbury Master Data 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 2. Let’s say the master data table here is a flat file ie excel file which is in your computer . We need to bring this table into SAP BI platform Customer Customer SAP BI Platform ID Name 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 3. The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage 1. DataSource DataSource DataSource: It defines about the data. For eg: Once I finish this presentation, I What type of will choose a location to save this ppt data? and I also define in what version I want Where the to save this ppt similarly, In datasource data is we will define about the data. located? By Karthikeyan Selvaraj
  • 4. The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage 2. InfoPackage What is InfoPackage? In simple words we can define InfoPackage, It is like a key to open and enter into a room. It helps to bring the data from a legacy system or SAP system. For our scenario it helps to bring the data from our computer into BI datawarehouse. Customer Customer DataSource Excel ID Name File 105 Sainsbury What type of 102 Tesco data? 109 Waitrose Where the 101 Asda data is located? InfoPackage By Computer BI Datawarehouse Karthikeyan Selvaraj
  • 5. Now we have moved the master data table into BI datawarehouse by executing the InfoPackage Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area) The data that comes inside from any source system will be stored temporarily in PSA. Excel File Customer Customer DataSource PSA ID Name 105 Sainsbury Customer Customer What type of ID Name 102 Tesco data? 105 Sainsbury 109 Waitrose Where the data is 102 Tesco 101 Asda located? 109 Waitrose InfoPackage 101 Asda BI Datawarehouse Computer By Karthikeyan Selvaraj
  • 6. Transformation of Data The first part of ETL ie Extraction is done successfully. Now we need to transform the data so that it can be made more optimized for reporting. In order to do that, we define fields of the table as Info Objects. In our master data table we have two fields ie Customer ID and Customer Name so in BI we define them as Info Objects. Info Objects are divided into three types 1. Characteristics – sorting keys such as company code, product ID, etc. 2. Key Figures – quantity, amount or number of items. Data that can be manipulated. 3. Units – currency, measure this all comes under unit. Customer ID and Customer name are characteristic Info Objects. PSA Customer Customer Customer ID ID Name Info Object 105 Sainsbury Customer Name Info Object 102 Tesco 109 Waitrose 101 Asda By Characteristic Info Object Karthikeyan Selvaraj
  • 7. Transformation of Data The attribute for Customer ID is Customer name In database we define the attributes for primary key similarly we need to define the attributes for master data field ie for Customer ID. Once that is done we do the mapping ie transformation. We map the fields of the DataSource to the fields of the Info Objects InfoProvider DataSource Customer ID Customer ID Info Object Transformation Customer Customer Name Name Info Object By Karthikeyan Selvaraj
  • 8. Loading Once the mapping is done, data has to be transferred from DataSource (PSA Table) to InfoProvider ( Info Objects) This is done by a process called Data Transfer Process (DTP). How?: We create the DTP in InfoProvider layer and activate it. After activation we execute the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their respective InfoObjects. InfoProvider DataSource Customer ID Customer ID Info Object Transformation Customer Customer Name Name Info Object DTP By Karthikeyan Selvaraj
  • 9. Loading Data are moved to their respective InfoObjects as per their mapping and it’s ready for reporting from the InfoProvider Layer. InfoProvider Customer ID Customer Name Info Object Info Object 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 10. Thank You By Karthikeyan Selvaraj