SlideShare ist ein Scribd-Unternehmen logo
1 von 81
SAP DATA SERVICES
A PRESENTATION BY GEETIKA
13 November 2019 Presentation titlePage 2
CONTENTS
1. Data Warehousing Overview
2. OLTP vs Data Warehouse
3. Data Mart
4. Data Warehousing Objects
5. Data Warehousing Schemas
6. Business Intelligence Overview
7. Operational Data Store
8. Fact Types
9. Slowly Changing Dimensions
10. ETL Overview
11. Datastores
12. Types of Datastores
13. Metadata Import
14. Data Services Object Hierarchy
15. Project
16. Jobs
17. Workflows
18. Dataflows
19. Embedded Dataflows
20. ABAP Dataflows
21. Log Files
22. Variables
23. Parameters
24. What is ETL ?
13 November 2019 Presentation titlePage 3
CONTENTS
25. SAP BODS Transforms Overview
26. Platform Transform
27. Data Integrator Transform
28. Query Transform
29. Case Transform
30. Map Operation Transform
31. Merge Transform
32. SQL Transform
33. Validation Transform
34. Data Integrator Transform
Geetika
SAP BI Consultant
35. Table Comparison Transform
36. History Preserving Transform
37. Key Generation Transform
38. Date Generation Transform
39. Pivot Transform
40. Reserve Pivot Transform
13 November 2019 Presentation titlePage 4
NEED FOR DATA WAREHOUSING
• Difficulty in obtaining integrated information
• Information structure not able to provide ‘full and dynamic’ analysis of information available
• Inconsistent results obtained from queries and reports arising from heterogeneous data sources
• Increased difficulty in delivering consistent comprehensive information in a timely fashion
13 November 2019 Presentation titlePage 5
WHY DATA WAREHOUSING?
Who are the potential
Customers ?
Which Products are sold the
most ?
What are the region-wise
preferences ?
What are the competitor
products ?
What are the projected
sales ?
What if you sale more
quantity of a particular
product ?
What will be the impact
on revenue ?
Results of promotion
schemes introduced ?
Need of Intelligent Information in Competitive Market
13 November 2019 Presentation titlePage 6
DATA WAREHOUSING OVERVIEW
• A data warehouse is a relational database that is designed for query and analysis rather than for
transaction processing. It usually contains historical data derived from transaction data.
• A data warehouse environment includes an extraction, transportation, transformation, and loading
(ETL) solution, online analytical processing (OLAP) and data mining capabilities, client analysis
tools, and other applications that manage the process of gathering data and delivering it to
business users.
• A data warehouse is a subject oriented, integrated, time-variant, and non-volatile collection of
data. This data helps analysts to take informed decisions in an organization.
• It is a series of processes, procedures and tools (h/w & s/w) that help the enterprise understand
more about itself, its products, its customers and the market it services
13 November 2019 Presentation titlePage 7
SUBJECT ORIENTED
• A data warehouse is subject
oriented because it provides
information around a subject
rather than the organization's
ongoing operations.
• These subjects can be product,
customers, suppliers, sales,
revenue, etc.
• A data warehouse does not focus
on the ongoing operations,
rather it focuses on modelling
and analysis of data for decision
making.
Operational
Systems
Data
Warehouse
Customer
Supplier
Product
Organized by processes
or tasks
Organized by
subject
13 November 2019 Presentation titlePage 8
INTEGRATED
• A data warehouse is constructed
by integrating data from
heterogeneous sources such as
relational databases, flat files,
etc.
• This integration enhances the
effective analysis of data.
• Data is stored once in a single
integrated location
• It is closely related with subject
orientation.
• Data from disparate sources need to
be put in a consistent format.
• Resolving of problems such as
naming conflicts and inconsistencies
Subject = Customer
Legacy
Mainframe
Customer
data
stored
in several
databases
RDBMS
Flat Files
13 November 2019 Presentation titlePage 9
TIME VARIANT
• The data collected in a data
warehouse is identified with a
particular time period.
• The data in a data warehouse
provides information from the
historical point of view.
• Data is stored as a series of
snapshots or views which record
how it is collected across time.
• It helps in Business trend analysis
• In contrast to OLTP environment,
data warehouse’s focus on change
over time that is what we mean by
time variant.
Data Warehouse
Time Data
{
Key
13 November 2019 Presentation titlePage 10
NON-VOLATILE
• Non-volatile means the previous
data is not erased when new data
is added to it.
• A data warehouse is kept
separate from the operational
database and therefore frequent
changes in operational database
is not reflected in the data
warehouse.
• This is logical because the
purpose of a data warehouse is
to enable you to analyze what
has occurred.
13 November 2019 Presentation titlePage 11
OLTP VS DATA WAREHOUSE
• OLTP systems are tuned for known transactions and workloads while workload is not known in a
data warehouse
• Special data organization, access methods and implementation methods are needed to support
data warehouse queries (typically multidimensional queries)
• OLTP
• Application Oriented
• Used to run business
• Detailed data
• Current up to date
• Isolated Data
• Repetitive access
• Clerical User
► Data warehouse
► Subject Oriented
► Used to analyze business
► Summarized and refined
► Snapshot data
► Integrated Data
► Ad-hoc access
► Knowledge User (Manager)
13 November 2019 Presentation titlePage 12
OLTP VS DATA WAREHOUSE (TO SUMMARIZE)
• OLTP Systems are
used to “run” a business
► The Data Warehouse helps to
“optimize” the business
13 November 2019 Presentation titlePage 13
DATA MART
• The data mart is a subset of the data warehouse and is usually oriented to a specific business line
or team.
• A data mart is a repository of data that is designed to serve a particular community of knowledge
workers.
• The goal of a data mart is to meet the particular demands of a specific group of users within the
organization, such as human resource management, sales etc.
• Data marts improve end-user response time by allowing users to have access to the specific type
of data they need to view most often by providing the data in a way that supports the collective
view of a group of users.
13 November 2019 Presentation titlePage 14
DATA WAREHOUSE END TO END
Metadata
Data Sources Data Management Access
Operational Data
Legacy Data
The Post
External Data
Sources
Enterprise
Data
Warehouse
Organizationally
structured
Extract
Transform
Load
Data
Mart
Data
Mart
Departmentally
structured
Data
Mart
Sales
Inventory
Purchase
13 November 2019 Presentation titlePage 15
DATA WAREHOUSING
13 November 2019 Presentation titlePage 16
DATA WAREHOUSING SCHEMAS
• A schema is a collection of database objects, including tables, views,
indexes, and synonyms.
• There is a variety of ways of arranging schema objects in the schema
models designed for data warehousing. The are:
Star Schema
Snowflake Schema
Galaxy Schema
13 November 2019 Presentation titlePage 17
STAR SCHEMA
• It Consists of a fact table connected to a set of dimensional tables
• Data is in Dimension tables is De-Normalized
13 November 2019 Presentation titlePage 18
SNOWFLAKE SCHEMA
 It is refinement of star schema where some dimensional hierarchy is normalized in to a set of
dimensional tables
13 November 2019 Presentation titlePage 19
GALAXY SCHEMA
 Multiple fact tables share dimension tables viewed as a collection of stars, therefore called galaxy
schema
13 November 2019 Presentation titlePage 20
BUSINESS INTELLIGENCE
• How intelligent can you make your business processes?
• What insight can you gain into your business?
• How integrated can your business processes be?
• How much more interactive can your business be with customers, partners, employees and
managers?
13 November 2019 Presentation titlePage 21
WHAT IS BUSINESS INTELLIGENCE (BI)?
• Business Intelligence is a generalized term applied to a broad category of applications and
technologies for gathering, storing, analyzing and providing access to data to help enterprise
users make better business decisions
• Business Intelligence applications include the activities of decision support systems, query and
reporting, online analytical processing (OLAP), statistical analysis, forecasting, and data mining
• An alternative way of describing BI is: the technology required to turn raw data into information
to support decision-making within corporations and business processes
13 November 2019 Presentation titlePage 22
OPERATIONAL DATA STORE (ODS)
An Operational Data Store (ODS) integrates data from multiple business operation sources to address
operational problems that span one or more business functions.
An ODS has the following features:
• Subject-oriented — Organized around major subjects of an organization (customer, product,
etc.), not specific applications (order entry, accounts receivable, etc.).
• Integrated — Presents an integrated image of subject-oriented data which is pulled from
fragmented operational source systems.
• Current — Contains a snapshot of the current content of legacy source systems. History is not
kept, and might be moved to the data warehouse for analysis.
• Volatile — Since ODS content is kept current, it changes frequently. Identical queries run at
different times may yield different results.
• Detailed — ODS data is generally more detailed than data warehouse data. Summary data is
usually not stored in an ODS; the exact granularity depends on the subject that is being
supported.
13 November 2019 Presentation titlePage 23
OPERATIONAL DATA STORE (ODS) CONTD..
The ODS provides an integrated view of data in operational systems.
As the figure below indicates, there is a clear separation between the ODS and the data warehouse.
13 November 2019 Presentation titlePage 24
BENEFITS OF ODS
• Supports operational reporting needs of the organization
• Operates as a store for detailed data, updated frequently and used for drill-downs from the data
warehouse which contains summary data.
• Reduces the burden placed on other operational or data warehouse platforms by providing an
additional data store for reporting.
• Provides more current data than in a data warehouse and more integrated than an OLTP system
• Feeds other operational systems in addition to the data warehouse
13 November 2019 Presentation titlePage 25
DATA WAREHOUSING OBJECTS
Fact Tables:
• Represent a business process, i.e., models the business process as an artifact in the data model
• Contain the measurements or metrics or facts of business processes
• "monthly sales number" in the Sales business process
• most are additive (sales this month), some are semi-additive (balance as of), some are not
additive (unit price)
• The level of detail is called the “grain” of the table
• Contain foreign keys for the dimension tables
13 November 2019 Presentation titlePage 26
DATA WAREHOUSING OBJECTS (CONTD..)
Dimension Tables:
• Dimension tables
• Define business in terms already familiar to users
• Wide rows with lots of descriptive text
• Small tables (about a million rows)
• Joined to fact table by a foreign key
• heavily indexed
• typical dimensions
• time periods, geographic region (markets, cities), products, customers,
salesperson, etc.
13 November 2019 Presentation titlePage 27
FACT TYPES
• Additive facts:
Additive facts are facts that can be summed up through all of the dimensions in the fact table
• Semi-Additive facts:
Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table
• Non-additive facts:
Non-additive facts are facts that cannot be summed up for any of the dimensions Present in the
fact table
13 November 2019 Presentation titlePage 28
EXAMPLE OF ADDITIVE FACT
Fact Table :
• The purpose of this table is to record the Sales_Amount for each product in each store
on a daily basis. Sales_Amount is the fact.
• In this case, Sales_Amount is an additive fact, because we can sum up this fact along
with any of the 3 dimensions present in the fact table – date, store, and product
Date
Store
Product
Sales_Amount
13 November 2019 Presentation titlePage 29
EXAMPLE OF SEMI ADDITIVE & NON-ADDITIVE FACTS
Fact Table :
 The purpose of this table is to record the current balance for each account at the end of
each day, as well as the profit margin for each account for each day
 Current_Balance & Profit_Margin are the facts
 Current_Balance is a semi additive fact, as it makes sense to add them up for all
accounts (what’s the total current balance for all accounts in the bank?), but it does not
make sense to add them up through time
 Profit_Margin is a non additive fact, for it does not make sense to add them up for the
Date
Account
Current_Balance
Profit_Margin
13 November 2019 Presentation titlePage 30
SLOWLY CHANGING DIMENSIONS
• Various data elements in the dimension undergo changes (e.g. changes in attributes, hierarchical
structures) which need to be captured for analysis.
• In a nutshell, this applies to cases where the attribute for a record varies over time.
• Example :
• Christina is a customer who first lived in Chicago, Illinois.
At a later date, she moved to Los Angeles, California.
Now how to modify the table to reflect this change?
This is a “Slowly Changing Dimension problem”
Customer key Name State
1001 Christina Illinois
13 November 2019 Presentation titlePage 31
TYPES OF SCD
• There are 3 types of SCDs :-
• Type 1
• Type 2
• Type 3
Type 1: New record places the original record. No trace of the old record exists.
Type 2: A new record is added to the dimension table
Type 3: The Original record is modified to reflect the change
13 November 2019 Presentation titlePage 32
SCD TYPE 1
New record places the original record. No trace of the old record exists
Eg:
Customer key Name State
1001 Christina Illinois
Customer key Name State
1001 Christina California
Advantages:
This is the easiest way to handle the Slowly Changing dimension, Since there is no need to
keep track of the old information.
Disadvantages:
All the history is lost. By applying this methodology, it is not possible to track back in history. For
eg In the
above case, the company would not able to know that Christina lived in Illinois before.
13 November 2019 Presentation titlePage 33
SCD TYPE 2
In type 2 SCD a new record is added to the table to represent the new
Information. Therefore both the original & the new record will be present
Eg:
Customer key Name State
1001 Christina Illinois
1005
Christina
California
After Christina moved from Illinois to California, we add the new information as a new row into the
table
Advantages:
This allows us to accurately keep all historical information
Disadvantages:
This will cause the size of the table to grow fast where the number of rows for the
table is very high to start with, storage and performance can become a concern
13 November 2019 Presentation titlePage 34
SCD TYPE 3
In type 3 SCD there will be two columns to indicate the particular attribute of
interest, one indicating the original value, and one indicating the current value.
There will also be a column that indicates when the current value becomes active.
Eg: Customer key Name Original State Current State Effective Date
1001 Christina Illinois
California 15-Jan-03
After Christina moved from Illinois to California, the original information gets updated,
And we have the above table (Assuming the effective date of change is January 15,2003)
Advantages:
 This does not increase the size of the table, since new information is updated.
 This allows us to keep some part of history
Disadvantages:
Type 3 will not be able to keep all history where an attribute is changed more than
Once. For eg, if Christina later moves from to Texas on December 15,2003 the
California information is lost.
13 November 2019 Presentation titlePage 35
WHAT IS ETL ?
• ETL stands for extract, transform, and load.
• ETL is software that enables businesses to consolidate their disparate data while moving it from place to
place, and it doesn't really matter that that data is in different forms or formats
• We have many ETL Tools e.g. BODS, Informatica, IBM InfoSphere Data stage, AbInitio, Oracle Warehouse
Builder (OWB)
• It can be used for below purpose
• As Middle ware
• In Data warehouse
• SAP Data Conversion/Migration
13 November 2019 Presentation titlePage 36
ETL PROCESS
13 November 2019 Presentation titlePage 37
ETL TERMS
• Source System
A database, application, file, or other storage facility from which the data in a data warehouse is
derived.
• Mapping
The definition of the relationship and data flow between source and target objects.
• Metadata
Data that describes data and other structures, such as objects, business rules, and processes. For
example, the schema design of a data warehouse is typically stored in a repository as metadata,
which is used to generate scripts used to build and populate the data warehouse. A repository
contains metadata.
• Staging Area
A place where data is processed before entering the warehouse.
• Cleansing
The process of resolving inconsistencies and fixing the anomalies in source data, typically as part of
the ETL process.
• Transformation
The process of manipulating data. Any manipulation beyond copying is a transformation. Examples
include cleansing, aggregating, and integrating data from multiple sources.
13 November 2019 Presentation titlePage 38
DATASTORES
• Datastores
• Are used to setup connection between an application and the database.
• Must be specified for every source and target database.
• Are used to import metadata for source and target databases and tables into the repository.
• Are used by Data Services to read data from source tables or load data to target tables.
• In Business Objects Data Services, you can connect to the following systems using Datastore :−
• Mainframe systems and Database
• Applications and software with user written adapters
• SAP Applications, SAP BW, Oracle Apps, Siebel, etc.
13 November 2019 Presentation titlePage 39
TYPES OF DATASTORES
• Custom Datastores provide a simple way to import metadata directly from a broad variety of
relational database management systems (RDBMS)
• Application Datastores lets users easily import metadata from most Enterprise Resource Planning
(ERP) systems
• Adapter Datastores allow users to import metadata from any source. Specific adapters may be
purchased from Business Objects, or can be developed by customers or third parties as
documented in Business Objects Adapter Development Kit (ADK)
13 November 2019 Presentation titlePage 40
DEFINING A DATASTORE
• To define a Datastore, you must have an
account with access privileges to the
database or application hosting the data
you need to access (user name and
password).
• Datastores are defined in the Datastores tab
of the object library using the Datastore
Editor.
• The Datastore options available depend on
which RDBMS or application is used for the
Datastore.
13 November 2019 Presentation titlePage 41
DEFINING A DATASTORE (CONT.)
• Datastore Editor
• Used to define/edit a Datastore
• Give the Datastore a meaningful name
• Choose the application type of your
Datastore
• You must enter the parameters of the
database to which you are connecting.
13 November 2019 Presentation titlePage 42
DATASTORE ADVANCED CONFIGURATION
• You can toggle the Advanced button to hide
and show the grid of additional Datastore
editor options.
• The grid displays Datastore configurations as
column headings and lists Datastore options
in the left column. Each row represents a
configuration option.
• Different options appear depending upon
Datastore type and (if applicable) database
type and version. Specific options appear
under group headings such as Connection,
General, and Locale
13 November 2019 Presentation titlePage 43
DATASTORE ADVANCED CONFIGURATION (CONT.)
• You can toggle the Advanced button to hide
and show the grid of additional Datastore
editor options.
• The grid displays Datastore configurations as
column headings and lists Datastore options
in the left column. Each row represents a
configuration option.
• Different options appear depending upon
Datastore type and (if applicable) database
type and version. Specific options appear
under group headings such as Connection,
General, and Locale
13 November 2019 Presentation titlePage 44
METADATA IMPORT
• Data Services stores the following table information :
• Table name, attributes, indexes
• Column names, descriptions, data types, primary keys
• Data Services only updates the loaded table information manually.
• Changes made to underlying table schema’s or functions are not automatically imported into
Business Objects Data Services.
13 November 2019 Presentation titlePage 45
SELECTIVE IMPORT
• Import metadata by Browsing :
• 1. In the object library, Datastores tab,
right-click on Datastore you want to
import to and select Open
• 2. From the workspace, right-click the
required table and select Import
Note: Only metadata is imported
13 November 2019 Presentation titlePage 46
SELECTIVE IMPORT (CONT.)
• Import metadata by Name-
1. In the object library, Datastores tab,
right-click the Datastore you want to
import to and select Import By Name
2. Complete the information in
Import By Name dialog box
13 November 2019 Presentation titlePage 47
SELECTIVE IMPORT (CONT.)
• Import Search for data
• Basic search of external or
imported (internal) data
• Advanced search of imported
(internal) data only
13 November 2019 Presentation titlePage 48
DATA SERVICES OBJECT HIERARCHY
13 November 2019 Presentation titlePage 49
PROJECT
• Project is the highest level of object offered by Data Services.
• They are listed in the object library under Project tab.
• Are used to group and organize related objects
• May contain any number of: Jobs, Workflows, Data flows etc.
• Only one project can be open at a time.
• Can be shared among multiple users using ATL files or a Central Repository
• Steps to create a new project :
• Choose Project > New > Project.
• Enter the name of your new project. The name can include alphanumeric
characters and underscores (_). It cannot contain blank spaces.
13 November 2019 Presentation titlePage 50
JOBS
• Jobs are the only executable object in the SAP BODS.
• Are reusable objects and next level of organization below a project.
• Contain Workflows (optional) and/or Dataflows.
• Can call many Workflows.
• Can be assigned in any projects available in local repository by dragging it from local object library.
• Are the highest level of logging which happens at this level.
13 November 2019 Presentation titlePage 51
JOBS (CONT.)
• Batch Job
A batch job extracts, transforms, and loads data. It is something that you start, it
does the processing like reading tables and loading the data warehouse, and then it
stops until it is started again, e.g. every night, twice a day, every 4 hours, or
manually started.
• Real Time Job
Like a batch job, a real-time job also extracts, transforms, and loads data. A Real
Time job is started once at the beginning and keeps running as long as the server is
active. Whenever a new message is sent through a SOAP request, it will get processed
and then the Real Time job sends a SOAP response and waits for the next request.
13 November 2019 Presentation titlePage 52
JOBS (CONT.)
Jobs are created in the Project area or in the Object Library.
• Create Job in Project area:
1. In the Project Area, select the Project Name.
2. Right-click and choose New Batch Job or New Real Time Job and then edit the
name.
3. SAP BODS opens a new workspace which is ready to define the job.
• Create Job in Object Library:
1. In Object Library, select Job tab.
2. Right-click Batch Jobs or Real Time Jobs and choose New.
3. A new job with a default name appears.
4. Right-click and select Properties to change the object's name and add a
description.
13 November 2019 Presentation titlePage 53
WORKFLOWS
• A Workflow defines the decision-making process for executing Dataflows.
• It is a reusable component used to group Dataflows and or Workflows together.
• The Workflow helps to define the execution order of the Dataflows and supporting operations.
• Defined System Parameters can be used to pass values into the workflow.
• Variables can also be defined for use inside the workflow.
• Workflows may contain the following objects : Workflow, Dataflow, Script, Conditional, Try, Catch,
While
13 November 2019 Presentation titlePage 54
DATAFLOWS
• Data flows extract, transform, and load data; reading sources, transforming data, and loading
targets, occurs inside a data flow.
• A data flow can be added to a job or a work flow.
• Dataflows are reusable objects.
• Workflows and Jobs call Dataflows to perform data movement operations.
13 November 2019 Presentation titlePage 55
EMBEDDED DATAFLOWS
• An embedded Dataflow is a Dataflow that is called from inside another Dataflow. Data passes into
or out of the embedded Dataflow from the parent flow through a single source or target.
• The embedded Dataflow can contain any number of sources or targets, but only one input or one
output can pass data to or from the parent Dataflow.
• An embedded Dataflow is a design aid that has no effect on job execution.
• When SAP BODS executes the parent Dataflow, it expands any embedded Dataflows, optimizes the
parent Dataflow, then executes it.
13 November 2019 Presentation titlePage 56
EMBEDDED DATAFLOWS (EXAMPLE)
• The Example of when to use embedded Dataflows:
• In this example, a Dataflow uses a single source to load three different target systems. The Case
transform sends each row from the source to different transforms that process it to get a unique
target output.
• You can simplify the parent Dataflow by using embedded Dataflows for the three different cases.
13 November 2019 Presentation titlePage 57
EMBEDDED DATAFLOWS (CONT.)
There are two ways to create embedded Dataflows:
• Select objects within a Dataflow, right-click, and select Make Embedded
Dataflow.
• Drag a complete and fully validated Dataflow from the object library into an
open Dataflow in the workspace. Then- Open the Dataflow you just added.
Right-click one object you want to use as an input or as an output port and
select Make Port for that object.
13 November 2019 Presentation titlePage 58
ABAP DATAFLOWS
• An ABAP Dataflow extracts and transforms data from SAP application tables, files, and hierarchies.
• The ABAP Dataflow produces a data set that you can use as input to other transforms, save to a file
that resides on an SAP application server, or save to an SAP table.
• When SAP BODS executes ABAP Dataflows, it translates the extraction requirements into ABAP
programs and passes them to SAP to execute.
• ABAP Dataflows generate ABAP code. ABAP Dataflows and data transport objects also appear in the
tool palette.
• In the ABAP Dataflow, after specifying the source and transformations, specify the target, which is
the data transport object.
• The data transport object in ABAP Dataflows makes the data set available to the calling Dataflow, it
will be a file on the SAP application server.
• Once the ABAP Dataflow is defined, in the normal Dataflow, connect the ABAP Dataflow to the
downstream transforms and the Target object where the data has to be loaded.
13 November 2019 Presentation titlePage 59
LOG FILES
• As a Job executes, Data Integrator produces the three types of log files that can be viewed in the
Designer Project Area :
• Monitor Log
• Statistics Log
• Error Log
• The log files are, by default, also set to display automatically in the workspace when you execute a
Job.
13 November 2019 Presentation titlePage 60
LOG FILES (CONT.)
Monitor Log : Displays each step of each data flow in the job, the number of rows streamed through
each step, and the duration of each step.
13 November 2019 Presentation titlePage 61
LOG FILES (CONT.)
Statistics log: Itemizes the steps executed in the job and the time execution begins and ends.
13 November 2019 Presentation titlePage 62
LOG FILES (CONT.)
Error log : Displays the name of the object being executed when a Data Integrator error occurred. Also
displays the text of the resulting error message.
13 November 2019 Presentation titlePage 63
VARIABLES
• Variables are symbolic placeholders for values.
• Local variables are restricted to the object in which they are created (job or work flow). You must use
parameters to pass local variables
• Global variables are restricted to the job in which they are created. However, they do not require
parameters to be passed to work flows and data flows.
• The data type of a variable can be any supported by the software such as an integer, decimal, date,
or text string.
• You can increase the flexibility and reusability of work flows and data flows by using local and
global variables when you design your jobs.
13 November 2019 Presentation titlePage 64
VARIABLES (CONT.)
• If you define variables in a job or work flow, the software typically uses them in a script, catch, or
conditional process.
• You can use variables inside data flows. For example, use them in a custom function or in the
WHERE clause of a query transform.
13 November 2019 Presentation titlePage 65
PARAMETERS
• Parameters can be defined to:
• Pass their values into and out of work flows
• Pass their values into data flows
• Each parameter is assigned a type: input, output, or input/output. The value
passed by the parameter can be used by any object called by the work flow
or data flow.
13 November 2019 Presentation titlePage 66
VARIABLES & PARAMETERS
• Variables and parameters are used differently based on the object type and whether the variable is
local or global.
• The following table lists the types of variables and parameters you can create based on the object
type and how you use them.
13 November 2019 Presentation titlePage 67
DATA SERVICES TRANSFORMS
• Data Services offers a number of pre-defined transformations and functional objects for both Data
Integration and Data Quality, that allow modelling of the ETL flows.
• Each transform is a step in a Dataflow that acts on a data set.
• A transform enables you to control how data sets change in a Dataflow.
• Transforms operate on data sets by manipulating input sets and producing one or more output
sets.
• The software includes many built-in transforms. These transforms are available from the object
library on the Transforms tab.
• Transforms are divided into 3 categories : Platform, Data Integrator & Data Quality
13 November 2019 Presentation titlePage 68
QUERY TRANSFORM
• The Query transform retrieves a data set that satisfies conditions that you specify. A Query
transform is similar to a SQL SELECT statement.
• A query has data outputs, which are data sets based on the conditions that you specify using the
schema specified in the output schema area.
• Query transform has the following tabs : SELECT, FROM, WHERE, GROUP BY, ORDER BY, ADVANCED
Platform Transforms
13 November 2019 Presentation titlePage 69
CASE TRANSFORM
• Specifies multiple paths in a single transform (different rows are processed in different ways).
• The Case transform simplifies branch logic in data flows by consolidating case or decision making
logic in one transform.
• Paths are defined in an expression table.
Platform Transforms
13 November 2019 Presentation titlePage 70
MAP_OPERATION TRANSFORM
• Modifies data based on mapping expressions and current operation codes. The operation codes can be
converted between data manipulation operations.
• This transform can also change operation codes on data sets to produce the desired output. For example,
if a row in the input data set has been updated in some previous operation in the data flow, you can use
this transform to map the UPDATE operation to an INSERT. The result of converting UPDATE rows into
INSERT rows is the preservation the rows in the target.
• Data Services can push Map_Operation transforms to the source database.
• The Map Operation tab would have the following settings:
Platform Transforms
13 November 2019 Presentation titlePage 71
MERGE TRANSFORM
• Combines incoming data sets, producing a single output data set with the same schema as the
input data sets.
• All sources must have the same schema, including:
• The same number of columns
• The same column names
• The same data type of columns
• A data set consisting of rows from all sources, with any operation codes. The output data has the
same schema as the source data, including nested schemas.
Platform Transforms
13 November 2019 Presentation titlePage 72
SQL TRANSFORM
• Performs the indicated SQL query operation.
• Use this transform to perform standard SQL operations when other built-in transforms cannot
perform them.
• The options for the SQL transform include specifying a Datastore, join rank, cache, array fetch size,
and entering SQL text.
• There are two ways of defining the output schema for a SQL transform:
• Automatic: After you type the SQL statement, click Update schema to execute a described select
statement against the database which obtains column information returned by the select statement and
populates the output schema.●
• Manual: Output columns must be defined in the output portion of the SQL transform. The number of
columns defined in the output of the SQL transform must equal the number of columns returned by the
SQL query.
Platform Transforms
13 November 2019 Presentation titlePage 73
VALIDATION TRANSFORM
• The Validation transform qualifies a data set based on
rules for input schema columns.
• You can apply multiple rules per column or bind a
single reusable rule (in the form of a validation
function) to multiple columns.
• The Validation transform can identify the row, column,
or columns for each validation failure.
• You can also use the Validation transform to filter or
replace (substitute) data that fails your criteria.
• When you enable a validation rule for a column, a
check mark appears next to it in the input schema.
Platform Transforms
13 November 2019 Presentation titlePage 74
TABLE COMPARISON TRANSFORM
• Compares two data sets and produces the difference between them as a data set with rows flagged
as INSERT or UPDATE.
• The Table_Comparison transform allows you to detect and forward changes that have occurred
since the last time a target was updated.
• Allows you to identify changes to a target table for incremental updates
• Three possible outcomes from this transform:
• New row can be added
• Existing record can be updated
• Row is can be ignored
Data Integrator Transforms
13 November 2019 Presentation titlePage 75
TABLE COMPARISON TRANSFORM (CONT.)
Presence of output
columns indicates
proper configuration
Comparison table
(usually the target)
Data Integrator Transforms
13 November 2019 Presentation titlePage 76
HISTORY PRESERVING TRANSFORM
• The History_Preserving transform allows you to produce a new row in your target rather than
updating an existing row. You can indicate in which columns the transform identifies changes to be
preserved.
• The History_Preserving transform requires input rows flagged as inserts and updates.
• The History_Preserving transform is usually preceded by a Table_Comparison, which provides the
required input row types.
Data Integrator Transforms
13 November 2019 Presentation titlePage 77
KEY GENERATION TRANSFORM
• Generates sequential key values for new rows, starting
from the maximum existing key value in a specified table
• Allows you to build a new physical primary key, e.g. for
preserving history
• When it is necessary to generate artificial keys in a table,
the Key_Generation transform looks up the maximum
existing key value from a table and uses it as the starting
value to generate new keys.
• The transform expects the generated key column to be
part of the input schema.
Data Integrator Transforms
13 November 2019 Presentation titlePage 78
DATE GENERATION TRANSFORM
• Produces a series of dates incremented as you specify.
• Use this transform to produce the key values for a time dimension target.
• From this generated sequence you can populate other fields in the time dimension (such as
day_of_week) using functions in a query.
• To create a time dimension target with dates from the beginning of the year 1997 to the end of
the year 2000, place a Date_Generation transform, a query, and a target in a data flow. Inside
the Date_Generation transform, specify the following Options :
• Start date: 1997.01.01
• End date: 2000.12.31 (A variable can also be used.)
• Increment: Daily (A variable can also be used.)
Data Integrator Transforms
13 November 2019 Presentation titlePage 79
PIVOT TRANSFORM (COLUMNS TO ROWS)
• Creates a new row for each value in a column that you identify as a pivot column.
• The Pivot transform allows you to change how the relationship between rows is displayed.
• For each value in each pivot column, Data Services produces a row in the output data set.
• You can create pivot sets to specify more than one pivot column.
Name Jan Feb Mar
Joe 1100 500 900
Sid 500 1200 300
Dolly 900 1300 200
Name0 Sequence Month Q1
Joe 1 Jan 1100
Joe 2 Feb 500
Joe 3 Mar 900
Sid 1 Jan 500
Sid 2 Feb 1200
Sid 3 Mar 300
Dolly 1 Jan 900
Dolly 2 Feb 1300
Dolly 3 Mar 200
Input: Output:
Non Pivot columns: Name
Pivot columns: Jan,
Feb, Mar
Sequence name:
Sequence
Pivot data field:
Q1_Expenses
Pivot header name: Month
Data Integrator Transforms
13 November 2019 Presentation titlePage 80
REVERSE PIVOT TRANSFORM (ROWS TO COLUMNS)
• Creates one row of data from several existing rows.
• The Reverse Pivot transform allows you to combine data from several rows into one row by
creating new columns.
• For each unique value in a pivot axis column and each selected pivot column, Data Services
produces a column in the output data set.
Data Integrator Transforms
13 November 2019 Presentation titlePage 81
REVERSE PIVOT TRANSFORM (ROWS TO COLUMNS) (CONT.)
Input:
Output:
Data Integrator Transforms

Weitere ähnliche Inhalte

Was ist angesagt?

SAP HANA Migration Deck.pptx
SAP HANA Migration Deck.pptxSAP HANA Migration Deck.pptx
SAP HANA Migration Deck.pptx
SingbBablu
 
Take the Next Step to S/4HANA with "RISE with SAP"
Take the Next Step to S/4HANA with "RISE with SAP"Take the Next Step to S/4HANA with "RISE with SAP"
Take the Next Step to S/4HANA with "RISE with SAP"
panayaofficial
 

Was ist angesagt? (20)

S4HANA Migration Overview
S4HANA Migration OverviewS4HANA Migration Overview
S4HANA Migration Overview
 
1668146695188.pdf
1668146695188.pdf1668146695188.pdf
1668146695188.pdf
 
SAP BI/BW
SAP BI/BWSAP BI/BW
SAP BI/BW
 
SAP S_4HANA Migration Cockpit - Migrate your Data to SAP S_4HANA.pdf
SAP S_4HANA Migration Cockpit - Migrate your Data to SAP S_4HANA.pdfSAP S_4HANA Migration Cockpit - Migrate your Data to SAP S_4HANA.pdf
SAP S_4HANA Migration Cockpit - Migrate your Data to SAP S_4HANA.pdf
 
Sap S4 HANA Everything You Need To Know
Sap S4 HANA Everything You Need To Know Sap S4 HANA Everything You Need To Know
Sap S4 HANA Everything You Need To Know
 
Migration Cockpit (LTMC)
Migration Cockpit (LTMC)Migration Cockpit (LTMC)
Migration Cockpit (LTMC)
 
SAP CPI - DS
SAP CPI - DSSAP CPI - DS
SAP CPI - DS
 
SAP BW Introduction.
SAP BW Introduction.SAP BW Introduction.
SAP BW Introduction.
 
SAP S4HANA Migration Cockpit.pdf
SAP S4HANA Migration Cockpit.pdfSAP S4HANA Migration Cockpit.pdf
SAP S4HANA Migration Cockpit.pdf
 
Sap Business Objects solutioning Framework architecture
Sap Business Objects solutioning Framework architectureSap Business Objects solutioning Framework architecture
Sap Business Objects solutioning Framework architecture
 
Transition to SAP S/4HANA System Conversion: A step-by-step guide
Transition to SAP S/4HANA System Conversion: A step-by-step guide Transition to SAP S/4HANA System Conversion: A step-by-step guide
Transition to SAP S/4HANA System Conversion: A step-by-step guide
 
SAP HANA Migration Deck.pptx
SAP HANA Migration Deck.pptxSAP HANA Migration Deck.pptx
SAP HANA Migration Deck.pptx
 
Moving to SAP S/4HANA
Moving to SAP S/4HANAMoving to SAP S/4HANA
Moving to SAP S/4HANA
 
Sap Purchase Order Workflow
Sap Purchase Order WorkflowSap Purchase Order Workflow
Sap Purchase Order Workflow
 
SAP Archiving
SAP ArchivingSAP Archiving
SAP Archiving
 
10 Golden Rules for S/4 HANA Migrations
10 Golden Rules for S/4 HANA Migrations10 Golden Rules for S/4 HANA Migrations
10 Golden Rules for S/4 HANA Migrations
 
Take the Next Step to S/4HANA with "RISE with SAP"
Take the Next Step to S/4HANA with "RISE with SAP"Take the Next Step to S/4HANA with "RISE with SAP"
Take the Next Step to S/4HANA with "RISE with SAP"
 
Sap bw4 hana
Sap bw4 hanaSap bw4 hana
Sap bw4 hana
 
SAP Overview and Architecture
SAP Overview and ArchitectureSAP Overview and Architecture
SAP Overview and Architecture
 
S4 HANA presentation.pptx
S4 HANA presentation.pptxS4 HANA presentation.pptx
S4 HANA presentation.pptx
 

Ähnlich wie SAP Data Services

An Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data WarehousingAn Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data Warehousing
BRNSSPublicationHubI
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
MutiaSari53
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
A P
 

Ähnlich wie SAP Data Services (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
An Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data WarehousingAn Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data Warehousing
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Chpt2.ppt
Chpt2.pptChpt2.ppt
Chpt2.ppt
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Issue in Data warehousing and OLAP in E-business
Issue in Data warehousing and OLAP in E-businessIssue in Data warehousing and OLAP in E-business
Issue in Data warehousing and OLAP in E-business
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Introduction to Data warehouse
Introduction to Data warehouseIntroduction to Data warehouse
Introduction to Data warehouse
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdf
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdf
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

SAP Data Services

  • 1. SAP DATA SERVICES A PRESENTATION BY GEETIKA
  • 2. 13 November 2019 Presentation titlePage 2 CONTENTS 1. Data Warehousing Overview 2. OLTP vs Data Warehouse 3. Data Mart 4. Data Warehousing Objects 5. Data Warehousing Schemas 6. Business Intelligence Overview 7. Operational Data Store 8. Fact Types 9. Slowly Changing Dimensions 10. ETL Overview 11. Datastores 12. Types of Datastores 13. Metadata Import 14. Data Services Object Hierarchy 15. Project 16. Jobs 17. Workflows 18. Dataflows 19. Embedded Dataflows 20. ABAP Dataflows 21. Log Files 22. Variables 23. Parameters 24. What is ETL ?
  • 3. 13 November 2019 Presentation titlePage 3 CONTENTS 25. SAP BODS Transforms Overview 26. Platform Transform 27. Data Integrator Transform 28. Query Transform 29. Case Transform 30. Map Operation Transform 31. Merge Transform 32. SQL Transform 33. Validation Transform 34. Data Integrator Transform Geetika SAP BI Consultant 35. Table Comparison Transform 36. History Preserving Transform 37. Key Generation Transform 38. Date Generation Transform 39. Pivot Transform 40. Reserve Pivot Transform
  • 4. 13 November 2019 Presentation titlePage 4 NEED FOR DATA WAREHOUSING • Difficulty in obtaining integrated information • Information structure not able to provide ‘full and dynamic’ analysis of information available • Inconsistent results obtained from queries and reports arising from heterogeneous data sources • Increased difficulty in delivering consistent comprehensive information in a timely fashion
  • 5. 13 November 2019 Presentation titlePage 5 WHY DATA WAREHOUSING? Who are the potential Customers ? Which Products are sold the most ? What are the region-wise preferences ? What are the competitor products ? What are the projected sales ? What if you sale more quantity of a particular product ? What will be the impact on revenue ? Results of promotion schemes introduced ? Need of Intelligent Information in Competitive Market
  • 6. 13 November 2019 Presentation titlePage 6 DATA WAREHOUSING OVERVIEW • A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data. • A data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, online analytical processing (OLAP) and data mining capabilities, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users. • A data warehouse is a subject oriented, integrated, time-variant, and non-volatile collection of data. This data helps analysts to take informed decisions in an organization. • It is a series of processes, procedures and tools (h/w & s/w) that help the enterprise understand more about itself, its products, its customers and the market it services
  • 7. 13 November 2019 Presentation titlePage 7 SUBJECT ORIENTED • A data warehouse is subject oriented because it provides information around a subject rather than the organization's ongoing operations. • These subjects can be product, customers, suppliers, sales, revenue, etc. • A data warehouse does not focus on the ongoing operations, rather it focuses on modelling and analysis of data for decision making. Operational Systems Data Warehouse Customer Supplier Product Organized by processes or tasks Organized by subject
  • 8. 13 November 2019 Presentation titlePage 8 INTEGRATED • A data warehouse is constructed by integrating data from heterogeneous sources such as relational databases, flat files, etc. • This integration enhances the effective analysis of data. • Data is stored once in a single integrated location • It is closely related with subject orientation. • Data from disparate sources need to be put in a consistent format. • Resolving of problems such as naming conflicts and inconsistencies Subject = Customer Legacy Mainframe Customer data stored in several databases RDBMS Flat Files
  • 9. 13 November 2019 Presentation titlePage 9 TIME VARIANT • The data collected in a data warehouse is identified with a particular time period. • The data in a data warehouse provides information from the historical point of view. • Data is stored as a series of snapshots or views which record how it is collected across time. • It helps in Business trend analysis • In contrast to OLTP environment, data warehouse’s focus on change over time that is what we mean by time variant. Data Warehouse Time Data { Key
  • 10. 13 November 2019 Presentation titlePage 10 NON-VOLATILE • Non-volatile means the previous data is not erased when new data is added to it. • A data warehouse is kept separate from the operational database and therefore frequent changes in operational database is not reflected in the data warehouse. • This is logical because the purpose of a data warehouse is to enable you to analyze what has occurred.
  • 11. 13 November 2019 Presentation titlePage 11 OLTP VS DATA WAREHOUSE • OLTP systems are tuned for known transactions and workloads while workload is not known in a data warehouse • Special data organization, access methods and implementation methods are needed to support data warehouse queries (typically multidimensional queries) • OLTP • Application Oriented • Used to run business • Detailed data • Current up to date • Isolated Data • Repetitive access • Clerical User ► Data warehouse ► Subject Oriented ► Used to analyze business ► Summarized and refined ► Snapshot data ► Integrated Data ► Ad-hoc access ► Knowledge User (Manager)
  • 12. 13 November 2019 Presentation titlePage 12 OLTP VS DATA WAREHOUSE (TO SUMMARIZE) • OLTP Systems are used to “run” a business ► The Data Warehouse helps to “optimize” the business
  • 13. 13 November 2019 Presentation titlePage 13 DATA MART • The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team. • A data mart is a repository of data that is designed to serve a particular community of knowledge workers. • The goal of a data mart is to meet the particular demands of a specific group of users within the organization, such as human resource management, sales etc. • Data marts improve end-user response time by allowing users to have access to the specific type of data they need to view most often by providing the data in a way that supports the collective view of a group of users.
  • 14. 13 November 2019 Presentation titlePage 14 DATA WAREHOUSE END TO END Metadata Data Sources Data Management Access Operational Data Legacy Data The Post External Data Sources Enterprise Data Warehouse Organizationally structured Extract Transform Load Data Mart Data Mart Departmentally structured Data Mart Sales Inventory Purchase
  • 15. 13 November 2019 Presentation titlePage 15 DATA WAREHOUSING
  • 16. 13 November 2019 Presentation titlePage 16 DATA WAREHOUSING SCHEMAS • A schema is a collection of database objects, including tables, views, indexes, and synonyms. • There is a variety of ways of arranging schema objects in the schema models designed for data warehousing. The are: Star Schema Snowflake Schema Galaxy Schema
  • 17. 13 November 2019 Presentation titlePage 17 STAR SCHEMA • It Consists of a fact table connected to a set of dimensional tables • Data is in Dimension tables is De-Normalized
  • 18. 13 November 2019 Presentation titlePage 18 SNOWFLAKE SCHEMA  It is refinement of star schema where some dimensional hierarchy is normalized in to a set of dimensional tables
  • 19. 13 November 2019 Presentation titlePage 19 GALAXY SCHEMA  Multiple fact tables share dimension tables viewed as a collection of stars, therefore called galaxy schema
  • 20. 13 November 2019 Presentation titlePage 20 BUSINESS INTELLIGENCE • How intelligent can you make your business processes? • What insight can you gain into your business? • How integrated can your business processes be? • How much more interactive can your business be with customers, partners, employees and managers?
  • 21. 13 November 2019 Presentation titlePage 21 WHAT IS BUSINESS INTELLIGENCE (BI)? • Business Intelligence is a generalized term applied to a broad category of applications and technologies for gathering, storing, analyzing and providing access to data to help enterprise users make better business decisions • Business Intelligence applications include the activities of decision support systems, query and reporting, online analytical processing (OLAP), statistical analysis, forecasting, and data mining • An alternative way of describing BI is: the technology required to turn raw data into information to support decision-making within corporations and business processes
  • 22. 13 November 2019 Presentation titlePage 22 OPERATIONAL DATA STORE (ODS) An Operational Data Store (ODS) integrates data from multiple business operation sources to address operational problems that span one or more business functions. An ODS has the following features: • Subject-oriented — Organized around major subjects of an organization (customer, product, etc.), not specific applications (order entry, accounts receivable, etc.). • Integrated — Presents an integrated image of subject-oriented data which is pulled from fragmented operational source systems. • Current — Contains a snapshot of the current content of legacy source systems. History is not kept, and might be moved to the data warehouse for analysis. • Volatile — Since ODS content is kept current, it changes frequently. Identical queries run at different times may yield different results. • Detailed — ODS data is generally more detailed than data warehouse data. Summary data is usually not stored in an ODS; the exact granularity depends on the subject that is being supported.
  • 23. 13 November 2019 Presentation titlePage 23 OPERATIONAL DATA STORE (ODS) CONTD.. The ODS provides an integrated view of data in operational systems. As the figure below indicates, there is a clear separation between the ODS and the data warehouse.
  • 24. 13 November 2019 Presentation titlePage 24 BENEFITS OF ODS • Supports operational reporting needs of the organization • Operates as a store for detailed data, updated frequently and used for drill-downs from the data warehouse which contains summary data. • Reduces the burden placed on other operational or data warehouse platforms by providing an additional data store for reporting. • Provides more current data than in a data warehouse and more integrated than an OLTP system • Feeds other operational systems in addition to the data warehouse
  • 25. 13 November 2019 Presentation titlePage 25 DATA WAREHOUSING OBJECTS Fact Tables: • Represent a business process, i.e., models the business process as an artifact in the data model • Contain the measurements or metrics or facts of business processes • "monthly sales number" in the Sales business process • most are additive (sales this month), some are semi-additive (balance as of), some are not additive (unit price) • The level of detail is called the “grain” of the table • Contain foreign keys for the dimension tables
  • 26. 13 November 2019 Presentation titlePage 26 DATA WAREHOUSING OBJECTS (CONTD..) Dimension Tables: • Dimension tables • Define business in terms already familiar to users • Wide rows with lots of descriptive text • Small tables (about a million rows) • Joined to fact table by a foreign key • heavily indexed • typical dimensions • time periods, geographic region (markets, cities), products, customers, salesperson, etc.
  • 27. 13 November 2019 Presentation titlePage 27 FACT TYPES • Additive facts: Additive facts are facts that can be summed up through all of the dimensions in the fact table • Semi-Additive facts: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table • Non-additive facts: Non-additive facts are facts that cannot be summed up for any of the dimensions Present in the fact table
  • 28. 13 November 2019 Presentation titlePage 28 EXAMPLE OF ADDITIVE FACT Fact Table : • The purpose of this table is to record the Sales_Amount for each product in each store on a daily basis. Sales_Amount is the fact. • In this case, Sales_Amount is an additive fact, because we can sum up this fact along with any of the 3 dimensions present in the fact table – date, store, and product Date Store Product Sales_Amount
  • 29. 13 November 2019 Presentation titlePage 29 EXAMPLE OF SEMI ADDITIVE & NON-ADDITIVE FACTS Fact Table :  The purpose of this table is to record the current balance for each account at the end of each day, as well as the profit margin for each account for each day  Current_Balance & Profit_Margin are the facts  Current_Balance is a semi additive fact, as it makes sense to add them up for all accounts (what’s the total current balance for all accounts in the bank?), but it does not make sense to add them up through time  Profit_Margin is a non additive fact, for it does not make sense to add them up for the Date Account Current_Balance Profit_Margin
  • 30. 13 November 2019 Presentation titlePage 30 SLOWLY CHANGING DIMENSIONS • Various data elements in the dimension undergo changes (e.g. changes in attributes, hierarchical structures) which need to be captured for analysis. • In a nutshell, this applies to cases where the attribute for a record varies over time. • Example : • Christina is a customer who first lived in Chicago, Illinois. At a later date, she moved to Los Angeles, California. Now how to modify the table to reflect this change? This is a “Slowly Changing Dimension problem” Customer key Name State 1001 Christina Illinois
  • 31. 13 November 2019 Presentation titlePage 31 TYPES OF SCD • There are 3 types of SCDs :- • Type 1 • Type 2 • Type 3 Type 1: New record places the original record. No trace of the old record exists. Type 2: A new record is added to the dimension table Type 3: The Original record is modified to reflect the change
  • 32. 13 November 2019 Presentation titlePage 32 SCD TYPE 1 New record places the original record. No trace of the old record exists Eg: Customer key Name State 1001 Christina Illinois Customer key Name State 1001 Christina California Advantages: This is the easiest way to handle the Slowly Changing dimension, Since there is no need to keep track of the old information. Disadvantages: All the history is lost. By applying this methodology, it is not possible to track back in history. For eg In the above case, the company would not able to know that Christina lived in Illinois before.
  • 33. 13 November 2019 Presentation titlePage 33 SCD TYPE 2 In type 2 SCD a new record is added to the table to represent the new Information. Therefore both the original & the new record will be present Eg: Customer key Name State 1001 Christina Illinois 1005 Christina California After Christina moved from Illinois to California, we add the new information as a new row into the table Advantages: This allows us to accurately keep all historical information Disadvantages: This will cause the size of the table to grow fast where the number of rows for the table is very high to start with, storage and performance can become a concern
  • 34. 13 November 2019 Presentation titlePage 34 SCD TYPE 3 In type 3 SCD there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active. Eg: Customer key Name Original State Current State Effective Date 1001 Christina Illinois California 15-Jan-03 After Christina moved from Illinois to California, the original information gets updated, And we have the above table (Assuming the effective date of change is January 15,2003) Advantages:  This does not increase the size of the table, since new information is updated.  This allows us to keep some part of history Disadvantages: Type 3 will not be able to keep all history where an attribute is changed more than Once. For eg, if Christina later moves from to Texas on December 15,2003 the California information is lost.
  • 35. 13 November 2019 Presentation titlePage 35 WHAT IS ETL ? • ETL stands for extract, transform, and load. • ETL is software that enables businesses to consolidate their disparate data while moving it from place to place, and it doesn't really matter that that data is in different forms or formats • We have many ETL Tools e.g. BODS, Informatica, IBM InfoSphere Data stage, AbInitio, Oracle Warehouse Builder (OWB) • It can be used for below purpose • As Middle ware • In Data warehouse • SAP Data Conversion/Migration
  • 36. 13 November 2019 Presentation titlePage 36 ETL PROCESS
  • 37. 13 November 2019 Presentation titlePage 37 ETL TERMS • Source System A database, application, file, or other storage facility from which the data in a data warehouse is derived. • Mapping The definition of the relationship and data flow between source and target objects. • Metadata Data that describes data and other structures, such as objects, business rules, and processes. For example, the schema design of a data warehouse is typically stored in a repository as metadata, which is used to generate scripts used to build and populate the data warehouse. A repository contains metadata. • Staging Area A place where data is processed before entering the warehouse. • Cleansing The process of resolving inconsistencies and fixing the anomalies in source data, typically as part of the ETL process. • Transformation The process of manipulating data. Any manipulation beyond copying is a transformation. Examples include cleansing, aggregating, and integrating data from multiple sources.
  • 38. 13 November 2019 Presentation titlePage 38 DATASTORES • Datastores • Are used to setup connection between an application and the database. • Must be specified for every source and target database. • Are used to import metadata for source and target databases and tables into the repository. • Are used by Data Services to read data from source tables or load data to target tables. • In Business Objects Data Services, you can connect to the following systems using Datastore :− • Mainframe systems and Database • Applications and software with user written adapters • SAP Applications, SAP BW, Oracle Apps, Siebel, etc.
  • 39. 13 November 2019 Presentation titlePage 39 TYPES OF DATASTORES • Custom Datastores provide a simple way to import metadata directly from a broad variety of relational database management systems (RDBMS) • Application Datastores lets users easily import metadata from most Enterprise Resource Planning (ERP) systems • Adapter Datastores allow users to import metadata from any source. Specific adapters may be purchased from Business Objects, or can be developed by customers or third parties as documented in Business Objects Adapter Development Kit (ADK)
  • 40. 13 November 2019 Presentation titlePage 40 DEFINING A DATASTORE • To define a Datastore, you must have an account with access privileges to the database or application hosting the data you need to access (user name and password). • Datastores are defined in the Datastores tab of the object library using the Datastore Editor. • The Datastore options available depend on which RDBMS or application is used for the Datastore.
  • 41. 13 November 2019 Presentation titlePage 41 DEFINING A DATASTORE (CONT.) • Datastore Editor • Used to define/edit a Datastore • Give the Datastore a meaningful name • Choose the application type of your Datastore • You must enter the parameters of the database to which you are connecting.
  • 42. 13 November 2019 Presentation titlePage 42 DATASTORE ADVANCED CONFIGURATION • You can toggle the Advanced button to hide and show the grid of additional Datastore editor options. • The grid displays Datastore configurations as column headings and lists Datastore options in the left column. Each row represents a configuration option. • Different options appear depending upon Datastore type and (if applicable) database type and version. Specific options appear under group headings such as Connection, General, and Locale
  • 43. 13 November 2019 Presentation titlePage 43 DATASTORE ADVANCED CONFIGURATION (CONT.) • You can toggle the Advanced button to hide and show the grid of additional Datastore editor options. • The grid displays Datastore configurations as column headings and lists Datastore options in the left column. Each row represents a configuration option. • Different options appear depending upon Datastore type and (if applicable) database type and version. Specific options appear under group headings such as Connection, General, and Locale
  • 44. 13 November 2019 Presentation titlePage 44 METADATA IMPORT • Data Services stores the following table information : • Table name, attributes, indexes • Column names, descriptions, data types, primary keys • Data Services only updates the loaded table information manually. • Changes made to underlying table schema’s or functions are not automatically imported into Business Objects Data Services.
  • 45. 13 November 2019 Presentation titlePage 45 SELECTIVE IMPORT • Import metadata by Browsing : • 1. In the object library, Datastores tab, right-click on Datastore you want to import to and select Open • 2. From the workspace, right-click the required table and select Import Note: Only metadata is imported
  • 46. 13 November 2019 Presentation titlePage 46 SELECTIVE IMPORT (CONT.) • Import metadata by Name- 1. In the object library, Datastores tab, right-click the Datastore you want to import to and select Import By Name 2. Complete the information in Import By Name dialog box
  • 47. 13 November 2019 Presentation titlePage 47 SELECTIVE IMPORT (CONT.) • Import Search for data • Basic search of external or imported (internal) data • Advanced search of imported (internal) data only
  • 48. 13 November 2019 Presentation titlePage 48 DATA SERVICES OBJECT HIERARCHY
  • 49. 13 November 2019 Presentation titlePage 49 PROJECT • Project is the highest level of object offered by Data Services. • They are listed in the object library under Project tab. • Are used to group and organize related objects • May contain any number of: Jobs, Workflows, Data flows etc. • Only one project can be open at a time. • Can be shared among multiple users using ATL files or a Central Repository • Steps to create a new project : • Choose Project > New > Project. • Enter the name of your new project. The name can include alphanumeric characters and underscores (_). It cannot contain blank spaces.
  • 50. 13 November 2019 Presentation titlePage 50 JOBS • Jobs are the only executable object in the SAP BODS. • Are reusable objects and next level of organization below a project. • Contain Workflows (optional) and/or Dataflows. • Can call many Workflows. • Can be assigned in any projects available in local repository by dragging it from local object library. • Are the highest level of logging which happens at this level.
  • 51. 13 November 2019 Presentation titlePage 51 JOBS (CONT.) • Batch Job A batch job extracts, transforms, and loads data. It is something that you start, it does the processing like reading tables and loading the data warehouse, and then it stops until it is started again, e.g. every night, twice a day, every 4 hours, or manually started. • Real Time Job Like a batch job, a real-time job also extracts, transforms, and loads data. A Real Time job is started once at the beginning and keeps running as long as the server is active. Whenever a new message is sent through a SOAP request, it will get processed and then the Real Time job sends a SOAP response and waits for the next request.
  • 52. 13 November 2019 Presentation titlePage 52 JOBS (CONT.) Jobs are created in the Project area or in the Object Library. • Create Job in Project area: 1. In the Project Area, select the Project Name. 2. Right-click and choose New Batch Job or New Real Time Job and then edit the name. 3. SAP BODS opens a new workspace which is ready to define the job. • Create Job in Object Library: 1. In Object Library, select Job tab. 2. Right-click Batch Jobs or Real Time Jobs and choose New. 3. A new job with a default name appears. 4. Right-click and select Properties to change the object's name and add a description.
  • 53. 13 November 2019 Presentation titlePage 53 WORKFLOWS • A Workflow defines the decision-making process for executing Dataflows. • It is a reusable component used to group Dataflows and or Workflows together. • The Workflow helps to define the execution order of the Dataflows and supporting operations. • Defined System Parameters can be used to pass values into the workflow. • Variables can also be defined for use inside the workflow. • Workflows may contain the following objects : Workflow, Dataflow, Script, Conditional, Try, Catch, While
  • 54. 13 November 2019 Presentation titlePage 54 DATAFLOWS • Data flows extract, transform, and load data; reading sources, transforming data, and loading targets, occurs inside a data flow. • A data flow can be added to a job or a work flow. • Dataflows are reusable objects. • Workflows and Jobs call Dataflows to perform data movement operations.
  • 55. 13 November 2019 Presentation titlePage 55 EMBEDDED DATAFLOWS • An embedded Dataflow is a Dataflow that is called from inside another Dataflow. Data passes into or out of the embedded Dataflow from the parent flow through a single source or target. • The embedded Dataflow can contain any number of sources or targets, but only one input or one output can pass data to or from the parent Dataflow. • An embedded Dataflow is a design aid that has no effect on job execution. • When SAP BODS executes the parent Dataflow, it expands any embedded Dataflows, optimizes the parent Dataflow, then executes it.
  • 56. 13 November 2019 Presentation titlePage 56 EMBEDDED DATAFLOWS (EXAMPLE) • The Example of when to use embedded Dataflows: • In this example, a Dataflow uses a single source to load three different target systems. The Case transform sends each row from the source to different transforms that process it to get a unique target output. • You can simplify the parent Dataflow by using embedded Dataflows for the three different cases.
  • 57. 13 November 2019 Presentation titlePage 57 EMBEDDED DATAFLOWS (CONT.) There are two ways to create embedded Dataflows: • Select objects within a Dataflow, right-click, and select Make Embedded Dataflow. • Drag a complete and fully validated Dataflow from the object library into an open Dataflow in the workspace. Then- Open the Dataflow you just added. Right-click one object you want to use as an input or as an output port and select Make Port for that object.
  • 58. 13 November 2019 Presentation titlePage 58 ABAP DATAFLOWS • An ABAP Dataflow extracts and transforms data from SAP application tables, files, and hierarchies. • The ABAP Dataflow produces a data set that you can use as input to other transforms, save to a file that resides on an SAP application server, or save to an SAP table. • When SAP BODS executes ABAP Dataflows, it translates the extraction requirements into ABAP programs and passes them to SAP to execute. • ABAP Dataflows generate ABAP code. ABAP Dataflows and data transport objects also appear in the tool palette. • In the ABAP Dataflow, after specifying the source and transformations, specify the target, which is the data transport object. • The data transport object in ABAP Dataflows makes the data set available to the calling Dataflow, it will be a file on the SAP application server. • Once the ABAP Dataflow is defined, in the normal Dataflow, connect the ABAP Dataflow to the downstream transforms and the Target object where the data has to be loaded.
  • 59. 13 November 2019 Presentation titlePage 59 LOG FILES • As a Job executes, Data Integrator produces the three types of log files that can be viewed in the Designer Project Area : • Monitor Log • Statistics Log • Error Log • The log files are, by default, also set to display automatically in the workspace when you execute a Job.
  • 60. 13 November 2019 Presentation titlePage 60 LOG FILES (CONT.) Monitor Log : Displays each step of each data flow in the job, the number of rows streamed through each step, and the duration of each step.
  • 61. 13 November 2019 Presentation titlePage 61 LOG FILES (CONT.) Statistics log: Itemizes the steps executed in the job and the time execution begins and ends.
  • 62. 13 November 2019 Presentation titlePage 62 LOG FILES (CONT.) Error log : Displays the name of the object being executed when a Data Integrator error occurred. Also displays the text of the resulting error message.
  • 63. 13 November 2019 Presentation titlePage 63 VARIABLES • Variables are symbolic placeholders for values. • Local variables are restricted to the object in which they are created (job or work flow). You must use parameters to pass local variables • Global variables are restricted to the job in which they are created. However, they do not require parameters to be passed to work flows and data flows. • The data type of a variable can be any supported by the software such as an integer, decimal, date, or text string. • You can increase the flexibility and reusability of work flows and data flows by using local and global variables when you design your jobs.
  • 64. 13 November 2019 Presentation titlePage 64 VARIABLES (CONT.) • If you define variables in a job or work flow, the software typically uses them in a script, catch, or conditional process. • You can use variables inside data flows. For example, use them in a custom function or in the WHERE clause of a query transform.
  • 65. 13 November 2019 Presentation titlePage 65 PARAMETERS • Parameters can be defined to: • Pass their values into and out of work flows • Pass their values into data flows • Each parameter is assigned a type: input, output, or input/output. The value passed by the parameter can be used by any object called by the work flow or data flow.
  • 66. 13 November 2019 Presentation titlePage 66 VARIABLES & PARAMETERS • Variables and parameters are used differently based on the object type and whether the variable is local or global. • The following table lists the types of variables and parameters you can create based on the object type and how you use them.
  • 67. 13 November 2019 Presentation titlePage 67 DATA SERVICES TRANSFORMS • Data Services offers a number of pre-defined transformations and functional objects for both Data Integration and Data Quality, that allow modelling of the ETL flows. • Each transform is a step in a Dataflow that acts on a data set. • A transform enables you to control how data sets change in a Dataflow. • Transforms operate on data sets by manipulating input sets and producing one or more output sets. • The software includes many built-in transforms. These transforms are available from the object library on the Transforms tab. • Transforms are divided into 3 categories : Platform, Data Integrator & Data Quality
  • 68. 13 November 2019 Presentation titlePage 68 QUERY TRANSFORM • The Query transform retrieves a data set that satisfies conditions that you specify. A Query transform is similar to a SQL SELECT statement. • A query has data outputs, which are data sets based on the conditions that you specify using the schema specified in the output schema area. • Query transform has the following tabs : SELECT, FROM, WHERE, GROUP BY, ORDER BY, ADVANCED Platform Transforms
  • 69. 13 November 2019 Presentation titlePage 69 CASE TRANSFORM • Specifies multiple paths in a single transform (different rows are processed in different ways). • The Case transform simplifies branch logic in data flows by consolidating case or decision making logic in one transform. • Paths are defined in an expression table. Platform Transforms
  • 70. 13 November 2019 Presentation titlePage 70 MAP_OPERATION TRANSFORM • Modifies data based on mapping expressions and current operation codes. The operation codes can be converted between data manipulation operations. • This transform can also change operation codes on data sets to produce the desired output. For example, if a row in the input data set has been updated in some previous operation in the data flow, you can use this transform to map the UPDATE operation to an INSERT. The result of converting UPDATE rows into INSERT rows is the preservation the rows in the target. • Data Services can push Map_Operation transforms to the source database. • The Map Operation tab would have the following settings: Platform Transforms
  • 71. 13 November 2019 Presentation titlePage 71 MERGE TRANSFORM • Combines incoming data sets, producing a single output data set with the same schema as the input data sets. • All sources must have the same schema, including: • The same number of columns • The same column names • The same data type of columns • A data set consisting of rows from all sources, with any operation codes. The output data has the same schema as the source data, including nested schemas. Platform Transforms
  • 72. 13 November 2019 Presentation titlePage 72 SQL TRANSFORM • Performs the indicated SQL query operation. • Use this transform to perform standard SQL operations when other built-in transforms cannot perform them. • The options for the SQL transform include specifying a Datastore, join rank, cache, array fetch size, and entering SQL text. • There are two ways of defining the output schema for a SQL transform: • Automatic: After you type the SQL statement, click Update schema to execute a described select statement against the database which obtains column information returned by the select statement and populates the output schema.● • Manual: Output columns must be defined in the output portion of the SQL transform. The number of columns defined in the output of the SQL transform must equal the number of columns returned by the SQL query. Platform Transforms
  • 73. 13 November 2019 Presentation titlePage 73 VALIDATION TRANSFORM • The Validation transform qualifies a data set based on rules for input schema columns. • You can apply multiple rules per column or bind a single reusable rule (in the form of a validation function) to multiple columns. • The Validation transform can identify the row, column, or columns for each validation failure. • You can also use the Validation transform to filter or replace (substitute) data that fails your criteria. • When you enable a validation rule for a column, a check mark appears next to it in the input schema. Platform Transforms
  • 74. 13 November 2019 Presentation titlePage 74 TABLE COMPARISON TRANSFORM • Compares two data sets and produces the difference between them as a data set with rows flagged as INSERT or UPDATE. • The Table_Comparison transform allows you to detect and forward changes that have occurred since the last time a target was updated. • Allows you to identify changes to a target table for incremental updates • Three possible outcomes from this transform: • New row can be added • Existing record can be updated • Row is can be ignored Data Integrator Transforms
  • 75. 13 November 2019 Presentation titlePage 75 TABLE COMPARISON TRANSFORM (CONT.) Presence of output columns indicates proper configuration Comparison table (usually the target) Data Integrator Transforms
  • 76. 13 November 2019 Presentation titlePage 76 HISTORY PRESERVING TRANSFORM • The History_Preserving transform allows you to produce a new row in your target rather than updating an existing row. You can indicate in which columns the transform identifies changes to be preserved. • The History_Preserving transform requires input rows flagged as inserts and updates. • The History_Preserving transform is usually preceded by a Table_Comparison, which provides the required input row types. Data Integrator Transforms
  • 77. 13 November 2019 Presentation titlePage 77 KEY GENERATION TRANSFORM • Generates sequential key values for new rows, starting from the maximum existing key value in a specified table • Allows you to build a new physical primary key, e.g. for preserving history • When it is necessary to generate artificial keys in a table, the Key_Generation transform looks up the maximum existing key value from a table and uses it as the starting value to generate new keys. • The transform expects the generated key column to be part of the input schema. Data Integrator Transforms
  • 78. 13 November 2019 Presentation titlePage 78 DATE GENERATION TRANSFORM • Produces a series of dates incremented as you specify. • Use this transform to produce the key values for a time dimension target. • From this generated sequence you can populate other fields in the time dimension (such as day_of_week) using functions in a query. • To create a time dimension target with dates from the beginning of the year 1997 to the end of the year 2000, place a Date_Generation transform, a query, and a target in a data flow. Inside the Date_Generation transform, specify the following Options : • Start date: 1997.01.01 • End date: 2000.12.31 (A variable can also be used.) • Increment: Daily (A variable can also be used.) Data Integrator Transforms
  • 79. 13 November 2019 Presentation titlePage 79 PIVOT TRANSFORM (COLUMNS TO ROWS) • Creates a new row for each value in a column that you identify as a pivot column. • The Pivot transform allows you to change how the relationship between rows is displayed. • For each value in each pivot column, Data Services produces a row in the output data set. • You can create pivot sets to specify more than one pivot column. Name Jan Feb Mar Joe 1100 500 900 Sid 500 1200 300 Dolly 900 1300 200 Name0 Sequence Month Q1 Joe 1 Jan 1100 Joe 2 Feb 500 Joe 3 Mar 900 Sid 1 Jan 500 Sid 2 Feb 1200 Sid 3 Mar 300 Dolly 1 Jan 900 Dolly 2 Feb 1300 Dolly 3 Mar 200 Input: Output: Non Pivot columns: Name Pivot columns: Jan, Feb, Mar Sequence name: Sequence Pivot data field: Q1_Expenses Pivot header name: Month Data Integrator Transforms
  • 80. 13 November 2019 Presentation titlePage 80 REVERSE PIVOT TRANSFORM (ROWS TO COLUMNS) • Creates one row of data from several existing rows. • The Reverse Pivot transform allows you to combine data from several rows into one row by creating new columns. • For each unique value in a pivot axis column and each selected pivot column, Data Services produces a column in the output data set. Data Integrator Transforms
  • 81. 13 November 2019 Presentation titlePage 81 REVERSE PIVOT TRANSFORM (ROWS TO COLUMNS) (CONT.) Input: Output: Data Integrator Transforms