This document provides an overview and demonstrations of data flow transformations in SQL Server Integration Services (SSIS). It begins with a question and answer section and an overview of split and join transformations such as the conditional split, multicast, union all, merge, and lookup transformations. Business intelligence transformations like the slowly changing dimension and term extraction transformations are also covered. The document concludes with demonstrations of the transformations in SSIS packages.
3. Recap and Q&A
Data Flow Transformations
Synchronous vs Asynchronous Transformations
Row Transformations
Demo: Character Map
Demo: Copy Column
Demo: Data Conversion
Demo: Derived Column
Demo: Export Column
Demo: OLE DB Command
Rowset Transformations
Demo: Aggregate
Demo: Sort
Demo: Pivot
Demo: Unpivot
Demo: Percentage Sampling
Demo: Row Sampling
@copyright 2014 (pramod_singla@yahoo.co.in)
4. Split and Join Transformations
These transformations distribute rows to different outputs, create copies of
the inputs, join multiple inputs into one output, and perform lookup
operations.
@copyright 2014 (pramod_singla@yahoo.co.in)
Transformation Description
Conditional Split TransformationThe transformation that routes data rows to different outputs.
Multicast Transformation The transformation that distributes data sets to multiple outputs.
Union All Transformation The transformation that merges multiple data sets.
Merge Transformation The transformation that merges two sorted data sets.
Merge Join Transformation The transformation that joins two data sets using a FULL, LEFT, or INNER
join.
Lookup Transformation The transformation that looks up values in a reference table using an exact
match.
Cache Transform The transformation that writes data from a connected data source in the data
flow to a Cache connection manager that saves the data to a cache file. The
Lookup transformation performs lookups on the data in the cache file.
5. Conditional Split
The transformation that routes data rows to
different outputs.
Similar to CASE decision structure
Must specify the default output for the
transformation.
It has one input, one or more outputs, and one
error output
@copyright 2014 (pramod_singla@yahoo.co.in)
6. Multicast
The transformation that distributes data sets to
multiple outputs.
This capability is useful when the package needs
to apply multiple sets of transformations to the
same data
Multicast transformation directs every row to
every output
It has one input , multiple outputs and no error
output.
@copyright 2014 (pramod_singla@yahoo.co.in)
7. Union All
The transformation that merges multiple data sets.
Inputs are added to output one after the other.
No reordering of rows occurs.
It has multiple inputs , one output and no error
output.
@copyright 2014 (pramod_singla@yahoo.co.in)
8. Merge
The transformation that merges two sorted data
sets. The rows from each dataset are inserted into the
output based on values in their key columns.
It has two inputs, one output and no error output.
Use the Union All transformation instead of the Merge
transformation in situations:
The transformation inputs are not sorted.
The combined output does not need to be sorted.
The transformation has more than two inputs.
@copyright 2014 (pramod_singla@yahoo.co.in)
9. Merge Join
This transformation that joins two data sets using a
FULL, LEFT, or INNER join.
Requires sorted data for its inputs.
Specify the join is a FULL, LEFT, or INNER join.
Specify the columns the join uses.
Specify whether the transformation handles null
values as equal to other nulls.
It has two inputs, one output and no error output.
@copyright 2014 (pramod_singla@yahoo.co.in)
10. Lookup
The transformation that looks up values in a reference
table using an exact match.
Uses either an OLE DB connection manager or a Cache
connection manager.
If there are multiple matches, returns only the first
match.
Lookup match is case sensitive
It has input, match output, no matched output and
error.
@copyright 2014 (pramod_singla@yahoo.co.in)
11. Cache
The Cache transformation generates a reference dataset for
the Lookup Transformation by writing data from a connected
data source in the data flow to a Cache connection
manager that saves the data to a cache file.
Writes only unique rows to the Cache connection manager.
In a single package, only one Cache Transform can write data
to the same Cache connection manager.
If the package contains multiple Cache Transforms, the first
Cache Transform that is called when the package runs, writes
the data to the connection manager. The write operations of
subsequent Cache Transforms fail.
@copyright 2014 (pramod_singla@yahoo.co.in)
12. Business Intelligence Transformations
These transformations perform BI operations such as cleaning data, mining
text, and running data mining prediction queries.
@copyright 2014 (pramod_singla@yahoo.co.in)
Transformation Description
Slowly Changing Dimension TransformationThe transformation that configures the updating of a slowly
changing dimension.
Fuzzy Grouping Transformation The transformation that standardizes values in column data.
Fuzzy Lookup Transformation The transformation that looks up values in a reference table using a
fuzzy match.
Term Extraction Transformation The transformation that extracts terms from text.
Term Lookup Transformation The transformation that looks up terms in a reference table and
counts terms extracted from text.
13. SCD
The transformation that configures the updating of a
slowly changing dimension.
Supports four types of changes: changing attribute,
historical attribute, fixed attribute, and inferred
member.
Only supports connections to SQL Server.
It has one input , up to six outputs and no error.
It requires at least non null one business key column.
@copyright 2014 (pramod_singla@yahoo.co.in)
15. Fuzzy Grouping
The transformation that standardizes values in column data.
Requires a connection to an instance of SQL Server.
Select the input columns to use when identifying duplicates,
and select the type of match—fuzzy or exact.
Uses an equi-join to locate at least one matching record, and
returns records with no matching records.
It has one input and one output. It does not support an error
output.
@copyright 2014 (pramod_singla@yahoo.co.in)
16. Term Extraction
The transformation that extracts terms from text.
Works only with English text
Can extract nouns only, noun phrases only, or both but articles and pronouns
are not extracted.
You can use the Term Extraction transformation to discover the content of a
data set. For example, text that contains e-mail messages may provide useful
feedback about products, so that you could use the Term Extraction
transformation to extract the topics of discussion in the messages, as a way of
analyzing the feedback.
One output column term contains the extracted terms and the other output
column sore contains the score.
Articles and pronouns are not extracted. For example, the Term Extraction
transformation extracts the term bicycle from the text the bicycle, my bicycle,
and that bicycle.
@copyright 2014 (pramod_singla@yahoo.co.in)