The ETL process contains 3 main steps: extraction, transformation, and loading. Data is extracted from source databases, transformed by applying business rules, and loaded into the target database. A full load populates data warehouse tables for the first time by loading all records, while an incremental load applies dynamic changes over time. A three-tier data warehouse has a source layer to land data, an integration layer to store transformed data, and a dimension layer as the presentation layer. Snapshots are read-only copies of master tables refreshed periodically, while materialized views are pre-computed aggregate tables created from fact and dimension tables with associated materialized view logs. PowerCenter processes large volumes of data including from ERP sources and allows session partitioning
1. Call Us: +91- 8885560202 (India)
+1-707-666-8949 (USA)
Mail Us: Info@VirtualNuggets.com
1.Explain the ETL process? How many steps ETL contains?
Explain with example.
ETL stands for Extraction, Transforming and Loading.
Data is extracted from the source(database servers), and applied for the generating business role
on it.
The following are the steps involved :
Define the source [ define the odbc connection to the database source ]
Define the target [ create the odbc connection to the target database ]
Create the mapping [ Apply business role here by adding transformations and define the data
flow from source to target ]
Create the session [ Mapping instructions ]
Create the work flow [ Instructions that run on the sessions ]
2.Explain the Full load & Incremental or Refresh load-ETL?
Initial Load : It is the process of populating all the data warehousing tables for the very first
time
Full Load : While loading the data for the first time, all the set records are loaded at a stretch
depending on the volume. It erases all the content of tables and reloads with fresh data
Incremental Load : Applying the dynamic changes as and when necessary in a specific period.
The schedule is predefined each period
2. 3.What is the three tier data warehouse?-ETL
The data ware is thought of as a three tier system
The middle layer provides the data that is usable in a secure way to the end users.
The other two layers are on the other side of the middle tier. One from the end users and
the other from back end data storage.-ETL
The 1st layer is known as source layer where the data lands
The 2nd layer is known as integration layer where data is stored after transformation
The 3rd layer is known as dimension layer where the actual presentation layer stands.
4. What are snapshots? What are materialized views &
where do we use them? What is a materialized view log?-
ETL
Snapshots are copies of read-only data of a master table.
They are located on a remote node that is refreshed periodically to reflect the changes
made to the master table.
They are replica of tables
Views
Views are built by using attributes of one or more tables.
View with single table can be updated, whereas view with multiple tables cannot be updated
Materialized View log
A materialized view is a pre computed table that has aggregated or joined data from fact tables
and dimension tables.
To put it simple, a materialized view is an aggregate table.
5.What is the difference between Power Center and Power
mart.-ETL
Power Center
Processes large volumes of the data
ERP sources such as SAP,PeopleSoft,Oracle Apps. can be connected with the power
center
3. Session partition is allowed to improving the performance of an ETL transaction
Power Mart
Processes low volumes of data
Does not providing connections to ERP sources
Does not allow session partitions