SlideShare ist ein Scribd-Unternehmen logo
1 von 54
Business Intelligence,
Data Warehousing &
ETL Concepts
Business Intelligence
3
Business Intelligence
How intelligent can you make your business processes?
What insight can you gain into your business?
How integrated can your business processes be?
How much more interactive can your business be with customers, partners,
employees and managers?
4
What is Business Intelligence (BI)?
Business Intelligence is a generalized term applied to a broad category of
applications and technologies for gathering, storing, analyzing and providing
access to data to help enterprise users make better business decisions
Business Intelligence applications include the activities of decision support
systems, query and reporting, online analytical processing (OLAP), statistical
analysis, forecasting, and data mining
An alternative way of describing BI is: the technology required to turn raw data
into information to support decision-making within corporations and business
processes
5
Why BI?
BI technologies help bring decision-makers the data in a form they can quickly
digest and apply to their decision making.
BI turns data into information for managers and executives and in general, people
making decisions in a company.
Companies want to use technology tactically to make their operations more
effective and more efficient - Business intelligence can be the catalyst for that
efficiency and effectiveness.
6
Benefits
The benefits of a well-planned BI implementation are going to be closely tied to
the business objectives driving the project.
Identify trends and anomalies in business operations more quickly, allowing
for more accurate and timelier decisions.
Deliver actionable insight and information to the right place with less effort .
Identify and operate based on a single version of the truth, allowing all
analysis to be completed on a core foundation with confidence.
7
Business Intelligence Components
TRANSFORM
LOAD
EXTRACT
OLAPDATA
MINING
Data
Warehouse
Operational Data
8
Business Intelligence Architecture
9
Business Intelligence Technologies
Data Sources
Paper, Files, Information Providers, Database Systems, OLTP
Data Warehouses / Data Marts
Data Exploration
OLAP, DSS, EIS, Querying and Reporting
Data Mining
Information discovery
Data Presentation
Visualization Techniques
Decision Making
Increasing potential to
support business decisions End User
Business Analyst
Data Analyst
DB Admin
Data Warehousing
11
What is a Data Warehouse?
A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing. It usually contains historical data derived
from transaction data.
A data warehouse environment includes an extraction, transportation,
transformation, and loading (ETL) solution, online analytical processing (OLAP)
and data mining capabilities, client analysis tools, and other applications that
manage the process of gathering data and delivering it to business users.
It is a series of processes, procedures and tools (h/w & s/w) that help the
enterprise understand more about itself, its products, its customers and the
market it services
12
Who are the potential
Customers ?
Which Products are sold the
most ?
What are the region-wise
preferences ?
What are the competitor
products ?
What are the projected
sales ?
What if you sale more
quantity of a particular
product ?
What will be the impact
on revenue ?
Results of promotion
schemes introduced ?
Why Data Warehousing?
Need of Intelligent Information in Competitive Market
13
OLTP vs. Data Warehouse
OLTP systems are tuned for known transactions and workloads while workload is
not known in a data warehouse
Special data organization, access methods and implementation methods are
needed to support data warehouse queries (typically multidimensional queries)
e.g., average amount spent on phone calls between 9AM-5PM in Pune during
the month of December
14
OLTP vs. Data Warehouse
OLTP
Application Oriented
Used to run business
Detailed data
Current up to date
Isolated Data
Repetitive access
Clerical User
WAREHOUSE (DSS)
Subject Oriented
Used to analyze business
Summarized and refined
Snapshot data
Integrated Data
Ad-hoc access
Knowledge User (Manager)
15
OLTP vs Data Warehouse
OLTP
Performance Sensitive
Few Records accessed at a time (tens)
Read/Update Access
No data redundancy
Database Size 100MB -100 GB
DATA WAREHOUSE
Performance relaxed
Large volumes accessed at
a time(millions)
Mostly Read (Batch Update)
Redundancy present
Database Size 100 GB -
few terabytes
16
OLTP vs Data Warehouse
OLTP
Transaction throughput is the
performance metric
Thousands of users
Managed in entirety
Data Warehouse
Query throughput is the
performance metric
Hundreds of users
Managed by subsets
17
Data Warehouse Architectures
 Centralized
In a centralized architecture, there exists only one data warehouse which stores
all data necessary for business analysis. As already shown in the previous section,
the disadvantage is the loss of performance in opposite to distributed approaches.
Central Architecture
18
Tiered:
A tiered architecture is a distributed data approach. This process
can not be done in one step because many sources have to be
integrated into the warehouse.
On a first level, the data of all branches in one region is collected, in
the second level the data from the regions is integrated into one
data warehouse.
Advantages:
 Faster response time
because the data is
located closer to the client
applications and
 Reduced volume of data to
be searched.
Tiered Architecture
Data Warehouse Architectures Contd…
19
Metadata
Data Sources Data Management Access
Complete Warehouse Solution Architecture
Operational Data
Legacy Data
The Post
VISA
External Data
Sources
Enterprise
Data
Warehouse
Organizationally
structured
Extract
Transform
Load
Data Information Knowledge
Asset Assembly (and Management) Asset Exploitation
Data
Mart
Data
Mart
Departmentally
structured
Data
Mart
Sales
Inventory
Purchase
20
Introduction To Data Marts
What is a Data Mart
From the Data Warehouse , atomic data flows to various departments for their
customized needs. If this data is periodically extracted from data warehouse
and loaded into a local database, it becomes a data mart. The data in Data Mart
has a different level of granularity than that of Data Warehouse. Since the data
in Data Marts is highly customized and lightly summarized , the departments can
do whatever they want without worrying about resource utilization. Also the
departments can use the analytical software they find convenient. The cost of
processing becomes very low.
21
Data Mart Overview
Data Marts
Satisfy 80% of
the local end-
users’ requests
Sales Representatives
and Analysts
Human
Resources
Financial Analysts,
Strategic Planners,
and Executives
DM Marketing
DM Finance
DM Sales
DM HR
Data Warehouse
DM Sales
DM HR
DM Marketing
22
From The Data Warehouse To Data Marts
Departmentally
Structured
Individually
Structured
Data Warehouse
Organizationally
Structured
Less
More
History
Normalized
Detailed
Data
Information
23
 Data model is a conceptual representation of data structures
(tables) required for a database and is very powerful in
expressing and communicating the business requirements. A
data model is an abstract model that describes how data is
represented and used.
The term data model has two generally accepted meanings:
 A data model theory i.e. a formal description of how data may
be structured and used.
 A data model instance i.e. applying a data model theory to
create a practical data model instance for some particular
application.
Modeling Fundamentals:
What is Data Model ?
24
Logical Data Model (LDM) - A logical design is conceptual and abstract.
The process of logical design involves arranging data into a series of logical
relationships called entities and attributes.
 Logical data model includes all required entities, attributes, key groups, and
relationships that represent business information and define business rules.
Modeling Fundamentals:Modeling Fundamentals:
Types OF Data ModelingTypes OF Data Modeling
Logical Data Model
25
Physical Data Model (PDM) - A physical data model is a representation
of a data design which takes into account the facilities and constraints of a
given database management system.
 A complete physical data model will include all the database artifacts
required to create relationships between tables or achieve performance
goals, such as indexes, constraint definitions, linking tables, partitioned
tables or clusters.
Modeling Fundamentals:Modeling Fundamentals:
Types OF Data ModelingTypes OF Data Modeling
Physical Data Model
26
Entity relationship diagram (ERD) – A data model utilizing several
notations to depict data in terms of the entities and relationships described by
that data.
 Databases are used to store structured data. The structure of this data, together
with other constraints, can be designed using a variety of techniques, one of
which is called entity-relationship modeling or ERM.
Modeling Fundamentals:Modeling Fundamentals:
Types OF Data ModelingTypes OF Data Modeling
ERD Diagram
27
Important Terminologies –
 Entity – Are the principal data object about which information is to be collected.
A class of persons, places, objects, events, or concepts about which we need to
capture and store data.
Modeling Fundamentals:Modeling Fundamentals:
Types OF Data ModelingTypes OF Data Modeling
•Persons: agency, contractor, customer,
department, division, employee, instructor,
student, supplier.
•Places: sales region, building, room,
branch office, campus.
•Objects: book, machine, part, product, raw material,
software license, software package, tool, vehicle model,
vehicle.
•Events: application, award, cancellation, class, flight,
invoice, order, registration, renewal, requisition,
reservation, sale, trip.
•Concepts: account, block of time, bond, course, fund,
qualification, stock.
28
 Relationship – A natural business association that exists between
one or more entities. The relationship may represent an event that
links the entities or merely a logical affinity that exists between the
entities
 An example of a relationship would be:
• Employees are assigned to projects
• Student enrolling in a curriculum
• Projects have subtasks
• Departments manage one or more projects
Modeling Fundamentals:Modeling Fundamentals:
Types OF Data ModelingTypes OF Data Modeling
STUDENT CURRICULUM
Is being studied by is enrolled in
29
Dimensional Data Modeling (DDM) -Dimensional modeling is the
design concept used by many data warehouse designers to build their data
warehouse.
 Is a logical design technique that seeks to present the data in a standard, intuitive
framework that allows for high-performance access. It adheres to a discipline that
uses the relational model with some important restrictions.
 Every dimensional model is composed of one table with a multi-part key, called
the fact table, and a set of smaller tables called dimension tables.
Components of a DM:
 Fact Table
 Dimension table
 Attributes
 Good examples of dimensions are location, product, time, promotion,
organization etc. Dimension tables store records related to that particular
dimension and no facts (measures) are stored in these tables.
 A fact (measure) table contains measures (sales gross value, total units sold) and
dimension columns. These dimension columns are actually foreign keys from the
respective dimension tables.
Modeling Fundamentals:Modeling Fundamentals:
Types OF Data ModelingTypes OF Data Modeling
30
 End users cannot understand or navigate ER models
 Software cannot usefully query an ER model
 Use of ER modeling techniques defeats intuitive and high performance retrieval of
data
Types OF Data ModelingTypes OF Data Modeling
Why Dimensional Modeling?Why Dimensional Modeling?
When the designer places understandability and
performance as the highest goals . . .
Dimensional Modeling is the natural approach
31
What is a Star Schema ?
Each dimension table has
a single-part primary key
that corresponds exactly to
one of the components of
the multi-part key in the
fact table. This
characteristic "star-like"
structure is often called a
star-schema.
32
 The Star schema model is essentially a method to store data which are multi-
dimensional in nature, in a relational database. It consists of a single “fact table"
with a compound primary key, with one segment for each “dimension" and with
additional columns of additive, numeric facts.
What is a Star Schema ?
Customer
OrganizationTime
Product
Channel
SALES
The star schema makes multi-dimensional database (MDDB)
functionality possible using a traditional relational database.
33
Fact Tables
 A fact table, because it has a multi-part primary key made up of
two or more foreign keys, always expresses a many-to-many
relationship.
 The most useful fact tables also contain one or more numerical
measures, or "facts," that occur for the combination of keys
that define each record.
 The most useful facts in a fact table are numeric. Numeric
addition is crucial because data warehouse applications rarely
retrieve a single fact table record. Rather, they retrieve
hundreds, thousands, or even millions of these records at a
time, and the only useful thing to do with so many records is to
add them up.
34
Defining Fact Table Structure
ITEM_ID
WEEK_ID
STORE_ID
SALES_DOLLARS
SALES_UNITS
Fact Item Day Store
Item
Store
Week
Fact Columns
Fact Table Structure
35
What is a Dimension?
Data Warehouse is
• Subject-Oriented
•Integrated
• Time-Variant
• Non-volatile
Subject Dimension
In a Dimensional Model, context of the measurements are represented in
dimension tables
The Dimension Attributes are the various columns in a dimension table
36
What are Slow changing Dimensions?
Slowly changing dimensions are dimensions where a "constant"
actually evolves slowly and asynchronously.
“ Dimensions have been assumed to be independent of time”
In the real world this is not
strictly true
Examples: Humans change their name
Get married or divorced
37
Three Methods…
The three choices for dealing with slow changing
dimensions are:
Approach Results
Type 1:
Type 2
Overwriting the old values in
the dimension record
Losing the ability to track the
old history
Creating an additional dimension
record at the time of the change
with the new attribute values
Segmenting history very
accurately between the old
description and the new
description
Type 3: Creating new “current” fields Describe history both
38
Type one
Implementing Type 1:
 Overwrite the field with new value
 No effect anywhere else in the database
Scenarios where applicable:
 When original data was in error
 When no value is reviewed in keeping the old description/attribute
Advantages
Easy to implement
No key affected
Disadvantages
History is lost
39
Type two
Advantages
Automatically partitions history
No time constraints required
Disadvantages
Abrupt point of time constraints
not effective
Implementing Type 2:
 Create new record with unique key
 Generalize the dimensioning by adding 2 or 3 various digits to the end of the
key.
Scenarios where applicable:
 Most commonly used where history is of importance
40
Dimension Tables
 Dimension tables, most often contain descriptive textual information.
 Dimension attributes are used as the source of most of the interesting constraints
in data warehouse queries, and they are virtually always the source of the row
headers in the SQL answer set.
 It should be obvious that the power of the data warehouse is proportional to the
quality and depth of the dimension tables.
41
Attributes in a Dimension Table
 Allows users to constrain data by one or more attributes.
 Allows users to define aggregation levels for data
DEPT CLASS SALES
Dept 1
Dept 2
Class 101
Class 120
Class 133
Class 127
Class 141
Class 145
1000
1100
1900
2100
1500
1800
• Present Classes by Departments
• Aggregate by Class
• Qualify by Department
42
Basic Dimensional Model
ETL Concepts
44
ETL !!!
(Extract, Transform, Load) –
ETL refers to the methods involved in accessing and manipulating source
data and loading it into target database. During the ETL process, more often,
data is extracted from an OLTP database, transformed to match the data
warehouse schema, and loaded into the data warehouse database.
45
EXTRACT DATA FROM
DISPARATE SOURCES
TRANSFORM DATA
LOAD DATA WHERE
WE WANT TO
WHAT IS ETL?
E  EXTRACT
T  TRANSFORM
L  LOAD
46
EXTRACTION (Data Capturing)
The ETL extraction element is responsible for extracting data from the source system.
During extraction, data may be removed from the source system or a copy made and the
original data retained in the source system.
47
Legacy systems may require too much effort to implement such offload processes, so
legacy data is often copied into the data warehouse, leaving the original data in place.
Extracted data is loaded into the data warehouse staging area (a relational database
usually separate from the data warehouse database), for manipulation by the remaining
ETL processes.
EXTRACTION (Data Transmission)
48
EXTRACTION (Cleansing Process)
Data extraction is generally performed within the source system itself.
Data extraction processes can be implemented using Transact-SQL stored procedures,
Data Transformation Services (DTS) tasks, or custom applications developed in
programming or scripting languages.
49
TRANSFORMATION
The ETL transformation element is responsible for data validation, data accuracy, data
type conversion, and business rule application. An ETL system that uses inline
transformations during extraction is less robust and flexible than one that confines
transformations to the reformatting element. Transformations performed in the OLTP
system impose a performance burden on the OLTP database.
50
TRANSFORMATION (contd.)
Data Validation
Check that all rows in the fact table match rows in dimension tables to enforce data
integrity.
Data Accuracy
Ensure that fields contain appropriate values, such as only "off" or "on" in a status field.
Data Type Conversion
Ensure that all values for a specified field are stored the same way in the data
warehouse regardless of how they were stored in the source system. For example, if
one source system stores "off" or "on" in its status field and another source system
stores "0" or "1" in its status field, then a data type conversion transformation converts
the content of one or both of the fields to a specified common value such as "off" or
"on".
Business Rule Application
Ensure that the rules of the business are enforced on the data stored in the
warehouse. For example, check that all customer records contain values for both
FirstName and LastName fields.
51
LOADING
The ETL loading element is responsible for loading transformed data into the data
warehouse database.
Data warehouses are usually updated periodically rather than continuously, and large
numbers of records are often loaded to multiple tables in a single data load.
The data warehouse is often taken offline during update operations so that data can be
loaded faster and SQL Server 2000 Analysis Services can update OLAP cubes to
incorporate the new data. BULK INSERT, bcp, and the Bulk Copy API are the best tools
for data loading operations.
The design of the loading element should focus on efficiency and performance to
minimize the data warehouse offline time.
52
ETL Tools
What are ETL Tools?
ETL Tools are meant to extract, transform and load the data into Data Warehouse for
decision making. Before the evolution of ETL Tools, the above mentioned ETL process
was done manually by using SQL code created by programmers. This task was tedious
and cumbersome in many cases since it involved many resources, complex coding and
more work hours. On top of it, maintaining the code placed a great challenge among the
programmers
Selecting an appropriate ETL tool is the most important decision that has to be made
when choosing the components of a data warehousing application. The ETL tool
operates at the heart of the data warehouse, extracting data from multiple data sources,
transforming the data to make it accessible to business analysis, and loading multiple
target databases
53
Features of ETL Tools
Features of ETL Tools
The ETL tools have the ability to extract data from various sources like RDBMS ,
DB2 , COBOL data files and flat files at scheduled intervals , do required
transformation and load the data into Data Warehouse which resides on RDBMS.
The ETL tools can connect to a RDBMS and get the list of tables and their
attributes. The general steps for designing an ETL process are
Define the structure of source data
Define the structure of Destination Data
Map elements of source data to elements of destination data
Define the transformation required like changing values , summing
Schedule the execution of process
The process once executed , generates a log showing status of process ,
number of records inserted etc. Various reports about processes are available
which can form the Metadata.
54

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringHadi Fadlallah
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management DATAVERSITY
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEdureka!
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEyad Manna
 
Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Kernel Training
 
Etl overview training
Etl overview trainingEtl overview training
Etl overview trainingMondy Holten
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsDataminingTools Inc
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 

Was ist angesagt? (20)

Data warehouse testing
Data warehouse testingData warehouse testing
Data warehouse testing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction
 
Etl overview training
Etl overview trainingEtl overview training
Etl overview training
 
OLTP vs OLAP
OLTP vs OLAPOLTP vs OLAP
OLTP vs OLAP
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database Concepts
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 

Andere mochten auch

Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouseKomal Choudhary
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing processRakesh Hansalia
 
Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)LizLavaveshkul
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 
Data extraction, transformation, and loading
Data extraction, transformation, and loadingData extraction, transformation, and loading
Data extraction, transformation, and loadingSiddique Ibrahim
 
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Health Catalyst
 

Andere mochten auch (11)

Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing process
 
Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Le processus ETL (Extraction, Transformation, Chargement)
Le processus ETL (Extraction, Transformation, Chargement)Le processus ETL (Extraction, Transformation, Chargement)
Le processus ETL (Extraction, Transformation, Chargement)
 
Data extraction, transformation, and loading
Data extraction, transformation, and loadingData extraction, transformation, and loading
Data extraction, transformation, and loading
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
 

Ähnlich wie Dw & etl concepts

Top 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfTop 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfDatacademy.ai
 
Data warehouse
Data warehouseData warehouse
Data warehouseRajThakuri
 
Data warehouse
Data warehouseData warehouse
Data warehouseMR Z
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
data collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxdata collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxSourabhkumar729579
 
Guide to Business Intelligence
Guide to Business IntelligenceGuide to Business Intelligence
Guide to Business IntelligenceTechnologyAdvice
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.pptBsMath3rdsem
 
IT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docxIT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docxvrickens
 
Data warehousing
Data warehousingData warehousing
Data warehousingkeeyre
 
Data Management and System Software (Information management)
Data Management and System Software (Information management)Data Management and System Software (Information management)
Data Management and System Software (Information management)Abdulmughni Ansari
 

Ähnlich wie Dw & etl concepts (20)

CTP Data Warehouse
CTP Data WarehouseCTP Data Warehouse
CTP Data Warehouse
 
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfTop 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdf
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Issue in Data warehousing and OLAP in E-business
Issue in Data warehousing and OLAP in E-businessIssue in Data warehousing and OLAP in E-business
Issue in Data warehousing and OLAP in E-business
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
data collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxdata collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptx
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Guide to Business Intelligence
Guide to Business IntelligenceGuide to Business Intelligence
Guide to Business Intelligence
 
Unit Ii
Unit IiUnit Ii
Unit Ii
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt
 
Data Management
Data ManagementData Management
Data Management
 
IT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docxIT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docx
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data Management and System Software (Information management)
Data Management and System Software (Information management)Data Management and System Software (Information management)
Data Management and System Software (Information management)
 

Kürzlich hochgeladen

Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...
Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...
Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...ranjana rawat
 
Dark Dubai Call Girls O525547819 Skin Call Girls Dubai
Dark Dubai Call Girls O525547819 Skin Call Girls DubaiDark Dubai Call Girls O525547819 Skin Call Girls Dubai
Dark Dubai Call Girls O525547819 Skin Call Girls Dubaikojalkojal131
 
Brand Analysis for reggaeton artist Jahzel.
Brand Analysis for reggaeton artist Jahzel.Brand Analysis for reggaeton artist Jahzel.
Brand Analysis for reggaeton artist Jahzel.GabrielaMiletti
 
Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
CALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual serviceanilsa9823
 
Bur Dubai Call Girl Service #$# O56521286O Call Girls In Bur Dubai
Bur Dubai Call Girl Service #$# O56521286O Call Girls In Bur DubaiBur Dubai Call Girl Service #$# O56521286O Call Girls In Bur Dubai
Bur Dubai Call Girl Service #$# O56521286O Call Girls In Bur Dubaiparisharma5056
 
Dubai Call Girls Starlet O525547819 Call Girls Dubai Showen Dating
Dubai Call Girls Starlet O525547819 Call Girls Dubai Showen DatingDubai Call Girls Starlet O525547819 Call Girls Dubai Showen Dating
Dubai Call Girls Starlet O525547819 Call Girls Dubai Showen Datingkojalkojal131
 
Résumé (2 pager - 12 ft standard syntax)
Résumé (2 pager -  12 ft standard syntax)Résumé (2 pager -  12 ft standard syntax)
Résumé (2 pager - 12 ft standard syntax)Soham Mondal
 
Call Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)
Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)
Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)sonalinghatmal
 
Production Day 1.pptxjvjbvbcbcb bj bvcbj
Production Day 1.pptxjvjbvbcbcb bj bvcbjProduction Day 1.pptxjvjbvbcbcb bj bvcbj
Production Day 1.pptxjvjbvbcbcb bj bvcbjLewisJB
 
Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...
Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...
Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...poojakaurpk09
 
Joshua Minker Brand Exploration Sports Broadcaster .pptx
Joshua Minker Brand Exploration Sports Broadcaster .pptxJoshua Minker Brand Exploration Sports Broadcaster .pptx
Joshua Minker Brand Exploration Sports Broadcaster .pptxsportsworldproductio
 
CFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector ExperienceCFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector ExperienceSanjay Bokadia
 
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...rightmanforbloodline
 
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Escorts Service Cambridge Layout ☎ 7737669865☎ Book Your One night Stand (Ba...
Escorts Service Cambridge Layout  ☎ 7737669865☎ Book Your One night Stand (Ba...Escorts Service Cambridge Layout  ☎ 7737669865☎ Book Your One night Stand (Ba...
Escorts Service Cambridge Layout ☎ 7737669865☎ Book Your One night Stand (Ba...amitlee9823
 
Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)amitlee9823
 
Top Rated Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Call Girls in Nagpur High Profile
 
Call Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 

Kürzlich hochgeladen (20)

Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...
Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...
Book Paid Saswad Call Girls Pune 8250192130Low Budget Full Independent High P...
 
Dark Dubai Call Girls O525547819 Skin Call Girls Dubai
Dark Dubai Call Girls O525547819 Skin Call Girls DubaiDark Dubai Call Girls O525547819 Skin Call Girls Dubai
Dark Dubai Call Girls O525547819 Skin Call Girls Dubai
 
Brand Analysis for reggaeton artist Jahzel.
Brand Analysis for reggaeton artist Jahzel.Brand Analysis for reggaeton artist Jahzel.
Brand Analysis for reggaeton artist Jahzel.
 
Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Devanahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Nishatganj Lucknow best sexual service
 
Bur Dubai Call Girl Service #$# O56521286O Call Girls In Bur Dubai
Bur Dubai Call Girl Service #$# O56521286O Call Girls In Bur DubaiBur Dubai Call Girl Service #$# O56521286O Call Girls In Bur Dubai
Bur Dubai Call Girl Service #$# O56521286O Call Girls In Bur Dubai
 
Dubai Call Girls Starlet O525547819 Call Girls Dubai Showen Dating
Dubai Call Girls Starlet O525547819 Call Girls Dubai Showen DatingDubai Call Girls Starlet O525547819 Call Girls Dubai Showen Dating
Dubai Call Girls Starlet O525547819 Call Girls Dubai Showen Dating
 
Résumé (2 pager - 12 ft standard syntax)
Résumé (2 pager -  12 ft standard syntax)Résumé (2 pager -  12 ft standard syntax)
Résumé (2 pager - 12 ft standard syntax)
 
Call Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Bidadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)
Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)
Toxicokinetics studies.. (toxicokinetics evaluation in preclinical studies)
 
Production Day 1.pptxjvjbvbcbcb bj bvcbj
Production Day 1.pptxjvjbvbcbcb bj bvcbjProduction Day 1.pptxjvjbvbcbcb bj bvcbj
Production Day 1.pptxjvjbvbcbcb bj bvcbj
 
Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...
Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...
Virgin Call Girls Delhi Service-oriented sexy call girls ☞ 9899900591 ☜ Rita ...
 
Joshua Minker Brand Exploration Sports Broadcaster .pptx
Joshua Minker Brand Exploration Sports Broadcaster .pptxJoshua Minker Brand Exploration Sports Broadcaster .pptx
Joshua Minker Brand Exploration Sports Broadcaster .pptx
 
CFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector ExperienceCFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector Experience
 
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
 
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Escorts Service Cambridge Layout ☎ 7737669865☎ Book Your One night Stand (Ba...
Escorts Service Cambridge Layout  ☎ 7737669865☎ Book Your One night Stand (Ba...Escorts Service Cambridge Layout  ☎ 7737669865☎ Book Your One night Stand (Ba...
Escorts Service Cambridge Layout ☎ 7737669865☎ Book Your One night Stand (Ba...
 
Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bidadi ☎ 7737669865☎ Book Your One night Stand (Bangalore)
 
Top Rated Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Deccan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
Call Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hosur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 

Dw & etl concepts

  • 3. 3 Business Intelligence How intelligent can you make your business processes? What insight can you gain into your business? How integrated can your business processes be? How much more interactive can your business be with customers, partners, employees and managers?
  • 4. 4 What is Business Intelligence (BI)? Business Intelligence is a generalized term applied to a broad category of applications and technologies for gathering, storing, analyzing and providing access to data to help enterprise users make better business decisions Business Intelligence applications include the activities of decision support systems, query and reporting, online analytical processing (OLAP), statistical analysis, forecasting, and data mining An alternative way of describing BI is: the technology required to turn raw data into information to support decision-making within corporations and business processes
  • 5. 5 Why BI? BI technologies help bring decision-makers the data in a form they can quickly digest and apply to their decision making. BI turns data into information for managers and executives and in general, people making decisions in a company. Companies want to use technology tactically to make their operations more effective and more efficient - Business intelligence can be the catalyst for that efficiency and effectiveness.
  • 6. 6 Benefits The benefits of a well-planned BI implementation are going to be closely tied to the business objectives driving the project. Identify trends and anomalies in business operations more quickly, allowing for more accurate and timelier decisions. Deliver actionable insight and information to the right place with less effort . Identify and operate based on a single version of the truth, allowing all analysis to be completed on a core foundation with confidence.
  • 9. 9 Business Intelligence Technologies Data Sources Paper, Files, Information Providers, Database Systems, OLTP Data Warehouses / Data Marts Data Exploration OLAP, DSS, EIS, Querying and Reporting Data Mining Information discovery Data Presentation Visualization Techniques Decision Making Increasing potential to support business decisions End User Business Analyst Data Analyst DB Admin
  • 11. 11 What is a Data Warehouse? A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data. A data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, online analytical processing (OLAP) and data mining capabilities, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users. It is a series of processes, procedures and tools (h/w & s/w) that help the enterprise understand more about itself, its products, its customers and the market it services
  • 12. 12 Who are the potential Customers ? Which Products are sold the most ? What are the region-wise preferences ? What are the competitor products ? What are the projected sales ? What if you sale more quantity of a particular product ? What will be the impact on revenue ? Results of promotion schemes introduced ? Why Data Warehousing? Need of Intelligent Information in Competitive Market
  • 13. 13 OLTP vs. Data Warehouse OLTP systems are tuned for known transactions and workloads while workload is not known in a data warehouse Special data organization, access methods and implementation methods are needed to support data warehouse queries (typically multidimensional queries) e.g., average amount spent on phone calls between 9AM-5PM in Pune during the month of December
  • 14. 14 OLTP vs. Data Warehouse OLTP Application Oriented Used to run business Detailed data Current up to date Isolated Data Repetitive access Clerical User WAREHOUSE (DSS) Subject Oriented Used to analyze business Summarized and refined Snapshot data Integrated Data Ad-hoc access Knowledge User (Manager)
  • 15. 15 OLTP vs Data Warehouse OLTP Performance Sensitive Few Records accessed at a time (tens) Read/Update Access No data redundancy Database Size 100MB -100 GB DATA WAREHOUSE Performance relaxed Large volumes accessed at a time(millions) Mostly Read (Batch Update) Redundancy present Database Size 100 GB - few terabytes
  • 16. 16 OLTP vs Data Warehouse OLTP Transaction throughput is the performance metric Thousands of users Managed in entirety Data Warehouse Query throughput is the performance metric Hundreds of users Managed by subsets
  • 17. 17 Data Warehouse Architectures  Centralized In a centralized architecture, there exists only one data warehouse which stores all data necessary for business analysis. As already shown in the previous section, the disadvantage is the loss of performance in opposite to distributed approaches. Central Architecture
  • 18. 18 Tiered: A tiered architecture is a distributed data approach. This process can not be done in one step because many sources have to be integrated into the warehouse. On a first level, the data of all branches in one region is collected, in the second level the data from the regions is integrated into one data warehouse. Advantages:  Faster response time because the data is located closer to the client applications and  Reduced volume of data to be searched. Tiered Architecture Data Warehouse Architectures Contd…
  • 19. 19 Metadata Data Sources Data Management Access Complete Warehouse Solution Architecture Operational Data Legacy Data The Post VISA External Data Sources Enterprise Data Warehouse Organizationally structured Extract Transform Load Data Information Knowledge Asset Assembly (and Management) Asset Exploitation Data Mart Data Mart Departmentally structured Data Mart Sales Inventory Purchase
  • 20. 20 Introduction To Data Marts What is a Data Mart From the Data Warehouse , atomic data flows to various departments for their customized needs. If this data is periodically extracted from data warehouse and loaded into a local database, it becomes a data mart. The data in Data Mart has a different level of granularity than that of Data Warehouse. Since the data in Data Marts is highly customized and lightly summarized , the departments can do whatever they want without worrying about resource utilization. Also the departments can use the analytical software they find convenient. The cost of processing becomes very low.
  • 21. 21 Data Mart Overview Data Marts Satisfy 80% of the local end- users’ requests Sales Representatives and Analysts Human Resources Financial Analysts, Strategic Planners, and Executives DM Marketing DM Finance DM Sales DM HR Data Warehouse DM Sales DM HR DM Marketing
  • 22. 22 From The Data Warehouse To Data Marts Departmentally Structured Individually Structured Data Warehouse Organizationally Structured Less More History Normalized Detailed Data Information
  • 23. 23  Data model is a conceptual representation of data structures (tables) required for a database and is very powerful in expressing and communicating the business requirements. A data model is an abstract model that describes how data is represented and used. The term data model has two generally accepted meanings:  A data model theory i.e. a formal description of how data may be structured and used.  A data model instance i.e. applying a data model theory to create a practical data model instance for some particular application. Modeling Fundamentals: What is Data Model ?
  • 24. 24 Logical Data Model (LDM) - A logical design is conceptual and abstract. The process of logical design involves arranging data into a series of logical relationships called entities and attributes.  Logical data model includes all required entities, attributes, key groups, and relationships that represent business information and define business rules. Modeling Fundamentals:Modeling Fundamentals: Types OF Data ModelingTypes OF Data Modeling Logical Data Model
  • 25. 25 Physical Data Model (PDM) - A physical data model is a representation of a data design which takes into account the facilities and constraints of a given database management system.  A complete physical data model will include all the database artifacts required to create relationships between tables or achieve performance goals, such as indexes, constraint definitions, linking tables, partitioned tables or clusters. Modeling Fundamentals:Modeling Fundamentals: Types OF Data ModelingTypes OF Data Modeling Physical Data Model
  • 26. 26 Entity relationship diagram (ERD) – A data model utilizing several notations to depict data in terms of the entities and relationships described by that data.  Databases are used to store structured data. The structure of this data, together with other constraints, can be designed using a variety of techniques, one of which is called entity-relationship modeling or ERM. Modeling Fundamentals:Modeling Fundamentals: Types OF Data ModelingTypes OF Data Modeling ERD Diagram
  • 27. 27 Important Terminologies –  Entity – Are the principal data object about which information is to be collected. A class of persons, places, objects, events, or concepts about which we need to capture and store data. Modeling Fundamentals:Modeling Fundamentals: Types OF Data ModelingTypes OF Data Modeling •Persons: agency, contractor, customer, department, division, employee, instructor, student, supplier. •Places: sales region, building, room, branch office, campus. •Objects: book, machine, part, product, raw material, software license, software package, tool, vehicle model, vehicle. •Events: application, award, cancellation, class, flight, invoice, order, registration, renewal, requisition, reservation, sale, trip. •Concepts: account, block of time, bond, course, fund, qualification, stock.
  • 28. 28  Relationship – A natural business association that exists between one or more entities. The relationship may represent an event that links the entities or merely a logical affinity that exists between the entities  An example of a relationship would be: • Employees are assigned to projects • Student enrolling in a curriculum • Projects have subtasks • Departments manage one or more projects Modeling Fundamentals:Modeling Fundamentals: Types OF Data ModelingTypes OF Data Modeling STUDENT CURRICULUM Is being studied by is enrolled in
  • 29. 29 Dimensional Data Modeling (DDM) -Dimensional modeling is the design concept used by many data warehouse designers to build their data warehouse.  Is a logical design technique that seeks to present the data in a standard, intuitive framework that allows for high-performance access. It adheres to a discipline that uses the relational model with some important restrictions.  Every dimensional model is composed of one table with a multi-part key, called the fact table, and a set of smaller tables called dimension tables. Components of a DM:  Fact Table  Dimension table  Attributes  Good examples of dimensions are location, product, time, promotion, organization etc. Dimension tables store records related to that particular dimension and no facts (measures) are stored in these tables.  A fact (measure) table contains measures (sales gross value, total units sold) and dimension columns. These dimension columns are actually foreign keys from the respective dimension tables. Modeling Fundamentals:Modeling Fundamentals: Types OF Data ModelingTypes OF Data Modeling
  • 30. 30  End users cannot understand or navigate ER models  Software cannot usefully query an ER model  Use of ER modeling techniques defeats intuitive and high performance retrieval of data Types OF Data ModelingTypes OF Data Modeling Why Dimensional Modeling?Why Dimensional Modeling? When the designer places understandability and performance as the highest goals . . . Dimensional Modeling is the natural approach
  • 31. 31 What is a Star Schema ? Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multi-part key in the fact table. This characteristic "star-like" structure is often called a star-schema.
  • 32. 32  The Star schema model is essentially a method to store data which are multi- dimensional in nature, in a relational database. It consists of a single “fact table" with a compound primary key, with one segment for each “dimension" and with additional columns of additive, numeric facts. What is a Star Schema ? Customer OrganizationTime Product Channel SALES The star schema makes multi-dimensional database (MDDB) functionality possible using a traditional relational database.
  • 33. 33 Fact Tables  A fact table, because it has a multi-part primary key made up of two or more foreign keys, always expresses a many-to-many relationship.  The most useful fact tables also contain one or more numerical measures, or "facts," that occur for the combination of keys that define each record.  The most useful facts in a fact table are numeric. Numeric addition is crucial because data warehouse applications rarely retrieve a single fact table record. Rather, they retrieve hundreds, thousands, or even millions of these records at a time, and the only useful thing to do with so many records is to add them up.
  • 34. 34 Defining Fact Table Structure ITEM_ID WEEK_ID STORE_ID SALES_DOLLARS SALES_UNITS Fact Item Day Store Item Store Week Fact Columns Fact Table Structure
  • 35. 35 What is a Dimension? Data Warehouse is • Subject-Oriented •Integrated • Time-Variant • Non-volatile Subject Dimension In a Dimensional Model, context of the measurements are represented in dimension tables The Dimension Attributes are the various columns in a dimension table
  • 36. 36 What are Slow changing Dimensions? Slowly changing dimensions are dimensions where a "constant" actually evolves slowly and asynchronously. “ Dimensions have been assumed to be independent of time” In the real world this is not strictly true Examples: Humans change their name Get married or divorced
  • 37. 37 Three Methods… The three choices for dealing with slow changing dimensions are: Approach Results Type 1: Type 2 Overwriting the old values in the dimension record Losing the ability to track the old history Creating an additional dimension record at the time of the change with the new attribute values Segmenting history very accurately between the old description and the new description Type 3: Creating new “current” fields Describe history both
  • 38. 38 Type one Implementing Type 1:  Overwrite the field with new value  No effect anywhere else in the database Scenarios where applicable:  When original data was in error  When no value is reviewed in keeping the old description/attribute Advantages Easy to implement No key affected Disadvantages History is lost
  • 39. 39 Type two Advantages Automatically partitions history No time constraints required Disadvantages Abrupt point of time constraints not effective Implementing Type 2:  Create new record with unique key  Generalize the dimensioning by adding 2 or 3 various digits to the end of the key. Scenarios where applicable:  Most commonly used where history is of importance
  • 40. 40 Dimension Tables  Dimension tables, most often contain descriptive textual information.  Dimension attributes are used as the source of most of the interesting constraints in data warehouse queries, and they are virtually always the source of the row headers in the SQL answer set.  It should be obvious that the power of the data warehouse is proportional to the quality and depth of the dimension tables.
  • 41. 41 Attributes in a Dimension Table  Allows users to constrain data by one or more attributes.  Allows users to define aggregation levels for data DEPT CLASS SALES Dept 1 Dept 2 Class 101 Class 120 Class 133 Class 127 Class 141 Class 145 1000 1100 1900 2100 1500 1800 • Present Classes by Departments • Aggregate by Class • Qualify by Department
  • 44. 44 ETL !!! (Extract, Transform, Load) – ETL refers to the methods involved in accessing and manipulating source data and loading it into target database. During the ETL process, more often, data is extracted from an OLTP database, transformed to match the data warehouse schema, and loaded into the data warehouse database.
  • 45. 45 EXTRACT DATA FROM DISPARATE SOURCES TRANSFORM DATA LOAD DATA WHERE WE WANT TO WHAT IS ETL? E  EXTRACT T  TRANSFORM L  LOAD
  • 46. 46 EXTRACTION (Data Capturing) The ETL extraction element is responsible for extracting data from the source system. During extraction, data may be removed from the source system or a copy made and the original data retained in the source system.
  • 47. 47 Legacy systems may require too much effort to implement such offload processes, so legacy data is often copied into the data warehouse, leaving the original data in place. Extracted data is loaded into the data warehouse staging area (a relational database usually separate from the data warehouse database), for manipulation by the remaining ETL processes. EXTRACTION (Data Transmission)
  • 48. 48 EXTRACTION (Cleansing Process) Data extraction is generally performed within the source system itself. Data extraction processes can be implemented using Transact-SQL stored procedures, Data Transformation Services (DTS) tasks, or custom applications developed in programming or scripting languages.
  • 49. 49 TRANSFORMATION The ETL transformation element is responsible for data validation, data accuracy, data type conversion, and business rule application. An ETL system that uses inline transformations during extraction is less robust and flexible than one that confines transformations to the reformatting element. Transformations performed in the OLTP system impose a performance burden on the OLTP database.
  • 50. 50 TRANSFORMATION (contd.) Data Validation Check that all rows in the fact table match rows in dimension tables to enforce data integrity. Data Accuracy Ensure that fields contain appropriate values, such as only "off" or "on" in a status field. Data Type Conversion Ensure that all values for a specified field are stored the same way in the data warehouse regardless of how they were stored in the source system. For example, if one source system stores "off" or "on" in its status field and another source system stores "0" or "1" in its status field, then a data type conversion transformation converts the content of one or both of the fields to a specified common value such as "off" or "on". Business Rule Application Ensure that the rules of the business are enforced on the data stored in the warehouse. For example, check that all customer records contain values for both FirstName and LastName fields.
  • 51. 51 LOADING The ETL loading element is responsible for loading transformed data into the data warehouse database. Data warehouses are usually updated periodically rather than continuously, and large numbers of records are often loaded to multiple tables in a single data load. The data warehouse is often taken offline during update operations so that data can be loaded faster and SQL Server 2000 Analysis Services can update OLAP cubes to incorporate the new data. BULK INSERT, bcp, and the Bulk Copy API are the best tools for data loading operations. The design of the loading element should focus on efficiency and performance to minimize the data warehouse offline time.
  • 52. 52 ETL Tools What are ETL Tools? ETL Tools are meant to extract, transform and load the data into Data Warehouse for decision making. Before the evolution of ETL Tools, the above mentioned ETL process was done manually by using SQL code created by programmers. This task was tedious and cumbersome in many cases since it involved many resources, complex coding and more work hours. On top of it, maintaining the code placed a great challenge among the programmers Selecting an appropriate ETL tool is the most important decision that has to be made when choosing the components of a data warehousing application. The ETL tool operates at the heart of the data warehouse, extracting data from multiple data sources, transforming the data to make it accessible to business analysis, and loading multiple target databases
  • 53. 53 Features of ETL Tools Features of ETL Tools The ETL tools have the ability to extract data from various sources like RDBMS , DB2 , COBOL data files and flat files at scheduled intervals , do required transformation and load the data into Data Warehouse which resides on RDBMS. The ETL tools can connect to a RDBMS and get the list of tables and their attributes. The general steps for designing an ETL process are Define the structure of source data Define the structure of Destination Data Map elements of source data to elements of destination data Define the transformation required like changing values , summing Schedule the execution of process The process once executed , generates a log showing status of process , number of records inserted etc. Various reports about processes are available which can form the Metadata.
  • 54. 54

Hinweis der Redaktion

  1. Hence Business Intelligence delivers actionable information at the point of business It can answer questions like How much more intelligent can you make your business processes? How much more insight can you gain into your business? How much more integrated can your business processes be? How much more interactive can your business be with customers, partners, employees and managers?
  2. Business intelligence has become a critical element of information technology. It’s an old term with general or even ambiguous meaning. It has been used synonymously with decision support, analysis, and data warehousing, but today business intelligence has a more specific definition and a better understood application. Taken literally, business intelligence is just that—intelligence or understanding of your business. You get that understanding by analyzing your business operations. This business intelligence process can deliver significant, bottom-line results. Implementing its technologies and applying its process can help make your business more effective and more efficient, increasing revenue, decreasing costs, and improving your relationships with customers and suppliers.
  3. "Why Business Intelligence?" By definition, the moment any given business is operating, it begins generating data. Some obvious examples are sales, bookkeeping, production data, warehouse information, transportation and logistics, personnel, etc. In addition there also exists large volumes of data which are important to the business but not directly generated by business operations. Examples are market data, competitive data, tenders and proposal, legal information, raw material prices, etc. As such, none of the above described information can be used in its raw form by corporate management to make decisions although the information is critical in helping make those business decisions. Therein lies the necessity for Business Intelligence. BI technologies help bring decision-makers the data in a form they can quickly digest and apply to their decision making. BI turns data into information for managers and executives and in general, people making decisions in a company.
  4. Questa figura illustra la disposizione logica delle differenti tecnologie adottate nel business intelligence. La disposizione si basa sul potenziale valore delle tecnologie come base per decisioni strategiche e di business. In generale il valore dell’informazione a supporto dei processi decisionali cresce dal basso verso l’alto. Una decisione basata sui dati nei livelli più bassi, dove tipicamente ci sono milioni di record, influenzerà la transazione di un singolo cliente, mentre una decisione presa su dati fortemente riepilogati come quelli dei livelli più alti riguarderà probabilmente un intero dipartimento o persino l’intera azienda. Per questa ragione si trovano generalmente differenti tipi di utenti sui diversi livelli. Un amministratore di basi di dati lavora principalmente con i database sulla sorgente dei dati e sul livello di data warehouse, mentre l’analista economico o la dirigenza lavorerà principalmente sui livelli più elevati della piramide. Va ricordato che questo è una disposizione logica e non una interdipendenza fisica tra i vari livelli tecnologici. Per esempio le tecniche di visualizzazione possono essere usate indipendentemente dal data mining, il data mining può basarsi sui data warehouse o semplicemente su file. I data warehouse sono una tecnologia di supporto al data mining e non una condizione essenziale. Infatti molte delle applicazioni di data mining oggi sono effettuate comunque su file estratti direttamente da sorgenti operazionali dei dati. La connessione fra data warehousing e data mining è comunque piuttosto forte, ragione per cui essa verrà approfondita in questa sede.
  5. Data warehousing is commonly used by companies to analyze trends over time. In other words, companies may very well use data warehousing to view day-to-day operations, but its primary function is facilitating strategic planning resulting from long-term data overviews. From such overviews, business models, forecasts, and other reports and projections can be made. Routinely, because the data stored in data warehouses is intended to provide more overview-like reporting, the data is read-only. If you want to update the data stored via data warehousing, you'll need to build a new query when you're done. This is not to say that data warehousing involves data that is never updated. On the contrary, the data stored in data warehouses is updated all the time. It's the reporting and the analysis that take more of a long-term view. Data warehousing is not the be-all and end-all for storing all of a company's data. Rather, data warehousing is used to house the necessary data for specific analysis. More comprehensive data storage requires different capacities that are more static and less easily manipulated than those used for data warehousing. Data warehousing is typically used by larger companies analyzing larger sets of data for enterprise purposes. Smaller companies wishing to analyze just one subject, for example, usually access data marts, which are much more specific and targeted in their storage and reporting. Data warehousing often includes smaller amounts of data grouped into data marts. In this way, a larger company might have at its disposal both data warehousing and data marts, allowing users to choose the source and functionality depending on current needs.