SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Er. Nawaraj Bhandari
Data Warehouse/Data Mining
Chapter 3:
Data Warehouse Physical Design
Physical Design
Physical design is the phase of a database design following the logical design that
identifies the actual database tables and index structures used to implement the
logical design.
In the physical design, you look at the most effective way of storing and retrieving
the objects as well as handling them from a transportation and backup/recovery
perspective.
Physical design decisions are mainly driven by query performance and
database maintenance aspects.
During the logical design phase, you defined a model for your data warehouse
consisting of entities, attributes, and relationships. The entities are linked
together using relationships. Attributes are used to describe the entities. The
unique identifier (UID) distinguishes between one instance of an entity and
another.
Figure: Logical Design Compared with Physical Design
During the physical design process, you translate the expected schemas
into actual database structures.
At this time, you have to map:
■ Entities to tables
■ Relationships to foreign key constraints
■ Attributes to columns
■ Primary unique identifiers to primary key constraints
■ Unique identifiers to unique key constraints
Physical Data Model
Features of physical data model include:
Specification all tables and columns.
Specification of Foreign keys.
De-normalization may be performed if necessary.
At this level, specification of logical data model is realized in the database.
The steps for physical data model design involves:
Conversion of entities into tables,
Conversion of relationships into foreign keys, Conversion of attributes into
columns
Changes to the physical data model based on the physical constraints.
Figure: Logical model and physical model
Physical Design Objectives
Involves tradeoffs among
 Performance
 Flexibility
 Scalability
 Ease of Administration
 Data Integrity
 Data Consistency
 Data Availability
 User Satisfaction
Physical Design Structures
Once you have converted your logical design to a physical one,
you will need to create some or all of the following structures:
■ Tablespaces
■ Tables and Partitioned Tables
■ Views
■ Integrity Constraints
■ Dimensions
Some of these structures require disk space. Others exist only in
the data dictionary. Additionally, the following structures may be
created for performance improvement:
■ Indexes and Partitioned Indexes
■ Materialized Views
Tablespaces
 A tablespace consists of one or more datafiles, which are physical
structures within the operating system you are using.
 A datafile is associated with only one tablespace.
 From a design perspective, tablespaces are containers for
physical design structures.
Tables and Partitioned Tables
 Tables are the basic unit of data storage. They are the
container for the expected amount of raw data in your
data warehouse.
 Using partitioned tables instead of non-partitioned ones
addresses the key problem of supporting very large data
volumes by allowing you to divide them into smaller and
more manageable pieces.
 Partitioning large tables improves performance because
each partitioned piece is more manageable.
Views
 A view is a tailored presentation of the data contained in one or
more tables or other views.
 A view takes the output of a query and treats it as a table.
 Views do not require any space in the database.
Improving Performance with the Use of Views
View of
selected rows
or columns of
these tables
Table 1
Table 2
Table 3
Query
View
 A view is a virtual table which
completely acts as a real table.
 The use of view as a way to improve
performance.
 Views can be used to combine tables,
so that instead of joining tables in a
query, the query will just access the
view and thus be quicker.
View
 We can perform different SQL queries.
 DESC department_worker_view;
Integrity Constraints
 Integrity constraints are used to enforce business rules associated
with your database and to prevent having invalid information in
the tables.
 In data warehousing environments, constraints are only used for
query rewrite.
 NOT NULL constraints are particularly common in data
warehouses.
Indexes and Partitioned Indexes
 Indexes are optional structures associated with tables.
 Indexes are just like tables in that you can partition them (but the
partitioning strategy is not dependent upon the table structure)
 Partitioning indexes makes it easier to manage the data warehouse
during refresh and improves query performance.
Materialized Views
 Materialized views are query results that have been stored in
advance so long-running calculations are not necessary when you
actually execute your SQL statements.
 From a physical design point of view, materialized views resemble
tables or partitioned tables and behave like indexes in that they are
used transparently and improve performance.
Data Warehouse: A Multi-Tiered Architecture
Data
Warehouse
Extract
Transform
Load
Refresh
(2) OLAP Engine
Analysis
Query/Reports
Data mining
Monitor
&
Integrator
Metadata
Data Sources (3) Front-End Tools
Server
Data Marts
Operational
DBs
Other
sources
(1) Data Storage
OLAP Server
ROLAP
Server
MOLAP
Server
ETL (Extract-Transform-Load)
 ETL comes from Data Warehousing and stands for Extract-Transform-Load.
ETL covers a process of how the data are loaded from the source system
to the data warehouse.
 Currently, the ETL encompasses a cleaning step as a separate step. The
sequence is then Extract-Clean-Transform-Load.
Extract
 The Extract step covers the data extraction from the source system and
makes it accessible for further processing.
 The main objective of the extract step is to retrieve all the required data
from the source system with as little resources as possible.
 The extract step should be designed in a way that it does not negatively
affect the source system in terms or performance, response time or any
kind of locking.
Extract
There are several ways to perform the extract:
 Update notification - if the source system is able to provide a notification that a record has
been changed and describe the change, this is the easiest way to get the data.
 Incremental extract - some systems may not be able to provide notification that an update
has occurred, but they are able to identify which records have been modified and provide
an extract of such records. During further ETL steps, the system needs to identify changes
and propagate it down. Note, that by using daily extract, we may not be able to handle
deleted records properly.
 Full extract - some systems are not able to identify which data has been changed at all, so
a full extract is the only way one can get the data out of the system. The full extract
requires keeping a copy of the last extract in the same format in order to be able to identify
changes. Full extract handles deletions as well.
Clean
The cleaning step is one of the most important as it ensures the quality of
the data in the data warehouse. Cleaning should perform basic data
unification rules, such as:
 Making identifiers unique (sex categories Male/Female/Unknown, M/F/null,
Man/Woman/Not Available are translated to standard Male/Female/Unknown)
 Convert null values into standardized Not Available/Not Provided value
 Convert phone numbers, ZIP codes to a standardized form
 Validate address fields, convert them into proper naming, e.g. Street/St/St./Str./Str
 Validate address fields against each other (State/Country, City/State, City/ZIP code,
City/Street).
Transform
The transform step applies a set of rules to transform the data from the
source to the target.
This includes converting any measured data to the same dimension (i.e.
conformed dimension) using the same units so that they can later be
joined.
The transformation step also requires joining data from several sources,
generating aggregates, generating surrogate keys(candidate key), sorting,
deriving new calculated values, and applying advanced validation rules.
OLAP Server Architectures
Types of OLAP Servers
 Relational OLAP (ROLAP)
 Multidimensional OLAP (MOLAP)
 Hybrid OLAP (HOLAP)
Relational OLAP (ROLAP)
 Relational OLAP servers are placed between relational back-end server and
client front-end tools. To store and manage the warehouse data, the relational
OLAP uses relational or extended-relational DBMS.
 ROLAP servers can be easily used with existing RDBMS.
 ROLAP tools do not use pre-calculated data cubes.
Multidimensional OLAP(MOLAP)
 Multidimensional OLAP (MOLAP) uses array-based multidimensional storage
engines for multidimensional views of data. With multidimensional data stores,
the storage utilization may be low if the data set is sparse. Therefore, many
MOLAP servers use two levels of data storage representation to handle dense
and sparse data-sets
 MOLAP allows fastest indexing to the pre-computed summarized data.
 Easier to use, therefore MOLAP is suitable for inexperienced users.
MOLAP vs. ROLAP
MOLAP ROLAP
Information retrieval is fast. Information retrieval is comparatively slow.
Uses sparse array to store data-sets. Uses relational table.
MOLAP is best suited for inexperienced
users, since it is very easy to use.
ROLAP is best suited for experienced users.
Maintains a separate database for data
cubes.
It may not require space other than available in
the Data warehouse.
Hybrid OLAP (HOLAP)
 Hybrid OLAP is a combination of both ROLAP and MOLAP. It offers
higher scalability of ROLAP and faster computation of MOLAP.
 HOLAP servers allows to store the large data volumes of detailed
information. The aggregations are stored separately in MOLAP store.
Distributed Data Warehouse
 (DDW) Data shared across multiple data repositories, for the purpose
of OLAP. Each data warehouse may belong to one or many
organizations. The sharing implies a common format or definition of
data elements (e.g. using XML).
 Distributed data warehousing encompasses a complete enterprise DW
but has smaller data stores that are built separately and joined
physically over a network, providing users with access to relevant
reports without impacting on performance.
 A distributed DW, the nucleus of all enterprise data, sends relevant
data to individual data marts from which users can access information
for order management, customer billing, sales analysis, and other
reporting and analytic functions.
Data Warehouse Manager
 Collects data inputs from a variety of sources, including legacy
operational systems, third-party data suppliers, and informal sources.
 Assures the quality of these data inputs by correcting spelling,
removing mistakes, eliminating null data, and combining multiple
sources
 Releases the data from the data staging area to the individual data
marts on a regular schedule.
 Measures the costs and benefits.
 Estimates the cost and benefits
Virtual Warehouse
 The data warehouse is a great idea, but it is complex to build and
requires investment. Why not use a cheap and fast approach
by eliminating the transformation steps of repositories for metadata
and another database.
 This approach is termed the 'virtual data warehouse'. To accomplish
this there is need to define 4 kinds of information:
 A data dictionary containing the definitions of the various databases.
 A description of the relationship among the data elements.
 The description of the way user will interface with the system.
 The algorithms and business rules that define what to do and how to do it.
References
1. Sam Anahory, Dennis Murray, “Data warehousing In the Real World”, Pearson
Education.
2. Kimball, R. “The Data Warehouse Toolkit”, Wiley, 1996.
3. Teorey, T. J., “Database Modeling and Design: The Entity-Relationship Approach”,
Morgan Kaufmann Publishers, Inc., 1990.
4. “An Overview of Data Warehousing and OLAP Technology”, S. Chaudhuri,
Microsoft Research
5. “Data Warehousing with Oracle”, M. A. Shahzad
6. “Data Mining Concepts and Techniques”, Morgan Kaufmann J. Han, M Kamber
Second Edition ISBN : 978-1-55860-901-3
ANY QUESTIONS?

Weitere ähnliche Inhalte

Was ist angesagt?

1. Introduction to DBMS
1. Introduction to DBMS1. Introduction to DBMS
1. Introduction to DBMS
koolkampus
 
Database Design
Database DesignDatabase Design
Database Design
learnt
 
Data cube computation
Data cube computationData cube computation
Data cube computation
Rashmi Sheikh
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
 

Was ist angesagt? (20)

Fundamentals of Database system
Fundamentals of Database systemFundamentals of Database system
Fundamentals of Database system
 
1. Introduction to DBMS
1. Introduction to DBMS1. Introduction to DBMS
1. Introduction to DBMS
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Relational data base management system (Unit 1)
Relational data base management system (Unit 1)Relational data base management system (Unit 1)
Relational data base management system (Unit 1)
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Database Design
Database DesignDatabase Design
Database Design
 
Rdbms
RdbmsRdbms
Rdbms
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
 
Database systems
Database systemsDatabase systems
Database systems
 
Data cube computation
Data cube computationData cube computation
Data cube computation
 
Structure of dbms
Structure of dbmsStructure of dbms
Structure of dbms
 
Chapter-1 Introduction to Database Management Systems
Chapter-1 Introduction to Database Management SystemsChapter-1 Introduction to Database Management Systems
Chapter-1 Introduction to Database Management Systems
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Centralised and distributed databases
Centralised and distributed databasesCentralised and distributed databases
Centralised and distributed databases
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
DBMS and its Models
DBMS and its ModelsDBMS and its Models
DBMS and its Models
 
Data Modeling PPT
Data Modeling PPTData Modeling PPT
Data Modeling PPT
 

Ähnlich wie Data warehouse physical design

UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
DURGADEVIL
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
Aaron Shilo
 
Sql Interview Questions
Sql Interview QuestionsSql Interview Questions
Sql Interview Questions
arjundwh
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of Denormalization
Aliya Saldanha
 

Ähnlich wie Data warehouse physical design (20)

Sqlserver interview questions
Sqlserver interview questionsSqlserver interview questions
Sqlserver interview questions
 
Introduction To Database.ppt
Introduction To Database.pptIntroduction To Database.ppt
Introduction To Database.ppt
 
Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswers
 
data warehousing need and characteristics. types of data w data warehouse arc...
data warehousing need and characteristics. types of data w data warehouse arc...data warehousing need and characteristics. types of data w data warehouse arc...
data warehousing need and characteristics. types of data w data warehouse arc...
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
object oriented analysis data.pptx
object oriented analysis data.pptxobject oriented analysis data.pptx
object oriented analysis data.pptx
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
 
Process management seminar
Process management seminarProcess management seminar
Process management seminar
 
Data Management
Data ManagementData Management
Data Management
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
 
GROPSIKS.pptx
GROPSIKS.pptxGROPSIKS.pptx
GROPSIKS.pptx
 
Unit 1 DBMS
Unit 1 DBMSUnit 1 DBMS
Unit 1 DBMS
 
Ch1_Intro-95(1).ppt
Ch1_Intro-95(1).pptCh1_Intro-95(1).ppt
Ch1_Intro-95(1).ppt
 
Sql
SqlSql
Sql
 
Sql Interview Questions
Sql Interview QuestionsSql Interview Questions
Sql Interview Questions
 
Sql
SqlSql
Sql
 
Sql
SqlSql
Sql
 
153680 sqlinterview
153680  sqlinterview153680  sqlinterview
153680 sqlinterview
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of Denormalization
 

Mehr von Er. Nawaraj Bhandari

Mehr von Er. Nawaraj Bhandari (20)

Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methods
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large Database
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
Data warehouse testing
Data warehouse testingData warehouse testing
Data warehouse testing
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
Chapter 3: Simplification of Boolean Function
Chapter 3: Simplification of Boolean FunctionChapter 3: Simplification of Boolean Function
Chapter 3: Simplification of Boolean Function
 
Chapter 6: Sequential Logic
Chapter 6: Sequential LogicChapter 6: Sequential Logic
Chapter 6: Sequential Logic
 
Chapter 5: Cominational Logic with MSI and LSI
Chapter 5: Cominational Logic with MSI and LSIChapter 5: Cominational Logic with MSI and LSI
Chapter 5: Cominational Logic with MSI and LSI
 
Chapter 4: Combinational Logic
Chapter 4: Combinational LogicChapter 4: Combinational Logic
Chapter 4: Combinational Logic
 
Chapter 2: Boolean Algebra and Logic Gates
Chapter 2: Boolean Algebra and Logic GatesChapter 2: Boolean Algebra and Logic Gates
Chapter 2: Boolean Algebra and Logic Gates
 
Chapter 1: Binary System
 Chapter 1: Binary System Chapter 1: Binary System
Chapter 1: Binary System
 
Introduction to Electronic Commerce
Introduction to Electronic CommerceIntroduction to Electronic Commerce
Introduction to Electronic Commerce
 
Evaluating software development
Evaluating software developmentEvaluating software development
Evaluating software development
 
Using macros in microsoft excel part 2
Using macros in microsoft excel   part 2Using macros in microsoft excel   part 2
Using macros in microsoft excel part 2
 
Using macros in microsoft excel part 1
Using macros in microsoft excel   part 1Using macros in microsoft excel   part 1
Using macros in microsoft excel part 1
 
Using macros in microsoft access
Using macros in microsoft accessUsing macros in microsoft access
Using macros in microsoft access
 
Testing software development
Testing software developmentTesting software development
Testing software development
 
Application software and business processes
Application software and business processesApplication software and business processes
Application software and business processes
 

Kürzlich hochgeladen

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 

Kürzlich hochgeladen (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 

Data warehouse physical design

  • 1. Er. Nawaraj Bhandari Data Warehouse/Data Mining Chapter 3: Data Warehouse Physical Design
  • 2. Physical Design Physical design is the phase of a database design following the logical design that identifies the actual database tables and index structures used to implement the logical design. In the physical design, you look at the most effective way of storing and retrieving the objects as well as handling them from a transportation and backup/recovery perspective.
  • 3. Physical design decisions are mainly driven by query performance and database maintenance aspects. During the logical design phase, you defined a model for your data warehouse consisting of entities, attributes, and relationships. The entities are linked together using relationships. Attributes are used to describe the entities. The unique identifier (UID) distinguishes between one instance of an entity and another.
  • 4. Figure: Logical Design Compared with Physical Design
  • 5. During the physical design process, you translate the expected schemas into actual database structures. At this time, you have to map: ■ Entities to tables ■ Relationships to foreign key constraints ■ Attributes to columns ■ Primary unique identifiers to primary key constraints ■ Unique identifiers to unique key constraints
  • 6. Physical Data Model Features of physical data model include: Specification all tables and columns. Specification of Foreign keys. De-normalization may be performed if necessary. At this level, specification of logical data model is realized in the database.
  • 7. The steps for physical data model design involves: Conversion of entities into tables, Conversion of relationships into foreign keys, Conversion of attributes into columns Changes to the physical data model based on the physical constraints.
  • 8. Figure: Logical model and physical model
  • 9. Physical Design Objectives Involves tradeoffs among  Performance  Flexibility  Scalability  Ease of Administration  Data Integrity  Data Consistency  Data Availability  User Satisfaction
  • 10. Physical Design Structures Once you have converted your logical design to a physical one, you will need to create some or all of the following structures: ■ Tablespaces ■ Tables and Partitioned Tables ■ Views ■ Integrity Constraints ■ Dimensions Some of these structures require disk space. Others exist only in the data dictionary. Additionally, the following structures may be created for performance improvement: ■ Indexes and Partitioned Indexes ■ Materialized Views
  • 11. Tablespaces  A tablespace consists of one or more datafiles, which are physical structures within the operating system you are using.  A datafile is associated with only one tablespace.  From a design perspective, tablespaces are containers for physical design structures.
  • 12. Tables and Partitioned Tables  Tables are the basic unit of data storage. They are the container for the expected amount of raw data in your data warehouse.  Using partitioned tables instead of non-partitioned ones addresses the key problem of supporting very large data volumes by allowing you to divide them into smaller and more manageable pieces.  Partitioning large tables improves performance because each partitioned piece is more manageable.
  • 13. Views  A view is a tailored presentation of the data contained in one or more tables or other views.  A view takes the output of a query and treats it as a table.  Views do not require any space in the database.
  • 14. Improving Performance with the Use of Views View of selected rows or columns of these tables Table 1 Table 2 Table 3 Query
  • 15. View  A view is a virtual table which completely acts as a real table.  The use of view as a way to improve performance.  Views can be used to combine tables, so that instead of joining tables in a query, the query will just access the view and thus be quicker.
  • 16. View  We can perform different SQL queries.  DESC department_worker_view;
  • 17. Integrity Constraints  Integrity constraints are used to enforce business rules associated with your database and to prevent having invalid information in the tables.  In data warehousing environments, constraints are only used for query rewrite.  NOT NULL constraints are particularly common in data warehouses.
  • 18. Indexes and Partitioned Indexes  Indexes are optional structures associated with tables.  Indexes are just like tables in that you can partition them (but the partitioning strategy is not dependent upon the table structure)  Partitioning indexes makes it easier to manage the data warehouse during refresh and improves query performance.
  • 19. Materialized Views  Materialized views are query results that have been stored in advance so long-running calculations are not necessary when you actually execute your SQL statements.  From a physical design point of view, materialized views resemble tables or partitioned tables and behave like indexes in that they are used transparently and improve performance.
  • 20. Data Warehouse: A Multi-Tiered Architecture Data Warehouse Extract Transform Load Refresh (2) OLAP Engine Analysis Query/Reports Data mining Monitor & Integrator Metadata Data Sources (3) Front-End Tools Server Data Marts Operational DBs Other sources (1) Data Storage OLAP Server ROLAP Server MOLAP Server
  • 21. ETL (Extract-Transform-Load)  ETL comes from Data Warehousing and stands for Extract-Transform-Load. ETL covers a process of how the data are loaded from the source system to the data warehouse.  Currently, the ETL encompasses a cleaning step as a separate step. The sequence is then Extract-Clean-Transform-Load.
  • 22. Extract  The Extract step covers the data extraction from the source system and makes it accessible for further processing.  The main objective of the extract step is to retrieve all the required data from the source system with as little resources as possible.  The extract step should be designed in a way that it does not negatively affect the source system in terms or performance, response time or any kind of locking.
  • 23. Extract There are several ways to perform the extract:  Update notification - if the source system is able to provide a notification that a record has been changed and describe the change, this is the easiest way to get the data.  Incremental extract - some systems may not be able to provide notification that an update has occurred, but they are able to identify which records have been modified and provide an extract of such records. During further ETL steps, the system needs to identify changes and propagate it down. Note, that by using daily extract, we may not be able to handle deleted records properly.  Full extract - some systems are not able to identify which data has been changed at all, so a full extract is the only way one can get the data out of the system. The full extract requires keeping a copy of the last extract in the same format in order to be able to identify changes. Full extract handles deletions as well.
  • 24. Clean The cleaning step is one of the most important as it ensures the quality of the data in the data warehouse. Cleaning should perform basic data unification rules, such as:  Making identifiers unique (sex categories Male/Female/Unknown, M/F/null, Man/Woman/Not Available are translated to standard Male/Female/Unknown)  Convert null values into standardized Not Available/Not Provided value  Convert phone numbers, ZIP codes to a standardized form  Validate address fields, convert them into proper naming, e.g. Street/St/St./Str./Str  Validate address fields against each other (State/Country, City/State, City/ZIP code, City/Street).
  • 25. Transform The transform step applies a set of rules to transform the data from the source to the target. This includes converting any measured data to the same dimension (i.e. conformed dimension) using the same units so that they can later be joined. The transformation step also requires joining data from several sources, generating aggregates, generating surrogate keys(candidate key), sorting, deriving new calculated values, and applying advanced validation rules.
  • 26. OLAP Server Architectures Types of OLAP Servers  Relational OLAP (ROLAP)  Multidimensional OLAP (MOLAP)  Hybrid OLAP (HOLAP)
  • 27. Relational OLAP (ROLAP)  Relational OLAP servers are placed between relational back-end server and client front-end tools. To store and manage the warehouse data, the relational OLAP uses relational or extended-relational DBMS.  ROLAP servers can be easily used with existing RDBMS.  ROLAP tools do not use pre-calculated data cubes.
  • 28. Multidimensional OLAP(MOLAP)  Multidimensional OLAP (MOLAP) uses array-based multidimensional storage engines for multidimensional views of data. With multidimensional data stores, the storage utilization may be low if the data set is sparse. Therefore, many MOLAP servers use two levels of data storage representation to handle dense and sparse data-sets  MOLAP allows fastest indexing to the pre-computed summarized data.  Easier to use, therefore MOLAP is suitable for inexperienced users.
  • 29. MOLAP vs. ROLAP MOLAP ROLAP Information retrieval is fast. Information retrieval is comparatively slow. Uses sparse array to store data-sets. Uses relational table. MOLAP is best suited for inexperienced users, since it is very easy to use. ROLAP is best suited for experienced users. Maintains a separate database for data cubes. It may not require space other than available in the Data warehouse.
  • 30. Hybrid OLAP (HOLAP)  Hybrid OLAP is a combination of both ROLAP and MOLAP. It offers higher scalability of ROLAP and faster computation of MOLAP.  HOLAP servers allows to store the large data volumes of detailed information. The aggregations are stored separately in MOLAP store.
  • 31. Distributed Data Warehouse  (DDW) Data shared across multiple data repositories, for the purpose of OLAP. Each data warehouse may belong to one or many organizations. The sharing implies a common format or definition of data elements (e.g. using XML).  Distributed data warehousing encompasses a complete enterprise DW but has smaller data stores that are built separately and joined physically over a network, providing users with access to relevant reports without impacting on performance.  A distributed DW, the nucleus of all enterprise data, sends relevant data to individual data marts from which users can access information for order management, customer billing, sales analysis, and other reporting and analytic functions.
  • 32. Data Warehouse Manager  Collects data inputs from a variety of sources, including legacy operational systems, third-party data suppliers, and informal sources.  Assures the quality of these data inputs by correcting spelling, removing mistakes, eliminating null data, and combining multiple sources  Releases the data from the data staging area to the individual data marts on a regular schedule.  Measures the costs and benefits.  Estimates the cost and benefits
  • 33. Virtual Warehouse  The data warehouse is a great idea, but it is complex to build and requires investment. Why not use a cheap and fast approach by eliminating the transformation steps of repositories for metadata and another database.  This approach is termed the 'virtual data warehouse'. To accomplish this there is need to define 4 kinds of information:  A data dictionary containing the definitions of the various databases.  A description of the relationship among the data elements.  The description of the way user will interface with the system.  The algorithms and business rules that define what to do and how to do it.
  • 34. References 1. Sam Anahory, Dennis Murray, “Data warehousing In the Real World”, Pearson Education. 2. Kimball, R. “The Data Warehouse Toolkit”, Wiley, 1996. 3. Teorey, T. J., “Database Modeling and Design: The Entity-Relationship Approach”, Morgan Kaufmann Publishers, Inc., 1990. 4. “An Overview of Data Warehousing and OLAP Technology”, S. Chaudhuri, Microsoft Research 5. “Data Warehousing with Oracle”, M. A. Shahzad 6. “Data Mining Concepts and Techniques”, Morgan Kaufmann J. Han, M Kamber Second Edition ISBN : 978-1-55860-901-3