SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Shikha Gautam
Asst.Professor
 The Data warehouse is an environment, not a product.
 It is an architectural construct of an information system that
provides users with current and historical decision support
information that is hard to access or present in traditional
operational data store.
 Data warehousing is a blend of technologies and components
aimed at effective integration of operation database into an
environment that enables strategic use of data.
 These technologies include relational and multi-dimensional
database management system, client/ server architecture,
meta-data modeling and repositories, graphical user interface
etc.
“A data warehouse is a subject-oriented, integrated, time-variant,
and nonvolatile collection of data in support of management’s
decision-making process.”—W. H. Inmon
 Data warehousing is the process of constructing and using
data warehouses.
 Focusing on the modeling and analysis of data for decision makers,
not on daily operations or transaction processing.
 Organized around major subjects, such as customer, product,
sales.
 Provide a simple and concise view around particular subject.
 The time horizon for the data warehouse is significantly
longer than that of operational systems:
 Operational database: current value data
 Data warehouse data: provide information from a
historical perspective (e.g., past 5-10 years)
 Every key structure in the data warehouse contains an
element of time, explicitly or implicitly.
 A physically separate store of data transformed from the
operational environment.
 Does not require transaction processing, recovery, and
concurrency control mechanisms
 Requires only two operations in data accessing:
▪ initial loading of data and access of data
DBMS— tuned for OLTP: access methods, indexing,
concurrency control, recovery.
Warehouse—tuned for OLAP: complex OLAP queries,
multidimensional view, consolidation.
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date
detailed, flat relational
isolated
historical,
summarized, multidimensional
integrated, consolidated
usage repetitive ad-hoc
access read/write
index/hash on prim. key
lots of scans
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 100MB-GB 100GB-TB
metric transaction throughput query throughput, response
Data Warehouse: A Multi-Tiered Architecture
Data
Warehouse
Extract
Transform
Load
Refresh
OLAP Engine
Analysis
Query
Reports
Data mining
Monitor
&
Integrator
Metadata
Data Sources Front-End Tools
Serve
Data Marts
Operational
DBs
Other
sources
Data Storage
OLAP Server
 A data warehouse is based on a multidimensional data model
which views data in the form of a data cube.
 A data cube, such as sales, allows data to be modeled and
viewed in multiple dimensions
 Dimension tables, such as item (item_name, brand, type),
or time(day, week, month, quarter, year)
 Fact table contains measures (such as dollars_sold) and
keys to each of the related dimension tables
time,item
time,item,location
time, item, location, supplier
all
time item location supplier
time,location
time,supplier
item,location
item,supplier
location,supplier
time,item,supplier
time,location,supplier
item,location,supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
 An n-D base cube is called a base cuboid.
 The top most 0-D cuboid, which holds the
highest-level of summarization, is called the
apex cuboid.
 The lattice of cuboids forms a data cube.
 Data warehouse have different schemas as:
1. Star schema: A fact table in the middle connected
to a set of dimension tables.
time_key
day
day_of_the_week
month
quarter
year
time
location_key
street
city
state_or_province
country
location
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
item_key
item_name
brand
type
supplier_type
item
branch_key
branch_name
branch_type
branch
2. Snowflake schema: A refinement of star schema where
some dimensional hierarchy is normalized into a set of
smaller dimension tables, forming a shape similar to
snowflake.
time_key
day
day_of_the_week
month
quarter
year
time
location_key
street
city_key
location
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
item_key
item_name
brand
type
supplier_key
item
branch_key
branch_name
branch_type
branch
supplier_key
supplier_type
supplier
city_key
city
state_or_province
country
city
3. Fact constellations: Multiple fact tables share dimension
tables, viewed as a collection of stars, therefore called
galaxy schema.
31
time_key
day
day_of_the_week
month
quarter
year
time
location_key
street
city
province_or_state
country
location
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
item_key
item_name
brand
type
supplier_type
item
branch_key
branch_name
branch_type
branch
Shipping Fact Table
time_key
item_key
shipper_key
from_location
to_location
dollars_cost
units_shipped
shipper_key
shipper_name
location_key
shipper_type
shipper
 Distributive: if the result derived by applying the function to n
aggregate values is the same as that derived by applying the
function on all the data without partitioning.
▪ E.g., count(), sum(), min(), max()
 Algebraic: if it can be computed by an algebraic function with
M arguments (where M is a bounded integer).
 E.g., avg(), min N(), standard_deviation()
 Holistic: if there is no constant bound on the storage size
needed to describe a sub-aggregate.
▪ E.g., median(), mode(), rank()
There are following seven components of a Data
Warehouse:
 Data Warehouse Database
 Sourcing, Acquisition, Cleanup andTransformation
Tools
 Meta Data
 Access (Query)Tools :The query tool allows executives
and other users real-time access to the DataWarehouse
database for query generation, result displays, reports
and data exports.
 Data Marts
 Data Warehouse Administration and Management
 Information Delivery System
1. Business Considerations- Identifying and establishing
information requirements.
2. Design Considerations- Data Content, Metadata, Data
Distribution, Tools, Performance consideration.
3. Technical Considerations- Hardware Platform, DBMS that
supports the warehouse, Communication infrastructure,
hardware platform and software to support the metadata
repository, Systems management framework.
4. Implementation Considerations- Access Tools, Data
Extraction, Cleanup, Transformation, and Migration etc.
The size of a data warehouse rapidly approaches to the
search for better performance. This search aims to pursue
two goals:
 Speed-up: the ability to execute the same request on
the same amount data in less time.
 Scale-up: the ability to obtain the same performance
on the same request as the database size increases.
Doubling the number of processors cuts the response
time in half (linear speed-up) or provides the same
performance on twice as much data (linear scale-up).
 The goals of linear performance and scalability can be
satisfied by parallel hardware architectures, parallel
operating systems, and parallel DBMSs.
 Parallel hardware architectures are based on Multi-
processor systems designed as a Shared-memory model
(symmetric multiprocessors), Shared-disk model or
distributed-memory model (MPP and Clusters of SMPs).
Parallelism can be achieved in two different ways:
1. Horizontal Parallelism (Database is partitioned across
different disks)
2. Vertical Parallelism (occurs among different tasks – all
components query operations i.e. scans, join, sort)
3. Data Partitioning.
 Shared memory -- processors share a
common memory.
 Shared disk -- processors share a common
disk.
 Shared nothing -- processors share neither a
common memory nor common disk.
 Hierarchical -- hybrid of the above
architectures.Also known as Combined
Architecture .
 ORACLE – Oracle supports Parallel Database processing with
its add-on.
 Informix –Support Shared-Memory, Shared-Disk, and
Shared-Nothing Models.
 IBM – DB2 Parallel Edition (DB2 PE), DB2 Universal
Database.
 Sybase – Sybase implemented its parallel DBMS functionality in
a product called SYBASE MPP.
 Other RDBMS Products : i. NCR Teradata ii. Tandem NonStop
SQL/MP.

Weitere ähnliche Inhalte

Was ist angesagt?

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schemaSayed Ahmed
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process Shuvra Ghosh
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Summary introduction to data engineering
Summary introduction to data engineeringSummary introduction to data engineering
Summary introduction to data engineeringNovita Sari
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 

Was ist angesagt? (20)

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Parallel Database
Parallel DatabaseParallel Database
Parallel Database
 
Summary introduction to data engineering
Summary introduction to data engineeringSummary introduction to data engineering
Summary introduction to data engineering
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data integration
Data integrationData integration
Data integration
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 

Ähnlich wie Data Warehousing

Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4ambujm
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3ambujm
 
Dataware house multidimensionalmodelling
Dataware house multidimensionalmodellingDataware house multidimensionalmodelling
Dataware house multidimensionalmodellingmeghu123
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olapSalah Amean
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptMutiaSari53
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptxjainyshah20
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 
Data Warehouse and Architecture, OLAP Operation
Data Warehouse and Architecture, OLAP OperationData Warehouse and Architecture, OLAP Operation
Data Warehouse and Architecture, OLAP OperationShivarkarSandip
 

Ähnlich wie Data Warehousing (20)

Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Dataware house multidimensionalmodelling
Dataware house multidimensionalmodellingDataware house multidimensionalmodelling
Dataware house multidimensionalmodelling
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Cs1011 dw-dm-1
Cs1011 dw-dm-1Cs1011 dw-dm-1
Cs1011 dw-dm-1
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Chpt2.ppt
Chpt2.pptChpt2.ppt
Chpt2.ppt
 
My2dw
My2dwMy2dw
My2dw
 
2. olap warehouse
2. olap warehouse2. olap warehouse
2. olap warehouse
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Data Warehouse and Architecture, OLAP Operation
Data Warehouse and Architecture, OLAP OperationData Warehouse and Architecture, OLAP Operation
Data Warehouse and Architecture, OLAP Operation
 

Mehr von SHIKHA GAUTAM

Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemorySHIKHA GAUTAM
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionSHIKHA GAUTAM
 
Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance SHIKHA GAUTAM
 
Type conversion in c
Type conversion in cType conversion in c
Type conversion in cSHIKHA GAUTAM
 
3. basic organization of a computer
3. basic organization of a computer3. basic organization of a computer
3. basic organization of a computerSHIKHA GAUTAM
 
Generations of computer
Generations of computerGenerations of computer
Generations of computerSHIKHA GAUTAM
 
Dbms Introduction and Basics
Dbms Introduction and BasicsDbms Introduction and Basics
Dbms Introduction and BasicsSHIKHA GAUTAM
 

Mehr von SHIKHA GAUTAM (16)

Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared Memory
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock Detection
 
Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance
 
Unit 4
Unit 4Unit 4
Unit 4
 
Unit v
Unit vUnit v
Unit v
 
Unit iii
Unit iiiUnit iii
Unit iii
 
Unit ii_KCS201
Unit ii_KCS201Unit ii_KCS201
Unit ii_KCS201
 
Type conversion in c
Type conversion in cType conversion in c
Type conversion in c
 
C intro
C introC intro
C intro
 
4. algorithm
4. algorithm4. algorithm
4. algorithm
 
3. basic organization of a computer
3. basic organization of a computer3. basic organization of a computer
3. basic organization of a computer
 
Generations of computer
Generations of computerGenerations of computer
Generations of computer
 
c_programming
c_programmingc_programming
c_programming
 
Data Mining
Data MiningData Mining
Data Mining
 
Dbms Introduction and Basics
Dbms Introduction and BasicsDbms Introduction and Basics
Dbms Introduction and Basics
 
DBMS
DBMSDBMS
DBMS
 

Kürzlich hochgeladen

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 

Kürzlich hochgeladen (20)

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 

Data Warehousing

  • 2.  The Data warehouse is an environment, not a product.  It is an architectural construct of an information system that provides users with current and historical decision support information that is hard to access or present in traditional operational data store.  Data warehousing is a blend of technologies and components aimed at effective integration of operation database into an environment that enables strategic use of data.  These technologies include relational and multi-dimensional database management system, client/ server architecture, meta-data modeling and repositories, graphical user interface etc.
  • 3. “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process.”—W. H. Inmon  Data warehousing is the process of constructing and using data warehouses.  Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing.
  • 4.  Organized around major subjects, such as customer, product, sales.  Provide a simple and concise view around particular subject.
  • 5.  The time horizon for the data warehouse is significantly longer than that of operational systems:  Operational database: current value data  Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years)  Every key structure in the data warehouse contains an element of time, explicitly or implicitly.
  • 6.  A physically separate store of data transformed from the operational environment.  Does not require transaction processing, recovery, and concurrency control mechanisms  Requires only two operations in data accessing: ▪ initial loading of data and access of data
  • 7. DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery. Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation.
  • 8. OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated historical, summarized, multidimensional integrated, consolidated usage repetitive ad-hoc access read/write index/hash on prim. key lots of scans unit of work short, simple transaction complex query # records accessed tens millions #users thousands hundreds DB size 100MB-GB 100GB-TB metric transaction throughput query throughput, response
  • 9. Data Warehouse: A Multi-Tiered Architecture Data Warehouse Extract Transform Load Refresh OLAP Engine Analysis Query Reports Data mining Monitor & Integrator Metadata Data Sources Front-End Tools Serve Data Marts Operational DBs Other sources Data Storage OLAP Server
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.  A data warehouse is based on a multidimensional data model which views data in the form of a data cube.  A data cube, such as sales, allows data to be modeled and viewed in multiple dimensions  Dimension tables, such as item (item_name, brand, type), or time(day, week, month, quarter, year)  Fact table contains measures (such as dollars_sold) and keys to each of the related dimension tables
  • 24. time,item time,item,location time, item, location, supplier all time item location supplier time,location time,supplier item,location item,supplier location,supplier time,item,supplier time,location,supplier item,location,supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid
  • 25.  An n-D base cube is called a base cuboid.  The top most 0-D cuboid, which holds the highest-level of summarization, is called the apex cuboid.  The lattice of cuboids forms a data cube.
  • 26.  Data warehouse have different schemas as: 1. Star schema: A fact table in the middle connected to a set of dimension tables.
  • 28. 2. Snowflake schema: A refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to snowflake.
  • 30. 3. Fact constellations: Multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema.
  • 32.  Distributive: if the result derived by applying the function to n aggregate values is the same as that derived by applying the function on all the data without partitioning. ▪ E.g., count(), sum(), min(), max()  Algebraic: if it can be computed by an algebraic function with M arguments (where M is a bounded integer).  E.g., avg(), min N(), standard_deviation()  Holistic: if there is no constant bound on the storage size needed to describe a sub-aggregate. ▪ E.g., median(), mode(), rank()
  • 33. There are following seven components of a Data Warehouse:  Data Warehouse Database  Sourcing, Acquisition, Cleanup andTransformation Tools  Meta Data  Access (Query)Tools :The query tool allows executives and other users real-time access to the DataWarehouse database for query generation, result displays, reports and data exports.  Data Marts  Data Warehouse Administration and Management  Information Delivery System
  • 34. 1. Business Considerations- Identifying and establishing information requirements. 2. Design Considerations- Data Content, Metadata, Data Distribution, Tools, Performance consideration. 3. Technical Considerations- Hardware Platform, DBMS that supports the warehouse, Communication infrastructure, hardware platform and software to support the metadata repository, Systems management framework. 4. Implementation Considerations- Access Tools, Data Extraction, Cleanup, Transformation, and Migration etc.
  • 35. The size of a data warehouse rapidly approaches to the search for better performance. This search aims to pursue two goals:  Speed-up: the ability to execute the same request on the same amount data in less time.  Scale-up: the ability to obtain the same performance on the same request as the database size increases. Doubling the number of processors cuts the response time in half (linear speed-up) or provides the same performance on twice as much data (linear scale-up).
  • 36.  The goals of linear performance and scalability can be satisfied by parallel hardware architectures, parallel operating systems, and parallel DBMSs.  Parallel hardware architectures are based on Multi- processor systems designed as a Shared-memory model (symmetric multiprocessors), Shared-disk model or distributed-memory model (MPP and Clusters of SMPs). Parallelism can be achieved in two different ways: 1. Horizontal Parallelism (Database is partitioned across different disks) 2. Vertical Parallelism (occurs among different tasks – all components query operations i.e. scans, join, sort) 3. Data Partitioning.
  • 37.  Shared memory -- processors share a common memory.  Shared disk -- processors share a common disk.  Shared nothing -- processors share neither a common memory nor common disk.  Hierarchical -- hybrid of the above architectures.Also known as Combined Architecture .
  • 38.  ORACLE – Oracle supports Parallel Database processing with its add-on.  Informix –Support Shared-Memory, Shared-Disk, and Shared-Nothing Models.  IBM – DB2 Parallel Edition (DB2 PE), DB2 Universal Database.  Sybase – Sybase implemented its parallel DBMS functionality in a product called SYBASE MPP.  Other RDBMS Products : i. NCR Teradata ii. Tandem NonStop SQL/MP.