2. What is Data warehouse ?
It is a dimensional database design for Business
Intelligent activities and other large data set
processing system
No historical data
Relational Database
Historical data
Multidimensional
Data warehouse
3. Characteristics of Data warehouse
Primary characteristic
• Subject oriented
• Integrated
• Non-volatile
• Time variant
4. Secondary Characteristic of DW
• Fast query performance with high data
throughput
• Should support predefined and ad hoc queries
• Historical data, multiple source and
integration
5. Why DW ?
Common questions on Relational OLTP
Operational efficiency
– ERP reporting
– KPI tracking
– Risk management and balance scorecard
Customer Interaction
– Sales Analysis and forecasting
– CRM analysis and campaign planning
– Customer profitability
Historical data analysis and data mining
7. DW Architecture Component
• Data Sources (operational systems and flat files(no
structured relationship))
• Staging Area (where data sources go before the
warehouse)
• Warehouse (metadata, summary data, and raw data)
• Data Marts (purchasing, sales, and inventory)
• Users (analysis, reporting, and mining)
8. OLTP Vs. OLAP
Comparison measures
Comparison Measures OLTP OLAP
Source of data Operational system OLTP, Operational
Purpose of data Daily Business tasks Planning, Analysis, DSS
What the data ? Ongoing business
snapshot
Multidimensional view of
business
Insert and update Fast Batch processing
Queries Simple query Complex queries
Processing speed Very fast Relatively slow
Space Requirement Small Very large
Database Design Highly normalized Highly de-normalizes, star
and snowflake schemas
9. Operations on OLAP
• Drill down
• Roll up
• Slicing
• Dicing
Date(Month)
Date(Day)
Day(year)
Item,location,da
Item,loation
y Bag,Pokhara,
10/31/2014
10. Common DW Tasks
• Single database, single query
• Maintain data history, even if source did not
• Integration for multiple source, central view
• Organization's information consistently
• Single data model for multiple data source
• Restructuring data to generate business sense
• Decision support queries easier to write
• Predictive analysis
11. DW applications
Netezza, Teradata, Oracle Exadata
• Most of these application uses massive
parallel processing to achieve high scalability
and efficient query processing