1. SUBMITTED TO: SUBMITTED BY:
MR. ASHOK WAHI NIKISHA GUPTA
CHANDNI RASTOGI
SAKSHI JAIN
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 1
2. DATA WAREHOUSE
A subject-oriented, integrated, time-variant, non-updatable collection
of data used in support of management decision-making processes.
Subject-oriented: Customers, patients, students, products.
Integrated: Consistent naming conventions, formats, encoding
structures; from multiple data sources.
Time-variant: Can study trends and changes.
Non-updatable: Read-only, periodically refreshed; never deleted.
A data warehouse is a home for your high-value data, or data
assets, that originates in other corporate applications, such as the one
your company uses to fill customer orders for its products, or some
data source external to your company, such as a public database that
contains sales information gathered from all your competitors.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 2
3. CLASSIFICATION OF DATA
WAREHOUSE
Each of these classifications of data warehouses implements various aspects of an
overall data warehousing architecture are:
Data warehouse lite: A relatively straightforward implementation of a modest
scope (often, for a small user group or team) in which you don’t go out on any
technological limbs; almost a low-tech implementation.
Data warehouse deluxe: A standard data warehouse implementation that uses
advanced technologies to solve complex business information and analytical
needs across a broader user population.
Data warehouse supreme: A data warehouse that has large-scale data
distribution and advanced technologies that can integrate various “run the
business” systems, improving the overall quality of the data assets across
business information analytical needs and transactional needs.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 3
4. 2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 4
5. This architecture assures that your data warehouse meets your user’s information
requirements and focuses on the following business organization and technical-
architecture presentation components:
Subject area and data content: A subject area is a high-level grouping of data
content that relates to a major area of business interests, such as customers, products,
sales orders, and contracts.
Data source: Data sources are very similar to raw materials that support the
creation of finished goods in manufacturing.
Business intelligence tools: The user’s requirements for information access
dictate the type of business intelligence tool deployed for your data warehouse.
Some users require only simple querying or reporting on the data content within a
subject area; others might require sophisticated analytics. These data access
requirements assist in classifying your data warehouse.
Database: The database refers to the technology of choice leveraged to manage
the data content within a set of target data structures.
Data integration: Data integration is a broad classification for the extraction,
movement, transformation, and loading of data from the data’s source into the target
database.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 5
6. DATA WAREHOUSE LITE
A data warehouse lite is a no-frills, bare-bones, low-tech approach to providing
data that can help with some of your business decision-making. No-frills
means that you put together, wherever possible, proven capabilities and
tools already within your organization to build your system.
Figure: A data warehouse lite has a narrow subject area focus.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 6
7. Denormalizing data from a single application restructures that data to make it more
conducive to reporting needs.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 7
8. The low-tech approach to moving data into a data warehouse lite database
backup tapes.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 8
9. The architecture of a data warehouse lite is built around straight-line
movement of data.
STRUCTURE OF DATA WAREHOUSE & DATA
2/29/2012 MARTS 9
10. DATA WAREHOUSE DELUXE
A data warehouse deluxe has a broader subject area focus than a data warehouse
lite.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 10
11. A data warehouse deluxe often has a complicated architecture with many different
collection points for data.
STRUCTURE OF DATA WAREHOUSE & DATA
2/29/2012 MARTS 11
12. DATA WAREHOUSE SUPREME
Intelligent agents are an important part of the push technology architecture of a
data warehouse supreme.
STRUCTURE OF DATA WAREHOUSE & DATA
2/29/2012 MARTS 12
13. Sample architecture from a data warehouse supreme (although it can look like
just about anything).
STRUCTURE OF DATA WAREHOUSE & DATA
2/29/2012 MARTS 13
14. A data warehouse might consist of more than one database, under the control of
the overall warehousing environment.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 14
15. DATA MART
A data mart is simply a scaled-down data warehouse.The idea of a
data mart is hardly revolutionary, despite what you might read on
blogs and in the computer trade press, and what you might hear at
conferences or seminars.
There are three main approaches to create a data mart:
✓ Sourced by a data warehouse (most or all of the data mart’s
contents come from a data warehouse)
Quickly developed and created from scratch
Developed from scratch with an eye toward eventual integration
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 15
16. Data marts sourced by a data warehouse
Many data warehousing experts would argue (and I’m one of them, in this
case) that a true data mart is a “retail outlet,” and a data warehouse
provides its contents.
The data sources, data warehouse, data mart, and user interact in this way:
The data sources, acting as suppliers of raw materials, send data into the
data warehouse.
The data warehouse serves as a consolidation and distribution center,
collecting the raw materials in much the same way that any data
warehouse does.
Instead of the user (the consumer) going straight to the data warehouse,
though, the data warehouse serves as a wholesaler with the premise of “we
sell only to retailers, not directly to the public.” In this case, the retailers
are the data marts.
The data marts order data from the warehouse and, after stocking the
newly acquired information, make it available to consumers (users).
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 16
17. The retail-outlet approach to data marts: All the data comes
from a data warehouse.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 17
18. In a variation of the sourced-from-the-warehouse model, the data
warehouse that serves as the source for the data mart doesn’t have
all the information the data mart’s users need. You can solve this
problem in one of two ways:
Supplement the missing information directly into the data
warehouse before sending the selected contents to the data mart.
Don’t touch the data warehouse; instead, add the supplemental
information to the data mart in addition to what it receives from the
data warehouse.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 18
19. Top-down, quick-strike data marts
There are three reasons to go the data-mart route:
Speed: A quick-strike data mart is typically completed in 90
to 120 days, rather than the much longer time required for a
full-scale data warehouse.
Cost: Doing the job faster means that you spend less money;
it’s that simple.
Complexity and risk: When you work with less data and
fewer sources over a shorter period, you’re likely to create a
significantly less complex environment — and have fewer
associated risks.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 19
20. A top-down, quick-strike data mart is a subset of what can be built
if you pursue full scale data warehousing instead.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 20
21. Bottom-up, integration-oriented
data marts
Theoretically, you can design data marts so that they’re
eventually integrated in a bottom-up manner by building a data
warehousing environment (in contrast to a single, monolithic
data warehouse).
Bottom-up integration of data marts isn’t for the
fainthearted. You can do it, but it’s more difficult than creating
a top-down, quick-strike data mart that will always remain
stand-alone. You might be able to successfully use this
approach . . . but you might not.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 21
22. SUBSETS OF INFORMATION FOR
DATA MART
Geography-bounded data: A data mart might contain only the information
relevant to a certain geographical area, such as a region or territory within your
company.
Organization-bounded data: When deciding what you want to put in your data
mart, you can base decisions on what information a specific organization needs
when it’s the sole (or, at least, primary) user of the data mart. This approach
works well when the overwhelming majority of inquiries and reports are
organization-oriented. For example, the commercial checking group has no need
whatsoever to analyze consumer checking accounts and vice versa.
Function-bounded data: Using an approach that crosses organizational
boundaries, you can establish a data mart’s contents based on a specific function
(or set of related functions) within the company. A multinational chemical
company, for example, might create a data mart exclusively for the sales and
marketing functions across all organizations and across all product lines.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 22
23. Market-bounded data: A company might occasionally be so
focused on a specific market and the associated competitors that it
makes sense to create a data mart oriented with that particular focus.
This type of environment might include competitive sales, all
available public information about the market and competitors
(particularly if you can find this information on the Internet), and
industry analysts’ reports, for example.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 23
24. Data mart or data warehouse?
If you start a project from the outset with either of the following
premises, you already have two strikes against you:
“We’re building a real data warehouse, not a puny little data mart.”
“We’re building a data mart, not a data warehouse.”
Until you understand the following three issues, you have no
foundation on which to classify your impending project as either a
data mart or a data warehouse:
The volumes and characteristics of data you need
The business problems you’re trying to solve and the questions
you’re trying to answer
The business value you expect to gain when your system is
successfully built
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 24
25. IMPLEMENTING A DATA MART
There are the three keys to speedy implementation:
Follow an iterative, phased methodology.
Hold to a fixed time for each phase.
Avoid scope creep at all costs.
2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 25
26. 2/29/2012 STRUCTURE OF DATA WAREHOUSE & DATA MARTS 26