• In some data warehouse implementations, a
data mart is a miniature data warehouse;
• In others, it is just one segment of the data
• Data marts are often used to provide
information to functional segments of the
When Data Mart Appropiate
• Data marts are sometimes designed as complete individual data warehouses and
contribute to the overall organization as a member of a distributed data
• In other designs, data marts receive data from a master data warehouse through
periodic updates, in which case the data mart functionality is often limited to
presentation services for clients.
• Data Marts are created for the following reasons
– To speed up work by reducing the volume of data scanned
– To structure data for a user access tool
– To partition data in order to impose access control strategies
– To segment data into different hardware platform
DESIGN OF DATA MART
• Regardless of the functionality provided by data marts, they must be designed as
components of the master data warehouse so that data organization, format, and schemas
are consistent throughout the data warehouse.
• Inconsistent table designs, update mechanisms, or dimension hierarchies can prevent data
from being reused throughout the data warehouse, and they can result in inconsistent
reports from the same data
– it is unlikely that summary reports produced from a finance department data mart that
organizes the sales force by management reporting structure will agree with summary
reports produced from a sales department data mart that organizes the same sales force
by geographical region.
Before designing for data mart we must confirm that data mart solution is appropiate for the
– Identify whether there is a natural functional split within the organization
– Identify whether there is a natural split of data
– Data marts should be designed from the perspective that they are components of the
data warehouse regardless of their individual functionality or construction
– This provides consistency and usability of information throughout the organization.
IDENTIFY FUNCTIONAL SPLIT
• We must see if the split will help the organisational benefit or not
– athe retail sales in a organisation in which merchant is responsible for sales.Their berief
could be to maximize the sales by ensuring adequate sales.
• In practice the information would be of value of:
– Sales transaction on a daily level or to monitor actual sales
– Sales forecast on weekly basis
– Stock position daily basis
– Stock movement on a daily basis .
Importance of Data Mart
• Easy access to frequently needed data
• Creates collective view by a group of users
• Improves end-user response time
• Ease of creation
• Lower cost than implementing a full Data
• Potential users are more clearly defined than
in a full Data warehouse
• Metadata is loosely defined as data about data.
• Metadata is a concept that applies mainly to electronically archived or presented
data and is used to describe the
– a) definition,
– b) structure and
– c) administration of data files with all contents in context to ease the use of
the captured and archived data for further use.
– example: a web page may include metadata specifying what language it's
written in, what tools were used to create it, where to go for more on the
subject and so on
What is Meta data
• Metadata (meta data, or sometimes metainformation) is "data about other data", of any
sort in any media. An item of metadata may describe an individual datum, or content item, or
a collection of data including multiple content items and hierarchical levels, such as a
database schema. In data processing, metadata provides information about, or
documentation of, other data managed within an application or environment. This commonly
defines the structure or schema of the primary data.
– metadata would document data about data elements or attributes, (name, size, data type, etc) and data about
records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated,
ownership, etc.). Metadata may include descriptive information about the context, quality and condition, or
characteristics of the data. It may be recorded with high or low granularity
Metadata contains information about that data or other data
Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in
the identification, discovery, assessment, and management of the described entities
Why Metadata is important
• Assume that the project team has completed successfully the development of first
data mart.But the user can have several things in mind:
– Are the predefined queries I look at
– What are the various elements in data warehouse
– Is there information about unit sales and unit costs by product
– How can I browse and see what is available
– From where did they get the data for data warehouse? From which source
– How old are data warehouse
– When is the last time fresh data was brought in
– Are there summaries by months and product
• We can define meta data in terms data
warehousing which includes:
– Data about data
– Table of content for data
– Catalog for data
– Data warehouse roadmap
– Data warehouse directory
Applications of Metadata
• Metadata has been used in various forms as a means of cataloging archived
• Metadata may be written into a digital photo file that will identify who owns it,
copyright & contact information, what camera created the file, along with
exposure information and descriptive information such as keywords about the
photo, making the file searchable on the computer and/or the Internet
• Web pages
• Web pages often include metadata in the form of meta tags. Description and
keywords meta tags are commonly used to describe the Web page's content. Most
search engines use this data when adding pages to their search index.
Critical Need of Data warehouse
• Meta data is absolute need in building datawarehouse i.e
– For Using data warehouse:
• To run adhoc queries and formatting reports users need to know
about the data in data warehouse.
• The users should gain maximum from data ware house and
ignorance of data should not give them wrong conclusion
– For building the data warehouse:
• For data extraction we must know the source system
• Structures and content will help in determining mapping
• As a Role of DBA if one needs to know about metadata for physical loading and
– Data Administration
• Data Administration is not possible knowing the metadata
• Metadata is absoultely necessary for building datawarehouse
Data warehouse Metadata
• Metadata systems in data warehouse are
sometimes separated into two sections:
1.back room metadata that are used for Extract,
transform, load functions to get OLTP data into a
2.front room metadata that are used to label
screens and create reports
Business Intelligence metadata
• Business Intelligence is the process of analyzing large amounts of corporate data,
usually stored in large databases such as a Data Warehouse, tracking business
performance, detecting patterns and trends, and helping enterprise business users
make better decisions. Business Intelligence metadata describes how data is
queried, filtered, analyzed, and displayed in Business Intelligence software tools,
such as Reporting tools, OLAP tools, Data Mining tools.
• Data Mining metadata: The descriptions and structures of Data Sets, Algorithms, Queries
• OLAP metadata: The descriptions and structures of Dimensions, Cubes, Measures (Metrics),
Hierarchies, Levels, Drill Paths
• Reporting metadata: The descriptions and structures of Reports, Charts, Queries, Data Sets,
Filters, Variables, Expressions
Building the data warehouse
• To build the metadata when need the data for data
warehouse extracted,the programmer needs to know
– the source system,data structure
– The data content
– How to handle data
• For DBA
– Incremental loading
– Last Compared data
– Populating tables
Administrating of Data warehouse
• Add new summary table
• Expand storage
• Add information delivery to the users
• When to schedule back ups
• How o maintain security system
• How to keep data definition up to date
• How o verify external data ongoing basis
Metadata used for Transformation
• Metadata may be used during data transformation and load to describe data any
changes made to data.
• The greater the difference in source the greater the requirement of metadata.
• The advantages of storing metadata is any transformation takes place as source
data changes it can be captured by metadata.
• For source data the following information required
– Source field(needs to be uniquely identified
• Unique Identifier
• Meta data is required to describe the data as it resides in the data warehouse.
• This is needed for warehouse manager to track and control all data movement.
• Metadata is needed for all these things
• For each table the information stored are:
– Table name(should be name in data dictionary
• Column name
• Reference identifier
• Aggregation to be stored in the way table is stored with aggregation name
and columns .
• Similarly partition also need information like partition key and data range
inside the table
Data E T L
• How to handle data changes
• How to include new sources
• Where to cleanse the data
• How to change data cleansing
• How to switch to new data
• How to add new external data
• How to drop external data source
• How merging and acquisition takes
• How to add new summary table
• How to expand storage
• How to add new information tools for
• How to continue ongoing training
• How to improve adhoc queries
• When to schedule back ups
• How to maintain security systems
• How to monitor load distribution
Why Metadata for vital end users
• Meta data helps user to know the complexity of data and how it should be
transformed into the information.
• In a company when a business analyst analyses the reason for loss or profit ,he
sees the following things:
• Are the sales stored in individual transactions or summary totals.
• Can sales be analyzed by product , promotion ,store and month.
• Can the current month sales be compared to previous month sales
• From where the sales come from , what is the source system.
• How old are sales system and how does it get updated.
– If the analyst is not sure of data he can not anlayze perfectly.
– It would be perfect for a anlyst if he has a perfect road map of
Metadata Vital for End users
• Data Content
• Summary Data
• Business Dimensions
• Business metrics
• Navigation paths
• Source systems
• External data
• Last update data
• Report formats
• OLAP data
Who needs Metadata
IT Professionals POWER USERS CASUAL USERS
Meaning of Data Data structures
Information Access SQL,3GL,4GL, Query tools Authorization
• Meta data is required by the query manager
to enable generate queries.
• The query manager generate metadata about
the queries it has run
• The metadata can be used build a history of all
queries run and generate query profile.
• The metadata that is required for each query are:
• Tables accessed
– Columns accessed
» Reference identifier
• Restriction applied
– Column name
– Table name
– Reference identifier
• Join criteria applied
– Column name
– Table name
– Reference identifier www.notesvillage.com
Why Metadata is essential for IT
• Beginning from data extraction to information delivery metadata is crucial.
• The following is the need for IT to process data:
– Source of data structures
– Source platforms
– Data extraction methods
– External data
– Data transformation rules
– Data cleansing rules
– Staging area structures
– Dimensional models
– OLAP Sytems
– Query/report Design
Automation of datawarehouse
• Tools performs major functions of data warehouse
• Tools enables data movement ,transformation accordingly etc.
• While designing data warehouse we must at the beginning see to create tool for
• In backend processes each tools record it’s own metadata.
– Source data structure definition
– Data extraction
– Initial Reformatting/merging
– Preliminary data cleansing
– Data transformation
– Data warehouse structure definition
– Load Merge creation
Classification of Metadata types
• Classification of metadata types by functional
– Data acquisition
– Data storage
– Information delivery
• Acquisition process:
– Data Extraction
– Data transformation
– Data cleansing
– Data Integration
– Data staging
• Metadata Types:
– Source system platforms
– Source structure definition
– Data extraction method
– Data transformation rules
– Data cleansing rules
– External data sructures
– External data definition
– Summerization rules
– Target physical and logical
• The metadata used recorded by the process in
data storage area is used for development
,administration and for user.
• User would like to see what is the last time
previous data loaded.
• DBA will use the metadata for processes
backup and incremental loads.
• Information delivery
– Report generation
– Query processing
– Complex Analysis
• Metadata types:
– Source systems
– Source data definitions
– Data extraction tools
– Query templates
– Preformatted reports
– OLAP content
• Technical Metadata:
– data about the processes, the tool sets, the
repositories, the physical layers of data under the
covers. Data about run-times, performance
averages, table structures, indexes, constraints;
data about relationships, sources and targets, up-
time, system failure ratios, system resource
utilization ratios, performance numbers
• List of questions Technical Metadat can answer
– What database and tables exists
– What are column for each table
– What are keys and indexes
– What are physical files
– What load refresh schedules
– What type aggregations are available
– What is source to target mapping in data warehouse.
• Better understand metadata by looking at a list of example:
– Source systems
– Source to target mapping
– Data transformation business rules
– Data transformation
– Attributes and business definition
– Query reporting tools
– Predefined tools
– Predefine reports
– Report distribution information
– Currency OLAP Report
– Rules for analysis using OLAP report
Behaviour of Business Metadata
• How can I sign onto Metadata
• Which part of data warehouse I can access.
• What are part of definition I need on my part for query.
• What are types of aggregation available for my metrics.
• How Old are OLAP data. Should I wait for next update.
– Business analyst
– Regular users
In IT, Business Metadata is adding additional text or statement
around a particular word that adds value to data. Business
Metadata is about creating definitions, business rules. For
example, when tables and columns are created the following
business metadata would be more useful for generating reports to
functional and technical team. The advantage is of this business
metadata is whether they are technical or non-technical,
everybody would understand what is going on within the
Table’s Metadata: While creating a table, metadata for definition
of a table, source system name, source entity names, business
rules to transform the source table, and the usage of the table in
reports should be added in order to make them available for
taking metadata reports.
Column’s Metadata: Similarly for columns, source column name
(mapping), business rules to transform the source column name,
and the usage of the column in reports should be added for taking
Business rules In dataware house
• In the course of designing and populating a data warehouse, some key questions must be
answered about the data being incorporated in the warehouse. More often than not, many
of these answers are not known at the outset of the project, but must be established if the
data warehouse is to succeed. Interestingly, these for the most part represent the same
contextual information about the data that business users of the warehouse will need to
know to be able to fully understand the information provided, and to trust in its reliability.
The questions include:
• What are the valid values for the attributes of the data warehouse?
• What are the valid data sources for the data warehouse?
• When the data’s life cycle, in the operational world, should it be captured and sent to the
• What are the “cleansing rules” for the source data?
• What are the transformation rules to move the source data to the target database?
• How was the data calculated in the operational database
Difference between Technical
metadata and business metadata
• Metadata into technical (the tool-specific metadata used by IT and vendors)
and business metadata (what a businessperson needs to know about what
• The technology person thinks about a data column - how it's defined in a
database, represented in a data model, mapped and transformed in the ETL
tool and defined in the BI report. All of this, however, is very much related to
how the tools store and process the data. The primary challenge is
gathering and integrating the metadata across tools.
• The businessperson thinks about where the data came from, its associated
data quality level, how it was filtered from its source and what types of
business rules and algorithms were applied to it. Most of this metadata is
either not stored in the tools or needs some serious translation from
technical terms to business language.
• The requirements for Metadata management are:
– Capturing and storing business
• Changes of algorithm methodology occurs when data for several years stores.
• Versioning must be maintained
– Variety of Metadata sources
• Different sources metadata available
– Metadata integration
• To be unified,merge to give a meaning to the end user.
– Metadata standardization
• Storage all the metadata should be in the same manner
– Rippling through revisions
• Revisions will occur as business rules changes
– Metadata Exchange
• End user should be able to exchange one meta data to another meta data.
– Support for end user
• Meat data must provide simple graphical and tabular representation to make-it
easy to browse through.
• Major challenges for Metadata management are:
– Each software tool has it’s own propiriey of metadata.If we
are using several tools ,how can we reconcile it.
– No industry wide accepted standards exist for metadata
– Preserving metadata version control uniformity in data
warehouse is very much difficult.
– Unifying data sources are very much difficult , since we
have to deal with conflicting standards, formats , data
naming conventions , units and measures.
META DATA REPOSITORY
• Metadata repository may be thought of two
distinct information queries:
– Technical Metadata
– Business Metadata
Hinweis der Redaktion
www.notesvillage.com-free any university tutorials,lectures,notes,results.universities supported mgu,kerala,anna,cusat,annamalai,calicut...and more.file types of pdf,ppt,txt,doc,etc....