Metadata management is critical for organizations looking to understand the context, definition and lineage of key data assets. Data models play a key role in metadata management, as many of the key structural and business definitions are stored within the models themselves. Can data models replace traditional metadata solutions? Or should they integrate with larger metadata management tools & initiatives? Join this webinar to discuss opportunities and challenges around:
- How data modeling fits within a larger metadata management landscape
- When can data modeling provide “just enough” metadata management
- Key data modeling artifacts for metadata
- Organization, Roles & Implementation Considerations
1. Data Modeling & Metadata Management
Donna Burbank
Global Data Strategy Ltd.
Lessons in Data Modeling DATAVERSITY Series
October 27th, 2016
2. Global Data Strategy, Ltd. 2016
Donna is a recognized industry expert in
information management with over 20
years of experience in data strategy,
information management, data modeling,
metadata management, and enterprise
architecture.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specialises in the alignment
of business drivers with data-centric
technology. In past roles, she has served in
a number of roles related to data modeling
& metadata:
• Metadata consultant (US, Europe, Asia,
Africa)
• Product Manager PLATINUM Metadata
Repository
• Director of Product Management,
ER/Studio
• VP of Product Marketing, Erwin
• Data modeling & data strategy
implementation & consulting
• Author of 2 books of data modeling &
contributor to 1 book on metadata
management, plus numerous articles
• OMG committee member of the
Information Management Metamodel
(IMM)
As an active contributor to the data
management community, she is a long
time DAMA International member and is
the President of the DAMA Rocky
Mountain chapter. She has worked with
dozens of Fortune 500 companies
worldwide in the Americas, Europe, Asia,
and Africa and speaks regularly at industry
conferences. She has co-authored two
books: Data Modeling for the
Business and Data Modeling Made Simple
with ERwin Data Modeler and is a regular
contributor to industry publications such
as DATAVERSITY, EM360, & TDAN. She can
be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
Donna Burbank
2
Follow on Twitter @donnaburbank
Today’s hashtag: #LessonsDM
3. Global Data Strategy, Ltd. 2016
Lessons in Data Modeling Series
• July 28th Why a Data Model is an Important Part of your Data Strategy
• August 25th Data Modeling for Big Data
• September 22nd UML for Data Modeling – When Does it Make Sense?
• October 27th Data Modeling & Metadata Management
• December 6th Data Modeling for XML and JSON
3
This Year’s Line Up
4. Global Data Strategy, Ltd. 2016
Agenda
• How data modeling fits within a larger metadata management landscape
• When can data modeling provide “just enough” metadata management
• Key data modeling artifacts for metadata
• Organization, roles & implementation considerations
• Summary & questions
4
What we’ll cover today
5. Global Data Strategy, Ltd. 2016
Metadata is Hotter than ever
5
A Growing Trend
In a recent DATAVERSITY survey, over 80% of
respondents stated that:
Metadata is as important, if not more
important, than in the past.
7. Global Data Strategy, Ltd. 2016
Metadata is the “Who, What, Where, Why, When & How” of Data
7
Who What Where Why When How
Who created this
data?
What is the business
definition of this data
element?
Where is this data
stored?
Why are we storing
this data?
When was this data
created?
How is this data
formatted?
(character, numeric,
etc.)
Who is the Steward of
this data?
What are the business
rules for this data?
Where did this data
come from?
What is its usage &
purpose?
When was this data
last updated?
How many databases
or data sources store
this data?
Who is using this
data?
What is the security
level or privacy level
of this data?
Where is this data
used & shared?
What are the business
drivers for using this
data?
How long should it be
stored?
Who “owns” this
data?
What is the
abbreviation or
acronym for this data
element?
Where is the backup
for this data?
When does it need to
be purged/deleted?
Who is regulating or
auditing this data?
What are the technical
naming standards for
database
implementation?
Are there regional
privacy or security
policies that regulate
this data?
8. Global Data Strategy, Ltd. 2016
Metadata is Part of a Larger Enterprise Landscape
8
A Successful Data Strategy Requires Many Inter-related Disciplines
“Top-Down” alignment with
business priorities
“Bottom-Up” management &
inventory of data sources
Managing the people, process,
policies & culture around data
Coordinating & integrating
disparate data sources
Leveraging & managing data for
strategic advantage
9. Global Data Strategy, Ltd. 2016
Metadata Across & Beyond the Organization
• Metadata exists in many sources across & beyond the organization.
9
COBOL
Legacy Systems
JCL
Spreadsheets
Media
Social
Media
IoTOpen Data
Databases
Data Models
Documents
Data
In Motion
10. Global Data Strategy, Ltd. 2016
Types of Metadata
• The DATAVERSITY Emerging Trends in Metadata survey revealed some interesting findings about
what types of metadata organizations will be managing now and in the future.
10
= Supported by most data modeling tools
Now Future
11. Global Data Strategy, Ltd. 2016
Data Models are a Good Source of Metadata
• Data Models are another good source of both business & technical metadata for relational
databases.
• They store structural metadata as well as business rules & definitions.
• Key relationships are also stored to provide lineage & impact analysis.
11
Customer
Customer_ID CHAR(18) NOT NULL
First Name
Last Name
City
Date Purchased
CHAR(18)
CHAR(18)
CHAR(18)
CHAR(18)
NOT NULL
NOT NULL
NULL
NULL
Technical Metadata Business Metadata
12. Global Data Strategy, Ltd. 2016
Data vs. Metadata
12
First Name Last Name Company City
Year
Purchased
Joe Smith Komputers R Us New York 1970
Mary Jones The Lord’s Store London 1999
Proful Bishwal The Lady’s Store Mumbai 1998
Ming Lee My Favorite Store Beijing 2001
Metadata
Data
Customer
13. Global Data Strategy, Ltd. 2016
Data vs. Metadata
13
STR01 STR02 TXT123 TXT127 DT01
Joe Smith Komputers R Us New York 1970
Mary Jones The Lord’s Store London 1999
Proful Bishwal The Lady’s Store Mumbai 1998
Ming Lee My Favorite Store Beijing 2001
Metadata?
Data
Customer
14. Global Data Strategy, Ltd. 2016
Metadata adds Context & Definition
14
First Name Last Name Company City
Year
Purchased
Joe Smith Komputers R Us New York 1970
Mary Jones The Lord’s Store London 1999
Proful Bishwal The Lady’s Store Mumbai 1998
Ming Lee My Favorite Store Beijing 2001
Customer Definition
Last Name represents the surname or family name of
an individual.
Business Rules
In the Chinese market, family name is listed first in
salutations.
Format VARCHAR(30)
Abbreviation LNAME
Required YES
Etc.
Numerous technical & business metadata including
security, privacy, nullability, primary key, etc.Is this the city where the customer lives
or where the store is located?
15. Global Data Strategy, Ltd. 2016
Technical & Business Metadata
• Technical Metadata describes the structure, format, and rules for storing data
• Business Metadata describes the business definitions, rules, and context for data.
• Data represents actual instances (e.g. John Smith)
15
CREATE TABLE EMPLOYEE (
employee_id INTEGER NOT NULL,
department_id INTEGER NOT NULL,
employee_fname VARCHAR(50) NULL,
employee_lname VARCHAR(50) NULL,
employee_ssn CHAR(9) NULL);
CREATE TABLE CUSTOMER (
customer_id INTEGER NOT NULL,
customer_name VARCHAR(50) NULL,
customer_address VARCHAR(150) NULL,
customer_city VARCHAR(50) NULL,
customer_state CHAR(2) NULL,
customer_zip CHAR(9) NULL);
Technical Metadata
John Smith
Business Metadata
Data
Term Definition
Employee
An employee is an individual who currently
works for the organization or who has been
recently employed within the past 6 months.
Customer
A customer is a person or organization who
has purchased from the organization within
the past 2 years and has an active loyalty card
or maintenance contract.
16. Global Data Strategy, Ltd. 2016
Business vs. Technical Metadata
• The following are examples of types of business & technical metadata.
16
Business Metadata Technical Metadata
• Definitions & Glossary
• Data Steward
• Organization
• Privacy Level
• Security Level
• Acronyms & Abbreviations
• Business Rules
• Etc.
• Column structure of a database table
• Data Type & Length (e.g. VARCHAR(20))
• Domains
• Standard abbreviations (e.g. CUSTOMER ->
CUST)
• Nullability
• Keys (primary, foreign, alternate, etc.)
• Validation Rules
• Data Movement Rules
• Permissions
• Etc.
17. Global Data Strategy, Ltd. 2016
Levels of Data Modeling
17
Conceptual
Logical
Physical
Business Concepts
Data Entities
Physical Tables
Business
Metadata
Technical
Metadata
18. Global Data Strategy, Ltd. 2016
Business Definitions
From Data Modeling for the Business by
Hoberman, Burbank, Bradley, Technics
Publications, 2009
20. Global Data Strategy, Ltd. 2016
Human Metadata
• Much business metadata and the history of the business exists in employee’s heads.
• It is important to capture this metadata in an electronic format for sharing with others.
• Avoid the dreaded “I just know”
20
Avoid the dreaded “I just know”
Part Number is what used to
be called Component
Number before the
acquisition.
Business Glossary
Metadata Repository
Data Models
Etc.
21. Global Data Strategy, Ltd. 2016
Data Modeling in the Big Data Ecosystem
Hive HBase
Structured Data Unstructured Data
MapReduce / AnalyticsHadoop Framework
HDFS File System
JSON / XML
HQL
Semi-structured Data
JSON
XML JSON
Data Sources
22. Global Data Strategy, Ltd. 2016
Cobol Copybook Metadata
• What is a COBOL Copybook? – In COBOL, a copybook file is used to define data elements that can
be referenced by many programs
• What is COBOL Copybook Metadata? – structure, definition
22
Metadata
Describes structure & format of
data
23. Global Data Strategy, Ltd. 2016
ERP/CRM and Packaged Application Metadata
• Packaged applications such as CRM and ERP systems (e.g. Salesforce, Peoplesoft, etc.) are
typically based on a relational database system.
• Therefore, there is important metadata about both the physical table structures as well as the
business names & definitions.
23
Technical Metadata Business Metadata
25. Global Data Strategy, Ltd. 2016
Data Lineage - Data Warehousing Example
• In the data warehouse example below, metadata for CUSTOMER exists in a
number tools & data stores.
• This lineage can be tracked in most data modeling tools.
25
Sales Report
CUSTOMER
Database Table
CUST
Database Table
CUSTOMER
Database Table
CUSTOMER
Database Table
TBL_C1
Database Table
Business Glossary
ETL Tool ETL Tool
Physical Data Model
Physical Data Model
Logical Data Model
Dimensional
Data Model
BI Tool
26. Global Data Strategy, Ltd. 2016
Metadata Discovery Tools
• Metadata Discovery Tools extract metadata from source systems, and rationalize
them to a common metamodel and storage facility.
26
Metadata Discovery
Tools
Metamodel(s)
Metadata Storage
(Database)
Metadata Storage
(Repository)
Metadata
Population
27. Global Data Strategy, Ltd. 2016
Impact Analysis & Where Used
• Impact Analysis shows the relationship between a piece of metadata and other sources that rely
on that metadata to assess the impact of a potential change.
• For example, if I change the length & name of a field, what other systems that are referencing
that field will be affected?
27
What happens if I change the name &
length of the “Brand” field?
Brand CHAR(10)
MyBrand VARCHAR(30)
Sales Application
Sales Database
DB2
Staging Area
ETL
Customer
Database
Oracle
28. Global Data Strategy, Ltd. 2016
Design Layer Relationships
• In a data model there are several design layers that describe a given data
concept.
28
30. Global Data Strategy, Ltd. 2016
Who Uses Metadata?
• In addition to sharing metadata between tools and via export,
many users across both IT & the business want to view the metadata through
reports, portals, etc.
30
Developer
If I change this field, what
else will be affected?
Business Person
(e.g. Finance)
What’s the definition of
“Regional Sales”
Auditor
How was “Total Sales”
calculated? Show me the
lineage.
Data
Architect
What is the approved data
structure for storing
customer data?
Data
Warehouse
Architect
What are the source-to-
target mappings for the
DW?
Business Person
(e.g. HR)
How can I get new staff up-
to-speed on our company’s
business terminology?
31. Global Data Strategy, Ltd. 2016
Metadata is Needed by Business Stakeholders
31
Making business decisions on accurate and well-understood data
80% of users of metadata are from
the business, according to the
recent DATAVERSITY survey.
32. Global Data Strategy, Ltd. 2016
Metadata Publication & Reporting – Business Glossary
• A Business Glossary is a common way to publish business terms & their definitions.
• When sourced from a common repository, these terms are integrated with the wider data
landscape.
• Most data modeling tools can take the definitions from Logical and/or Conceptual data models
and publish them to a Glossary-style format, via web portals or reports.
32
Business Term Abbreviation Definition Data Steward Security Level
BFPO Number BFPO Num
BFPO Number is for British Forces Postal Office. It can be
used in UK and overseas addresses. Accounting Unclassified
Interest Int The growth in capital of a monetary investment Finance Unclassified
PO Box POB
A numbered box in a post office assigned to a person or
organization, where mail for them is kept until collected Accounting Unclassified
A feedback mechanism is important to gather valuable input & updates
from users.
33. Global Data Strategy, Ltd. 2016
Metadata Publication & Reporting – Lineage
• Data Lineage can be visualized through a web portal or reports.
• With web-based reporting, users can drill-down into each data source and investigate further
lineage.
33
34. Global Data Strategy, Ltd. 2016
Metadata Publication & Reporting – Data Structures
• Having a common view of standard data structures is helpful for data architects, developers, etc.
• This can all be sourced from a data model.
34
Table Name Column Name Attribute Name Data Type Nullability Primary Key Definition
CUSTOMER CUST_ID Customer Identifier VARCHAR(20) NOT NULL Yes
Customer ID is the unique identifier that
locates a customer
F_NAME First Name VARCHAR(30) NOT NULL No The given name of an individual
L_NAME Last Name VARCHAR(40) NOT NULL No The family name of an individual
ORDER ORDER_ID Order Identifier VARCHAR(10) NOT NULL Yes
The number assigned to an order from
the FIX10 system that locates a unique
order.
Etc.
35. Global Data Strategy, Ltd. 2016
Data Models can provide “Just Enough” Metadata Management
35
Metadata
Storage
Metadata
Lifecycle &
Versioning
Data Lineage
Visualization
Business Glossary Data Modeling
Metadata
Discovery &
Integration w/
Other Tools
Customizable
Metamodel
Data Modeling Tools
(e.g. Erwin, SAP
PowerDesigner, Idera
ER/Studio)
x X x X X x
Metadata Repositories (e.g.
ASG, Adaptive) X X X X X X
Data Governance Tools (e.g.
Collibra, Diaku) x x X x x
Spreadsheets x x x
• While data modeling tools are not metadata repositories, nor designed to be, they offer many features shared with these
repository solutions:
• Metadata storage, Data lineage visualization, Business Glossary, Integration with BI tools, ETL tools, etc.
• Metadata repositories have a broader range metadata sources & dedicated metadata management support.
• And Data Modeling tools, of course, have the added benefit of doing data modeling!
• And the benefit is that much of the needed metadata is in these data models.
36. Global Data Strategy, Ltd. 2016
Key Components of Metadata Management
36
Metadata Strategy Metadata Capture &
Storage
Metadata Integration &
Publication
Metadata Management &
Governance
Alignment with business goals
& strategy
Identification of all internal &
external metadata sources
Identification of all technical
metadata sources
Metadata roles &
responsibilities defined
Identification of & feedback
from key stakeholders
Population/import mechanism
for all identified sources
Identification of key
stakeholders & audiences
(internal & external)
Metadata standards created
Prioritization of key activities
aligned with business needs &
technical capabilities
Identification of existing
metadata storage
Integration mechanism for key
technologies (direct
integration, export, etc.)
Metadata lifecycle
management defined &
implemented
Prioritization of key data
elements/subject areas
Definition of enterprise
metadata storage strategy
Publication mechanism for
each audience
Metadata quality statistics
defined & monitored
Communication Plan
developed
Feedback mechanism for each
audience
Metadata integrated into
operational activities & related
data management projects
37. Global Data Strategy, Ltd. 2016
Implementing a Metadata Strategy
• A successful metadata strategy requires input from multiple factors.
37
Business Drivers & Motivation
Metadata Sources & Technology
Metadata Management MaturityStakeholders & Audience Metadata
Strategy
38. Global Data Strategy, Ltd. 2016
Stakeholder Feedback
• Determine key business issues & drivers through direct feedback.
38
I didn’t know we had any
documented data
standards
Where do I go to get the
definition of “default
banking standard”?
$12m has been spent on
projects to clean up the data
over the past 2-3 years
What are the data structures
used in the application?
We have 15 customer
databases – with many
duplications.
There is limited ownership or
enforcement of common
practices and standards
across the projects
Key subject matter experts
are relied upon to review
detailed data from various
systems to ensure accuracy.
I just joined the company and
don’t understand all of the
acronyms!
There was an error in reporting
products by customer & region
that was noticed by upper
management.
I need a central, accurate
view of all my customers
worldwide.
39. Global Data Strategy, Ltd. 2016
Mapping Business Drivers to Metadata Management
Capabilities
39
Business Drivers
Digital
Self Service
Increasing Regulatory
Pressures
Online Community &
Social Media
Community Building
External Drivers
Internal Drivers
Targeted Marketing
360 View of Customer
Brand Reputation
Efficient IT
Stakeholder Challenges
Lack of Business Alignment
• Data spend not aligned to Business Plans
• Business users not involved with data
Integrating Data
• Siloed systems
• No common view of key information
3 Data Quality Issues
• Bad customer info causing Brand damage
• Completeness & Accuracy Needed
4
Cost of Data Management
• Manual entry increases costs
• System redundancy
• No reuse or standards
5 No Audit Trails
• No lineage of changes
• Fines had been levied in past for lack of
compliance
6 Big Data Exploitation
• Exploiting Unstructured Data
• Access to External & Social Data
1
Shows “Heat
Map” of Priorities
2
3
4
5
6
Metadata Capability
Metadata Strategy
Metadata Capture &
Storage
Metadata Integration &
Publication
Metadata Management &
Governance
1 2 3 4 5 6
2 3 4 5 6
2 3 4
1 2 3 4 5 6
40. Global Data Strategy, Ltd. 2016
Inventory & Usage Mapping
• It’s also important to determine which teams are using these technologies to
create a “heat map” of usage & priority.
40
Metadata Sources Leadership Sales Finance Marketing Support R&D HR Legal Compliance
Relational Databases
MySQL X
Oracle X X X X X X X X
SQL Server X X
Sybase X
Etc.
BI Tools
Tableau X X X X X X
Qlik X X X
Etc.
Open Data
Data.gov – agricultural data X X X
Etc.
41. Global Data Strategy, Ltd. 2016
Metadata Roles & Responsibilities
• It’s important to establish formal roles & responsibilities for your metadata effort.
• Some may be part-time, and some full-time, but they should be clearly defined and
communicated so that staff has understanding of and accountability for their roles.
• Executive Sponsor/Champion: Understands & communicates the importance of metadata
management across the organization.
• Steering Group: As part of a metadata management effort, or part of a larger data governance effort,
the steering group prioritizes & sets direction for key activities.
• Data Stewards: Responsible for business definitions & rules for key data elements.
• Metadata Repository Administrator: Manages the administration, population, and interfaces of a
metadata repository.
• Metadata Publicist: Establishes reports & publication methods to end users.
• Metadata Consumers: Actively use metadata as part of their daily jobs, and are held accountable for
using published standards.
• Data Modelers
• Developers
• Business Users
• Report Developers
• Etc.
41
42. Global Data Strategy, Ltd. 2016
Monitoring Metadata Quality & Metrics
• Metadata is a key driver of data quality, and to support this, the metadata itself must be of high
quality.
• In order to ensure that quality metadata is maintained, it must be actively managed and
monitored. Dashboards & Reports can be used to monitor key quality indicators.
• Key metadata quality indicators include:
• Completeness: e.g. Do definitions exist for all key data elements?
• Accuracy: e.g. Are current definitions correct? Do data types accurately represent currently
implemented standards?
• Currency/ Timeliness: e.g. Are metadata definitions current or outdated?
• Consistency: e.g. Are metadata standards defined, published & implemented consistently across the
organization?
• Accountability: e.g. Are data stewards or owners defined?
• Integrity: e.g. Are linkages and relationships established between critical metadata items?
• Privacy: e.g. Is any metadata subject to privacy restrictions?
• Usability: e.g. Are people actually using this metadata?
42
43. Global Data Strategy, Ltd. 2016
Summary
• Metadata is more important than ever
• Data models are a rich source of metadata
• While metadata repositories are valuable, data models & associated functionality can often
provide “just enough” metadata management
• Business definitions
• Technical data structures
• Data lineage & impact analysis
• Visual models
• Organizational considerations are critical to achieve success
• Understanding business drivers
• Defining roles & responsibilities
• Monitoring metadata quality & metrics
• Have fun! Metadata is for the cool kids.
44. Global Data Strategy, Ltd. 2016
About Global Data Strategy, Ltd
• Global Data Strategy is an international information management consulting company that specializes
in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
44
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
45. Global Data Strategy, Ltd. 2016
Contact Info
• Email: donna.burbank@globaldatastrategy.com
• Twitter: @donnaburbank
@GlobalDataStrat
• Website: www.globaldatastrategy.com
• Company Linkedin: https://www.linkedin.com/company/global-data-strategy-ltd
• Personal Linkedin: https://www.linkedin.com/in/donnaburbank
45
46. Global Data Strategy, Ltd. 2016
White Paper: Emerging Trends in Metadata Management
• Download from www.dataversity.net
• Under ‘Whitepapers’
46
Free Download
47. Global Data Strategy, Ltd. 2016
DATAVERSITY Training Center
• Learn the basics of Metadata Management and practical tips on how to apply metadata
management in the real world. This online course hosted by DATAVERSITY provides a series of six
courses including:
• What is Metadata
• The Business Value of Metadata
• Sources of Metadata
• Metamodels and Metadata Standards
• Metadata Architecture, Integration, and Storage
• Metadata Strategy and Implementation
• Purchase all six courses for $399 or individually at $79 each.
Register here
• Other courses available on Data Governance & Data Quality
47
Online Training Courses
New Metadata Management Course
Visit: http://training.dataversity.net/lms/
48. Global Data Strategy, Ltd. 2016
Lessons in Data Modeling Series
• July 28th Why a Data Model is an Important Part of your Data Strategy
• August 25th Data Modeling for Big Data
• September 22nd UML for Data Modeling – When Does it Make Sense?
• October 27th Data Modeling & Metadata Management
• December 6th Data Modeling for XML and JSON
48
Join us next time