Suche senden
Hochladen
Data-Ed Engineering Solutions to Data Quality Challenges
•
4 gefällt mir
•
2,343 views
DATAVERSITY
Folgen
Technologie
Melden
Teilen
Melden
Teilen
1 von 75
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DATAVERSITY
Data Quality Best Practices
Data Quality Best Practices
DATAVERSITY
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
ETIS10 - BI Governance Models & Strategies - Presentation
ETIS10 - BI Governance Models & Strategies - Presentation
David Walker
Chapter 4: Data Architecture Management
Chapter 4: Data Architecture Management
Ahmed Alorage
Data Engineering.pdf
Data Engineering.pdf
Datacademy.ai
Introduction to Data Governance
Introduction to Data Governance
John Bao Vuu
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
Precisely
Empfohlen
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DATAVERSITY
Data Quality Best Practices
Data Quality Best Practices
DATAVERSITY
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
ETIS10 - BI Governance Models & Strategies - Presentation
ETIS10 - BI Governance Models & Strategies - Presentation
David Walker
Chapter 4: Data Architecture Management
Chapter 4: Data Architecture Management
Ahmed Alorage
Data Engineering.pdf
Data Engineering.pdf
Datacademy.ai
Introduction to Data Governance
Introduction to Data Governance
John Bao Vuu
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
Precisely
Best Practices in Metadata Management
Best Practices in Metadata Management
DATAVERSITY
Why Data Vault?
Why Data Vault?
Kent Graziano
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
DATAVERSITY
Data Vault and DW2.0
Data Vault and DW2.0
Empowered Holdings, LLC
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
Roland Bullivant
Office of the Chief Data Officer. How is your office organized?
Office of the Chief Data Officer. How is your office organized?
Craig Milroy
Why data governance is the new buzz?
Why data governance is the new buzz?
Aachen Data & AI Meetup
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
Building a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
dmurph4
Data Vault Overview
Data Vault Overview
Empowered Holdings, LLC
Chapter 12: Data Quality Management
Chapter 12: Data Quality Management
Ahmed Alorage
Review of Data Management Maturity Models
Review of Data Management Maturity Models
Alan McSweeney
Data Governance Best Practices
Data Governance Best Practices
DATAVERSITY
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
David Walker
Building an integrated data strategy
Building an integrated data strategy
Lucas Modesto
Best Practices in Metadata Management
Best Practices in Metadata Management
DATAVERSITY
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DATAVERSITY
Data Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation
James Chi
Data Governance and Metadata Management
Data Governance and Metadata Management
DATAVERSITY
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data Blueprint
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Sybase Türkiye
Weitere ähnliche Inhalte
Was ist angesagt?
Best Practices in Metadata Management
Best Practices in Metadata Management
DATAVERSITY
Why Data Vault?
Why Data Vault?
Kent Graziano
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
DATAVERSITY
Data Vault and DW2.0
Data Vault and DW2.0
Empowered Holdings, LLC
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
Roland Bullivant
Office of the Chief Data Officer. How is your office organized?
Office of the Chief Data Officer. How is your office organized?
Craig Milroy
Why data governance is the new buzz?
Why data governance is the new buzz?
Aachen Data & AI Meetup
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
Building a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
dmurph4
Data Vault Overview
Data Vault Overview
Empowered Holdings, LLC
Chapter 12: Data Quality Management
Chapter 12: Data Quality Management
Ahmed Alorage
Review of Data Management Maturity Models
Review of Data Management Maturity Models
Alan McSweeney
Data Governance Best Practices
Data Governance Best Practices
DATAVERSITY
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
David Walker
Building an integrated data strategy
Building an integrated data strategy
Lucas Modesto
Best Practices in Metadata Management
Best Practices in Metadata Management
DATAVERSITY
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DATAVERSITY
Data Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation
James Chi
Data Governance and Metadata Management
Data Governance and Metadata Management
DATAVERSITY
Was ist angesagt?
(20)
Best Practices in Metadata Management
Best Practices in Metadata Management
Why Data Vault?
Why Data Vault?
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Vault and DW2.0
Data Vault and DW2.0
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
Office of the Chief Data Officer. How is your office organized?
Office of the Chief Data Officer. How is your office organized?
Why data governance is the new buzz?
Why data governance is the new buzz?
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
Building a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
Data Vault Overview
Data Vault Overview
Chapter 12: Data Quality Management
Chapter 12: Data Quality Management
Review of Data Management Maturity Models
Review of Data Management Maturity Models
Data Governance Best Practices
Data Governance Best Practices
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
Building an integrated data strategy
Building an integrated data strategy
Best Practices in Metadata Management
Best Practices in Metadata Management
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
Data Strategy Best Practices
Data Strategy Best Practices
3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation
Data Governance and Metadata Management
Data Governance and Metadata Management
Ähnlich wie Data-Ed Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data Blueprint
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Sybase Türkiye
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
David Walker
Big Data For Investment Research Management
Big Data For Investment Research Management
IDT Partners
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
David Linthicum
Martin Wildberger Presentation
Martin Wildberger Presentation
Mauricio Godoy
ICT for Governance and Policy Modelling
ICT for Governance and Policy Modelling
Corvinno Technology Transfer Center Nonprofit Public Ltd.
NASA Facilities GIS
NASA Facilities GIS
rjinterr
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Cambridge Semantics
Data Mining
Data Mining
swami920
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
Will Gardella
Physical Database Requirements.pdf
Physical Database Requirements.pdf
seifusisay06
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
The METL Process in Investment Banking
The METL Process in Investment Banking
Antony Benzing
SAP EIM
SAP EIM
Sybase Türkiye
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
Mark Ginnebaugh
Anexinet Big Data Solutions
Anexinet Big Data Solutions
Mark Kromer
1.1 Data Modelling - Part I (Understand Data Model).pdf
1.1 Data Modelling - Part I (Understand Data Model).pdf
RakeshKumar145431
StreamCentral Technical Overview
StreamCentral Technical Overview
Raheel Retiwalla
Ähnlich wie Data-Ed Engineering Solutions to Data Quality Challenges
(20)
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
Big Data For Investment Research Management
Big Data For Investment Research Management
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
Martin Wildberger Presentation
Martin Wildberger Presentation
ICT for Governance and Policy Modelling
ICT for Governance and Policy Modelling
NASA Facilities GIS
NASA Facilities GIS
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Data Mining
Data Mining
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
Physical Database Requirements.pdf
Physical Database Requirements.pdf
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
The METL Process in Investment Banking
The METL Process in Investment Banking
SAP EIM
SAP EIM
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
Anexinet Big Data Solutions
Anexinet Big Data Solutions
1.1 Data Modelling - Part I (Understand Data Model).pdf
1.1 Data Modelling - Part I (Understand Data Model).pdf
StreamCentral Technical Overview
StreamCentral Technical Overview
Mehr von DATAVERSITY
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
Make Data Work for You
Make Data Work for You
DATAVERSITY
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
Data Modeling Fundamentals
Data Modeling Fundamentals
DATAVERSITY
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
DATAVERSITY
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
DATAVERSITY
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
Data Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
DATAVERSITY
Data Management Best Practices
Data Management Best Practices
DATAVERSITY
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
Mehr von DATAVERSITY
(20)
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Make Data Work for You
Make Data Work for You
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Modeling Fundamentals
Data Modeling Fundamentals
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
Data Strategy Best Practices
Data Strategy Best Practices
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
Data Management Best Practices
Data Management Best Practices
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
Kürzlich hochgeladen
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
The Digital Insurer
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Andrey Devyatkin
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
The Digital Insurer
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
UK Journal
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
apidays
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
The Digital Insurer
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Kürzlich hochgeladen
(20)
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Data-Ed Engineering Solutions to Data Quality Challenges
1.
Data Quality Engineering
TITLE This presentation provides guidance to organizations considering data quality initiatives or preparing for data quality initiatives. This talk will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor data quality. Showing how data quality can be engineered provides a useful framework in which to develop an organizational approach. This in turn will allow organizations to more quickly identify data problems caused by structural issues versus practice-oriented defects. Participants will also Starting learn the importance of practicing data quality point for new system Metadata Creation • Define Data Architecture • Define Data Model Structures Metadata Refinement • Correct Structural Defects • Update Implementation engineering quantification. development architecture data architecture refinements Metadata Structuring Data Refinement • Implement Data Model Views • Correct Data Value Defects • Populate Data Model Views corrected • Re-store Data Values data data Date: October 9, 2012 Data Creation architecture and data models facts & Metadata & Data Storage data performance metadata Data Assessment meanings Time: 2:00 PM ET • Create Data • Assess Data Values • Verify Data Values • Assess Metadata shared data updated data Starting point for existing Presented by: Dr. Peter Aiken Data Utilization Data Manipulation systems • Inspect Data • Manipulate Data • Present Data • Updata Data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 1 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
2.
Get Social With
Us! TITLE Live Twitter Feed Like Us on Facebook Join the Group Join the conversation! www.facebook.com/ Data Management & Follow us: datablueprint Business Intelligence @datablueprint Post questions and Ask questions, gain insights comments and collaborate with fellow @paiken Find industry news, insightful data management Ask questions and submit content professionals your comments: #dataed and event updates. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 2 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
3.
Meet Your Presenter:
Dr. Peter Aiken • Internationally recognized thought- leader in the data management field - 30 years of experience – Recipient of multiple international awards – Founder, Data Blueprint (http://datablueprint.com) • 7 books and dozens of articles • Experienced w/ 500+ data management practices in 20 countries • Multi-year immersions with organizations as diverse as the US DoD, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia and Walmart 3 - datablueprint.com 10/11/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
4.
Data Quality
Engineering Data Quality Engineering DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12
5.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 5 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
6.
TITLE
The DAMA Guide to the Data Management Body of Knowledge Published by DAMA International • The professional association for Data Managers (40 chapters worldwide) DMBoK organized around • Primary data management functions focused around data delivery to the organization • Organized around several environmental elements Data Management Functions PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 6 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
7.
TITLE
The DAMA Guide to the Data Management Body of Knowledge Amazon: http:// www.amazon.com/ DAMA-Guide- Management- Knowledge-DAMA- DMBOK/dp/ 0977140083 Or enter the terms "dama dm bok" at the Amazon search engine Environmental Elements PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 7 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
8.
TITLE
What is the CDMP? • Certified Data Management Professional • DAMA International and ICCP • Membership in a distinct group made up of your fellow professionals • Recognition for your specialized knowledge in a choice of 17 specialty areas • Series of 3 exams • For more information, please visit: – http://www.dama.org/i4a/pages/ index.cfm?pageid=3399 – http://iccp.org/certification/ designations/cdmp #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 5/15/2012 8 © Copyright this and previous years by Data Blueprint - all rights reserved!
9.
TITLE
Data Management PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 9 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
10.
TITLE
Data Management Manage data coherently. Data Program Coordination Share data across boundaries. Organizational Data Integration Data Stewardship Data Development Assign responsibilities for data. Engineer data delivery systems. Data Support Operations Maintain data availability. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 10 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
11.
TITLE
Data Management PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 11 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
12.
TITLE
Overview: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 12 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
13.
TITLE
Overview: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 13 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
14.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 14 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
15.
TITLE
Definitions Data Quality Management • Planning, implementation and control activities that apply quality management techniques to measure, assess, improve, and ensure the fitness of data for use • Entails the establishment and deployment of roles, responsibilities concerning the acquisition, maintenance, dissemination, and disposition of data.” http://www2.sas.com/proceedings/sugi29/098-29.pdf • Critical support process in organizational change management • Continuous process for defining the parameters for specifying acceptable levels of data quality to meet business needs and for ensuring that data quality meets these levels Data Quality • Synonymous with information quality, since poor data quality results in inaccurate information and poor business performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/2012 10/09/12 15 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
16.
TITLE
Overview: DQM Concepts and Activities 1) Data Quality Management Approach 2) Develop and promote data quality awareness 3) Define data quality requirements 4) Profile, analyze and assess data quality 5) Define data quality metrics 6) Define data quality business rules 7) Test and validate data quality requirements 8) Set and evaluate data quality service levels 9) Measure and monitor data quality 10) Manage data quality issues 11) Clean and correct data quality defects 12) Design and implement operational DQM procedures 13) Monitor operational DQM procedures and performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 16 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
17.
TITLE
Concepts and Activities Data quality expectations provide the inputs necessary to define the data quality framework: – Requirements – Inspection policies – Measures, and monitors that reflect changes in data quality and performance • The data quality framework requirements reflect 3 aspects of business data expectations 1) A manner to record the expectation in business rules 2) A way to measure the quality of data within that dimension 3) An acceptability threshold from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 17 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
18.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 18 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
19.
TITLE
The DQM Cycle The general approach to DQM is a version of the Deming cycle. Deming proposes a problem–solving model known as “plan-do-study-act” or “plan-do-check-act” The cycle begins by: 1) Identifying data issues that are critical to the achievement of business objectives 2) Defining business requirements for data quality 3) Identifying key data quality dimensions 4) Defining business rules critical to ensuring high quality data from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 19 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
20.
TITLE
The DQM Cycle: (1) Plan Plan for the assessment of the current state and identification of key metrics for measuring quality • The data quality team assesses the scope of known issues • This involves: – Determining cost and impact – Evaluating alternatives for addressing them from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 20 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
21.
TITLE
The DQM Cycle: (2) Deploy Deploy processes for measuring and improving the quality of data: • Data profiling • Institute inspections and monitors to identify data issues when they occur • Fix flawed processes that are the root cause of data errors or correct errors downstream • When it is not possible to correct errors at their source, correct them at their earliest point in the data flow from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 21 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
22.
TITLE
The DQM Cycle: (3) Monitor Monitor the quality of data as measured against the defined business rules • If data quality meets defined thresholds for acceptability, the processes are in control and the level of data quality meets the business requirements • If data quality falls below acceptability thresholds, notify data stewards so they can take action during the next stage from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 22 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
23.
TITLE
The DQM Cycle: (4) Act Act to resolve any identified issues to improve data quality and better meet business expectations • New cycles begin as new data sets come under investigation or as new data quality requirements are identified for existing data sets from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 23 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
24.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 24 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
25.
TITLE
Develop and Promote DQ Awareness • Promoting data quality awareness is essential to ensure buy-in of necessary stakeholders in the organization • Ensure that the right people in the organization are aware of the existence of data quality issues • Awareness increases the chance of success of any DQM program • Awareness includes: – Relating material impacts to data issues – Ensuring systematic approaches to regulators – Oversight of the quality of organizational data – Socializing the concept that data quality problems cannot be solely addressed by technology solutions from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 25 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
26.
TITLE
Polling Question #1 Which is not a step to promote data quality awareness? a) Training on the core concepts of data quality b) Establish data governance framework for data quality c) Create a data architecture map PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 26 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
27.
TITLE
Develop and Promote DQ Awareness: Steps 1) Training on the core concepts of data quality 2) Establish data governance framework for data quality 3) Create a data quality oversight board that has a reporting hierarchy associated with the different data governance roles from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 27 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
28.
TITLE
Define DQ Requirements • Data quality must be understood within the context of ‘fitness for use’ • Data quality requirements are often hidden within defined business policies • Incremental detailed review and iterative refinement of business policies helps to identify those information requirements which become data quality rules • Steps for incremental detailed review: – Identify key data components associated with business policies – Determine how identified data assertions affect the business – Evaluate how data errors are categorized within a set of data quality dimensions – Specify the business rules that measure the occurrence of data errors – Provide a means for implementing measurement processes that assess conformance to those business rules from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 28 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
29.
TITLE
Data Quality Dimensions from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 29 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
30.
TITLE
Profile, Analyze and Assess DQ Data assessment using 2 different approaches: 1) Bottom-up 2) Top-down Bottom-up assessment: • Inspection and evaluation of the data sets • Highlight potential issues based on the results of automated processes Top-down assessment: • Engage business users to document their business processes and the corresponding critical data dependencies • Understand how their processes consume data and which data elements are critical to the success of the business application from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 30 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
31.
TITLE
Define DQ Metrics • Metrics development occurs as part of the strategy/design/plan step • Process for defining data quality metrics: 1) Select one of the identified critical business impacts 2) Evaluate the dependent data elements, create and update processes associate with that business impact 3) List any associated data requirements 4) Specify the associated dimension of data quality and one or more business rules to use to determine conformance of the data to expectations 5) Describe the process for measuring conformance 6) Specify an acceptability threshold from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 31 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
32.
TITLE
Test and Validate DQ Requirements • Data profiling tools analyze data to find potential anomalies • Use the same tools for rule validation • Rules discovered or defined during the data quality assessment phase are referenced in measuring conformance as part of the operational process from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 32 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
33.
TITLE
Set and Evaluate DQ Service Levels • Data quality inspection and monitoring are used to measure and monitor compliance with defined data quality rules • Data quality SLAs specify the organization’s expectations for response and remediation • Operational data quality control defined in data quality SLAs includes: – Data elements covered by the agreement – Business impacts associated with data flaws – Data quality dimensions associated with each data element – Quality expectations for each data element of the indentified dimensions in each application for system in the value chain – Methods for measuring against those expectations – (…) from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 33 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
34.
TITLE
Measure and Monitor DQ • DQM procedures depend on available data quality measuring and monitoring services • 2 contexts for control/measurement of conformance to data quality business rules exist: – In-stream: collect in-stream measurements while creating data – In batch: perform batch activities on collections of data instances assembled in a data set • Apply measurements at 3 levels of granularity: – Data element value – Data instance or record – Data set from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 34 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
35.
Clean & Correct
Manage DQ Issues DQ Defects • Supporting the enforcement of Perform data correction the data quality SLA requires a mechanism for reporting and in 3 ways: tracking data quality incidents 1) Automated correction and activities for researching 2) Manual directed correction and resolving those incidents 3) Manual correction • A data quality incident reporting system can provide this capability • It can log the evaluation, initial diagnosis, and actions associated with data quality events from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 35 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
36.
Manage DQ Issues:
Example TITLE Data quality incident tracking focuses on training staff to recognize when data issues appear and how they are to be classified, logged and tracked according to the data quality SLA from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 36 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
37.
Design and Implement
Monitor Operational Operational DQM DQM Procedures and Procedures Performances 1) Inspection and monitoring 1) Accountability is critical 2) Diagnosis and evaluation to governance of remediation protocols overseeing alternatives data quality control 3) Resolve issues 2) All issues must be 4) Reporting assigned 3) The tracking process should specify and document the ultimate issue accountability from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 37 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
38.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 38 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
39.
TITLE
Example: Data Quality Interview Session Summary • During mid-February, the Data Governance Team and Data Blueprint conducted ten qualitative interview sessions with groups of individuals who interact with data on regular basis • A series of patterns emerged as participants shared stories about the impact of poor data quality on the client, its products, and its customers • These patterns highlight gaps in best practices for ensuring data quality, i.e. the extent to which data is “fit for use” • Our preliminary analysis evaluated these stories against attributes of four data quality dimensions • At this early stage of the post-interview process, we are seeking confirmation of our assumptions and method PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 39 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
40.
TITLE
Which Activities Support Quality Data? • Data quality best practices depend on both – Practice-oriented activities – Structure-oriented activities Quality Practice-oriented Data Structure-oriented activities focus on activities focus on the capture and the data manipulation of data implementation PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 40 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
41.
TITLE
Quality Dimensions Practice-oriented causes • Stem from a failure to rigor when capturing and manipulating data such as: – Edit masking – Range checking of input data – CRC-checking of transmitted data Structure-oriented causes • Occur because of data and metadata that has been arranged imperfectly. For example: – When the data is in the system but we just can't access it; – When a correct data value is provided as the wrong response to a query; or – When data is not provided because it is unavailable or inaccessible to the customer • Developer focus within system boundaries instead of within organization boundaries PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 41 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
42.
TITLE
Practice-Oriented Activities • Affect the Data Value Quality and Data Representation Quality • Examples of improper practice-oriented activities: – Allowing imprecise or incorrect data to be collected when requirements specify otherwise – Presenting data out of sequence • Typically diagnosed in bottom-up manner: find and fix the resulting problem • Addressed by imposing more rigorous data-handling governance Practice-oriented activities Quality of Data Quality of Data Values Representa2on PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 42 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
43.
TITLE
Structure-Oriented Activities • Affect the Data Model Quality and Data Architecture Quality • Examples of improper structure-oriented activities: – Providing a correct response but incomplete data to a query because the user did not comprehend the system data structure – Costly maintenance of inconsistent data used by redundant systems • Typically diagnosed in top-down manner: root cause fixes • Addressed through fundamental data structure governance Structure-oriented activities Quality of Quality of Data Models Data Architecture PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 43 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
44.
TITLE
4 Dimensions of Data Quality An organization’s overall data quality is a function of four distinct components, each with its own attributes: • Data Value: the quality of data as stored & maintained in the system Practice- oriented • Data Representation – the quality of representation for stored values; perfect data values stored in a system that are inappropriately represented can be harmful • Data Model – the quality of data logically representing user requirements related to data entities, associated attributes, and their relationships; essential for effective Structure-‐ communication among data suppliers and consumers oriented • Data Architecture – the coordination of data management activities in cross-functional system development and operations PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/2012 10/09/12 44 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
45.
TITLE
Effective Data Quality Engineering • Data quality engineering has been focused on operational problem correction – Directing attention to practice-oriented data imperfections • Data quality engineering is more effective when also focused on structure-oriented causes – Ensuring the quality of shared data across system boundaries (closer to the user) (closer to the architect) Data Data Value Data Model Data Architecture Representa9on Quality Quality Quality Quality As an As understood by organiza9onal As presented to As maintained in developers asset the user the system PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 45 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
46.
TITLE
Full Set of Data Quality Attributes PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 46 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
47.
TITLE
Data Value Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 47 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
48.
TITLE
Data Representation Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 48 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
49.
TITLE
Data Model Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 49 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
50.
TITLE
Data Architecture Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 50 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
51.
TITLE
Extended data life cycle model with metadata sources and uses Starting point Metadata Refinement Metadata Creation for new • Define Data Architecture • Correct Structural Defects system • Update Implementation • Define Data Model Structures development architecture data architecture refinements Metadata Structuring Data Refinement • Implement Data Model Views • Correct Data Value Defects • Populate Data Model Views corrected • Re-store Data Values data data architecture and Metadata & data models Data Storage data performance metadata Data Creation facts & Data Assessment • Create Data meanings • Assess Data Values • Verify Data Values • Assess Metadata shared data updated data Starting point for existing Data Utilization Data Manipulation systems • Inspect Data • Manipulate Data • Present Data • Updata Data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 51 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
52.
TITLE
Data Quality Engineering ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 52 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
53.
Goals and Principles
TITLE § To measurably improve the quality of data in relation to defined business expectations § To define requirements and specifications for integrating data quality control into the system development life cycle § To provide defined processes for measuring, monitoring, and reporting conformance to acceptable levels of data quality from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 53 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
54.
TITLE
Activities • Develop and Promote Data Quality Awareness • Set and Evaluate Data Quality Service Levels • Test and Validate Data Quality Requirements • Profile, Analyze, and Assess Data Quality • Continuously Measure and Monitor Data Quality • Monitor Operational DQM Procedures and Performance • Define Data Quality Business Rules • Define Data Quality Metrics • Manage Data Quality Issues • Clean and Correct Data Quality Defects • Define Data Quality Requirements • Design and Implement Operational DQM Procedures from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 54 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
55.
TITLE
Primary Deliverables • Improved Quality Data • Data Management Operational Analysis • Data profiles • Data Quality Certification Reports • Data Quality Service Level Agreements from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 55 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
56.
TITLE
Roles and Responsibilities Suppliers: § External Sources § Regulatory Bodies § Business Subject Matter Experts § Information Consumers § Data Producers § Data Architects § Data Modelers § Data Stewards Participants: Consumers: § Data Quality Analysts § Data Stewards § Data Analysts § Data Professionals § Database Administrators § Other IT Professionals § Data Stewards § Knowledge Workers § Other Data Professionals § Managers and § DRM Director Executives § Data Stewardship Council § Customers from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 56 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
57.
TITLE
Polling Question #2 What is one guiding principle for data quality? a. Business process owners will agree to and abide by data quality SLAs a. IdenDfy a blue record for all data elements a. Upstream data consumers specific data quality expectaDons PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 57 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
58.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 58 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
59.
TITLE
Technology • Data Profiling Tools • Statistical Analysis Tools • Data Cleansing Tools • Data Integration Tools • Issue and Event Management Tools from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 59 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
60.
TITLE
Overview: Data Quality Tools 4 categories of Principal tools: activities: 1) Data Profiling 1) Analysis 2) Parsing and 2) Cleansing Standardization 3) Enhancement 3) Data Transformation 4) Monitoring 4) Identity Resolution and Matching 5) Enhancement 6) Reporting from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 60 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
61.
TITLE
DQ Tool #1: Data Profiling • Data profiling is the assessment of value distribution and clustering of values into domains • Need to be able to distinguish between good and bad data before making any improvements • Data profiling is a set of algorithms for 2 purposes: – Statistical analysis and assessment of the data quality values within a data set – Exploring relationships that exist between value collections within and across data sets • At its most advanced, data profiling takes a series of prescribed rules from data quality engines. It then assesses the data, annotates and tracks violations to determine if they comprise new or inferred data quality rules PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 61 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
62.
TITLE
DQ Tool #1: Data Profiling, cont’d • Data profiling vs. data quality-business context and semantic/logical layers – Data quality is concerned with proscriptive rules – Data profiling looks for patterns when rules are adhered to and when rules are violated; able to provide input into the business context layer • Incumbent that data profiling services notify all concerned parties of whatever is discovered • Profiling can be used to… – …notify the help desk that valid changes in the data are about to case an avalanche of “skeptical user” calls – …notify business analysts of precisely where they should be working today in terms of shifts in the data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 62 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
63.
TITLE
DQ Tool #2: Parsing & Standardization • Data parsing tools enable the definition of patterns that feed into a rules engine used to distinguish between valid and invalid data values • Actions are triggered upon matching a specific pattern • When an invalid pattern is recognized, the application may attempt to transform the invalid value into one that meets expectations • Data standardization is the process of conforming to a set of business rules and formats that are set up by data stewards and administrators • Data standardization example: – Brining all the different formats of “street” into a single format, e.g. “STR”, “ST.”, “STRT”, “STREET”, etc. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 63 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
64.
TITLE
DQ Tool #3: Data Transformation • Upon identification of data errors, trigger data rules to transform the flawed data • Perform standardization and guide rule-based transformations by mapping data values in their original formats and patterns into a target representation • Parsed components of a pattern are subjected to rearrangement, corrections, or any changes as directed by the rules in the knowledge base PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 64 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
65.
TITLE
DQ Tool #4: Identify Resolution & Matching • Data matching enables analysts to identify relationships between records for de-duplication or group-based processing • Matching is central to maintaining data consistency and integrity throughout the enterprise • The matching process should be used in the initial data migration of data into a single repository 2 basic approaches to matching: • Deterministic – Relies on defined patterns/rules for assigning weights and scores to determine similarity – Predictable – Dependent on rules developers anticipations • Probabilistic – Relies on statistical techniques for assessing the probability that any pair of record represents the same entity – Not reliant on rules – Probabilities can be refined based on experience -> matchers can improve precision as more data is analyzed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 65 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
66.
TITLE
DQ Tool #5: Enhancement Definition: Examples of data • A method for adding value to enhancements: information by accumulating • Time/date stamps additional information about a • Auditing information base set of entities and then merging all the sets of • Contextual information information to provide a focused • Geographic information view. Improves master data. • Demographic information Benefits: • Psychographic information • Enables use of third party data sources • Allows you to take advantage of the information and research carried out by external data vendors to make data more meaningful and useful PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 66 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
67.
TITLE
DQ Tool #6: Reporting • Good reporting supports: – Inspection and monitoring of conformance to data quality expectations – Monitoring performance of data stewards conforming to data quality SLAs – Workflow processing for data quality incidents – Manual oversight of data cleansing and correction • Data quality tools provide dynamic reporting and monitoring capabilities • Enables analyst and data stewards to support and drive the methodology for ongoing DQM and improvement with a single, easy-to-use solution • Associate report results with: – Data quality measurement – Metrics – Activity PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 67 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
68.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 68 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
69.
Guiding Principles
TITLE 1) Manage data as a core organizational asset. 2) Identify a gold record for all data elements 3) All data elements will have a standardized data definition, data type, and acceptable value domain 4) Leverage data governance for the control and performance of DQM 5) Use industry and international data standards whenever possible 6) Downstream data consumers specify data quality expectations 7) Define business rules to assert conformance to data quality expectations 8) Validate data instances and data sets against defined business rules 9) Business process owners will agree to and abide by data quality SLAs 10) Apply data corrections at the original source if possible 11) If it is not possible to correct data at the source, forward data corrections to the owner of the original source. Influence on data brokers to conform to local requirements may be limited 12) Report measured levels of data quality to appropriate data stewards, business process owners, and SLA managers from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 69 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
70.
TITLE
Interdependencies - Tools alone cannot do the job! Education and Training (People) Data Cleansing and Prevention Data Quality Tools (Process) (Technology) PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
71.
TITLE
Summary: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 71 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
72.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 72 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
73.
TITLE
Recommended Reading PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 73 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
Jetzt herunterladen