Suche senden
Hochladen
Data-Ed Engineering Solutions to Data Quality Challenges
•
4 gefällt mir
•
2,343 views
DATAVERSITY
Folgen
Technologie
Melden
Teilen
Melden
Teilen
1 von 75
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Big Data Security and Governance
Big Data Security and Governance
DataWorks Summit/Hadoop Summit
Data Quality Strategies
Data Quality Strategies
DATAVERSITY
Real-World Data Governance: Data Governance Expectations
Real-World Data Governance: Data Governance Expectations
DATAVERSITY
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DATAVERSITY
Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality Engineering
DATAVERSITY
Data Modeling & Metadata Management
Data Modeling & Metadata Management
DATAVERSITY
Introduction to Data Governance
Introduction to Data Governance
John Bao Vuu
Data mining an introduction
Data mining an introduction
Dr-Dipali Meher
Empfohlen
Big Data Security and Governance
Big Data Security and Governance
DataWorks Summit/Hadoop Summit
Data Quality Strategies
Data Quality Strategies
DATAVERSITY
Real-World Data Governance: Data Governance Expectations
Real-World Data Governance: Data Governance Expectations
DATAVERSITY
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DATAVERSITY
Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality Engineering
DATAVERSITY
Data Modeling & Metadata Management
Data Modeling & Metadata Management
DATAVERSITY
Introduction to Data Governance
Introduction to Data Governance
John Bao Vuu
Data mining an introduction
Data mining an introduction
Dr-Dipali Meher
Big Data
Big Data
Seminar Links
Big data analytics
Big data analytics
Vikram Nandini
Data Governance Intro.pptx
Data Governance Intro.pptx
BHARATH KUNAMNENI
Data Management vs Data Strategy
Data Management vs Data Strategy
DATAVERSITY
Big Data Fundamentals
Big Data Fundamentals
rjain51
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity Models
Kingland
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data Governance
DATAVERSITY
DMBOK - Chapter 1 Summary
DMBOK - Chapter 1 Summary
Nicolas Ruslim
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
DATAVERSITY
Big Data Testing Strategies
Big Data Testing Strategies
Knoldus Inc.
Data Governance
Data Governance
Rob Lux
Data Strategy
Data Strategy
Jeff Block
Data mining in retail industry
Data mining in retail industry
MonicaRaveshanker
Data Quality Presentation
Data Quality Presentation
Stephen McCarthy
Data Quality Best Practices
Data Quality Best Practices
DATAVERSITY
Data Quality Bootcamp
Data Quality Bootcamp
Elliott Lowe
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
Alan McSweeney
Unified big data architecture
Unified big data architecture
DataWorks Summit
Big Data Fabric Capability Maturity Model
Big Data Fabric Capability Maturity Model
Ross Collins
Adopting a Process-Driven Approach to Master Data Management
Adopting a Process-Driven Approach to Master Data Management
Software AG
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data Blueprint
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Sybase Türkiye
Weitere ähnliche Inhalte
Was ist angesagt?
Big Data
Big Data
Seminar Links
Big data analytics
Big data analytics
Vikram Nandini
Data Governance Intro.pptx
Data Governance Intro.pptx
BHARATH KUNAMNENI
Data Management vs Data Strategy
Data Management vs Data Strategy
DATAVERSITY
Big Data Fundamentals
Big Data Fundamentals
rjain51
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity Models
Kingland
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data Governance
DATAVERSITY
DMBOK - Chapter 1 Summary
DMBOK - Chapter 1 Summary
Nicolas Ruslim
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
DATAVERSITY
Big Data Testing Strategies
Big Data Testing Strategies
Knoldus Inc.
Data Governance
Data Governance
Rob Lux
Data Strategy
Data Strategy
Jeff Block
Data mining in retail industry
Data mining in retail industry
MonicaRaveshanker
Data Quality Presentation
Data Quality Presentation
Stephen McCarthy
Data Quality Best Practices
Data Quality Best Practices
DATAVERSITY
Data Quality Bootcamp
Data Quality Bootcamp
Elliott Lowe
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
Alan McSweeney
Unified big data architecture
Unified big data architecture
DataWorks Summit
Big Data Fabric Capability Maturity Model
Big Data Fabric Capability Maturity Model
Ross Collins
Adopting a Process-Driven Approach to Master Data Management
Adopting a Process-Driven Approach to Master Data Management
Software AG
Was ist angesagt?
(20)
Big Data
Big Data
Big data analytics
Big data analytics
Data Governance Intro.pptx
Data Governance Intro.pptx
Data Management vs Data Strategy
Data Management vs Data Strategy
Big Data Fundamentals
Big Data Fundamentals
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity Models
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data Governance
DMBOK - Chapter 1 Summary
DMBOK - Chapter 1 Summary
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
Big Data Testing Strategies
Big Data Testing Strategies
Data Governance
Data Governance
Data Strategy
Data Strategy
Data mining in retail industry
Data mining in retail industry
Data Quality Presentation
Data Quality Presentation
Data Quality Best Practices
Data Quality Best Practices
Data Quality Bootcamp
Data Quality Bootcamp
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
Unified big data architecture
Unified big data architecture
Big Data Fabric Capability Maturity Model
Big Data Fabric Capability Maturity Model
Adopting a Process-Driven Approach to Master Data Management
Adopting a Process-Driven Approach to Master Data Management
Ähnlich wie Data-Ed Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data Blueprint
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Sybase Türkiye
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
David Walker
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
David Walker
Big Data For Investment Research Management
Big Data For Investment Research Management
IDT Partners
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
David Linthicum
Martin Wildberger Presentation
Martin Wildberger Presentation
Mauricio Godoy
ICT for Governance and Policy Modelling
ICT for Governance and Policy Modelling
Corvinno Technology Transfer Center Nonprofit Public Ltd.
NASA Facilities GIS
NASA Facilities GIS
rjinterr
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Cambridge Semantics
Data Mining
Data Mining
swami920
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
Will Gardella
Physical Database Requirements.pdf
Physical Database Requirements.pdf
seifusisay06
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
The METL Process in Investment Banking
The METL Process in Investment Banking
Antony Benzing
SAP EIM
SAP EIM
Sybase Türkiye
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
Mark Ginnebaugh
Anexinet Big Data Solutions
Anexinet Big Data Solutions
Mark Kromer
1.1 Data Modelling - Part I (Understand Data Model).pdf
1.1 Data Modelling - Part I (Understand Data Model).pdf
RakeshKumar145431
Ähnlich wie Data-Ed Engineering Solutions to Data Quality Challenges
(20)
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
Big Data For Investment Research Management
Big Data For Investment Research Management
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
Martin Wildberger Presentation
Martin Wildberger Presentation
ICT for Governance and Policy Modelling
ICT for Governance and Policy Modelling
NASA Facilities GIS
NASA Facilities GIS
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Data Mining
Data Mining
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
Physical Database Requirements.pdf
Physical Database Requirements.pdf
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
The METL Process in Investment Banking
The METL Process in Investment Banking
SAP EIM
SAP EIM
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
Anexinet Big Data Solutions
Anexinet Big Data Solutions
1.1 Data Modelling - Part I (Understand Data Model).pdf
1.1 Data Modelling - Part I (Understand Data Model).pdf
Mehr von DATAVERSITY
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
Make Data Work for You
Make Data Work for You
DATAVERSITY
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
Data Modeling Fundamentals
Data Modeling Fundamentals
DATAVERSITY
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
DATAVERSITY
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
DATAVERSITY
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
Data Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
DATAVERSITY
Data Management Best Practices
Data Management Best Practices
DATAVERSITY
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
Mehr von DATAVERSITY
(20)
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Make Data Work for You
Make Data Work for You
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Modeling Fundamentals
Data Modeling Fundamentals
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
Data Strategy Best Practices
Data Strategy Best Practices
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
Data Management Best Practices
Data Management Best Practices
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
Kürzlich hochgeladen
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
The Digital Insurer
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
sammart93
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Andrey Devyatkin
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
MIND CTI
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
The Digital Insurer
Kürzlich hochgeladen
(20)
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
Data-Ed Engineering Solutions to Data Quality Challenges
1.
Data Quality Engineering
TITLE This presentation provides guidance to organizations considering data quality initiatives or preparing for data quality initiatives. This talk will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor data quality. Showing how data quality can be engineered provides a useful framework in which to develop an organizational approach. This in turn will allow organizations to more quickly identify data problems caused by structural issues versus practice-oriented defects. Participants will also Starting learn the importance of practicing data quality point for new system Metadata Creation • Define Data Architecture • Define Data Model Structures Metadata Refinement • Correct Structural Defects • Update Implementation engineering quantification. development architecture data architecture refinements Metadata Structuring Data Refinement • Implement Data Model Views • Correct Data Value Defects • Populate Data Model Views corrected • Re-store Data Values data data Date: October 9, 2012 Data Creation architecture and data models facts & Metadata & Data Storage data performance metadata Data Assessment meanings Time: 2:00 PM ET • Create Data • Assess Data Values • Verify Data Values • Assess Metadata shared data updated data Starting point for existing Presented by: Dr. Peter Aiken Data Utilization Data Manipulation systems • Inspect Data • Manipulate Data • Present Data • Updata Data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 1 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
2.
Get Social With
Us! TITLE Live Twitter Feed Like Us on Facebook Join the Group Join the conversation! www.facebook.com/ Data Management & Follow us: datablueprint Business Intelligence @datablueprint Post questions and Ask questions, gain insights comments and collaborate with fellow @paiken Find industry news, insightful data management Ask questions and submit content professionals your comments: #dataed and event updates. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 2 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
3.
Meet Your Presenter:
Dr. Peter Aiken • Internationally recognized thought- leader in the data management field - 30 years of experience – Recipient of multiple international awards – Founder, Data Blueprint (http://datablueprint.com) • 7 books and dozens of articles • Experienced w/ 500+ data management practices in 20 countries • Multi-year immersions with organizations as diverse as the US DoD, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia and Walmart 3 - datablueprint.com 10/11/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
4.
Data Quality
Engineering Data Quality Engineering DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12
5.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 5 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
6.
TITLE
The DAMA Guide to the Data Management Body of Knowledge Published by DAMA International • The professional association for Data Managers (40 chapters worldwide) DMBoK organized around • Primary data management functions focused around data delivery to the organization • Organized around several environmental elements Data Management Functions PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 6 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
7.
TITLE
The DAMA Guide to the Data Management Body of Knowledge Amazon: http:// www.amazon.com/ DAMA-Guide- Management- Knowledge-DAMA- DMBOK/dp/ 0977140083 Or enter the terms "dama dm bok" at the Amazon search engine Environmental Elements PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 7 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
8.
TITLE
What is the CDMP? • Certified Data Management Professional • DAMA International and ICCP • Membership in a distinct group made up of your fellow professionals • Recognition for your specialized knowledge in a choice of 17 specialty areas • Series of 3 exams • For more information, please visit: – http://www.dama.org/i4a/pages/ index.cfm?pageid=3399 – http://iccp.org/certification/ designations/cdmp #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 5/15/2012 8 © Copyright this and previous years by Data Blueprint - all rights reserved!
9.
TITLE
Data Management PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 9 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
10.
TITLE
Data Management Manage data coherently. Data Program Coordination Share data across boundaries. Organizational Data Integration Data Stewardship Data Development Assign responsibilities for data. Engineer data delivery systems. Data Support Operations Maintain data availability. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 10 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
11.
TITLE
Data Management PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 11 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
12.
TITLE
Overview: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 12 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
13.
TITLE
Overview: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 13 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
14.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 14 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
15.
TITLE
Definitions Data Quality Management • Planning, implementation and control activities that apply quality management techniques to measure, assess, improve, and ensure the fitness of data for use • Entails the establishment and deployment of roles, responsibilities concerning the acquisition, maintenance, dissemination, and disposition of data.” http://www2.sas.com/proceedings/sugi29/098-29.pdf • Critical support process in organizational change management • Continuous process for defining the parameters for specifying acceptable levels of data quality to meet business needs and for ensuring that data quality meets these levels Data Quality • Synonymous with information quality, since poor data quality results in inaccurate information and poor business performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/2012 10/09/12 15 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
16.
TITLE
Overview: DQM Concepts and Activities 1) Data Quality Management Approach 2) Develop and promote data quality awareness 3) Define data quality requirements 4) Profile, analyze and assess data quality 5) Define data quality metrics 6) Define data quality business rules 7) Test and validate data quality requirements 8) Set and evaluate data quality service levels 9) Measure and monitor data quality 10) Manage data quality issues 11) Clean and correct data quality defects 12) Design and implement operational DQM procedures 13) Monitor operational DQM procedures and performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 16 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
17.
TITLE
Concepts and Activities Data quality expectations provide the inputs necessary to define the data quality framework: – Requirements – Inspection policies – Measures, and monitors that reflect changes in data quality and performance • The data quality framework requirements reflect 3 aspects of business data expectations 1) A manner to record the expectation in business rules 2) A way to measure the quality of data within that dimension 3) An acceptability threshold from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 17 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
18.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 18 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
19.
TITLE
The DQM Cycle The general approach to DQM is a version of the Deming cycle. Deming proposes a problem–solving model known as “plan-do-study-act” or “plan-do-check-act” The cycle begins by: 1) Identifying data issues that are critical to the achievement of business objectives 2) Defining business requirements for data quality 3) Identifying key data quality dimensions 4) Defining business rules critical to ensuring high quality data from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 19 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
20.
TITLE
The DQM Cycle: (1) Plan Plan for the assessment of the current state and identification of key metrics for measuring quality • The data quality team assesses the scope of known issues • This involves: – Determining cost and impact – Evaluating alternatives for addressing them from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 20 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
21.
TITLE
The DQM Cycle: (2) Deploy Deploy processes for measuring and improving the quality of data: • Data profiling • Institute inspections and monitors to identify data issues when they occur • Fix flawed processes that are the root cause of data errors or correct errors downstream • When it is not possible to correct errors at their source, correct them at their earliest point in the data flow from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 21 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
22.
TITLE
The DQM Cycle: (3) Monitor Monitor the quality of data as measured against the defined business rules • If data quality meets defined thresholds for acceptability, the processes are in control and the level of data quality meets the business requirements • If data quality falls below acceptability thresholds, notify data stewards so they can take action during the next stage from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 22 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
23.
TITLE
The DQM Cycle: (4) Act Act to resolve any identified issues to improve data quality and better meet business expectations • New cycles begin as new data sets come under investigation or as new data quality requirements are identified for existing data sets from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 23 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
24.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 24 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
25.
TITLE
Develop and Promote DQ Awareness • Promoting data quality awareness is essential to ensure buy-in of necessary stakeholders in the organization • Ensure that the right people in the organization are aware of the existence of data quality issues • Awareness increases the chance of success of any DQM program • Awareness includes: – Relating material impacts to data issues – Ensuring systematic approaches to regulators – Oversight of the quality of organizational data – Socializing the concept that data quality problems cannot be solely addressed by technology solutions from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 25 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
26.
TITLE
Polling Question #1 Which is not a step to promote data quality awareness? a) Training on the core concepts of data quality b) Establish data governance framework for data quality c) Create a data architecture map PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 26 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
27.
TITLE
Develop and Promote DQ Awareness: Steps 1) Training on the core concepts of data quality 2) Establish data governance framework for data quality 3) Create a data quality oversight board that has a reporting hierarchy associated with the different data governance roles from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 27 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
28.
TITLE
Define DQ Requirements • Data quality must be understood within the context of ‘fitness for use’ • Data quality requirements are often hidden within defined business policies • Incremental detailed review and iterative refinement of business policies helps to identify those information requirements which become data quality rules • Steps for incremental detailed review: – Identify key data components associated with business policies – Determine how identified data assertions affect the business – Evaluate how data errors are categorized within a set of data quality dimensions – Specify the business rules that measure the occurrence of data errors – Provide a means for implementing measurement processes that assess conformance to those business rules from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 28 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
29.
TITLE
Data Quality Dimensions from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 29 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
30.
TITLE
Profile, Analyze and Assess DQ Data assessment using 2 different approaches: 1) Bottom-up 2) Top-down Bottom-up assessment: • Inspection and evaluation of the data sets • Highlight potential issues based on the results of automated processes Top-down assessment: • Engage business users to document their business processes and the corresponding critical data dependencies • Understand how their processes consume data and which data elements are critical to the success of the business application from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 30 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
31.
TITLE
Define DQ Metrics • Metrics development occurs as part of the strategy/design/plan step • Process for defining data quality metrics: 1) Select one of the identified critical business impacts 2) Evaluate the dependent data elements, create and update processes associate with that business impact 3) List any associated data requirements 4) Specify the associated dimension of data quality and one or more business rules to use to determine conformance of the data to expectations 5) Describe the process for measuring conformance 6) Specify an acceptability threshold from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 31 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
32.
TITLE
Test and Validate DQ Requirements • Data profiling tools analyze data to find potential anomalies • Use the same tools for rule validation • Rules discovered or defined during the data quality assessment phase are referenced in measuring conformance as part of the operational process from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 32 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
33.
TITLE
Set and Evaluate DQ Service Levels • Data quality inspection and monitoring are used to measure and monitor compliance with defined data quality rules • Data quality SLAs specify the organization’s expectations for response and remediation • Operational data quality control defined in data quality SLAs includes: – Data elements covered by the agreement – Business impacts associated with data flaws – Data quality dimensions associated with each data element – Quality expectations for each data element of the indentified dimensions in each application for system in the value chain – Methods for measuring against those expectations – (…) from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 33 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
34.
TITLE
Measure and Monitor DQ • DQM procedures depend on available data quality measuring and monitoring services • 2 contexts for control/measurement of conformance to data quality business rules exist: – In-stream: collect in-stream measurements while creating data – In batch: perform batch activities on collections of data instances assembled in a data set • Apply measurements at 3 levels of granularity: – Data element value – Data instance or record – Data set from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 34 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
35.
Clean & Correct
Manage DQ Issues DQ Defects • Supporting the enforcement of Perform data correction the data quality SLA requires a mechanism for reporting and in 3 ways: tracking data quality incidents 1) Automated correction and activities for researching 2) Manual directed correction and resolving those incidents 3) Manual correction • A data quality incident reporting system can provide this capability • It can log the evaluation, initial diagnosis, and actions associated with data quality events from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 35 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
36.
Manage DQ Issues:
Example TITLE Data quality incident tracking focuses on training staff to recognize when data issues appear and how they are to be classified, logged and tracked according to the data quality SLA from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 36 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
37.
Design and Implement
Monitor Operational Operational DQM DQM Procedures and Procedures Performances 1) Inspection and monitoring 1) Accountability is critical 2) Diagnosis and evaluation to governance of remediation protocols overseeing alternatives data quality control 3) Resolve issues 2) All issues must be 4) Reporting assigned 3) The tracking process should specify and document the ultimate issue accountability from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 37 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
38.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 38 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
39.
TITLE
Example: Data Quality Interview Session Summary • During mid-February, the Data Governance Team and Data Blueprint conducted ten qualitative interview sessions with groups of individuals who interact with data on regular basis • A series of patterns emerged as participants shared stories about the impact of poor data quality on the client, its products, and its customers • These patterns highlight gaps in best practices for ensuring data quality, i.e. the extent to which data is “fit for use” • Our preliminary analysis evaluated these stories against attributes of four data quality dimensions • At this early stage of the post-interview process, we are seeking confirmation of our assumptions and method PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 39 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
40.
TITLE
Which Activities Support Quality Data? • Data quality best practices depend on both – Practice-oriented activities – Structure-oriented activities Quality Practice-oriented Data Structure-oriented activities focus on activities focus on the capture and the data manipulation of data implementation PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 40 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
41.
TITLE
Quality Dimensions Practice-oriented causes • Stem from a failure to rigor when capturing and manipulating data such as: – Edit masking – Range checking of input data – CRC-checking of transmitted data Structure-oriented causes • Occur because of data and metadata that has been arranged imperfectly. For example: – When the data is in the system but we just can't access it; – When a correct data value is provided as the wrong response to a query; or – When data is not provided because it is unavailable or inaccessible to the customer • Developer focus within system boundaries instead of within organization boundaries PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 41 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
42.
TITLE
Practice-Oriented Activities • Affect the Data Value Quality and Data Representation Quality • Examples of improper practice-oriented activities: – Allowing imprecise or incorrect data to be collected when requirements specify otherwise – Presenting data out of sequence • Typically diagnosed in bottom-up manner: find and fix the resulting problem • Addressed by imposing more rigorous data-handling governance Practice-oriented activities Quality of Data Quality of Data Values Representa2on PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 42 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
43.
TITLE
Structure-Oriented Activities • Affect the Data Model Quality and Data Architecture Quality • Examples of improper structure-oriented activities: – Providing a correct response but incomplete data to a query because the user did not comprehend the system data structure – Costly maintenance of inconsistent data used by redundant systems • Typically diagnosed in top-down manner: root cause fixes • Addressed through fundamental data structure governance Structure-oriented activities Quality of Quality of Data Models Data Architecture PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 43 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
44.
TITLE
4 Dimensions of Data Quality An organization’s overall data quality is a function of four distinct components, each with its own attributes: • Data Value: the quality of data as stored & maintained in the system Practice- oriented • Data Representation – the quality of representation for stored values; perfect data values stored in a system that are inappropriately represented can be harmful • Data Model – the quality of data logically representing user requirements related to data entities, associated attributes, and their relationships; essential for effective Structure-‐ communication among data suppliers and consumers oriented • Data Architecture – the coordination of data management activities in cross-functional system development and operations PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/2012 10/09/12 44 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
45.
TITLE
Effective Data Quality Engineering • Data quality engineering has been focused on operational problem correction – Directing attention to practice-oriented data imperfections • Data quality engineering is more effective when also focused on structure-oriented causes – Ensuring the quality of shared data across system boundaries (closer to the user) (closer to the architect) Data Data Value Data Model Data Architecture Representa9on Quality Quality Quality Quality As an As understood by organiza9onal As presented to As maintained in developers asset the user the system PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 45 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
46.
TITLE
Full Set of Data Quality Attributes PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 46 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
47.
TITLE
Data Value Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 47 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
48.
TITLE
Data Representation Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 48 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
49.
TITLE
Data Model Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 49 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
50.
TITLE
Data Architecture Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 50 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
51.
TITLE
Extended data life cycle model with metadata sources and uses Starting point Metadata Refinement Metadata Creation for new • Define Data Architecture • Correct Structural Defects system • Update Implementation • Define Data Model Structures development architecture data architecture refinements Metadata Structuring Data Refinement • Implement Data Model Views • Correct Data Value Defects • Populate Data Model Views corrected • Re-store Data Values data data architecture and Metadata & data models Data Storage data performance metadata Data Creation facts & Data Assessment • Create Data meanings • Assess Data Values • Verify Data Values • Assess Metadata shared data updated data Starting point for existing Data Utilization Data Manipulation systems • Inspect Data • Manipulate Data • Present Data • Updata Data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 51 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
52.
TITLE
Data Quality Engineering ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 52 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
53.
Goals and Principles
TITLE § To measurably improve the quality of data in relation to defined business expectations § To define requirements and specifications for integrating data quality control into the system development life cycle § To provide defined processes for measuring, monitoring, and reporting conformance to acceptable levels of data quality from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 53 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
54.
TITLE
Activities • Develop and Promote Data Quality Awareness • Set and Evaluate Data Quality Service Levels • Test and Validate Data Quality Requirements • Profile, Analyze, and Assess Data Quality • Continuously Measure and Monitor Data Quality • Monitor Operational DQM Procedures and Performance • Define Data Quality Business Rules • Define Data Quality Metrics • Manage Data Quality Issues • Clean and Correct Data Quality Defects • Define Data Quality Requirements • Design and Implement Operational DQM Procedures from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 54 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
55.
TITLE
Primary Deliverables • Improved Quality Data • Data Management Operational Analysis • Data profiles • Data Quality Certification Reports • Data Quality Service Level Agreements from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 55 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
56.
TITLE
Roles and Responsibilities Suppliers: § External Sources § Regulatory Bodies § Business Subject Matter Experts § Information Consumers § Data Producers § Data Architects § Data Modelers § Data Stewards Participants: Consumers: § Data Quality Analysts § Data Stewards § Data Analysts § Data Professionals § Database Administrators § Other IT Professionals § Data Stewards § Knowledge Workers § Other Data Professionals § Managers and § DRM Director Executives § Data Stewardship Council § Customers from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 56 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
57.
TITLE
Polling Question #2 What is one guiding principle for data quality? a. Business process owners will agree to and abide by data quality SLAs a. IdenDfy a blue record for all data elements a. Upstream data consumers specific data quality expectaDons PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 57 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
58.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 58 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
59.
TITLE
Technology • Data Profiling Tools • Statistical Analysis Tools • Data Cleansing Tools • Data Integration Tools • Issue and Event Management Tools from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 59 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
60.
TITLE
Overview: Data Quality Tools 4 categories of Principal tools: activities: 1) Data Profiling 1) Analysis 2) Parsing and 2) Cleansing Standardization 3) Enhancement 3) Data Transformation 4) Monitoring 4) Identity Resolution and Matching 5) Enhancement 6) Reporting from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 60 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
61.
TITLE
DQ Tool #1: Data Profiling • Data profiling is the assessment of value distribution and clustering of values into domains • Need to be able to distinguish between good and bad data before making any improvements • Data profiling is a set of algorithms for 2 purposes: – Statistical analysis and assessment of the data quality values within a data set – Exploring relationships that exist between value collections within and across data sets • At its most advanced, data profiling takes a series of prescribed rules from data quality engines. It then assesses the data, annotates and tracks violations to determine if they comprise new or inferred data quality rules PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 61 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
62.
TITLE
DQ Tool #1: Data Profiling, cont’d • Data profiling vs. data quality-business context and semantic/logical layers – Data quality is concerned with proscriptive rules – Data profiling looks for patterns when rules are adhered to and when rules are violated; able to provide input into the business context layer • Incumbent that data profiling services notify all concerned parties of whatever is discovered • Profiling can be used to… – …notify the help desk that valid changes in the data are about to case an avalanche of “skeptical user” calls – …notify business analysts of precisely where they should be working today in terms of shifts in the data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 62 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
63.
TITLE
DQ Tool #2: Parsing & Standardization • Data parsing tools enable the definition of patterns that feed into a rules engine used to distinguish between valid and invalid data values • Actions are triggered upon matching a specific pattern • When an invalid pattern is recognized, the application may attempt to transform the invalid value into one that meets expectations • Data standardization is the process of conforming to a set of business rules and formats that are set up by data stewards and administrators • Data standardization example: – Brining all the different formats of “street” into a single format, e.g. “STR”, “ST.”, “STRT”, “STREET”, etc. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 63 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
64.
TITLE
DQ Tool #3: Data Transformation • Upon identification of data errors, trigger data rules to transform the flawed data • Perform standardization and guide rule-based transformations by mapping data values in their original formats and patterns into a target representation • Parsed components of a pattern are subjected to rearrangement, corrections, or any changes as directed by the rules in the knowledge base PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 64 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
65.
TITLE
DQ Tool #4: Identify Resolution & Matching • Data matching enables analysts to identify relationships between records for de-duplication or group-based processing • Matching is central to maintaining data consistency and integrity throughout the enterprise • The matching process should be used in the initial data migration of data into a single repository 2 basic approaches to matching: • Deterministic – Relies on defined patterns/rules for assigning weights and scores to determine similarity – Predictable – Dependent on rules developers anticipations • Probabilistic – Relies on statistical techniques for assessing the probability that any pair of record represents the same entity – Not reliant on rules – Probabilities can be refined based on experience -> matchers can improve precision as more data is analyzed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 65 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
66.
TITLE
DQ Tool #5: Enhancement Definition: Examples of data • A method for adding value to enhancements: information by accumulating • Time/date stamps additional information about a • Auditing information base set of entities and then merging all the sets of • Contextual information information to provide a focused • Geographic information view. Improves master data. • Demographic information Benefits: • Psychographic information • Enables use of third party data sources • Allows you to take advantage of the information and research carried out by external data vendors to make data more meaningful and useful PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 66 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
67.
TITLE
DQ Tool #6: Reporting • Good reporting supports: – Inspection and monitoring of conformance to data quality expectations – Monitoring performance of data stewards conforming to data quality SLAs – Workflow processing for data quality incidents – Manual oversight of data cleansing and correction • Data quality tools provide dynamic reporting and monitoring capabilities • Enables analyst and data stewards to support and drive the methodology for ongoing DQM and improvement with a single, easy-to-use solution • Associate report results with: – Data quality measurement – Metrics – Activity PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 67 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
68.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 68 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
69.
Guiding Principles
TITLE 1) Manage data as a core organizational asset. 2) Identify a gold record for all data elements 3) All data elements will have a standardized data definition, data type, and acceptable value domain 4) Leverage data governance for the control and performance of DQM 5) Use industry and international data standards whenever possible 6) Downstream data consumers specify data quality expectations 7) Define business rules to assert conformance to data quality expectations 8) Validate data instances and data sets against defined business rules 9) Business process owners will agree to and abide by data quality SLAs 10) Apply data corrections at the original source if possible 11) If it is not possible to correct data at the source, forward data corrections to the owner of the original source. Influence on data brokers to conform to local requirements may be limited 12) Report measured levels of data quality to appropriate data stewards, business process owners, and SLA managers from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 69 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
70.
TITLE
Interdependencies - Tools alone cannot do the job! Education and Training (People) Data Cleansing and Prevention Data Quality Tools (Process) (Technology) PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
71.
TITLE
Summary: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 71 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
72.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 72 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
73.
TITLE
Recommended Reading PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 73 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
Jetzt herunterladen