SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
 
	
  
	
  
	
  
	
  
	
  
	
  
	
  
T5	
  
Test	
  Data	
  Management	
  
5/11/17	
  9:45	
  
	
  
	
  
	
  
	
  
	
  
Data	
  Quality	
  at	
  the	
  Speed	
  of	
  Work	
  
	
  
Presented	
  by:	
  	
  
	
  
	
   Shauna	
  Ayers	
  
Catherine	
  Cruz	
  Agosto	
  
	
  
Availity	
  	
  
	
  
Brought	
  to	
  you	
  by:	
  	
  
	
  	
  
	
  
	
  
	
  
	
  
350	
  Corporate	
  Way,	
  Suite	
  400,	
  Orange	
  Park,	
  FL	
  32073	
  	
  
888-­‐-­‐-­‐268-­‐-­‐-­‐8770	
  ·∙·∙	
  904-­‐-­‐-­‐278-­‐-­‐-­‐0524	
  -­‐	
  info@techwell.com	
  -­‐	
  http://www.starwest.techwell.com/	
  	
  	
  
 
	
  	
  
	
  
Shauna	
  Ayers	
  
	
  
Shauna	
  Ayers	
  has	
  been	
  untangling	
  the	
  Gordian	
  knots	
  of	
  IT	
  systems	
  for	
  more	
  than	
  
seventeen	
  years,	
  analyzing	
  data	
  systems	
  and	
  testing	
  both	
  software	
  and	
  data	
  quality	
  
in	
  the	
  manufacturing,	
  medical	
  device,	
  and	
  healthcare	
  industries.	
  Shauna	
  found	
  her	
  
passion	
  in	
  developing	
  creative	
  solutions	
  for	
  the	
  analysis	
  and	
  testing	
  of	
  sensitive	
  and	
  
highly	
  regulated	
  data	
  sets	
  at	
  industry	
  leaders	
  such	
  as	
  Blue	
  Cross	
  Blue	
  Shield	
  of	
  
Florida	
  (now	
  Florida	
  Blue),	
  Vistakon	
  (a	
  subsidiary	
  of	
  Johnson	
  &	
  Johnson),	
  and	
  
Availity.	
  
	
  
Catherine	
  Cruz	
  Agosto	
  
	
  
Catherine	
  Cruz	
  Agosto	
  found	
  her	
  software	
  engineering	
  experience	
  at	
  Baxter	
  
Healthcare	
  and	
  Boeing-­‐subsidiary	
  Insitu	
  provided	
  an	
  excellent	
  foundation	
  for	
  
finding	
  more	
  effective	
  and	
  user-­‐friendly	
  approaches	
  to	
  complex	
  technical	
  problems.	
  
Catherine	
  has	
  developed	
  more	
  efficient	
  and	
  innovative	
  data	
  quality	
  testing	
  solutions	
  
at	
  healthcare	
  intermediary	
  Availity,	
  expanding	
  their	
  automated	
  data	
  quality	
  testing	
  
processes	
  to	
  accommodate	
  diverse	
  and	
  dissimilar	
  data	
  sources,	
  thus	
  facilitating	
  
analysis,	
  testing,	
  and	
  controls	
  for	
  data	
  integration,	
  analytics,	
  and	
  healthcare	
  data	
  
reporting.	
  
Data	
  Quality	
  at	
  the	
  
Speed	
  of	
  Work	
  
	
  
By	
  Shauna	
  Ayers	
  and	
  Catherine	
  Cruz	
  Agosto	
  
Overview	
  
•  Definitions
•  Why is this important?
•  What strategies can we use?
•  What benefits do these activities bring us?
•  What tools do we use?
•  Case Studies
•  Communication
•  Conclusion
	
  
Defini+ons	
  
●  Data quality (DQ) is data's fitness and
usability for its intended purpose.
●  Data quality assurance is the monitoring
and analysis of data sets and the
processes that create or manipulate data,
in order to ensure the data’s quality meets
the company's needs.
●  DQ Issue: Incorrect or unexpected
behavior from the data as a result of
unknown data scenario, upstream change,
flaw in logic, missing requirements, etc.
○  Timing Issue: A type of issue/defect
in which the root cause stems from
the timing between two or more
components of the system that
depend on each other.
Why	
  is	
  this	
  important?	
  
•  Consumers expect data to be instantly available
•  Consumers expect near-zero downtime
•  Automation and algorithmic transactions cause a small
data issue to snowball quickly
•  If consumers don’t feel they can trust your data, they
won’t be your customers for long
	
  
What	
  strategies	
  can	
  we	
  use?	
  
●  Types of Testing
○  Exploratory
○  Manual
○  Automated
●  Continuous Regression
○  Production Monitoring
vs Monitoring Lower
Environments
●  Continuous Data Profiling
What	
  strategies	
  can	
  we	
  use?	
  	
  
(con6nued)	
  
●  Types of Checks and how to use them to identify timing issues
○  Business Rule Validations: Type of test that verifies all of the
acceptance criteria by comparing the source data to the target
data.
■  This type of check catches any discrepancies or deviations
from the acceptance criteria.
○  Null Checks: Type of test that verifies key fields are not null
■  Verify that fields that are expected to be populated are done
so from the initial write, instead of as an update later on.
○  Duplicate Checks: Type of test that checks for any unexpected
duplication of records, typically by use of alternate key.
■  Can be used to spot duplications that are created over time.
What	
  strategies	
  can	
  we	
  use?	
  	
  
(con6nued)	
  
●  More types of Checks and how to use them
○  Environment Checks: Type of test that verifies if the process run is
within tolerance.
■  Can be used to identify if and when process is running behind,
which can explain any data issues with downstream processes.
○  Count Checks: Type of tests that compares the count of records in
the source to the count of records in the target.
■  Timing issue could be a potential cause for count mismatch.
○  Compare Checks: Type of tests that compares the alternate key of
records in the source to the alternate key of records in the target.
■  A mismatch in data could indicate potential timing issue
■  Can use compare check to get the details on a count check
discrepancy
What	
  strategies	
  can	
  we	
  use?	
  	
  
(con6nued)	
  
●  Even more types of checks and how to use them
○  Domain Integrity Checks: Type of test that verifies the values
used in specified field exist in the corresponding code set.
■  Could indicate discrepancy between timing of value added to
code set and use of code value.
○  System Version Checks: Type of test that checks when there are
changes to the version the system is running on.
■  Changes and/or updates to system versions can cause
unexpected issues such as difference in process behavior,
difference in system clocks, etc.
What	
  benefits	
  do	
  these	
  ac+vi+es	
  
bring	
  us?	
  
•  Opportunity to fix issues before the customer sees or
reports them
•  Faster localization of root causes
•  Better visibility of chronic issues rooted in timing and
environment
•  Better visibility of changes in input profiles
•  Cleaner integration with existing operational support
	
  
What	
  tools	
  do	
  we	
  use?	
  
●  Buying DQ testing software
o  Common tools: Informatica
Data Quality, Datamartist,
Microsoft Data Profiling Task
o  All tools have some sort of
limitations
o  Can get expensive
●  Creating custom test harnesses
o  Seems more time consuming
up-front
o  More control/ less limitations
compared
to pre-bought
●  Machine cannot replace a human
Case	
  Studies:	
  	
  
Data	
  Integra+on	
  Timing	
  
●  Definition: The timing of ETL processes in relation to each other and the
supporting systems they depend on. Risks affect execution order,
dependencies, and load rule boundaries across processes.
●  Useful Checks:
o  Count/ Compare checks
o  Tolerance/Threshold checks
(includes cycle time checks)
o  Environment checks
o  Business Rule Validations
●  Case Studies
o  Hybrid systems – the
velocity/dependency trap
o  Clock syncs sink ships
o  Who watches the watchmen?
o  Surge Protection
	
  
Case	
  Studies:	
  	
  
Opera+onal	
  Dependencies	
  
●  Definition: Two or more
processes of a system or
components of a process that
rely on each other.
●  Useful Checks:
○  Codesets
○  BRV
○  Null Checks
○  System Version Checks
○  Count/ Compare checks
○  Environment Checks
●  Case Studies
○  Rocket Failure
○  Data Warehousing
○  UI to Backend
Case	
  Studies:	
  	
  
Reference	
  Data	
  Management	
  
●  Definition: Reference values are used to drive categorization, routing and
filtering, and may provide part of the focus for dimensional data. They are
normally controlled data sets. Some
●  Useful Checks:
o  Domain checks
o  Tolerance/Threshold checks
o  Consistency checks
●  Case Studies
o  Point-of-Use Domain Checks
o  Rate of Dimensional Growth (runaway conditions in the content)
o  Process violations
	
  
	
  
Case	
  Studies:	
  	
  
Data	
  Integrity	
  
●  Definition: The correctness of
the data in or outputted from the
system
●  Useful Checks:
o  BRV
o  Null checks
o  Domain Checks
o  Null Checks
o  Duplicate Checks
o  Count/ Compare Checks
o  Environment Checks
●  Case Studies
o  Transaction Processing
o  Reporting
Communica6on:	
  	
  
Proac+ve	
  No+fica+on	
  Alerts	
  
•  Automated	
  no+fica+on	
  
mechanisms	
  can	
  be	
  
integrated	
  easily	
  with	
  
exis+ng	
  opera+onal	
  
alert	
  mechanisms	
  (e.g.,	
  
pager	
  duty)	
  
•  No+fica+ons	
  and	
  alerts	
  
can	
  be	
  tailored	
  to	
  
support	
  and	
  reinforce	
  
data	
  stewardship	
  
	
  
Communica6on:	
  	
  
Business	
  Intelligence	
  Dashboards	
  
●  External Dashboards
○  Potential Users: Customers, Production Support, Customer Service,
Business
●  Internal Dashboards
○  Display more granular data regarding processes and/ or tests
○  Drill-through
Communica6on:	
  	
  
Trends	
  Analysis	
  
●  Performance and tolerance
checks over time reveal cyclic
impacts from maintenance
activities or correlation of
surges in quality issues to
specific business activities.
These drive preventive
measures, capacity planning
and performance tuning.
	
  
Conclusion	
  
●  Proactive data quality saves an
organization time and money.
●  Data is the fastest changing
element of an organization; there
is no cookie cutter way of
monitoring or testing, but there
are known strategies that can be
used to help maneuver the
course.
●  Metadata about data quality
testing can be used to
communicate issues faster, more
easily target the correct parties,
and provide insights as to the
health of the systems that drive
the organization.
Ques+ons?	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Data quality overview
Data quality overviewData quality overview
Data quality overview
Alex Meadows
 
System analysis and design
System analysis and design System analysis and design
System analysis and design
Razan Al Ryalat
 

Was ist angesagt? (20)

System analysis
System analysisSystem analysis
System analysis
 
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
2013 OHSUG - Oracle Clinical and RDC Training for Data Management and Clinica...
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
 
Crisp dm
Crisp dmCrisp dm
Crisp dm
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 
Hi600 u03_inst_slides
Hi600 u03_inst_slidesHi600 u03_inst_slides
Hi600 u03_inst_slides
 
Machine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case studyMachine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case study
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
An overview of big data analytics
An overview of big data analytics An overview of big data analytics
An overview of big data analytics
 
Hi600 u04_inst_slides
Hi600 u04_inst_slidesHi600 u04_inst_slides
Hi600 u04_inst_slides
 
Machine Learning in ICU mortality prediction
Machine Learning in ICU mortality predictionMachine Learning in ICU mortality prediction
Machine Learning in ICU mortality prediction
 
CCXG Special Event, November 2020, Michael Vartanyan
CCXG Special Event, November 2020, Michael VartanyanCCXG Special Event, November 2020, Michael Vartanyan
CCXG Special Event, November 2020, Michael Vartanyan
 
Computer Assisted Audit Techniques (CAATS) - IS AUDIT
Computer Assisted Audit Techniques (CAATS) - IS AUDITComputer Assisted Audit Techniques (CAATS) - IS AUDIT
Computer Assisted Audit Techniques (CAATS) - IS AUDIT
 
Data Visualization: Sales forecasting
Data Visualization: Sales forecastingData Visualization: Sales forecasting
Data Visualization: Sales forecasting
 
Machine Learning in Healthcare: A Case Study
Machine Learning in Healthcare: A Case StudyMachine Learning in Healthcare: A Case Study
Machine Learning in Healthcare: A Case Study
 
System analysis and design
System analysis and design System analysis and design
System analysis and design
 
Machine Learning For Stock Broking
Machine Learning For Stock BrokingMachine Learning For Stock Broking
Machine Learning For Stock Broking
 

Ähnlich wie Data Quality at the Speed of Work

593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
Vinny (Gurvinder) Ahuja
 
How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...
How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...
How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...
Perficient
 
Test data documentation ss
Test data documentation ssTest data documentation ss
Test data documentation ss
AshwiniPoloju
 

Ähnlich wie Data Quality at the Speed of Work (20)

Document Control in FDA Regulated Environments - When and how to automate
Document Control in FDA Regulated Environments - When and how to automateDocument Control in FDA Regulated Environments - When and how to automate
Document Control in FDA Regulated Environments - When and how to automate
 
Techniques for effective test data management in test automation.pptx
Techniques for effective test data management in test automation.pptxTechniques for effective test data management in test automation.pptx
Techniques for effective test data management in test automation.pptx
 
Top 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdfTop 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdf
 
Agile Testing Process Analytics: From Data to Insightful Information
Agile Testing Process Analytics: From Data to Insightful InformationAgile Testing Process Analytics: From Data to Insightful Information
Agile Testing Process Analytics: From Data to Insightful Information
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
 
Data Quality
Data QualityData Quality
Data Quality
 
System testing
System testingSystem testing
System testing
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
 
Mind Map Test Data Management Overview
Mind Map Test Data Management OverviewMind Map Test Data Management Overview
Mind Map Test Data Management Overview
 
Xybion Webinar - Rumors, Risks and Realities of spreadsheet validation
Xybion Webinar - Rumors, Risks and Realities of spreadsheet validationXybion Webinar - Rumors, Risks and Realities of spreadsheet validation
Xybion Webinar - Rumors, Risks and Realities of spreadsheet validation
 
How to apply machine learning into your CI/CD pipeline
How to apply machine learning into your CI/CD pipelineHow to apply machine learning into your CI/CD pipeline
How to apply machine learning into your CI/CD pipeline
 
Survival Guide: Taming the Data Quality Beast
Survival Guide: Taming the Data Quality BeastSurvival Guide: Taming the Data Quality Beast
Survival Guide: Taming the Data Quality Beast
 
Design testabilty
Design testabiltyDesign testabilty
Design testabilty
 
tool support for testing
tool support for testingtool support for testing
tool support for testing
 
How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...
How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...
How to Migrate Drug Safety and Pharmacovigilance Data Cost-Effectively and wi...
 
Lecture 2 - Security Requirments.ppt
Lecture 2 - Security Requirments.pptLecture 2 - Security Requirments.ppt
Lecture 2 - Security Requirments.ppt
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
Software Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxSoftware Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptx
 
Test data documentation ss
Test data documentation ssTest data documentation ss
Test data documentation ss
 

Mehr von TechWell

Mehr von TechWell (20)

Failing and Recovering
Failing and RecoveringFailing and Recovering
Failing and Recovering
 
Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization
 
Test Design for Fully Automated Build Architecture
Test Design for Fully Automated Build ArchitectureTest Design for Fully Automated Build Architecture
Test Design for Fully Automated Build Architecture
 
System-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good StartSystem-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good Start
 
Build Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test StrategyBuild Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test Strategy
 
Testing Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for SuccessTesting Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for Success
 
Implement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlowImplement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlow
 
Develop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your SanityDevelop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your Sanity
 
Ma 15
Ma 15Ma 15
Ma 15
 
Eliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps StrategyEliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps Strategy
 
Transform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOpsTransform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOps
 
The Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—LeadershipThe Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—Leadership
 
Resolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile TeamsResolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile Teams
 
Pin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile GamePin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile Game
 
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile TeamsAgile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
 
A Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps ImplementationA Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps Implementation
 
Databases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery ProcessDatabases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery Process
 
Mobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to AutomateMobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to Automate
 
Cultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for SuccessCultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for Success
 
Turn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile TransformationTurn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile Transformation
 

Kürzlich hochgeladen

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Kürzlich hochgeladen (20)

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 

Data Quality at the Speed of Work

  • 1.                 T5   Test  Data  Management   5/11/17  9:45             Data  Quality  at  the  Speed  of  Work     Presented  by:         Shauna  Ayers   Catherine  Cruz  Agosto     Availity       Brought  to  you  by:                 350  Corporate  Way,  Suite  400,  Orange  Park,  FL  32073     888-­‐-­‐-­‐268-­‐-­‐-­‐8770  ·∙·∙  904-­‐-­‐-­‐278-­‐-­‐-­‐0524  -­‐  info@techwell.com  -­‐  http://www.starwest.techwell.com/      
  • 2.         Shauna  Ayers     Shauna  Ayers  has  been  untangling  the  Gordian  knots  of  IT  systems  for  more  than   seventeen  years,  analyzing  data  systems  and  testing  both  software  and  data  quality   in  the  manufacturing,  medical  device,  and  healthcare  industries.  Shauna  found  her   passion  in  developing  creative  solutions  for  the  analysis  and  testing  of  sensitive  and   highly  regulated  data  sets  at  industry  leaders  such  as  Blue  Cross  Blue  Shield  of   Florida  (now  Florida  Blue),  Vistakon  (a  subsidiary  of  Johnson  &  Johnson),  and   Availity.     Catherine  Cruz  Agosto     Catherine  Cruz  Agosto  found  her  software  engineering  experience  at  Baxter   Healthcare  and  Boeing-­‐subsidiary  Insitu  provided  an  excellent  foundation  for   finding  more  effective  and  user-­‐friendly  approaches  to  complex  technical  problems.   Catherine  has  developed  more  efficient  and  innovative  data  quality  testing  solutions   at  healthcare  intermediary  Availity,  expanding  their  automated  data  quality  testing   processes  to  accommodate  diverse  and  dissimilar  data  sources,  thus  facilitating   analysis,  testing,  and  controls  for  data  integration,  analytics,  and  healthcare  data   reporting.  
  • 3. Data  Quality  at  the   Speed  of  Work     By  Shauna  Ayers  and  Catherine  Cruz  Agosto  
  • 4. Overview   •  Definitions •  Why is this important? •  What strategies can we use? •  What benefits do these activities bring us? •  What tools do we use? •  Case Studies •  Communication •  Conclusion  
  • 5. Defini+ons   ●  Data quality (DQ) is data's fitness and usability for its intended purpose. ●  Data quality assurance is the monitoring and analysis of data sets and the processes that create or manipulate data, in order to ensure the data’s quality meets the company's needs. ●  DQ Issue: Incorrect or unexpected behavior from the data as a result of unknown data scenario, upstream change, flaw in logic, missing requirements, etc. ○  Timing Issue: A type of issue/defect in which the root cause stems from the timing between two or more components of the system that depend on each other.
  • 6. Why  is  this  important?   •  Consumers expect data to be instantly available •  Consumers expect near-zero downtime •  Automation and algorithmic transactions cause a small data issue to snowball quickly •  If consumers don’t feel they can trust your data, they won’t be your customers for long  
  • 7. What  strategies  can  we  use?   ●  Types of Testing ○  Exploratory ○  Manual ○  Automated ●  Continuous Regression ○  Production Monitoring vs Monitoring Lower Environments ●  Continuous Data Profiling
  • 8. What  strategies  can  we  use?     (con6nued)   ●  Types of Checks and how to use them to identify timing issues ○  Business Rule Validations: Type of test that verifies all of the acceptance criteria by comparing the source data to the target data. ■  This type of check catches any discrepancies or deviations from the acceptance criteria. ○  Null Checks: Type of test that verifies key fields are not null ■  Verify that fields that are expected to be populated are done so from the initial write, instead of as an update later on. ○  Duplicate Checks: Type of test that checks for any unexpected duplication of records, typically by use of alternate key. ■  Can be used to spot duplications that are created over time.
  • 9. What  strategies  can  we  use?     (con6nued)   ●  More types of Checks and how to use them ○  Environment Checks: Type of test that verifies if the process run is within tolerance. ■  Can be used to identify if and when process is running behind, which can explain any data issues with downstream processes. ○  Count Checks: Type of tests that compares the count of records in the source to the count of records in the target. ■  Timing issue could be a potential cause for count mismatch. ○  Compare Checks: Type of tests that compares the alternate key of records in the source to the alternate key of records in the target. ■  A mismatch in data could indicate potential timing issue ■  Can use compare check to get the details on a count check discrepancy
  • 10. What  strategies  can  we  use?     (con6nued)   ●  Even more types of checks and how to use them ○  Domain Integrity Checks: Type of test that verifies the values used in specified field exist in the corresponding code set. ■  Could indicate discrepancy between timing of value added to code set and use of code value. ○  System Version Checks: Type of test that checks when there are changes to the version the system is running on. ■  Changes and/or updates to system versions can cause unexpected issues such as difference in process behavior, difference in system clocks, etc.
  • 11. What  benefits  do  these  ac+vi+es   bring  us?   •  Opportunity to fix issues before the customer sees or reports them •  Faster localization of root causes •  Better visibility of chronic issues rooted in timing and environment •  Better visibility of changes in input profiles •  Cleaner integration with existing operational support  
  • 12. What  tools  do  we  use?   ●  Buying DQ testing software o  Common tools: Informatica Data Quality, Datamartist, Microsoft Data Profiling Task o  All tools have some sort of limitations o  Can get expensive ●  Creating custom test harnesses o  Seems more time consuming up-front o  More control/ less limitations compared to pre-bought ●  Machine cannot replace a human
  • 13. Case  Studies:     Data  Integra+on  Timing   ●  Definition: The timing of ETL processes in relation to each other and the supporting systems they depend on. Risks affect execution order, dependencies, and load rule boundaries across processes. ●  Useful Checks: o  Count/ Compare checks o  Tolerance/Threshold checks (includes cycle time checks) o  Environment checks o  Business Rule Validations ●  Case Studies o  Hybrid systems – the velocity/dependency trap o  Clock syncs sink ships o  Who watches the watchmen? o  Surge Protection  
  • 14. Case  Studies:     Opera+onal  Dependencies   ●  Definition: Two or more processes of a system or components of a process that rely on each other. ●  Useful Checks: ○  Codesets ○  BRV ○  Null Checks ○  System Version Checks ○  Count/ Compare checks ○  Environment Checks ●  Case Studies ○  Rocket Failure ○  Data Warehousing ○  UI to Backend
  • 15. Case  Studies:     Reference  Data  Management   ●  Definition: Reference values are used to drive categorization, routing and filtering, and may provide part of the focus for dimensional data. They are normally controlled data sets. Some ●  Useful Checks: o  Domain checks o  Tolerance/Threshold checks o  Consistency checks ●  Case Studies o  Point-of-Use Domain Checks o  Rate of Dimensional Growth (runaway conditions in the content) o  Process violations    
  • 16. Case  Studies:     Data  Integrity   ●  Definition: The correctness of the data in or outputted from the system ●  Useful Checks: o  BRV o  Null checks o  Domain Checks o  Null Checks o  Duplicate Checks o  Count/ Compare Checks o  Environment Checks ●  Case Studies o  Transaction Processing o  Reporting
  • 17. Communica6on:     Proac+ve  No+fica+on  Alerts   •  Automated  no+fica+on   mechanisms  can  be   integrated  easily  with   exis+ng  opera+onal   alert  mechanisms  (e.g.,   pager  duty)   •  No+fica+ons  and  alerts   can  be  tailored  to   support  and  reinforce   data  stewardship    
  • 18. Communica6on:     Business  Intelligence  Dashboards   ●  External Dashboards ○  Potential Users: Customers, Production Support, Customer Service, Business ●  Internal Dashboards ○  Display more granular data regarding processes and/ or tests ○  Drill-through
  • 19. Communica6on:     Trends  Analysis   ●  Performance and tolerance checks over time reveal cyclic impacts from maintenance activities or correlation of surges in quality issues to specific business activities. These drive preventive measures, capacity planning and performance tuning.  
  • 20. Conclusion   ●  Proactive data quality saves an organization time and money. ●  Data is the fastest changing element of an organization; there is no cookie cutter way of monitoring or testing, but there are known strategies that can be used to help maneuver the course. ●  Metadata about data quality testing can be used to communicate issues faster, more easily target the correct parties, and provide insights as to the health of the systems that drive the organization.