SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
W8
Concurrent Class
10/2/2013 1:45:00 PM

"Data Warehouse Testing: It’s
All about the Planning"
Presented by:
Geoff Horne
NZTester Magazine

Brought to you by:

340 Corporate Way, Suite 300, Orange Park, FL 32073
888-268-8770 ∙ 904-278-0524 ∙ sqeinfo@sqe.com ∙ www.sqe.com
Geoff Horne
NZTester Magazine
Geoff Horne has an extensive background in test program/project directorship and
management, architecture, and general consulting. In New Zealand Geoff established and ran
ISQA as a testing consultancy which enjoys a local and international clientele in Australia, the
US, and the United Kingdom. He has held senior test management roles across a number of
diverse industry sectors, and is editor and publisher of the recently launched NZTester
magazine. Geoff has authored a variety of white papers on software testing and is a regular
speaker at the STAR conferences.
9/20/2013

Data Warehouse Test Effectiveness
It’s All about the Planning!
Assuring Data Warehouse
Content, Structure and Quality

© Wayne Yaddow, 2013, wyaddow@gmail.com

1

Agenda
Challenges of DWH testing
Planning for DWH tests
Tester skills for DWH testing
Basic ETL verifications
Defects you can expect to find
Testing tools identified

2

1
9/20/2013

The Data Testing Process

3

Plan QA for typical DWH phases

4

2
9/20/2013

Data Model Example

5

Source to Target Mapping

6

3
9/20/2013

Plan QA for DWH Lifecycle
Primary goals for verification
– Data completeness
– Data transformations
– Data quality
– Performance and scalability
– Integration testing
– User-acceptance testing
– Regression testing
7

Planning the DWH QA Strategy
Carefully review:
– Requirements documentation
– Data models for source and target schemas
– Source to target mappings
– ETL / stored proc design & logic
– CA deployment tasks / steps
– Required QA tools

8

4
9/20/2013

Challenges for DWH Testers (1)
1.
2.
3.
4.

Often inadequate ETL design documents
Source table field values unexpectedly null
Excessive ETL errors discovered after entry to QA
Source data does not meet table mapping specs
(ex., dirty data)
5. Source to target mappings:
1. Often not reviewed by all stakeholders
2. Not consistently maintained through dev lifecycle
3. Therefore, in error
9

Challenges for DWH Testers (2)
6. Data models not maintained
7. Target data does not meet mapping specifications
8. Duplicate field values when defined to be
DISTINCT
9. ETL SQL / errors that lead to missing rows and
invalid field values
10. Constraint violations in source data
11. Table keys are incorrect for important RDB
linkages
10

5
9/20/2013

Challenges for DWH Testers (3)
12. Huge source data volumes and of data types.
13. Source data quality that must be profiled before
loading to DWH
14. Redundancy, duplicate source data.
15. Many source data records to be rejected
16. ETL logs w/ messages to be acted upon.
17. Source field values may be missing where they
should always be present.

11

Challenges for DWH Testers (4)
19. Source data history & business rules may not
be available.
20. SME’s and business rules may not be available
21. Since data ETLs must often pass through
multiple phases
22. Transaction-level traceability will be difficult to
attain in a data warehouse.
23. The data warehouse will be a strategic
enterprise resource and heavily relied upon
12

6
9/20/2013

Plan for QA Tools

13

Identify QA skills (1)
•
•
•
•

Understanding fundamental DWH and DB concepts
High skill w/SQL and stored procedures
Understanding of data used by the business
Developing strategies, test plans and test cases
specific to DWH and the business
• Creating effective ETL test cases / scenarios based on
loading technology and business requirements
• Understanding of data models, data mapping
documents, ETL design and ETL coding; ability to
provide feedback to designers and developers
14

7
9/20/2013

Identify QA skills (2)
• Experience with Oracle, SQL Server, Sybase, DB2
technology
• Informatica session troubleshooting
• Deploying DB code to data bases
• Unix scripting, Autosys, Anthill, etc.
• SQL editors
• Data profiling
• Use of Excel & MS Access for data analysis

15

Basic ETL Verifications (1)
• Verify data mappings, source to target
• Verify that all tables fields were loaded from source
to staging
• Verify that keys were properly generated using
sequence generator
• Verify that not-null fields were populated
• Verify no data truncation in each field
• Verify data types and formats are as specified in
design phase
16

8
9/20/2013

Basic ETL Verifications (2)
• Verify no duplicate records in target tables.
• Verify transformations based on data low level
design (LLD's)
• Verify that numeric fields are populated with
correct precision
• Verify that every ETL session completed with only
planned exceptions
• Verify all cleansing, transformation, error and
exception handling
• Verify PL/SQL calculations and data mappings
17

Examples of DWH Defects
1. Inadequate ETL and stored procedure design documents
2. Field values are null when specified as “Not Null”.
3. Field constraints and SQL not coded correctly for
Informatica ETL
4. Excessive ETL errors discovered after entry to QA
5. Source data does not meet table mapping specifications
(ex., dirty data)
6. Source to target mappings: 1) often not reviewed, 2) in
error and 2) not consistently maintained through dev
lifecycle
18

9
9/20/2013

Examples of DWH Defects
7. Data models are not adequately maintained during
development lifecycle
8. Target data does not meet mapping specifications
9. Duplicate field values when defined to be DISTINCT
10. ETL SQL / transformation errors leading to missing rows
and invalid field values
11. Constraint violations in source
12. Target data is incorrectly stored in nonstandard formats
13. Table keys are incorrect for important relationship linkages

19

Verifying Data Loads
From RTTS

20

10
9/20/2013

Planning for DWH QA (1)
Data integration planning (Data model, LLD’s)
1. Gain understanding of data to be reported by the
application (e.g., profiling)… and the tables upon
which each user report will be based upon
2. Review, understand data model – gain understanding
of keys, flows from source to target
3. Review, understand data LLD’s and mappings: add,
update sequences for all sources of each target table

21

Planning for DWH QA (2)
ETL planning and testing (source inputs & ETL design)
1. Participate in ETL design reviews
2. Gain in-depth knowledge of ETL sessions, the order
of execution, restraints, transformations
3. Participate in development ETL test case reviews
4. After ETL’s are run, use checklists for QA
assessments of rejects, session failures, errors

22

11
9/20/2013

Planning for DWH QA (3)
Assess ETL logs: session, workflow, errors
1. Review ETL workflow outputs, source to target
counts
2. Verify source to target mapping docs with loaded
tables using TOAD and other tools
3. After ETL runs or manual data loads, assess data in
every table with focus on key fields (dirty data,
incorrect formats, duplicates, etc.). Use TOAD, Excel
tools. (SQL queries, filtering, etc.)
23

Planning for DWH QA (4)
GUI and report validations
1. Compare report data with target data.
2. Verify that reporting meets user expectations
Analytics test team data validation
1. Test data as it is integrated into application
2. Provide tools and tests for data validation

24

12
9/20/2013

DQ tools / techniques for QA team
TOAD / SQL Navigator
•Data profiling for value range &
boundary analysis
•Null field analysis
•Row counting
•Data type analysis
•Referential integrity analysis
•Distinct value analysis by field
•Duplicate data analysis (fields and rows)
•Cardinality analysis
• Stored procedures & package
verification
Excel
•Data filtering for profile analysis
•Data value sampling
•Data type analysis

MS Access
•Table and data analysis across
schemas
QTP
•Automated testing of templates and
application screens
• RTTS QuerySurge
Analytics Tools
•J – statistics, visualization, data
manipulation
•Perl – data manipulation, scripting
•R – statistics

25

Bottom Line Recommendations
• Involve test team in entire DWH SDLC
• Profile source and target data
• Remember: DWH QA is much more than
source and target record counts
• Develop testers SQL and DWH skills
• Assure availability of source to target mapping
document
• Plan for regression and automated testing

26

13
9/20/2013

Planning Dev/Unit Tests
Unit testing checklist
•

•

Some programmers are not well trained as testers. They may like to program, deploy the
code, and move on to the next development task without a thorough unit test. A checklist
will aid database programmers to systematically test their code before formal QA testing.
Check the mapping of fields that support data staging and in data marts. Check for
duplication of values generated using sequence generators. Check the correctness of
surrogate keys that uniquely identify rows of data. Check for data-type constraints of the
fields present in staging and core levels. Check the data loading status and error messages
after ETLs (extracts, transformations, loads).Look for string columns that are incorrectly leftor right-trimmed. Make sure all tables and specified fields were loaded from source to
staging. Verify that not-null fields were populated. Verify that no data truncation occurred in
each field. Make sure data types and formats are as specified during database design. Make
sure there are no duplicate records in target tables. Make sure data transformations are
correctly based on business rules. Verify that numeric fields are populated precisely. Make
sure every ETL session completed with only planned exceptions. Verify all data cleansing,
transformation, and error and exception handling. Verify stored procedure calculations and
data mappings. Some programmers are not well trained as testers. They may like to program,
deploy the code, and move on to the next development task without a thorough unit test. A
checklist will aid database programmers to systematically test their code before formal QA
testing.
27

Planning for Performance Tests
•

As the volume of data in the warehouse grows, ETL execution times can be expected to
increase, and performance of queries often degrade. These changes can be mitigated by
having a solid technical architecture and efficient ETL design. The aim of performance testing
is to point out potential weaknesses in the ETL design, such as reading a file multiple times or
creating unnecessary intermediate files. A performance and scalability testing checklist helps
discover performance issues.

•

Load the database with peak expected production volumes to help ensure that the volume of
data can be loaded by the ETL process within the agreed-on window. Compare ETL loading
times to loads performed with a smaller amount of data to anticipate scalability issues.
Compare the ETL processing times component by component to pinpoint any areas of
weakness. Monitor the timing of the reject process and consider how large volumes of
rejected data will be handled. Perform simple and multiple join queries to validate query
performance on large database volumes. Work with business users to develop sample
queries and acceptable performance criteria for each query.

28

14
9/20/2013

Recommendations for data
verifications
Detailed Recommendations for Data Development and QA

1.

Need analysis of a.) source data quality and b.) data field profiles before input to Informatica and other
data-build services.

2.

QA should participate in all data model and data mapping reviews.

3.

Need complete review of ETL error logs and resolution of errors by ETL teams before DB turn-over to QA.

4.

Early use of QC during ETL and stored procedure testing to target vulnerable process areas.

5.

Substantially improved documentation of PL/SQL stored procedures.

6.

QA needs dev or separate environment for early data testing. QA should be able to modify data in order
to perform negative tests. (QA currently does only positive tests because the application and data base
tests work in parallel in the same environment.)

7.

Need substantially enhanced verification of target tables after each ETL load before data turn-over to QA.

8.

Need mandatory maintenance of data models and source to target mapping / transformation rules
documents from elaboration until transition.

9.

Investments in more Informatica and off-the-shelf data quality analysis tools for pre and post ETL.

10. Investments in automated DB regression test tools and training to support frequent data loads.
29

Plan QA for All DWH Dev. Phases

30

15
9/20/2013

Plan methods & tools for testing

31

16

Weitere ähnliche Inhalte

Was ist angesagt?

Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouseKomal Choudhary
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyRTTS
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptxJesusaEspeleta
 
Data Quality
Data QualityData Quality
Data QualityVijaya K
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing Girish Dhareshwar
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform LoadABDUL KHALIQ
 
DATA WAREHOUSE -- ETL testing Plan
DATA WAREHOUSE -- ETL testing PlanDATA WAREHOUSE -- ETL testing Plan
DATA WAREHOUSE -- ETL testing PlanMadhu Nepal
 
Data warehousing testing strategies cognos
Data warehousing testing strategies cognosData warehousing testing strategies cognos
Data warehousing testing strategies cognosSandeep Mehta
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management BasicKhaled Mosharraf
 
warner-DP-203-slides.pptx
warner-DP-203-slides.pptxwarner-DP-203-slides.pptx
warner-DP-203-slides.pptxHibaB2
 

Was ist angesagt? (20)

What is ETL?
What is ETL?What is ETL?
What is ETL?
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
ETL Testing Overview
ETL Testing OverviewETL Testing Overview
ETL Testing Overview
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing Strategy
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
 
Data Quality
Data QualityData Quality
Data Quality
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
DATA WAREHOUSE -- ETL testing Plan
DATA WAREHOUSE -- ETL testing PlanDATA WAREHOUSE -- ETL testing Plan
DATA WAREHOUSE -- ETL testing Plan
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Data warehouse testing
Data warehouse testingData warehouse testing
Data warehouse testing
 
Data Vault and DW2.0
Data Vault and DW2.0Data Vault and DW2.0
Data Vault and DW2.0
 
Data warehousing testing strategies cognos
Data warehousing testing strategies cognosData warehousing testing strategies cognos
Data warehousing testing strategies cognos
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management Basic
 
warner-DP-203-slides.pptx
warner-DP-203-slides.pptxwarner-DP-203-slides.pptx
warner-DP-203-slides.pptx
 
ETL QA
ETL QAETL QA
ETL QA
 

Ähnlich wie Data Warehouse Testing: It’s All about the Planning

Data Verification In QA Department Final
Data Verification In QA Department FinalData Verification In QA Department Final
Data Verification In QA Department FinalWayne Yaddow
 
rizwan cse exp resume
rizwan cse exp resumerizwan cse exp resume
rizwan cse exp resumeshaik rizwan
 
Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...Marco Tusa
 
Pradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of ExeriencePradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of ExeriencePradeep Shahapur
 
reddythippa ETL 8Years
reddythippa ETL 8Yearsreddythippa ETL 8Years
reddythippa ETL 8YearsThippa Reddy
 
ShwetaKumar_ETLBITesting_3.7yr_faridabad
ShwetaKumar_ETLBITesting_3.7yr_faridabadShwetaKumar_ETLBITesting_3.7yr_faridabad
ShwetaKumar_ETLBITesting_3.7yr_faridabadshweta kumar
 
1ResumeDEC2016Phillip Lopez
1ResumeDEC2016Phillip Lopez1ResumeDEC2016Phillip Lopez
1ResumeDEC2016Phillip Lopezphillip Lopez
 
Pankaj_Kumar_~3 yr exp.docx
Pankaj_Kumar_~3  yr exp.docxPankaj_Kumar_~3  yr exp.docx
Pankaj_Kumar_~3 yr exp.docxKumar Pankaj
 
ETL & Reporting Test Lead_JenishVarkeyJohn
ETL & Reporting Test Lead_JenishVarkeyJohnETL & Reporting Test Lead_JenishVarkeyJohn
ETL & Reporting Test Lead_JenishVarkeyJohnJenish John
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentRTTS
 
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant Krishna Kishore
 
Pankaj_Kumar_3 yr exp _ETL
Pankaj_Kumar_3  yr exp _ETL Pankaj_Kumar_3  yr exp _ETL
Pankaj_Kumar_3 yr exp _ETL Kumar Pankaj
 

Ähnlich wie Data Warehouse Testing: It’s All about the Planning (20)

Data Verification In QA Department Final
Data Verification In QA Department FinalData Verification In QA Department Final
Data Verification In QA Department Final
 
Pradeep_resume_ETL Testing
Pradeep_resume_ETL TestingPradeep_resume_ETL Testing
Pradeep_resume_ETL Testing
 
rizwan cse exp resume
rizwan cse exp resumerizwan cse exp resume
rizwan cse exp resume
 
Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...Are we there Yet?? (The long journey of Migrating from close source to opens...
Are we there Yet?? (The long journey of Migrating from close source to opens...
 
Resume sailaja
Resume sailajaResume sailaja
Resume sailaja
 
Pradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of ExeriencePradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of Exerience
 
sandhya exp resume
sandhya exp resume sandhya exp resume
sandhya exp resume
 
reddythippa ETL 8Years
reddythippa ETL 8Yearsreddythippa ETL 8Years
reddythippa ETL 8Years
 
ShwetaKumar_ETLBITesting_3.7yr_faridabad
ShwetaKumar_ETLBITesting_3.7yr_faridabadShwetaKumar_ETLBITesting_3.7yr_faridabad
ShwetaKumar_ETLBITesting_3.7yr_faridabad
 
1ResumeDEC2016Phillip Lopez
1ResumeDEC2016Phillip Lopez1ResumeDEC2016Phillip Lopez
1ResumeDEC2016Phillip Lopez
 
Jithender_3+Years_Exp_ETL Testing
Jithender_3+Years_Exp_ETL TestingJithender_3+Years_Exp_ETL Testing
Jithender_3+Years_Exp_ETL Testing
 
Pankaj_Kumar_~3 yr exp.docx
Pankaj_Kumar_~3  yr exp.docxPankaj_Kumar_~3  yr exp.docx
Pankaj_Kumar_~3 yr exp.docx
 
ETL & Reporting Test Lead_JenishVarkeyJohn
ETL & Reporting Test Lead_JenishVarkeyJohnETL & Reporting Test Lead_JenishVarkeyJohn
ETL & Reporting Test Lead_JenishVarkeyJohn
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
 
Resume
ResumeResume
Resume
 
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
 
Pankaj_Kumar_3 yr exp _ETL
Pankaj_Kumar_3  yr exp _ETL Pankaj_Kumar_3  yr exp _ETL
Pankaj_Kumar_3 yr exp _ETL
 
HamsaBalajiresume
HamsaBalajiresumeHamsaBalajiresume
HamsaBalajiresume
 
Resume_of_sayeed
Resume_of_sayeedResume_of_sayeed
Resume_of_sayeed
 
Siva - Resume
Siva - ResumeSiva - Resume
Siva - Resume
 

Mehr von TechWell

Failing and Recovering
Failing and RecoveringFailing and Recovering
Failing and RecoveringTechWell
 
Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization TechWell
 
Test Design for Fully Automated Build Architecture
Test Design for Fully Automated Build ArchitectureTest Design for Fully Automated Build Architecture
Test Design for Fully Automated Build ArchitectureTechWell
 
System-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good StartSystem-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good StartTechWell
 
Build Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test StrategyBuild Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test StrategyTechWell
 
Testing Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for SuccessTesting Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for SuccessTechWell
 
Implement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlowImplement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlowTechWell
 
Develop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your SanityDevelop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your SanityTechWell
 
Eliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps StrategyEliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps StrategyTechWell
 
Transform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOpsTransform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOpsTechWell
 
The Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—LeadershipThe Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—LeadershipTechWell
 
Resolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile TeamsResolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile TeamsTechWell
 
Pin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile GamePin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile GameTechWell
 
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile TeamsAgile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile TeamsTechWell
 
A Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps ImplementationA Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps ImplementationTechWell
 
Databases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery ProcessDatabases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery ProcessTechWell
 
Mobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to AutomateMobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to AutomateTechWell
 
Cultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for SuccessCultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for SuccessTechWell
 
Turn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile TransformationTurn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile TransformationTechWell
 

Mehr von TechWell (20)

Failing and Recovering
Failing and RecoveringFailing and Recovering
Failing and Recovering
 
Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization
 
Test Design for Fully Automated Build Architecture
Test Design for Fully Automated Build ArchitectureTest Design for Fully Automated Build Architecture
Test Design for Fully Automated Build Architecture
 
System-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good StartSystem-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good Start
 
Build Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test StrategyBuild Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test Strategy
 
Testing Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for SuccessTesting Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for Success
 
Implement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlowImplement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlow
 
Develop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your SanityDevelop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your Sanity
 
Ma 15
Ma 15Ma 15
Ma 15
 
Eliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps StrategyEliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps Strategy
 
Transform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOpsTransform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOps
 
The Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—LeadershipThe Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—Leadership
 
Resolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile TeamsResolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile Teams
 
Pin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile GamePin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile Game
 
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile TeamsAgile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
 
A Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps ImplementationA Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps Implementation
 
Databases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery ProcessDatabases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery Process
 
Mobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to AutomateMobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to Automate
 
Cultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for SuccessCultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for Success
 
Turn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile TransformationTurn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile Transformation
 

Kürzlich hochgeladen

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Kürzlich hochgeladen (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Data Warehouse Testing: It’s All about the Planning

  • 1. W8 Concurrent Class 10/2/2013 1:45:00 PM "Data Warehouse Testing: It’s All about the Planning" Presented by: Geoff Horne NZTester Magazine Brought to you by: 340 Corporate Way, Suite 300, Orange Park, FL 32073 888-268-8770 ∙ 904-278-0524 ∙ sqeinfo@sqe.com ∙ www.sqe.com
  • 2. Geoff Horne NZTester Magazine Geoff Horne has an extensive background in test program/project directorship and management, architecture, and general consulting. In New Zealand Geoff established and ran ISQA as a testing consultancy which enjoys a local and international clientele in Australia, the US, and the United Kingdom. He has held senior test management roles across a number of diverse industry sectors, and is editor and publisher of the recently launched NZTester magazine. Geoff has authored a variety of white papers on software testing and is a regular speaker at the STAR conferences.
  • 3. 9/20/2013 Data Warehouse Test Effectiveness It’s All about the Planning! Assuring Data Warehouse Content, Structure and Quality © Wayne Yaddow, 2013, wyaddow@gmail.com 1 Agenda Challenges of DWH testing Planning for DWH tests Tester skills for DWH testing Basic ETL verifications Defects you can expect to find Testing tools identified 2 1
  • 4. 9/20/2013 The Data Testing Process 3 Plan QA for typical DWH phases 4 2
  • 6. 9/20/2013 Plan QA for DWH Lifecycle Primary goals for verification – Data completeness – Data transformations – Data quality – Performance and scalability – Integration testing – User-acceptance testing – Regression testing 7 Planning the DWH QA Strategy Carefully review: – Requirements documentation – Data models for source and target schemas – Source to target mappings – ETL / stored proc design & logic – CA deployment tasks / steps – Required QA tools 8 4
  • 7. 9/20/2013 Challenges for DWH Testers (1) 1. 2. 3. 4. Often inadequate ETL design documents Source table field values unexpectedly null Excessive ETL errors discovered after entry to QA Source data does not meet table mapping specs (ex., dirty data) 5. Source to target mappings: 1. Often not reviewed by all stakeholders 2. Not consistently maintained through dev lifecycle 3. Therefore, in error 9 Challenges for DWH Testers (2) 6. Data models not maintained 7. Target data does not meet mapping specifications 8. Duplicate field values when defined to be DISTINCT 9. ETL SQL / errors that lead to missing rows and invalid field values 10. Constraint violations in source data 11. Table keys are incorrect for important RDB linkages 10 5
  • 8. 9/20/2013 Challenges for DWH Testers (3) 12. Huge source data volumes and of data types. 13. Source data quality that must be profiled before loading to DWH 14. Redundancy, duplicate source data. 15. Many source data records to be rejected 16. ETL logs w/ messages to be acted upon. 17. Source field values may be missing where they should always be present. 11 Challenges for DWH Testers (4) 19. Source data history & business rules may not be available. 20. SME’s and business rules may not be available 21. Since data ETLs must often pass through multiple phases 22. Transaction-level traceability will be difficult to attain in a data warehouse. 23. The data warehouse will be a strategic enterprise resource and heavily relied upon 12 6
  • 9. 9/20/2013 Plan for QA Tools 13 Identify QA skills (1) • • • • Understanding fundamental DWH and DB concepts High skill w/SQL and stored procedures Understanding of data used by the business Developing strategies, test plans and test cases specific to DWH and the business • Creating effective ETL test cases / scenarios based on loading technology and business requirements • Understanding of data models, data mapping documents, ETL design and ETL coding; ability to provide feedback to designers and developers 14 7
  • 10. 9/20/2013 Identify QA skills (2) • Experience with Oracle, SQL Server, Sybase, DB2 technology • Informatica session troubleshooting • Deploying DB code to data bases • Unix scripting, Autosys, Anthill, etc. • SQL editors • Data profiling • Use of Excel & MS Access for data analysis 15 Basic ETL Verifications (1) • Verify data mappings, source to target • Verify that all tables fields were loaded from source to staging • Verify that keys were properly generated using sequence generator • Verify that not-null fields were populated • Verify no data truncation in each field • Verify data types and formats are as specified in design phase 16 8
  • 11. 9/20/2013 Basic ETL Verifications (2) • Verify no duplicate records in target tables. • Verify transformations based on data low level design (LLD's) • Verify that numeric fields are populated with correct precision • Verify that every ETL session completed with only planned exceptions • Verify all cleansing, transformation, error and exception handling • Verify PL/SQL calculations and data mappings 17 Examples of DWH Defects 1. Inadequate ETL and stored procedure design documents 2. Field values are null when specified as “Not Null”. 3. Field constraints and SQL not coded correctly for Informatica ETL 4. Excessive ETL errors discovered after entry to QA 5. Source data does not meet table mapping specifications (ex., dirty data) 6. Source to target mappings: 1) often not reviewed, 2) in error and 2) not consistently maintained through dev lifecycle 18 9
  • 12. 9/20/2013 Examples of DWH Defects 7. Data models are not adequately maintained during development lifecycle 8. Target data does not meet mapping specifications 9. Duplicate field values when defined to be DISTINCT 10. ETL SQL / transformation errors leading to missing rows and invalid field values 11. Constraint violations in source 12. Target data is incorrectly stored in nonstandard formats 13. Table keys are incorrect for important relationship linkages 19 Verifying Data Loads From RTTS 20 10
  • 13. 9/20/2013 Planning for DWH QA (1) Data integration planning (Data model, LLD’s) 1. Gain understanding of data to be reported by the application (e.g., profiling)… and the tables upon which each user report will be based upon 2. Review, understand data model – gain understanding of keys, flows from source to target 3. Review, understand data LLD’s and mappings: add, update sequences for all sources of each target table 21 Planning for DWH QA (2) ETL planning and testing (source inputs & ETL design) 1. Participate in ETL design reviews 2. Gain in-depth knowledge of ETL sessions, the order of execution, restraints, transformations 3. Participate in development ETL test case reviews 4. After ETL’s are run, use checklists for QA assessments of rejects, session failures, errors 22 11
  • 14. 9/20/2013 Planning for DWH QA (3) Assess ETL logs: session, workflow, errors 1. Review ETL workflow outputs, source to target counts 2. Verify source to target mapping docs with loaded tables using TOAD and other tools 3. After ETL runs or manual data loads, assess data in every table with focus on key fields (dirty data, incorrect formats, duplicates, etc.). Use TOAD, Excel tools. (SQL queries, filtering, etc.) 23 Planning for DWH QA (4) GUI and report validations 1. Compare report data with target data. 2. Verify that reporting meets user expectations Analytics test team data validation 1. Test data as it is integrated into application 2. Provide tools and tests for data validation 24 12
  • 15. 9/20/2013 DQ tools / techniques for QA team TOAD / SQL Navigator •Data profiling for value range & boundary analysis •Null field analysis •Row counting •Data type analysis •Referential integrity analysis •Distinct value analysis by field •Duplicate data analysis (fields and rows) •Cardinality analysis • Stored procedures & package verification Excel •Data filtering for profile analysis •Data value sampling •Data type analysis MS Access •Table and data analysis across schemas QTP •Automated testing of templates and application screens • RTTS QuerySurge Analytics Tools •J – statistics, visualization, data manipulation •Perl – data manipulation, scripting •R – statistics 25 Bottom Line Recommendations • Involve test team in entire DWH SDLC • Profile source and target data • Remember: DWH QA is much more than source and target record counts • Develop testers SQL and DWH skills • Assure availability of source to target mapping document • Plan for regression and automated testing 26 13
  • 16. 9/20/2013 Planning Dev/Unit Tests Unit testing checklist • • Some programmers are not well trained as testers. They may like to program, deploy the code, and move on to the next development task without a thorough unit test. A checklist will aid database programmers to systematically test their code before formal QA testing. Check the mapping of fields that support data staging and in data marts. Check for duplication of values generated using sequence generators. Check the correctness of surrogate keys that uniquely identify rows of data. Check for data-type constraints of the fields present in staging and core levels. Check the data loading status and error messages after ETLs (extracts, transformations, loads).Look for string columns that are incorrectly leftor right-trimmed. Make sure all tables and specified fields were loaded from source to staging. Verify that not-null fields were populated. Verify that no data truncation occurred in each field. Make sure data types and formats are as specified during database design. Make sure there are no duplicate records in target tables. Make sure data transformations are correctly based on business rules. Verify that numeric fields are populated precisely. Make sure every ETL session completed with only planned exceptions. Verify all data cleansing, transformation, and error and exception handling. Verify stored procedure calculations and data mappings. Some programmers are not well trained as testers. They may like to program, deploy the code, and move on to the next development task without a thorough unit test. A checklist will aid database programmers to systematically test their code before formal QA testing. 27 Planning for Performance Tests • As the volume of data in the warehouse grows, ETL execution times can be expected to increase, and performance of queries often degrade. These changes can be mitigated by having a solid technical architecture and efficient ETL design. The aim of performance testing is to point out potential weaknesses in the ETL design, such as reading a file multiple times or creating unnecessary intermediate files. A performance and scalability testing checklist helps discover performance issues. • Load the database with peak expected production volumes to help ensure that the volume of data can be loaded by the ETL process within the agreed-on window. Compare ETL loading times to loads performed with a smaller amount of data to anticipate scalability issues. Compare the ETL processing times component by component to pinpoint any areas of weakness. Monitor the timing of the reject process and consider how large volumes of rejected data will be handled. Perform simple and multiple join queries to validate query performance on large database volumes. Work with business users to develop sample queries and acceptable performance criteria for each query. 28 14
  • 17. 9/20/2013 Recommendations for data verifications Detailed Recommendations for Data Development and QA 1. Need analysis of a.) source data quality and b.) data field profiles before input to Informatica and other data-build services. 2. QA should participate in all data model and data mapping reviews. 3. Need complete review of ETL error logs and resolution of errors by ETL teams before DB turn-over to QA. 4. Early use of QC during ETL and stored procedure testing to target vulnerable process areas. 5. Substantially improved documentation of PL/SQL stored procedures. 6. QA needs dev or separate environment for early data testing. QA should be able to modify data in order to perform negative tests. (QA currently does only positive tests because the application and data base tests work in parallel in the same environment.) 7. Need substantially enhanced verification of target tables after each ETL load before data turn-over to QA. 8. Need mandatory maintenance of data models and source to target mapping / transformation rules documents from elaboration until transition. 9. Investments in more Informatica and off-the-shelf data quality analysis tools for pre and post ETL. 10. Investments in automated DB regression test tools and training to support frequent data loads. 29 Plan QA for All DWH Dev. Phases 30 15
  • 18. 9/20/2013 Plan methods & tools for testing 31 16