SlideShare ist ein Scribd-Unternehmen logo
1 von 34
1
Presenter
Bill Hayduk
Founder / President
Presenter
Jeff Bocarsly, Ph.D.
Senior Architect
built bybuilt by
QuerySurge™
built by
The average organization loses $14.2 million annually
through poor Data Quality.
- Gartner
46% of companies cite Data Quality as a barrier
for adopting Business Intelligence products.
- InformationWeek
The cost per patient data of Phase 3 clinical studies of
new pharmaceuticals exceeds $26,000.
- Journal of Clinical Research Best Practices
built by
QuerySurge™
built by
QuerySurge™
(1) Data Integrity (2) Compliance
built by
QuerySurge™
(1) Data Integrity
high risk of defects that are not readily visible
Missing Data
Truncation of Data
Data Type Mismatch
Null Translation errors
Incorrect Type Translation
Misplaced Data
Extra Records
Transformation Logic Errors/Holes
Simple/Small Errors
Sequence Generator errors
Undocumented Requirements
Not Enough Records
built by
QuerySurge™
(2) Compliance
Need to comply with Part 11 mandates
historical test information test version history
test execution data:
who, what & when
test cycle information
visibility of assets archived test results
built by
QuerySurge™
 Periodic data reporting to FDA
 Periodic data reporting to int’l
bodies
(1) Data Integrity (2) Compliance
 FDA announced audits
 Unannounced FDA audits
Consequences
Severe financial and
business
built by
QuerySurge™
built by
QuerySurge™
 automate the manual testing of data
 compare millions of rows of data quickly
 flag mismatches and inconsistencies in data sets
 provide flexibility in scheduling test runs
 generate informative reports that can easily be shared
with the team
 validate up to 100% of all of all data, mitigating the risk
Need a testing solution that can…
built by
QuerySurge™
 track test history
 provide reporting on test version history
 record all test execution by testing owner’s name
and date
 deliver auditable reports of test cycles
 store all test outcomes and test data
 offer a read-only user for reviewing test assets
 support archiving of results
Need a testing solution that can…
built by
QuerySurge™
a software division ofQuerySurge™
is the smart testing solution that
automates the data validation & testing process
QuerySurge
QuerySurge™ a software division of
Use Cases
ETL
ETL
Mainframe
Business Intelligence
& Analytics
C-level executives are using BI &
Analytics to make critical business
decisions with the assumption
that the underlying data is fine
We know it is not
ETL
Typical data
issue areas
Web-based…
Supported OS...
Connects through…
…to any JDBC compliant data source
QuerySurge™
QuerySurge
Controller
QuerySurge Server
DB Server (MySQL)
App Server (Tomcat)
QuerySurge Agents
(Ships with 10 Agents)
a software division of
Installs...
…in the Cloud…on a VM…on a Bare Metal Server
• Market leader and visionary in automated data testing
• Launched in 2012, has 150+ customers in 30 countries
• Partner Ecosystem boasts 50+ world-renowned Technology
companies, global System Integrators, & regional consulting
firms
• Briefs leading analyst firms and appears in their reports as the
recommended solution for data testing
• Named Top 50 companies driving Big Data Innovation by
Database Trends and Applications
• In-depth, online Knowledge Base, formal classroom training,
& free Customer Training Portal
• Sets the gold standard for PoC and post-sale support
a software division ofQuerySurge™
QuerySurge supports the following data stores…
• Amazon Redshift, Elastic Map Reduce, DynamoDB
• Apache Hadoop/Hive, Spark
• Cassandra
• Cloudera
• Couchbase
• Exasol
• Flat Files (delimited, fixed-width)
• Google BigQuery
• Hortonworks
• IBM (Cognos ,Db2, Netezza, Informix, Big Insights, Cloudant, MDM)
• JSON files
• Mainframe
• MAPR
• Micro Focus Vertica
• Microsoft (SQL Server DWH, HDInsight, PDW, SSAS, Excel, Access,
SharePoint)
• MicrosStrategy
• MongoDB
• Oracle (Oracle DB, MySQL, Exadata, NoSQL, Hadoop)
• Pivotal GreenPlum
• PostgreSQL
• Salesforce
• SAP (Business Objects, HANA, IQ, ASE, Altiscale Data Cloud)
• Snowflake
• Tableau
• Teradata, Aster
• Workday
• XML
…and any other data store
Flat Files
Excel
QuerySurge connects
to any 2 points
at one time
SQL
HQL
SQL
Comparison of every data set
Source
Data
Target
Data
Data Intelligence Reports, Data Analytics
Dashboard, automated emails
Results – pass/fail
Target Data
Big Data
stores
• Hadoop
• NoSQL
Data
Warehouses
XML
Web Services
Source Data
Data Stores
• Databases
• Data Warehouses
• Data Marts
Flat Files
• Fixed Width
• Delimited
• Excel
• JSON
Business Intelligence
Reports
Business Intelligence
Reports
ETL Developer: Codes data movement based on Mapping Requirements
Data Warehouse
ETL
Data Tester: Tests data movement based on Mapping Requirements
Data Mart
ETL
Source Data Big Data lake
Testing Point #1 Testing Point #2 Testing Point #3
BI & Analytics
BI User extracts
data for reports
Testing Point #4
Tester tests BI
Reports
a software division ofQuerySurge™
Automate the entire testing cycle
 Automate the launch, execution, comparison, & emailed results
Smart Query Wizards - no coding needed
 Query Wizards create tests visually, without writing SQL
Test across different platforms
 Data Warehouse, Hadoop, NoSQL, DB, mainframe, flat files, XML,
JSON, BI Reports
Data Analytics & Intelligence
 Data Analytics Dashboard, Data Intelligence Reports, emailed results,
back-end data access
Create Custom Tests
 Modularize functions with snippets, set thresholds, stage data,
check data types
DevOps & Continuous Delivery
 API Integration with Build, Configuration, ETL & QA mgmt solutions
Design
Library
Scheduler
Run-Time
Dashboard
Query
Wizards
a software division ofQuerySurge™
Data
Intelligence
Reports
Data Analytics
Dashboard
QuerySurge
for
DevOps
Fast and Easy.
No programming needed.
• Perform 80% of all data tests with no SQL coding
• Opens up testing to novices & non-technical members
• Speeds up testing for skilled coders
• provides a huge Return-On-Investment
a software division of
QuerySurge™
QuerySurge™
a software division of
QuerySurge™
Design Library
• Create custom Query Pairs (source & target
SQLs for tests that have transformations)
Scheduling
 Build groups of Query Pairs
 Schedule Test Runs
• Run immediately
• Run at set date/time
• Have event kick it off
a software division of
QuerySurge™
Data Intelligence Reports
 Examine and automatically
email test results
Run Dashboard
 View real-time execution
 Analyze real-time results
QuerySurge™
a software division of
Large Suite Jan 5, 2019 16:20:44 Jan 5, 2019
Jan 5, 2019 4:24 PM
Start Time
QuerySurge™
6 minutes
QuerySurge™
Row Failure Drill-Down
QuerySurge™
• view data reliability & pass rate
• add, move, filter, zoom-in on any
data widget & underlying data
• verify build success or failure
a software division of
a software division ofQuerySurge™
Run Test Scenario
Kill Test Scenario
Execution
Test Suite Results
Individual Test Results
Source and Target Data
Failed Record Data
Test Suite Execution Status
Retrieve
QueryPairs
Create / Modify / Delete
Datastore Connections
Test Suites
Staging Tables
Query Snippets
Staging Queries
With the new expanded QuerySurge DevOps API, customers now have the ability to perform
design and analysis operations externally from QuerySurge, which allows QuerySurge to be
adopted and integrated into any DevOps process that focuses around data.
QuerySurge Server
Front Line Support:
• Technical Resources available for POCs
(7:30am – 9:00pm New York time)
• Web conferencing sessions
• QuerySurge Customer Portal (free)
• QuerySurge Partner Portal (free)
Additional Support:
• Ticket support
• Knowledge Base
• Videos / Slide decks
(1) a Trial in the Cloud of QuerySurge, including self-learning
tutorial that works with sample data for 3 days or
(2) a Downloaded Trial of QuerySurge, including self-learning
tutorial with sample data or your data for 15 days or
(3) a Proof of Concept of QuerySurge, including a kickoff &
setup meeting and weekly meetings with our team of experts
for 30 days
http://www.querysurge.com/compare-trial-optionsfor more information, Go here
built by
QuerySurge™
Fortune 500 firm:
Clinical Trial Data
built by
QuerySurge™
Challenge
How can a Data Warehouse team assure data
integrity over multiple builds when the cost per patient
data of Phase 3 clinical studies exceeds $26,000 and
volume of live case data is > 1 TB?
Strategy
Implement QuerySurge™ to dramatically increase
coverage of data that is verified for each build.
Implementation
• 1,000 SQL queries written to compare case data from
the source systems to the DWH after ETL.
• QuerySurge™automated the scheduling, test runs,
comparisons and reporting for each build.
built by
QuerySurge™
Metrics
 500 mappings
 2.5 million data items
 1.25 billion verifications
 Complete run finished in 7 days
 45% of data was covered.
 14 builds were deployed
 115 defects were discovered and
remediated
Benefits
• 10-fold increase in the speed of testing.
• Huge increase in coverage of data (from less than 1/10 % to 45%)
• Production defects discovered that were missed in previous cycles
• Huge savings on clean records (115 defects x $26,000/record)
• A huge time savings (3.6 years x 10 people)
• Avoidance of lawsuits and FDA fines
built by
QuerySurge™
built by
QuerySurge™
QuerySurge
For more on the Pharma & QuerySurge, go to
www.querysurge.com/solutions/pharmaceutical-industry

Weitere ähnliche Inhalte

Was ist angesagt?

Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
winghung
 
3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation
James Chi
 

Was ist angesagt? (19)

Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data Squared
 
Operational Data Vault
Operational Data VaultOperational Data Vault
Operational Data Vault
 
Business Analytics Overview
Business Analytics OverviewBusiness Analytics Overview
Business Analytics Overview
 
{French] 5 cas d'usages mdm produit
{French] 5 cas d'usages mdm produit{French] 5 cas d'usages mdm produit
{French] 5 cas d'usages mdm produit
 
How to choose between SharePoint lists, SQL Azure, Microsoft Dataverse with D...
How to choose between SharePoint lists, SQL Azure, Microsoft Dataverse with D...How to choose between SharePoint lists, SQL Azure, Microsoft Dataverse with D...
How to choose between SharePoint lists, SQL Azure, Microsoft Dataverse with D...
 
Gathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data WarehousesGathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data Warehouses
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
 
Building Data Lakehouse.pdf
Building Data Lakehouse.pdfBuilding Data Lakehouse.pdf
Building Data Lakehouse.pdf
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
Dbm630_lecture02-03
Dbm630_lecture02-03Dbm630_lecture02-03
Dbm630_lecture02-03
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
Microsoft Business Intelligence Vision and Strategy
Microsoft Business Intelligence Vision and StrategyMicrosoft Business Intelligence Vision and Strategy
Microsoft Business Intelligence Vision and Strategy
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
DAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best Practices
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best Practices
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail store
 
3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation3 Keys To Successful Master Data Management - Final Presentation
3 Keys To Successful Master Data Management - Final Presentation
 
Creating and Implementing Your Analytics Strategy
Creating and Implementing Your Analytics StrategyCreating and Implementing Your Analytics Strategy
Creating and Implementing Your Analytics Strategy
 

Ähnlich wie Data Warehouse Testing in the Pharmaceutical Industry

How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
RTTS
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
Vinny (Gurvinder) Ahuja
 

Ähnlich wie Data Warehouse Testing in the Pharmaceutical Industry (20)

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data Quality
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinar
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
 
Test Automation for Data Warehouses
Test Automation for Data Warehouses Test Automation for Data Warehouses
Test Automation for Data Warehouses
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your Data
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
QuerySurge AI webinar
QuerySurge AI webinarQuerySurge AI webinar
QuerySurge AI webinar
 
Etl testing strategies
Etl testing strategiesEtl testing strategies
Etl testing strategies
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
 
Automated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI Reports
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Resume sailaja
Resume sailajaResume sailaja
Resume sailaja
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 

Mehr von RTTS

QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
RTTS
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality Experts
RTTS
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
RTTS
 

Mehr von RTTS (14)

State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
 
RTTS Postman and API Testing Webinar Slides.pdf
RTTS Postman and API Testing Webinar  Slides.pdfRTTS Postman and API Testing Webinar  Slides.pdf
RTTS Postman and API Testing Webinar Slides.pdf
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing Strategy
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing Project
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and Databases
 
Case study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverCase study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriver
 
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality Experts
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Data Warehouse Testing in the Pharmaceutical Industry

  • 1. 1 Presenter Bill Hayduk Founder / President Presenter Jeff Bocarsly, Ph.D. Senior Architect built bybuilt by QuerySurge™
  • 2. built by The average organization loses $14.2 million annually through poor Data Quality. - Gartner 46% of companies cite Data Quality as a barrier for adopting Business Intelligence products. - InformationWeek The cost per patient data of Phase 3 clinical studies of new pharmaceuticals exceeds $26,000. - Journal of Clinical Research Best Practices built by QuerySurge™
  • 4. (1) Data Integrity (2) Compliance built by QuerySurge™
  • 5. (1) Data Integrity high risk of defects that are not readily visible Missing Data Truncation of Data Data Type Mismatch Null Translation errors Incorrect Type Translation Misplaced Data Extra Records Transformation Logic Errors/Holes Simple/Small Errors Sequence Generator errors Undocumented Requirements Not Enough Records built by QuerySurge™
  • 6. (2) Compliance Need to comply with Part 11 mandates historical test information test version history test execution data: who, what & when test cycle information visibility of assets archived test results built by QuerySurge™
  • 7.  Periodic data reporting to FDA  Periodic data reporting to int’l bodies (1) Data Integrity (2) Compliance  FDA announced audits  Unannounced FDA audits Consequences Severe financial and business built by QuerySurge™
  • 9.  automate the manual testing of data  compare millions of rows of data quickly  flag mismatches and inconsistencies in data sets  provide flexibility in scheduling test runs  generate informative reports that can easily be shared with the team  validate up to 100% of all of all data, mitigating the risk Need a testing solution that can… built by QuerySurge™
  • 10.  track test history  provide reporting on test version history  record all test execution by testing owner’s name and date  deliver auditable reports of test cycles  store all test outcomes and test data  offer a read-only user for reviewing test assets  support archiving of results Need a testing solution that can… built by QuerySurge™
  • 11. a software division ofQuerySurge™
  • 12. is the smart testing solution that automates the data validation & testing process QuerySurge QuerySurge™ a software division of Use Cases
  • 13. ETL ETL Mainframe Business Intelligence & Analytics C-level executives are using BI & Analytics to make critical business decisions with the assumption that the underlying data is fine We know it is not ETL Typical data issue areas
  • 14. Web-based… Supported OS... Connects through… …to any JDBC compliant data source QuerySurge™ QuerySurge Controller QuerySurge Server DB Server (MySQL) App Server (Tomcat) QuerySurge Agents (Ships with 10 Agents) a software division of Installs... …in the Cloud…on a VM…on a Bare Metal Server
  • 15. • Market leader and visionary in automated data testing • Launched in 2012, has 150+ customers in 30 countries • Partner Ecosystem boasts 50+ world-renowned Technology companies, global System Integrators, & regional consulting firms • Briefs leading analyst firms and appears in their reports as the recommended solution for data testing • Named Top 50 companies driving Big Data Innovation by Database Trends and Applications • In-depth, online Knowledge Base, formal classroom training, & free Customer Training Portal • Sets the gold standard for PoC and post-sale support a software division ofQuerySurge™
  • 16. QuerySurge supports the following data stores… • Amazon Redshift, Elastic Map Reduce, DynamoDB • Apache Hadoop/Hive, Spark • Cassandra • Cloudera • Couchbase • Exasol • Flat Files (delimited, fixed-width) • Google BigQuery • Hortonworks • IBM (Cognos ,Db2, Netezza, Informix, Big Insights, Cloudant, MDM) • JSON files • Mainframe • MAPR • Micro Focus Vertica • Microsoft (SQL Server DWH, HDInsight, PDW, SSAS, Excel, Access, SharePoint) • MicrosStrategy • MongoDB • Oracle (Oracle DB, MySQL, Exadata, NoSQL, Hadoop) • Pivotal GreenPlum • PostgreSQL • Salesforce • SAP (Business Objects, HANA, IQ, ASE, Altiscale Data Cloud) • Snowflake • Tableau • Teradata, Aster • Workday • XML …and any other data store Flat Files Excel
  • 17. QuerySurge connects to any 2 points at one time SQL HQL SQL Comparison of every data set Source Data Target Data Data Intelligence Reports, Data Analytics Dashboard, automated emails Results – pass/fail Target Data Big Data stores • Hadoop • NoSQL Data Warehouses XML Web Services Source Data Data Stores • Databases • Data Warehouses • Data Marts Flat Files • Fixed Width • Delimited • Excel • JSON Business Intelligence Reports Business Intelligence Reports
  • 18. ETL Developer: Codes data movement based on Mapping Requirements Data Warehouse ETL Data Tester: Tests data movement based on Mapping Requirements Data Mart ETL Source Data Big Data lake Testing Point #1 Testing Point #2 Testing Point #3 BI & Analytics BI User extracts data for reports Testing Point #4 Tester tests BI Reports
  • 19. a software division ofQuerySurge™ Automate the entire testing cycle  Automate the launch, execution, comparison, & emailed results Smart Query Wizards - no coding needed  Query Wizards create tests visually, without writing SQL Test across different platforms  Data Warehouse, Hadoop, NoSQL, DB, mainframe, flat files, XML, JSON, BI Reports Data Analytics & Intelligence  Data Analytics Dashboard, Data Intelligence Reports, emailed results, back-end data access Create Custom Tests  Modularize functions with snippets, set thresholds, stage data, check data types DevOps & Continuous Delivery  API Integration with Build, Configuration, ETL & QA mgmt solutions
  • 20. Design Library Scheduler Run-Time Dashboard Query Wizards a software division ofQuerySurge™ Data Intelligence Reports Data Analytics Dashboard QuerySurge for DevOps
  • 21. Fast and Easy. No programming needed. • Perform 80% of all data tests with no SQL coding • Opens up testing to novices & non-technical members • Speeds up testing for skilled coders • provides a huge Return-On-Investment a software division of QuerySurge™ QuerySurge™
  • 22. a software division of QuerySurge™
  • 23. Design Library • Create custom Query Pairs (source & target SQLs for tests that have transformations) Scheduling  Build groups of Query Pairs  Schedule Test Runs • Run immediately • Run at set date/time • Have event kick it off a software division of QuerySurge™
  • 24. Data Intelligence Reports  Examine and automatically email test results Run Dashboard  View real-time execution  Analyze real-time results QuerySurge™ a software division of
  • 25. Large Suite Jan 5, 2019 16:20:44 Jan 5, 2019 Jan 5, 2019 4:24 PM Start Time QuerySurge™ 6 minutes
  • 27. QuerySurge™ • view data reliability & pass rate • add, move, filter, zoom-in on any data widget & underlying data • verify build success or failure a software division of
  • 28. a software division ofQuerySurge™ Run Test Scenario Kill Test Scenario Execution Test Suite Results Individual Test Results Source and Target Data Failed Record Data Test Suite Execution Status Retrieve QueryPairs Create / Modify / Delete Datastore Connections Test Suites Staging Tables Query Snippets Staging Queries With the new expanded QuerySurge DevOps API, customers now have the ability to perform design and analysis operations externally from QuerySurge, which allows QuerySurge to be adopted and integrated into any DevOps process that focuses around data. QuerySurge Server
  • 29. Front Line Support: • Technical Resources available for POCs (7:30am – 9:00pm New York time) • Web conferencing sessions • QuerySurge Customer Portal (free) • QuerySurge Partner Portal (free) Additional Support: • Ticket support • Knowledge Base • Videos / Slide decks
  • 30. (1) a Trial in the Cloud of QuerySurge, including self-learning tutorial that works with sample data for 3 days or (2) a Downloaded Trial of QuerySurge, including self-learning tutorial with sample data or your data for 15 days or (3) a Proof of Concept of QuerySurge, including a kickoff & setup meeting and weekly meetings with our team of experts for 30 days http://www.querysurge.com/compare-trial-optionsfor more information, Go here built by QuerySurge™
  • 31. Fortune 500 firm: Clinical Trial Data built by QuerySurge™
  • 32. Challenge How can a Data Warehouse team assure data integrity over multiple builds when the cost per patient data of Phase 3 clinical studies exceeds $26,000 and volume of live case data is > 1 TB? Strategy Implement QuerySurge™ to dramatically increase coverage of data that is verified for each build. Implementation • 1,000 SQL queries written to compare case data from the source systems to the DWH after ETL. • QuerySurge™automated the scheduling, test runs, comparisons and reporting for each build. built by QuerySurge™
  • 33. Metrics  500 mappings  2.5 million data items  1.25 billion verifications  Complete run finished in 7 days  45% of data was covered.  14 builds were deployed  115 defects were discovered and remediated Benefits • 10-fold increase in the speed of testing. • Huge increase in coverage of data (from less than 1/10 % to 45%) • Production defects discovered that were missed in previous cycles • Huge savings on clean records (115 defects x $26,000/record) • A huge time savings (3.6 years x 10 people) • Avoidance of lawsuits and FDA fines built by QuerySurge™
  • 34. built by QuerySurge™ QuerySurge For more on the Pharma & QuerySurge, go to www.querysurge.com/solutions/pharmaceutical-industry

Hinweis der Redaktion

  1. Other Pharmaceutical Industry Complexities ------------------------------------------------------------------------ Industry consolidation causing massive integration of data FDA CFR Part 11 compliance A broad variety of data types and sources may be fed into a data warehouse. general Pharma-specific information exchange formats (e.g., HL7 feeds, CDISC feeds, other XML grammars) multiple proprietary and internal data formats, which may have been acquired in the process of industry consolidation.
  2. QuerySurge can automate the comparison of all data from source files and databases through different legs of the ETL process to the target data warehouse. QuerySurge can be scheduled to run immediately, next Monday at 11:00pm or when an event, such as the current ETL process ends. QuerySurge will execute tests that automate the comparison of target data to source data very quickly, comparing millions of rows of data in minutes. On completion of the run, QuerySurge will produce informative summary and detailed reports that can be viewed immediately or shared with the team via the automated email scheduler. QuerySurge will validate 100% of all of your data, providing full coverage and mitigating the risk while providing reports highlighting every data difference, down to the individual character.
  3. - tracks test history (user, date, each test version) - provides reporting on test version history for convenient auditing - supports tracking of deviations from approved tests - records all test execution owners by name and date - delivers auditable results reporting of test cycles - stores all test outcomes and test data for post-facto review or audit - offers a read-only user type for reviewing test assets - supports off-database archiving of results (for future restore) for effective long-term results data management
  4. QuerySurge provides insight into the health of your data throughout your organization through BI dashboards and reporting at your fingertips. It is a collaborative tool that allows for distributed use of the tool throughout your organization and provides for a sharable, holistic view of your data’s health and your organization’s level of maturity of your data management.
  5. Your distributed team from around the world can use any of these web browsers: Internet Explorer, Chrome, Firefox and Safari. Installs on operating systems: Windows & Linux. QS connects to any JDBC-compliant data source. Even if it is not listed here.
  6. QuerySurge finds bad data by natively connecting to: any data source, whether it is any type of database, flat file or xml and can connect to any data target, whether it is a db, file, xml, data warehouse or hadoop implementation. QuerySurge pulls data from the source and the target and compares them very quickly (typically in a few minutes) and then produces reports that show every data difference, even if there are millions of rows and hundreds of columns in the test. These reports can be automatically emailed to your team. You can pick from a multitude of reports or export the results so that you can build your own reports.