SlideShare ist ein Scribd-Unternehmen logo
1 von 16
© 2017 IDERA, Inc. All rights reserved.
Proprietary and confidential.
A MEDICAL DATA
WAREHOUSE MODEL
Michael R Blaha, DSc.
blaha@computer.org
www.superdataguy.com
2© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
WAREHOUSE DATA FLOW
3© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SOURCE DATA DETAILS
 Epic Clarity is 99+% of source data
• All the data of a major hospital
• Small amount of data from other apps and
external sources
• No scrubbing of source data
• 10,000+ Clarity tables
• Some tables have 200+ columns
• Much redundant data
• Clarity also has both transaction data and rollup data
• No referential integrity
• Odd PKs of many-to-many tables
• Table1 PK + line_number
4© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
OTHER DATA FLOW DETAILS
 Data warehouse platform is Netezza
 The staging tables maintain a history of operational data
• One staging table for each major source
• Staging schema = operational schema + surrogate key
+ effective date + expiration date
 Informatica for staging data + a new commercial tool
 Informatica for ETL + agile SQL queries
5© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
BUS ARCHITECTURE
 Definition: A data warehouse with dimensions that are
consistently defined across facts
 200+ dimensions; 100+ facts; Kimball approach
 Fully commented data model with naming standards
 We created our own DW schema
• This project preceded Epic’s DW schema
 DW has strong correspondence to Clarity
• Little abstraction
6© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
PARTIAL BUS ARCHITECTURE
A
c
c
t
A
p
p
t
B
e
n
P
C
a
r
r
C
o
s
C
C
o
v
r
D
a
t
e
D
e
p
t
D
I
a
g
D
R
G
E
n
c
M
e
d
P
a
t
P
a
y
r
P
h
a
r
P
r
o
b
P
r
o
c
P
r
o
v
R
e
v
C
U
n
a
v
V
e
n
d
Account_Coverage X X X X X
Account_Diagnosis X X X X X
Account_Procedure X X X X X X
Account_Summary X X X X X X X X X X X X
Billing_Tx_Detail X X X X X X X X X X
Encounter X X X X X X
7© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
PARTIAL BUS ARCHITECTURE (CONT.)
A
c
c
t
A
p
p
t
B
e
n
P
C
a
r
r
C
o
s
C
C
o
v
r
D
a
t
e
D
e
p
t
D
I
a
g
D
R
G
E
n
c
M
e
d
P
a
t
P
a
y
r
P
h
a
r
P
r
o
b
P
r
o
c
P
r
o
v
R
e
v
C
U
n
a
v
V
e
n
d
Order_Medication X X X X X X
Order_Procedure X X X X X X X
Patient_Appt X X X X X X
Provider_Avail X X X X X X
Readmission X X X X X X X X X X
Referral X X X X X X X X X
8© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SAMPLE DIMENSIONS
 Account
 Appointment
 Benefit_Plan
 Carrier
 Cost_Center
 Coverage
 Date
 Department
 Diagnosis
 DRG
 Encounter
9© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SAMPLE DIMENSIONS (CONT.)
 Medication
 Patient
 Payor
 Pharmacy
 Problem_List
 Procedure
 Provider
 Revenue_Code
 Unavailable_Reason
 Vendor
10© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SAMPLE FACTS
 Account_Coverage – patient account and coverage data
 Account_Diagnosis – data about diagnoses
 Account_Procedure – procedures for an account
 Account_Summary – an accumulating snapshot fact
 Billing_Transaction_Detail – hospital billing transactions
 Encounter – the current state of a patient encounter
 Order_Medication – data for medications
 Order_Procedure – data for procedures and lab orders
 Patient_Appointment – data for appointments
 Provider_Availability – time slots for a provider’s schedule
 Readmission – data for hospital readmission per ACA
 Referral – a patient handoff from one provider to another
11© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SAMPLE SUBJECT AREA
12© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SAMPLE SUBJECT AREA
13© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
SAMPLE SUBJECT AREA
14© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
STAFFING BREAKDOWN
 4 managers
 2 data modelers + advanced tools/techniques
 6 ETL
 5 business analysts
 1 DBA
 1 metadata
 2 secretary
 2 legacy software
15© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
RETROSPECTIVE
 Too much of “build it and they will come”
 There were errors in staging data!
 There was code review for ETL scripts
• I’m not confident that the ETL scripts were correct
• We should have used SQL to validate ETL scripts
 Agile analytics
• We had complex SQL for calculating readmissions
• The SQL was validated as correct
• The legacy software was wrong!
 Issue of Clarity rollup vs DW rollup
16© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential.
THANKS!
Any questions?
You can find me at:
@michaelrblaha
blaha@computer.org
www.superdataguy.com

Weitere ähnliche Inhalte

Ähnlich wie A Medical Data Warehouse Model

"When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! "
"When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! ""When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! "
"When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! "MongoDB
 
MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!
MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!
MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!MongoDB
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceEric Kavanagh
 
Jethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik QonnectionsJethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik QonnectionsRemy Rosenbaum
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...DataWorks Summit
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Cloudera, Inc.
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data AnalyticsDatameer
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Big Data Engineer Skills and Job Description | Edureka
Big Data Engineer Skills and Job Description | EdurekaBig Data Engineer Skills and Job Description | Edureka
Big Data Engineer Skills and Job Description | EdurekaEdureka!
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessInside Analysis
 
Accelerating Secure SAP Application Delivery
Accelerating Secure SAP Application Delivery Accelerating Secure SAP Application Delivery
Accelerating Secure SAP Application Delivery Delphix
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
IDERA Live | Decode your Organization's Data DNA
IDERA Live | Decode your Organization's Data DNAIDERA Live | Decode your Organization's Data DNA
IDERA Live | Decode your Organization's Data DNAIDERA Software
 
IDERA Live | Maintaining Data Governance During Rapidly Changing Conditions
IDERA Live | Maintaining Data Governance During Rapidly Changing ConditionsIDERA Live | Maintaining Data Governance During Rapidly Changing Conditions
IDERA Live | Maintaining Data Governance During Rapidly Changing ConditionsIDERA Software
 
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...Todd Winey
 

Ähnlich wie A Medical Data Warehouse Model (20)

"When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! "
"When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! ""When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! "
"When a Startup Hits Growth Mode: Scaling from 200GB to 20TB! "
 
MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!
MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!
MongoDB World 2018: When a Startup Hits Growth Mode: Scaling from 200GB to 20TB!
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data Governance
 
Jethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik QonnectionsJethro + Symphony Health at Qlik Qonnections
Jethro + Symphony Health at Qlik Qonnections
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Big Data Engineer Skills and Job Description | Edureka
Big Data Engineer Skills and Job Description | EdurekaBig Data Engineer Skills and Job Description | Edureka
Big Data Engineer Skills and Job Description | Edureka
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of Success
 
Accelerating Secure SAP Application Delivery
Accelerating Secure SAP Application Delivery Accelerating Secure SAP Application Delivery
Accelerating Secure SAP Application Delivery
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
IDERA Live | Decode your Organization's Data DNA
IDERA Live | Decode your Organization's Data DNAIDERA Live | Decode your Organization's Data DNA
IDERA Live | Decode your Organization's Data DNA
 
IDERA Live | Maintaining Data Governance During Rapidly Changing Conditions
IDERA Live | Maintaining Data Governance During Rapidly Changing ConditionsIDERA Live | Maintaining Data Governance During Rapidly Changing Conditions
IDERA Live | Maintaining Data Governance During Rapidly Changing Conditions
 
How Businesses use Big Data to Impact the Bottom Line
How Businesses use Big Data to Impact the Bottom LineHow Businesses use Big Data to Impact the Bottom Line
How Businesses use Big Data to Impact the Bottom Line
 
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
 

Mehr von IDERA Software

The role of the database administrator (DBA) in 2020: Changes, challenges, an...
The role of the database administrator (DBA) in 2020: Changes, challenges, an...The role of the database administrator (DBA) in 2020: Changes, challenges, an...
The role of the database administrator (DBA) in 2020: Changes, challenges, an...IDERA Software
 
Problems and solutions for migrating databases to the cloud
Problems and solutions for migrating databases to the cloudProblems and solutions for migrating databases to the cloud
Problems and solutions for migrating databases to the cloudIDERA Software
 
Public cloud uses and limitations
Public cloud uses and limitationsPublic cloud uses and limitations
Public cloud uses and limitationsIDERA Software
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxIDERA Software
 
Monitor cloud database with SQL Diagnostic Manager for SQL Server
Monitor cloud database with SQL Diagnostic Manager for SQL ServerMonitor cloud database with SQL Diagnostic Manager for SQL Server
Monitor cloud database with SQL Diagnostic Manager for SQL ServerIDERA Software
 
Database administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databasesDatabase administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databasesIDERA Software
 
Six tips for cutting sql server licensing costs
Six tips for cutting sql server licensing costsSix tips for cutting sql server licensing costs
Six tips for cutting sql server licensing costsIDERA Software
 
Idera live 2021: The Power of Abstraction by Steve Hoberman
Idera live 2021:  The Power of Abstraction by Steve HobermanIdera live 2021:  The Power of Abstraction by Steve Hoberman
Idera live 2021: The Power of Abstraction by Steve HobermanIDERA Software
 
Idera live 2021: Why Data Lakes are Critical for AI, ML, and IoT By Brian Flug
Idera live 2021:  Why Data Lakes are Critical for AI, ML, and IoT  By Brian FlugIdera live 2021:  Why Data Lakes are Critical for AI, ML, and IoT  By Brian Flug
Idera live 2021: Why Data Lakes are Critical for AI, ML, and IoT By Brian FlugIDERA Software
 
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...IDERA Software
 
Idera live 2021: Managing Digital Transformation on a Budget by Bert Scalzo
Idera live 2021:  Managing Digital Transformation on a Budget by Bert ScalzoIdera live 2021:  Managing Digital Transformation on a Budget by Bert Scalzo
Idera live 2021: Managing Digital Transformation on a Budget by Bert ScalzoIDERA Software
 
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...IDERA Software
 
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...IDERA Software
 
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...IDERA Software
 
Idera live 2021: Performance Tuning Azure SQL Database by Monica Rathbun
Idera live 2021:  Performance Tuning Azure SQL Database by Monica RathbunIdera live 2021:  Performance Tuning Azure SQL Database by Monica Rathbun
Idera live 2021: Performance Tuning Azure SQL Database by Monica RathbunIDERA Software
 
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERAGeek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERAIDERA Software
 
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...IDERA Software
 
Benefits of Third Party Tools for MySQL | IDERA
Benefits of Third Party Tools for MySQL | IDERABenefits of Third Party Tools for MySQL | IDERA
Benefits of Third Party Tools for MySQL | IDERAIDERA Software
 
Achieve More with Less Resources | IDERA
Achieve More with Less Resources | IDERAAchieve More with Less Resources | IDERA
Achieve More with Less Resources | IDERAIDERA Software
 
Benefits of SQL Server 2017 and 2019 | IDERA
Benefits of SQL Server 2017 and 2019 | IDERABenefits of SQL Server 2017 and 2019 | IDERA
Benefits of SQL Server 2017 and 2019 | IDERAIDERA Software
 

Mehr von IDERA Software (20)

The role of the database administrator (DBA) in 2020: Changes, challenges, an...
The role of the database administrator (DBA) in 2020: Changes, challenges, an...The role of the database administrator (DBA) in 2020: Changes, challenges, an...
The role of the database administrator (DBA) in 2020: Changes, challenges, an...
 
Problems and solutions for migrating databases to the cloud
Problems and solutions for migrating databases to the cloudProblems and solutions for migrating databases to the cloud
Problems and solutions for migrating databases to the cloud
 
Public cloud uses and limitations
Public cloud uses and limitationsPublic cloud uses and limitations
Public cloud uses and limitations
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptx
 
Monitor cloud database with SQL Diagnostic Manager for SQL Server
Monitor cloud database with SQL Diagnostic Manager for SQL ServerMonitor cloud database with SQL Diagnostic Manager for SQL Server
Monitor cloud database with SQL Diagnostic Manager for SQL Server
 
Database administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databasesDatabase administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databases
 
Six tips for cutting sql server licensing costs
Six tips for cutting sql server licensing costsSix tips for cutting sql server licensing costs
Six tips for cutting sql server licensing costs
 
Idera live 2021: The Power of Abstraction by Steve Hoberman
Idera live 2021:  The Power of Abstraction by Steve HobermanIdera live 2021:  The Power of Abstraction by Steve Hoberman
Idera live 2021: The Power of Abstraction by Steve Hoberman
 
Idera live 2021: Why Data Lakes are Critical for AI, ML, and IoT By Brian Flug
Idera live 2021:  Why Data Lakes are Critical for AI, ML, and IoT  By Brian FlugIdera live 2021:  Why Data Lakes are Critical for AI, ML, and IoT  By Brian Flug
Idera live 2021: Why Data Lakes are Critical for AI, ML, and IoT By Brian Flug
 
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
 
Idera live 2021: Managing Digital Transformation on a Budget by Bert Scalzo
Idera live 2021:  Managing Digital Transformation on a Budget by Bert ScalzoIdera live 2021:  Managing Digital Transformation on a Budget by Bert Scalzo
Idera live 2021: Managing Digital Transformation on a Budget by Bert Scalzo
 
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
 
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
 
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
 
Idera live 2021: Performance Tuning Azure SQL Database by Monica Rathbun
Idera live 2021:  Performance Tuning Azure SQL Database by Monica RathbunIdera live 2021:  Performance Tuning Azure SQL Database by Monica Rathbun
Idera live 2021: Performance Tuning Azure SQL Database by Monica Rathbun
 
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERAGeek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
 
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
 
Benefits of Third Party Tools for MySQL | IDERA
Benefits of Third Party Tools for MySQL | IDERABenefits of Third Party Tools for MySQL | IDERA
Benefits of Third Party Tools for MySQL | IDERA
 
Achieve More with Less Resources | IDERA
Achieve More with Less Resources | IDERAAchieve More with Less Resources | IDERA
Achieve More with Less Resources | IDERA
 
Benefits of SQL Server 2017 and 2019 | IDERA
Benefits of SQL Server 2017 and 2019 | IDERABenefits of SQL Server 2017 and 2019 | IDERA
Benefits of SQL Server 2017 and 2019 | IDERA
 

Kürzlich hochgeladen

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

A Medical Data Warehouse Model

  • 1. © 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. A MEDICAL DATA WAREHOUSE MODEL Michael R Blaha, DSc. blaha@computer.org www.superdataguy.com
  • 2. 2© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. WAREHOUSE DATA FLOW
  • 3. 3© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SOURCE DATA DETAILS  Epic Clarity is 99+% of source data • All the data of a major hospital • Small amount of data from other apps and external sources • No scrubbing of source data • 10,000+ Clarity tables • Some tables have 200+ columns • Much redundant data • Clarity also has both transaction data and rollup data • No referential integrity • Odd PKs of many-to-many tables • Table1 PK + line_number
  • 4. 4© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. OTHER DATA FLOW DETAILS  Data warehouse platform is Netezza  The staging tables maintain a history of operational data • One staging table for each major source • Staging schema = operational schema + surrogate key + effective date + expiration date  Informatica for staging data + a new commercial tool  Informatica for ETL + agile SQL queries
  • 5. 5© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. BUS ARCHITECTURE  Definition: A data warehouse with dimensions that are consistently defined across facts  200+ dimensions; 100+ facts; Kimball approach  Fully commented data model with naming standards  We created our own DW schema • This project preceded Epic’s DW schema  DW has strong correspondence to Clarity • Little abstraction
  • 6. 6© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. PARTIAL BUS ARCHITECTURE A c c t A p p t B e n P C a r r C o s C C o v r D a t e D e p t D I a g D R G E n c M e d P a t P a y r P h a r P r o b P r o c P r o v R e v C U n a v V e n d Account_Coverage X X X X X Account_Diagnosis X X X X X Account_Procedure X X X X X X Account_Summary X X X X X X X X X X X X Billing_Tx_Detail X X X X X X X X X X Encounter X X X X X X
  • 7. 7© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. PARTIAL BUS ARCHITECTURE (CONT.) A c c t A p p t B e n P C a r r C o s C C o v r D a t e D e p t D I a g D R G E n c M e d P a t P a y r P h a r P r o b P r o c P r o v R e v C U n a v V e n d Order_Medication X X X X X X Order_Procedure X X X X X X X Patient_Appt X X X X X X Provider_Avail X X X X X X Readmission X X X X X X X X X X Referral X X X X X X X X X
  • 8. 8© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SAMPLE DIMENSIONS  Account  Appointment  Benefit_Plan  Carrier  Cost_Center  Coverage  Date  Department  Diagnosis  DRG  Encounter
  • 9. 9© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SAMPLE DIMENSIONS (CONT.)  Medication  Patient  Payor  Pharmacy  Problem_List  Procedure  Provider  Revenue_Code  Unavailable_Reason  Vendor
  • 10. 10© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SAMPLE FACTS  Account_Coverage – patient account and coverage data  Account_Diagnosis – data about diagnoses  Account_Procedure – procedures for an account  Account_Summary – an accumulating snapshot fact  Billing_Transaction_Detail – hospital billing transactions  Encounter – the current state of a patient encounter  Order_Medication – data for medications  Order_Procedure – data for procedures and lab orders  Patient_Appointment – data for appointments  Provider_Availability – time slots for a provider’s schedule  Readmission – data for hospital readmission per ACA  Referral – a patient handoff from one provider to another
  • 11. 11© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SAMPLE SUBJECT AREA
  • 12. 12© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SAMPLE SUBJECT AREA
  • 13. 13© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. SAMPLE SUBJECT AREA
  • 14. 14© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. STAFFING BREAKDOWN  4 managers  2 data modelers + advanced tools/techniques  6 ETL  5 business analysts  1 DBA  1 metadata  2 secretary  2 legacy software
  • 15. 15© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. RETROSPECTIVE  Too much of “build it and they will come”  There were errors in staging data!  There was code review for ETL scripts • I’m not confident that the ETL scripts were correct • We should have used SQL to validate ETL scripts  Agile analytics • We had complex SQL for calculating readmissions • The SQL was validated as correct • The legacy software was wrong!  Issue of Clarity rollup vs DW rollup
  • 16. 16© 2017 IDERA, Inc. All rights reserved. Proprietary and confidential. THANKS! Any questions? You can find me at: @michaelrblaha blaha@computer.org www.superdataguy.com