SlideShare ist ein Scribd-Unternehmen logo
1 von 32
The Four Essential Zones of a
Healthcare Data Lake
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Healthcare Data Lake
Health Catalyst has published articles describing
early- and late-binding data warehouse
architectures, comparing data lakes to data
warehouses, and explaining how health systems
can leverage unique data lake functions within
their existing analytic platforms.
The evolving healthcare data environment
created the need for data lakes, but they are a
significant IT investment.
Understanding the relationship between an
enterprise data warehouse (EDW) and a data
lake, and its zones, is fundamental to investing
in the right technology with the appropriate
financial and human resources.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
Why a Data Lake Is Necessary
In healthcare today, outcomes improvement efforts are fueled by limited
information, primarily healthcare encounter data (Figure 1).
Figure 1: The human health data ecosystem is large, though we use
very little of it for improving outcomes.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
Why a Data Lake Is Necessary
To see more of the picture, bring it into focus,
and understand what really impacts outcomes,
we need genomic and familial data, outcomes
data, 7×24 biometric data, consumer data, and
socio-economic data.
The complete ecosystem of data necessary for
massive outcomes improvements will increase
the total amount of healthcare data tenfold.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
Why a Data Lake Is Necessary
According to a 2014 IDC report, the healthcare
digital universe is growing 48 percent per year.
In 2013, the industry generated 4.4 zettabytes
(1021 bytes) of data. By 2020, it will generate
44 zettabytes.
Unfortunately, this data volume would explode
the data warehouse of most organizations.
Fortunately, a data lake can handle this volume.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Benefits of a Data Lake
The benefits of a data lake as a supplement to an EDW are numerous in
terms of scale, schema, processing workloads, data accessibility, data
complexity, and data usability:
A data lake, typically designed using Apache Hadoop, is the preferred choice for larger
structured and unstructured datasets coming from multiple internal and external sources,
such as radiology, physician notes, and claims. This removes data silos.
A data lake doesn’t demand definitions on the data it ingests. The data can be refined once
the questions are known.
A data lake offers great flexibility on the tools and technology used to run queries. These
benefits are instrumental to socializing data access and developing a data-driven culture
across the organization.
A data lake is prepared for the future of healthcare data with the ability to integrate patient
data from implanted monitors and wearable fitness devices.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Data Lake’s Strength Leads to a Weakness
A data lake can scale to petabytes of information
of both structured and unstructured data and
can ingest data at a variety of speeds from batch
to real-time.
Unfortunately, these capabilities have led to a
negative side effect.
Gartner’s hype cycle for 2017 shows that data
lakes have passed the “peak of inflated
expectations” and have started the slide into the
“trough of disillusionment.”
This isn’t surprising. Often, an industry develops
a concept thinking it will solve world hunger,
then learns its real-life limitations.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Data Lake’s Strength Leads to a Weakness
Initially, data lakes were predicted to
solve all of healthcare’s outcomes
problems, but they have ended up just
collecting petabytes of data.
Now, data lake users see a lot of detritus
that can’t be used to build anything. The
data lake has become a data swamp.
Understanding and creating zones
within a data lake are the keys to
draining the swamp.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Data lake zones form a structural governance
to the assets in the data lake.
To define zones, Zaloni excerpts content from
the ebook, “Big Data: Data Science and
Advanced Analytics.”
The book’s authors write that “zones allow the
logical and/or physical separation of data that
keeps the environment secure, organized,
and agile.”
Zones are physically created through
“exclusive servers or clusters,” or virtually
created through “the deliberate structuring of
directories and access privileges.”
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Healthcare analytics architectures need a
data lake to collect the sheer volume of raw
data that comes in from the various
transactional source systems used in
healthcare (e.g., EMR data, billing data,
costing data, ERP data, etc.).
Data then populates into various zones
within the data lake.
To effectively allocate resources for building
and managing the data lake, it helps to define
each zone, understand their relationships
with one another, know the types of data
stored in each zone, and identify each
zone’s typical user.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Data lakes are divided into four zones (Figure 2).
Organizations may label
these zones differently
according to individual
or industry preference,
but their functions are
essentially the same.
Figure 2: Data lake zones
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Raw Data Zone
In the raw zone data is moved in its native
format, without transformation or binding to
any business rules.
Often the only organization or structure
added in this layer is outlining what data
came from what source system.
Health Catalyst calls those areas in the
raw zone source marts. Though all data
starts in the raw zone, it’s too vast of a
landscape for less technical users.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Raw Data Zone
Typical users include ETL developers,
data stewards, data analysts, and data
scientists, who are defined by their ability
to derive new knowledge and insights
amid vast amounts of data.
This user base tends to be small and
spends a lot of time sifting through data,
then pushing it into other zones.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Trusted Data Zone
Source data is ingested into the EDW,
then used to build shared data marts in the
trusted data zone.
Terminology is standardized at this point
(e.g., RxNorm, SNOMED, etc.). The
trusted data zone holds data that serves
as universal truth across the organization.
A broader group of people has applied
extensive governance to this data, which
has more comprehensive definitions that
the entire organization can stand behind.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Trusted Data Zone
Trusted data could include building blocks,
such as the number of ED visits in a
certain period, inpatient admission rates
from one year to the next, or the number of
members in risk-based contracts.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Refined Data Zone
Meaning is applied to raw data so it can be
integrated into a common format and used by
specific lines of business.
Data in the refined zone is grouped into Subject
Area Marts (SAMs, often referred to as data marts).
A department manager looking for end-of-month
numbers would query a SAM rather than the EDW.
SAMs are the source of truth for specific domains.
They take subsets of data from the larger pool and
add value that’s meaningful to a finance, clinical,
operations, supply chain, or other areas.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Refined Data Zone
Refined data is used by a broad group of
people, but is not yet blessed by everyone in
the organization.
In other words, people beyond specific subject
areas may not be able to derive meaning from
refined data.
A SAM gets promoted to the trusted zone
when the definitions applied to its data
elements have broadened to a much larger
group of people.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Sandbox Data Zone
Anyone can decide to move data from
the raw, trusted, or refined zones into the
sandbox data zone.
Here, data from all of these zones can
be morphed for private use.
Once sandbox information has been
vetted, it is promoted for broader use
in the refined data zone.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Zones and Their Data Definitions
For an example of the data type in each
zone, consider length of stay (LOS).
There are dozens of ways to define LOS
using ED presentation time, admit time,
registration time, cut time, post-
observation time, and discharge time.
The clinical definition of LOS for an
appendectomy may be from cut time to
discharge time, but the corporate
definition may be from admit time to
discharge time.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Zones and Their Data Definitions
A SAM that focuses on appendectomy
might choose to use the clinical
definition, which doesn’t apply to the
global definition (i.e., the definition in the
trusted zone).
For an individual SAM definition of LOS
to be promoted to the trusted zone, it
needs to be vetted through a broader
group of people to confirm it has
universal application.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Zones and Their Data Definitions
Directors who have financial responsibility
over a single line of business may need to
evaluate their department’s productivity.
They may need to see things a certain way,
such as excluding corporate overhead, over
which they have no control.
This is what makes the SAM more specific
to one area. The data definition has been
vetted and agreed to by a group of people,
though it has yet to reach global agreement.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Right Technology for the Right Zone
Different technology can run on top of
different zones in a data lake. The data lake
itself typically runs on Hadoop, which is
optimal for handling huge data volumes.
Relational Databases like SQL Server are
more user friendly and will provide data to a
larger user base.
SQL queries can run on top of Hadoop to
produce data marts and SAMs in the trusted
and refined zones.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The Right Technology for the Right Zone
Hortonworks refers to a Connected Data
Architecture, in which “data pools need to
ensure that connected data can flow freely to
the place where it is optimal for the business
to get value from it.”
Zones may not live on the same data
technology. Much of the data will live in a data
lake, but more refined zones may have a
portion of their data that resides in an EDW or
smaller data marts.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Data Lakes Are Integral to a Larger Operating System
Earlier, we said that huge data volumes have turned
data lakes into data swamps, which is remedied
through a larger healthcare analytics ecosystem.
Some, or all, of a data operating system can be
deployed over the top of any healthcare data lake.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Data Lakes Are Integral to a Larger Operating System
The Health Catalyst® Data Operating System
(DOS™) (Figure 3 on next slide) can index, catalog,
analyze, and provide insights from the terabytes and
growing data assets in a health system and provide
health system leaders with the knowledge they need
to produce massive outcomes improvements:
• IT departments
• Clinicians
• population health managers
• financial leaders
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Data Lakes Are Integral to a Larger Operating System
Figure 3: The Health Catalyst Data Operating System.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
Data Lakes Are Integral to a Larger Operating System
DOS enables a data lake to be built with
the required governance and meaning
added to the data so it is easily
organized into the appropriate zones.
Data can then be used according to zone
by the various data consumers in a
health system.
DOS also allows data to be analyzed and
consumed by the Fabric Services layer to
accelerate the development of innovative
data-first applications.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
The volume of healthcare data is
mushrooming, and data architectures
need to get ahead of the growth.
Vast volumes of data will continue to flow
into the EDW.
A data lake is required to make data
accessible to a subset of ETL developers,
data stewards, data analysts, and data
scientists.
Data lakes allow data to be moved into
various zones for experimentation and
research, or for customization into shared
data marts and SAMs.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
The Four Zones of a Data Lake
To prevent data lakes from becoming mired
in the petabytes of data now swamping
healthcare, the new architecture presented
by the data operating system offers a
breakthrough in analytics engineering that
can renew the life of a data lake and
accommodate the big-bang growth of
healthcare data.
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
For more information:
“This book is a fantastic piece of work”
– Robert Lindeman MD, FAAP, Chief Physician Quality Officer
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
More about this topic
Link to original article for a more in-depth discussion.
The Four Essential Zones of a Healthcare Data Lake
What Is a Healthcare Data Lake and Why Do You Need One? Imagine a Supermarket
Imran Qureshi, Chief Software Development Officer
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Jared Crapo, Sales, Senior VP
Data Warehouse Tools: Faster Time-to-Value for Your Healthcare Data Warehouse
Doug Adamson, Chief Technology Officer, VP
Comparing the Three Major Approaches to Healthcare Data Warehousing: A Deep Dive
Review (White Paper) Steve Barlow, Senior VP of Client Operations and Co-Founder
The Health Catalyst Data Operating System (DOS™) Solution
Health Catalyst Solution
© 2016 Health Catalyst
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
Joined Health Catalyst in February 2012. Prior to joining the Catalyst team, Bryan spent
six years with Intel and four years with the The Church of Jesus Christ of Latter-Day
Saints. While at Intel Bryan was on teams responsible for Intel's factory reporting systems
and equipment maintenance prediction.
At the LDS Church he led the .NET Development Center of Excellence and was responsible for the
Application Lifecycle Management (ALM) processes and tools used for development at the Church.
Bryan graduated from Brigham Young University with a degree in Computer Science.
Other Clinical Quality Improvement Resources
Click to read additional information at www.healthcatalyst.com
Bryan Hinton

Weitere ähnliche Inhalte

Was ist angesagt?

The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
DATAVERSITY
 
Sistemas de base de datos vs sistemas de archivos
Sistemas de base de datos vs sistemas de archivosSistemas de base de datos vs sistemas de archivos
Sistemas de base de datos vs sistemas de archivos
Universidad de Panamá
 

Was ist angesagt? (20)

Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
Building a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthBuilding a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public Health
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
Customer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer ExperiencesCustomer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer Experiences
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Difference
 
Teradata
TeradataTeradata
Teradata
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
 
Analisys services 2005 cubos olap con o sin data warehouse
Analisys services 2005 cubos olap con o sin data warehouseAnalisys services 2005 cubos olap con o sin data warehouse
Analisys services 2005 cubos olap con o sin data warehouse
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Sistemas de base de datos vs sistemas de archivos
Sistemas de base de datos vs sistemas de archivosSistemas de base de datos vs sistemas de archivos
Sistemas de base de datos vs sistemas de archivos
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 

Ähnlich wie The Four Essential Zones of a Healthcare Data Lake

Big Data in Healthcare Made Simple Where It Stands Today and Where .pdf
Big Data in Healthcare Made Simple Where It Stands Today and Where .pdfBig Data in Healthcare Made Simple Where It Stands Today and Where .pdf
Big Data in Healthcare Made Simple Where It Stands Today and Where .pdf
annamalaiagencies
 
Healthcare Information Systems - Past, Present, and Future
Healthcare Information Systems - Past, Present, and FutureHealthcare Information Systems - Past, Present, and Future
Healthcare Information Systems - Past, Present, and Future
Health Catalyst
 

Ähnlich wie The Four Essential Zones of a Healthcare Data Lake (20)

Healthcare Data Warehouse Models Explained
Healthcare Data Warehouse Models ExplainedHealthcare Data Warehouse Models Explained
Healthcare Data Warehouse Models Explained
 
Healthcare Analytics Platform: DOS Delivers the 7 Essential Components
Healthcare Analytics Platform: DOS Delivers the 7 Essential ComponentsHealthcare Analytics Platform: DOS Delivers the 7 Essential Components
Healthcare Analytics Platform: DOS Delivers the 7 Essential Components
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Seven Ways DOS™ Simplifies the Complexities of Healthcare IT
Seven Ways DOS™ Simplifies the Complexities of Healthcare ITSeven Ways DOS™ Simplifies the Complexities of Healthcare IT
Seven Ways DOS™ Simplifies the Complexities of Healthcare IT
 
Big Data Analytics in Hospitals By Dr.Mahboob ali khan Phd
Big Data Analytics in Hospitals By Dr.Mahboob ali khan PhdBig Data Analytics in Hospitals By Dr.Mahboob ali khan Phd
Big Data Analytics in Hospitals By Dr.Mahboob ali khan Phd
 
Big Data in Healthcare Made Simple Where It Stands Today and Where .pdf
Big Data in Healthcare Made Simple Where It Stands Today and Where .pdfBig Data in Healthcare Made Simple Where It Stands Today and Where .pdf
Big Data in Healthcare Made Simple Where It Stands Today and Where .pdf
 
The Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of HealthcareThe Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of Healthcare
 
The Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of HealthcareThe Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of Healthcare
 
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Data Lake vs. Data Warehouse: Which is Right for Healthcare?Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
 
Aiding Analytics Adoption Via Metadata-Driven Architecture: If You Build It, ...
Aiding Analytics Adoption Via Metadata-Driven Architecture: If You Build It, ...Aiding Analytics Adoption Via Metadata-Driven Architecture: If You Build It, ...
Aiding Analytics Adoption Via Metadata-Driven Architecture: If You Build It, ...
 
The Biggest Barriers to Healthcare Interoperability
The Biggest Barriers to Healthcare InteroperabilityThe Biggest Barriers to Healthcare Interoperability
The Biggest Barriers to Healthcare Interoperability
 
Healthcare Interoperability: New Tactics and Technology
Healthcare Interoperability: New Tactics and TechnologyHealthcare Interoperability: New Tactics and Technology
Healthcare Interoperability: New Tactics and Technology
 
Database vs Data Warehouse: A Comparative Review
Database vs Data Warehouse: A Comparative ReviewDatabase vs Data Warehouse: A Comparative Review
Database vs Data Warehouse: A Comparative Review
 
Is That Data Valid? Getting Accurate Financial Data in Healthcare
Is That Data Valid? Getting Accurate Financial Data in HealthcareIs That Data Valid? Getting Accurate Financial Data in Healthcare
Is That Data Valid? Getting Accurate Financial Data in Healthcare
 
Eight Reasons Why Chief Data Officers Will Help Healthcare Organizations Thri...
Eight Reasons Why Chief Data Officers Will Help Healthcare Organizations Thri...Eight Reasons Why Chief Data Officers Will Help Healthcare Organizations Thri...
Eight Reasons Why Chief Data Officers Will Help Healthcare Organizations Thri...
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
Healthcare data's perfect storm
Healthcare data's perfect stormHealthcare data's perfect storm
Healthcare data's perfect storm
 
Optimising Data Lakes for Financial Services
Optimising Data Lakes for Financial ServicesOptimising Data Lakes for Financial Services
Optimising Data Lakes for Financial Services
 
Healthcare Information Systems - Past, Present, and Future
Healthcare Information Systems - Past, Present, and FutureHealthcare Information Systems - Past, Present, and Future
Healthcare Information Systems - Past, Present, and Future
 
Innovative Healthcare Partnerships: Making the Most of Merging Resources and ...
Innovative Healthcare Partnerships: Making the Most of Merging Resources and ...Innovative Healthcare Partnerships: Making the Most of Merging Resources and ...
Innovative Healthcare Partnerships: Making the Most of Merging Resources and ...
 

Mehr von Health Catalyst

Mehr von Health Catalyst (20)

Looking Ahead: Market Trends Impacting Key Healthcare Issues
Looking Ahead: Market Trends Impacting Key Healthcare IssuesLooking Ahead: Market Trends Impacting Key Healthcare Issues
Looking Ahead: Market Trends Impacting Key Healthcare Issues
 
2024 HCAT Healthcare Technology Insights
2024 HCAT Healthcare Technology Insights2024 HCAT Healthcare Technology Insights
2024 HCAT Healthcare Technology Insights
 
Three Keys to a Successful Margin: Charges, Costs, and Labor
Three Keys to a Successful Margin: Charges, Costs, and LaborThree Keys to a Successful Margin: Charges, Costs, and Labor
Three Keys to a Successful Margin: Charges, Costs, and Labor
 
2024 CPT® Updates (Professional Services Focused) - Part 3
2024 CPT® Updates (Professional Services Focused) - Part 32024 CPT® Updates (Professional Services Focused) - Part 3
2024 CPT® Updates (Professional Services Focused) - Part 3
 
2024 CPT® Code Updates (HIM Focused) - Part 2
2024 CPT® Code Updates (HIM Focused) - Part 22024 CPT® Code Updates (HIM Focused) - Part 2
2024 CPT® Code Updates (HIM Focused) - Part 2
 
2024 CPT® Code Updates (CDM Focused) - Part 1
2024 CPT® Code Updates (CDM Focused) - Part 12024 CPT® Code Updates (CDM Focused) - Part 1
2024 CPT® Code Updates (CDM Focused) - Part 1
 
What’s Next for Hospital Price Transparency in 2024 and Beyond
What’s Next for Hospital Price Transparency in 2024 and BeyondWhat’s Next for Hospital Price Transparency in 2024 and Beyond
What’s Next for Hospital Price Transparency in 2024 and Beyond
 
Automated Patient Reported Outcomes (PROs) for Hip & Knee Replacement
Automated Patient Reported Outcomes (PROs) for Hip & Knee ReplacementAutomated Patient Reported Outcomes (PROs) for Hip & Knee Replacement
Automated Patient Reported Outcomes (PROs) for Hip & Knee Replacement
 
2024 Medicare Physician Fee Schedule (MPFS) Final Rule Updates
2024 Medicare Physician Fee Schedule (MPFS) Final Rule Updates2024 Medicare Physician Fee Schedule (MPFS) Final Rule Updates
2024 Medicare Physician Fee Schedule (MPFS) Final Rule Updates
 
What's Next for OPPS: A Look at the 2024 Final Rule
What's Next for OPPS: A Look at the 2024 Final RuleWhat's Next for OPPS: A Look at the 2024 Final Rule
What's Next for OPPS: A Look at the 2024 Final Rule
 
Insight into the 2024 ICD-10 PCS Updates - Part 2
Insight into the 2024 ICD-10 PCS Updates - Part 2Insight into the 2024 ICD-10 PCS Updates - Part 2
Insight into the 2024 ICD-10 PCS Updates - Part 2
 
Vitalware Insight Into the 2024 ICD10 CM Updates.pdf
Vitalware Insight Into the 2024 ICD10 CM Updates.pdfVitalware Insight Into the 2024 ICD10 CM Updates.pdf
Vitalware Insight Into the 2024 ICD10 CM Updates.pdf
 
Driving Value: Boosting Clinical Registry Value Using ARMUS Solutions
Driving Value: Boosting Clinical Registry Value Using ARMUS SolutionsDriving Value: Boosting Clinical Registry Value Using ARMUS Solutions
Driving Value: Boosting Clinical Registry Value Using ARMUS Solutions
 
Tech-Enabled Managed Services: Not Your Average Outsourcing
Tech-Enabled Managed Services: Not Your Average OutsourcingTech-Enabled Managed Services: Not Your Average Outsourcing
Tech-Enabled Managed Services: Not Your Average Outsourcing
 
2023 Mid-Year CPT/HCPCS Code Set Updates
2023 Mid-Year CPT/HCPCS Code Set Updates2023 Mid-Year CPT/HCPCS Code Set Updates
2023 Mid-Year CPT/HCPCS Code Set Updates
 
How Managing Chronic Conditions Is Streamlined with Digital Technology
How Managing Chronic Conditions Is Streamlined with Digital TechnologyHow Managing Chronic Conditions Is Streamlined with Digital Technology
How Managing Chronic Conditions Is Streamlined with Digital Technology
 
COVID-19: After the Public Health Emergency Ends
COVID-19: After the Public Health Emergency EndsCOVID-19: After the Public Health Emergency Ends
COVID-19: After the Public Health Emergency Ends
 
Automated Medication Compliance Tools for the Provider and Patient
Automated Medication Compliance Tools for the Provider and PatientAutomated Medication Compliance Tools for the Provider and Patient
Automated Medication Compliance Tools for the Provider and Patient
 
A Facility-Focused Guide to Applying Modifiers Corectly.pptx
A Facility-Focused Guide to Applying Modifiers Corectly.pptxA Facility-Focused Guide to Applying Modifiers Corectly.pptx
A Facility-Focused Guide to Applying Modifiers Corectly.pptx
 
Self-Service Analytics: How to Use Healthcare Business Intelligence
Self-Service Analytics: How to Use Healthcare Business IntelligenceSelf-Service Analytics: How to Use Healthcare Business Intelligence
Self-Service Analytics: How to Use Healthcare Business Intelligence
 

Kürzlich hochgeladen

Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...
Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...
Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...
Sheetaleventcompany
 
Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...
Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...
Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...
Sheetaleventcompany
 
Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...
Sheetaleventcompany
 
Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...
Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...
Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...
Sheetaleventcompany
 
🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...
🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...
🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...
dilpreetentertainmen
 
💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...
Sheetaleventcompany
 
Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...
Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...
Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...
Sheetaleventcompany
 
Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...
Sheetaleventcompany
 
💚 Low Rate Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...
💚 Low Rate  Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...💚 Low Rate  Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...
💚 Low Rate Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...
Sheetaleventcompany
 

Kürzlich hochgeladen (20)

💸Cash Payment No Advance Call Girls Bhopal 🧿 9332606886 🧿 High Class Call Gir...
💸Cash Payment No Advance Call Girls Bhopal 🧿 9332606886 🧿 High Class Call Gir...💸Cash Payment No Advance Call Girls Bhopal 🧿 9332606886 🧿 High Class Call Gir...
💸Cash Payment No Advance Call Girls Bhopal 🧿 9332606886 🧿 High Class Call Gir...
 
Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...
Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...
Lucknow Call Girls Service ❤️🍑 9xx000xx09 👄🫦 Independent Escort Service Luckn...
 
Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...
Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...
Low Rate Call Girls Udaipur {9xx000xx09} ❤️VVIP NISHA CCall Girls in Udaipur ...
 
💸Cash Payment No Advance Call Girls Surat 🧿 9332606886 🧿 High Class Call Girl...
💸Cash Payment No Advance Call Girls Surat 🧿 9332606886 🧿 High Class Call Girl...💸Cash Payment No Advance Call Girls Surat 🧿 9332606886 🧿 High Class Call Girl...
💸Cash Payment No Advance Call Girls Surat 🧿 9332606886 🧿 High Class Call Girl...
 
Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9142599079} ❤️VVIP NISHA Call Girls in Pune Maharas...
 
Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...
Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...
Low Rate Call Girls Jaipur {9521753030} ❤️VVIP NISHA CCall Girls in Jaipur Es...
 
❤️Chandigarh Escorts☎️9814379184☎️ Call Girl service in Chandigarh☎️ Chandiga...
❤️Chandigarh Escorts☎️9814379184☎️ Call Girl service in Chandigarh☎️ Chandiga...❤️Chandigarh Escorts☎️9814379184☎️ Call Girl service in Chandigarh☎️ Chandiga...
❤️Chandigarh Escorts☎️9814379184☎️ Call Girl service in Chandigarh☎️ Chandiga...
 
🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...
🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...
🍑👄Ludhiana Escorts Service☎️98157-77685🍑👄 Call Girl service in Ludhiana☎️Ludh...
 
💞 Safe And Secure Call Girls gaya 🧿 9332606886 🧿 High Class Call Girl Service...
💞 Safe And Secure Call Girls gaya 🧿 9332606886 🧿 High Class Call Girl Service...💞 Safe And Secure Call Girls gaya 🧿 9332606886 🧿 High Class Call Girl Service...
💞 Safe And Secure Call Girls gaya 🧿 9332606886 🧿 High Class Call Girl Service...
 
💞 Safe And Secure Call Girls Mysore 🧿 9332606886 🧿 High Class Call Girl Servi...
💞 Safe And Secure Call Girls Mysore 🧿 9332606886 🧿 High Class Call Girl Servi...💞 Safe And Secure Call Girls Mysore 🧿 9332606886 🧿 High Class Call Girl Servi...
💞 Safe And Secure Call Girls Mysore 🧿 9332606886 🧿 High Class Call Girl Servi...
 
💸Cash Payment No Advance Call Girls Pune 🧿 9332606886 🧿 High Class Call Girl ...
💸Cash Payment No Advance Call Girls Pune 🧿 9332606886 🧿 High Class Call Girl ...💸Cash Payment No Advance Call Girls Pune 🧿 9332606886 🧿 High Class Call Girl ...
💸Cash Payment No Advance Call Girls Pune 🧿 9332606886 🧿 High Class Call Girl ...
 
❤️Call Girl In Chandigarh☎️9814379184☎️ Call Girl service in Chandigarh☎️ Cha...
❤️Call Girl In Chandigarh☎️9814379184☎️ Call Girl service in Chandigarh☎️ Cha...❤️Call Girl In Chandigarh☎️9814379184☎️ Call Girl service in Chandigarh☎️ Cha...
❤️Call Girl In Chandigarh☎️9814379184☎️ Call Girl service in Chandigarh☎️ Cha...
 
💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Jiya 📲🔝8868886958🔝Call Girls In Chandigarh No...
 
Independent Call Girls Service Chandigarh Sector 17 | 8868886958 | Call Girl ...
Independent Call Girls Service Chandigarh Sector 17 | 8868886958 | Call Girl ...Independent Call Girls Service Chandigarh Sector 17 | 8868886958 | Call Girl ...
Independent Call Girls Service Chandigarh Sector 17 | 8868886958 | Call Girl ...
 
Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...
Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...
Low Rate Call Girls Nagpur {9xx000xx09} ❤️VVIP NISHA Call Girls in Nagpur Mah...
 
The Events of Cardiac Cycle - Wigger's Diagram
The Events of Cardiac Cycle - Wigger's DiagramThe Events of Cardiac Cycle - Wigger's Diagram
The Events of Cardiac Cycle - Wigger's Diagram
 
❤️Zirakpur Escorts☎️7837612180☎️ Call Girl service in Zirakpur☎️ Zirakpur Cal...
❤️Zirakpur Escorts☎️7837612180☎️ Call Girl service in Zirakpur☎️ Zirakpur Cal...❤️Zirakpur Escorts☎️7837612180☎️ Call Girl service in Zirakpur☎️ Zirakpur Cal...
❤️Zirakpur Escorts☎️7837612180☎️ Call Girl service in Zirakpur☎️ Zirakpur Cal...
 
Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {9179660964} ❤️VVIP POOJA Call Girls in Bangalor...
 
💚 Low Rate Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...
💚 Low Rate  Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...💚 Low Rate  Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...
💚 Low Rate Call Girls In Chandigarh 💯Lucky 📲🔝8868886958🔝Call Girl In Chandig...
 
2024 PCP #IMPerative Updates in Rheumatology
2024 PCP #IMPerative Updates in Rheumatology2024 PCP #IMPerative Updates in Rheumatology
2024 PCP #IMPerative Updates in Rheumatology
 

The Four Essential Zones of a Healthcare Data Lake

  • 1. The Four Essential Zones of a Healthcare Data Lake
  • 2. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Healthcare Data Lake Health Catalyst has published articles describing early- and late-binding data warehouse architectures, comparing data lakes to data warehouses, and explaining how health systems can leverage unique data lake functions within their existing analytic platforms. The evolving healthcare data environment created the need for data lakes, but they are a significant IT investment. Understanding the relationship between an enterprise data warehouse (EDW) and a data lake, and its zones, is fundamental to investing in the right technology with the appropriate financial and human resources.
  • 3. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. Why a Data Lake Is Necessary In healthcare today, outcomes improvement efforts are fueled by limited information, primarily healthcare encounter data (Figure 1). Figure 1: The human health data ecosystem is large, though we use very little of it for improving outcomes.
  • 4. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. Why a Data Lake Is Necessary To see more of the picture, bring it into focus, and understand what really impacts outcomes, we need genomic and familial data, outcomes data, 7×24 biometric data, consumer data, and socio-economic data. The complete ecosystem of data necessary for massive outcomes improvements will increase the total amount of healthcare data tenfold.
  • 5. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. Why a Data Lake Is Necessary According to a 2014 IDC report, the healthcare digital universe is growing 48 percent per year. In 2013, the industry generated 4.4 zettabytes (1021 bytes) of data. By 2020, it will generate 44 zettabytes. Unfortunately, this data volume would explode the data warehouse of most organizations. Fortunately, a data lake can handle this volume.
  • 6. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Benefits of a Data Lake The benefits of a data lake as a supplement to an EDW are numerous in terms of scale, schema, processing workloads, data accessibility, data complexity, and data usability: A data lake, typically designed using Apache Hadoop, is the preferred choice for larger structured and unstructured datasets coming from multiple internal and external sources, such as radiology, physician notes, and claims. This removes data silos. A data lake doesn’t demand definitions on the data it ingests. The data can be refined once the questions are known. A data lake offers great flexibility on the tools and technology used to run queries. These benefits are instrumental to socializing data access and developing a data-driven culture across the organization. A data lake is prepared for the future of healthcare data with the ability to integrate patient data from implanted monitors and wearable fitness devices.
  • 7. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Data Lake’s Strength Leads to a Weakness A data lake can scale to petabytes of information of both structured and unstructured data and can ingest data at a variety of speeds from batch to real-time. Unfortunately, these capabilities have led to a negative side effect. Gartner’s hype cycle for 2017 shows that data lakes have passed the “peak of inflated expectations” and have started the slide into the “trough of disillusionment.” This isn’t surprising. Often, an industry develops a concept thinking it will solve world hunger, then learns its real-life limitations.
  • 8. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Data Lake’s Strength Leads to a Weakness Initially, data lakes were predicted to solve all of healthcare’s outcomes problems, but they have ended up just collecting petabytes of data. Now, data lake users see a lot of detritus that can’t be used to build anything. The data lake has become a data swamp. Understanding and creating zones within a data lake are the keys to draining the swamp.
  • 9. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Data lake zones form a structural governance to the assets in the data lake. To define zones, Zaloni excerpts content from the ebook, “Big Data: Data Science and Advanced Analytics.” The book’s authors write that “zones allow the logical and/or physical separation of data that keeps the environment secure, organized, and agile.” Zones are physically created through “exclusive servers or clusters,” or virtually created through “the deliberate structuring of directories and access privileges.”
  • 10. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Healthcare analytics architectures need a data lake to collect the sheer volume of raw data that comes in from the various transactional source systems used in healthcare (e.g., EMR data, billing data, costing data, ERP data, etc.). Data then populates into various zones within the data lake. To effectively allocate resources for building and managing the data lake, it helps to define each zone, understand their relationships with one another, know the types of data stored in each zone, and identify each zone’s typical user.
  • 11. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Data lakes are divided into four zones (Figure 2). Organizations may label these zones differently according to individual or industry preference, but their functions are essentially the same. Figure 2: Data lake zones
  • 12. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Raw Data Zone In the raw zone data is moved in its native format, without transformation or binding to any business rules. Often the only organization or structure added in this layer is outlining what data came from what source system. Health Catalyst calls those areas in the raw zone source marts. Though all data starts in the raw zone, it’s too vast of a landscape for less technical users.
  • 13. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Raw Data Zone Typical users include ETL developers, data stewards, data analysts, and data scientists, who are defined by their ability to derive new knowledge and insights amid vast amounts of data. This user base tends to be small and spends a lot of time sifting through data, then pushing it into other zones.
  • 14. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Trusted Data Zone Source data is ingested into the EDW, then used to build shared data marts in the trusted data zone. Terminology is standardized at this point (e.g., RxNorm, SNOMED, etc.). The trusted data zone holds data that serves as universal truth across the organization. A broader group of people has applied extensive governance to this data, which has more comprehensive definitions that the entire organization can stand behind.
  • 15. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Trusted Data Zone Trusted data could include building blocks, such as the number of ED visits in a certain period, inpatient admission rates from one year to the next, or the number of members in risk-based contracts.
  • 16. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Refined Data Zone Meaning is applied to raw data so it can be integrated into a common format and used by specific lines of business. Data in the refined zone is grouped into Subject Area Marts (SAMs, often referred to as data marts). A department manager looking for end-of-month numbers would query a SAM rather than the EDW. SAMs are the source of truth for specific domains. They take subsets of data from the larger pool and add value that’s meaningful to a finance, clinical, operations, supply chain, or other areas.
  • 17. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Refined Data Zone Refined data is used by a broad group of people, but is not yet blessed by everyone in the organization. In other words, people beyond specific subject areas may not be able to derive meaning from refined data. A SAM gets promoted to the trusted zone when the definitions applied to its data elements have broadened to a much larger group of people.
  • 18. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Sandbox Data Zone Anyone can decide to move data from the raw, trusted, or refined zones into the sandbox data zone. Here, data from all of these zones can be morphed for private use. Once sandbox information has been vetted, it is promoted for broader use in the refined data zone.
  • 19. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Zones and Their Data Definitions For an example of the data type in each zone, consider length of stay (LOS). There are dozens of ways to define LOS using ED presentation time, admit time, registration time, cut time, post- observation time, and discharge time. The clinical definition of LOS for an appendectomy may be from cut time to discharge time, but the corporate definition may be from admit time to discharge time.
  • 20. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Zones and Their Data Definitions A SAM that focuses on appendectomy might choose to use the clinical definition, which doesn’t apply to the global definition (i.e., the definition in the trusted zone). For an individual SAM definition of LOS to be promoted to the trusted zone, it needs to be vetted through a broader group of people to confirm it has universal application.
  • 21. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Zones and Their Data Definitions Directors who have financial responsibility over a single line of business may need to evaluate their department’s productivity. They may need to see things a certain way, such as excluding corporate overhead, over which they have no control. This is what makes the SAM more specific to one area. The data definition has been vetted and agreed to by a group of people, though it has yet to reach global agreement.
  • 22. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Right Technology for the Right Zone Different technology can run on top of different zones in a data lake. The data lake itself typically runs on Hadoop, which is optimal for handling huge data volumes. Relational Databases like SQL Server are more user friendly and will provide data to a larger user base. SQL queries can run on top of Hadoop to produce data marts and SAMs in the trusted and refined zones.
  • 23. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The Right Technology for the Right Zone Hortonworks refers to a Connected Data Architecture, in which “data pools need to ensure that connected data can flow freely to the place where it is optimal for the business to get value from it.” Zones may not live on the same data technology. Much of the data will live in a data lake, but more refined zones may have a portion of their data that resides in an EDW or smaller data marts.
  • 24. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Data Lakes Are Integral to a Larger Operating System Earlier, we said that huge data volumes have turned data lakes into data swamps, which is remedied through a larger healthcare analytics ecosystem. Some, or all, of a data operating system can be deployed over the top of any healthcare data lake.
  • 25. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Data Lakes Are Integral to a Larger Operating System The Health Catalyst® Data Operating System (DOS™) (Figure 3 on next slide) can index, catalog, analyze, and provide insights from the terabytes and growing data assets in a health system and provide health system leaders with the knowledge they need to produce massive outcomes improvements: • IT departments • Clinicians • population health managers • financial leaders
  • 26. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Data Lakes Are Integral to a Larger Operating System Figure 3: The Health Catalyst Data Operating System.
  • 27. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake Data Lakes Are Integral to a Larger Operating System DOS enables a data lake to be built with the required governance and meaning added to the data so it is easily organized into the appropriate zones. Data can then be used according to zone by the various data consumers in a health system. DOS also allows data to be analyzed and consumed by the Fabric Services layer to accelerate the development of innovative data-first applications.
  • 28. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake The volume of healthcare data is mushrooming, and data architectures need to get ahead of the growth. Vast volumes of data will continue to flow into the EDW. A data lake is required to make data accessible to a subset of ETL developers, data stewards, data analysts, and data scientists. Data lakes allow data to be moved into various zones for experimentation and research, or for customization into shared data marts and SAMs.
  • 29. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. The Four Zones of a Data Lake To prevent data lakes from becoming mired in the petabytes of data now swamping healthcare, the new architecture presented by the data operating system offers a breakthrough in analytics engineering that can renew the life of a data lake and accommodate the big-bang growth of healthcare data.
  • 30. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. For more information: “This book is a fantastic piece of work” – Robert Lindeman MD, FAAP, Chief Physician Quality Officer
  • 31. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. More about this topic Link to original article for a more in-depth discussion. The Four Essential Zones of a Healthcare Data Lake What Is a Healthcare Data Lake and Why Do You Need One? Imagine a Supermarket Imran Qureshi, Chief Software Development Officer Data Lake vs. Data Warehouse: Which is Right for Healthcare? Jared Crapo, Sales, Senior VP Data Warehouse Tools: Faster Time-to-Value for Your Healthcare Data Warehouse Doug Adamson, Chief Technology Officer, VP Comparing the Three Major Approaches to Healthcare Data Warehousing: A Deep Dive Review (White Paper) Steve Barlow, Senior VP of Client Operations and Co-Founder The Health Catalyst Data Operating System (DOS™) Solution Health Catalyst Solution
  • 32. © 2016 Health Catalyst Proprietary. Feel free to share but we would appreciate a Health Catalyst citation. Joined Health Catalyst in February 2012. Prior to joining the Catalyst team, Bryan spent six years with Intel and four years with the The Church of Jesus Christ of Latter-Day Saints. While at Intel Bryan was on teams responsible for Intel's factory reporting systems and equipment maintenance prediction. At the LDS Church he led the .NET Development Center of Excellence and was responsible for the Application Lifecycle Management (ALM) processes and tools used for development at the Church. Bryan graduated from Brigham Young University with a degree in Computer Science. Other Clinical Quality Improvement Resources Click to read additional information at www.healthcatalyst.com Bryan Hinton