Weitere ähnliche Inhalte Mehr von Health Catalyst (20) Kürzlich hochgeladen (20) Hadoop in Healthcare - A No-Nonsense Q & A1. Hadoop in Healthcare – A No-nonsense Q & A
© 2014 Health Catalyst
www.healthcatalyst.com Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
© 2014 Health Catalyst
www.healthcatalyst.com
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
By Jared Crapo
2. © 2014 Health Catalyst
www.healthcatalyst.com
Hadoop in Healthcare
Hadoop is used in all kinds of applications
like Facebook and LinkedIn.
The potential for Big Data and Hadoop in healthcare and
managing healthcare data is exciting, but—as of yet—has not
been fully realized.
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
3. Although healthcare analytics haven’t yet been
hampered by hospital systems not using Hadoop,
it never hurts to look forward and consider the
possibilities.
Hadoop is an indispensable tool for efficiently storing and
processing large quantities of data. Its unique capabilities will offer
new ways of thinking about how we use healthcare data and
analytics to provide improved patient care at reduced costs.
What follows is a Q & A on Hadoop
and its implications for the future of
healthcare.
© 2014 Health Catalyst
www.healthcatalyst.com
Hadoop in Healthcare
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
4. © 2014 Health Catalyst
www.healthcatalyst.com
What is Hadoop?
1
Hadoop is an open-source
distributed data storage and
analysis application that was
developed by Yahoo! based on
research papers published by
Google.
Hadoop implements Google’s
MapReduce algorithm by divvying
up a large query into many parts,
sending those respective parts to
many different processing nodes,
and then combining the results
from each node.
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
HADOOP
5. © 2014 Health Catalyst
www.healthcatalyst.com
1
What is Hadoop?
Hadoop also refers to the tools and
software that works with and
enhances Hadoop’s core storage
and processing components:
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
HADOOP
Hive – a SQL-like query language for Hadoop
Pig – a high-level query language for MapReduce
HBase – a columnar data store that runs on top of
the Hadoop distributed file storage mechanism
Spark – general purpose cluster computing
framework
6. What are some key reasons to
adopt Hadoop?
© 2014 Health Catalyst
www.healthcatalyst.com
2
Large companies are moving to
Hadoop for generally two reasons:
1. Enormous data sets
2. Costs
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
HADOOP
For example, Yahoo! implemented
42,000 nodes in several different
Hadoop clusters with a combined
capacity of about 200 petabytes
(200,000 terabytes).
7. What are some key reasons to
adopt Hadoop?
© 2014 Health Catalyst
www.healthcatalyst.com
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
2
HADOOP
Even if existing database
applications could accommodate
these large data sets, the cost of
typical enterprise hardware and
disk storage becomes prohibitive.
Hadoop was designed from the
beginning to run on commodity
hardware which substantially
reduces the need for expensive
hardware infrastructure.
Because Hadoop is open source,
there are no licensing fees for the
software either, another substantial
savings.
8. How will Hadoop impact and/or
change healthcare analytics?
© 2014 Health Catalyst
www.healthcatalyst.com
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
3
HADOOP
Hadoop has been called the most
significant data processing
platform for big data analytics in
healthcare.
Using Hadoop, researchers can
now use data sets that were
traditionally impossible to handle.
A team in Colorado is correlating
air quality data with asthma
admissions.
Life sciences companies use
genomic and proteomic data to
speed drug development.
9. How will Hadoop impact and/or
change healthcare analytics?
© 2014 Health Catalyst
www.healthcatalyst.com
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
3
HADOOP
Healthcare analytics is generally not
held back by the capability of the
data processing platforms. There are
a few exceptions in life sciences.
But for most healthcare providers,
the limiting factor is the willingness
and ability let data inform and
change the way care is delivered.
Today, it takes more than a decade
for compelling clinical evidence to
become common clinical practice.
It’s not how much data you have that matters, but how you use it.
10. How will clinicians use outside
data sources?
© 2014 Health Catalyst
www.healthcatalyst.com
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
4
HADOOP
Data from other clinical providers in
your geography can be very useful.
Claims data give a broad picture but
not a deep one.
Data from other non-traditional
sources also has surprising
relevance; in some cases, it’s a
better predictor than clinical data.
For example: EPA data on geographical toxic chemical load adds
additional insight to cancer rates for long-term residents. The CMS-HCC
risk adjustment model can help providers understand why
patients in their area seem to have higher or lower risk for certain
disease conditions. Household size of one increases the risk of
readmissions because there is no other caregiver in the home.
11. © 2014 Health Catalyst
www.healthcatalyst.com
What are the drawbacks of
Hadoop?
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
5
HADOOP
What do CTOs, CIOs and other IT
leaders need to consider?
Hadoop is very young technology
and the capabilities and tools are
relatively immature. So too are the
number of people who have Hadoop
experience.
Competition for these resources will
be large technology and financial
services companies. People with
Hadoop experience are in high
demand.
12. © 2014 Health Catalyst
www.healthcatalyst.com
What are the drawbacks of
Hadoop?
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
5
HADOOP
You should also consider alternate
hardware maintenance schemes.
Hadoop was designed for
commodity hardware which
generally experienced higher
failure rates.
Instead of purchasing hardware
maintenance you should plan to
have spare nodes on standby.
The good news is that commercial
database vendors, including
Microsoft, Oracle, and Teradata,
are all racing to integrate Hadoop
into their offerings.
13. Where is Hadoop headed and
how will it impact big data?
© 2014 Health Catalyst
www.healthcatalyst.com
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
QUESTIONS
6
HADOOP
Fifteen years ago, we didn’t capture
data unless we knew we needed it.
The cost to capture and store it was
just too high.
Fifteen years from now, reductions in
the cost to capture and store data
will likely mean that we will capture
and store everything.
Hadoop is a huge leap forward in
our ability to efficiently store and
process large quantities of data and
allows creative thinking about how to
apply the resulting answers in a
meaningful and useful way.
14. © 2014 Health Catalyst
www.healthcatalyst.com
More about this topic
Five Reasons Healthcare Data Is Different
Dan LeSueur, Vice President, Technical Operations
Big Data in Healthcare: Separating the Hype from Reality
Jared Crapo, Vice President
In Healthcare Predictive Analytics, Sometimes Big Data Is a Big Mess
David Crockett, Senior Director, Research and Predictive Analytics
Data Alone Is Not Enough: A Clinical Perspective
(free, on-demand webinar, transcript, and slides)
Dale Sanders, Senior Vice President, Strategy and John Kenagy, MD
Using Healthcare Data: Healthcare Analytics Adoption Model (white paper)
Dale Sanders, Senior Vice President, Strategy
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
15. © 2014 Health Catalyst
www.healthcatalyst.com
For more information:
Proprietary. Feel free to share but we would appreciate a Health Catalyst citation.
16. Other Clinical Quality Improvement Resources
© 2013 Health Catalyst
www.healthcatalyst.com
Click to read additional information at www.healthcatalyst.com
Jared Crapo joined Health Catalyst in February 2013 as a Vice President.
Prior to coming to Catalyst, he worked for Medicity as the Chief of Staff to
the CEO. During his tenure at Medicity, he was also the Director of Product
Management and the Director of Product Strategy.
Jared co-founded Allviant, a spin-out of Medicity, that created consumer health
management tools. In his early career, he developed physician accounting systems
and health claims payment systems.