SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Vamshi Punugoti & Bryan Lari
MD Anderson Cancer Center
June 2016
HDP @ MD ANDERSON
Starting the Hadoop Journey at a Global Leader
in Cancer Research
Agenda
• About MD Anderson
• Big Data Program
• Our Hadoop Implementation
• Lessons Learned
• Next Steps
• Who we are
– One of the worlds largest centers devoted exclusively to cancer care
– Created by the Texas legislature in 1941
– Named one of the nation's top two hospitals for cancer care every
year since the survey began in 1990
• Mission
– MD Anderson’s mission is to eliminate cancer in Texas, the nation and
the world through exceptional programs that integrate patient care,
research and prevention.
About MD Anderson
About MD Anderson cont.
Patient Care Education
Research
Moon Shots Program
• Launched in 2012 – to make a giant leap for patients
• Accelerating the pace of converting scientific discoveries into
clinical advances that reduce cancer deaths
• Transdisciplinary team-science approach
• Transformative professional platforms
List of Moon Shots
12 Total Moon Shots
B-cell Lymphoma Lung Cancer
Breast Cancer Melanoma
Colorectal Cancer Multiple Myeloma
Glioblastoma Ovarian Cancer
HPV-Related Cancers Pancreatic Cancer
Leukemia (CLL, MDS, AML) Prostate Cancer
http://www.cancermoonshots.org
Volume
Variety
Velocity
Veracity
Gulf of Mexico Analogy
Goals of Big Data Program
• Data driven organization
• All “types” of data
• “Access” for all customers
• Clinicians
• Researchers
• Administrative / Operational
• Enable discovery of “insights”
• Improve patient care
• Increase research discoveries
• Improve operations
• Govern data like an asset
• Provide a platform / environment to enable all these things
To provide the right information to the right people at the right time with the right tools
Goal
data
insight
Insights
Make big data additive and build upon foundation
What are we doing today?
• FIRE Enterprise Data Warehouse
• Natural Language Processing (NLP)
• Data Governance
• Hadoop NoSQL
• Cognitive Computing
• Data Visualization
• Evolving our Platform / Architecture
• Identifying big data use cases
• Training & Skills
• Federated Institutional Reporting Environment
• Centralized data repository supporting analytics,
decision making, and business intelligence
• Central repository for historical and operational data
• Break-down data silos
Enterprise
RepositorySource Systems
Dashboards
KPI’s
Analytic
Reports
Analytics
& Reporting
Discoveries
Improve
Patient Care
Quality / Perf
Improvements
Genomic
FIRE Program
Radiology
Labs
Epic / Clarity
Legacy Systems
• Vast amounts of unstructured data are
stored on MDACC servers.
• Conventional ETL tools are not designed
to mine unstructured data.
• Suite of tools make up the NLP Pipeline
• Dictionaries were created to help Epic
go-live (Provider Friendly Terminology)
• Other examples:
• Diagnosis from the pathology reports
• Comorbidities
• Family Cancer History
• Cytogenetics
• Obituary text
• ICD10 Coding
• Structured results feeding Moonshot TRA and OEA
• Etc.
IBM ECM
NLP
Engine
Unstructured Data
Sources
Post NLP
Database
HDWF
(FIRE)
NLP Pipeline - Overview
Enterprise
Business
Clinical Big Data
Peoplesoft
Systems of Record
Systems of Reporting
Systems of Insights
Kronos
Point of Sale
Volunteer Services
Rotary House
MyHR
UTPD
Facilities
Clinic Station
Epic
Lab
GE IDX
Cerner
CARE
EPM
Hyperion
Oracle Business Intelligence
Smart View
Web Analytics
FIRE
EIW
Business Objects
Crystal
Hyperion Interactive Reporting
Facebook
Twitter
UPS
Center for Disease Control
The Weather Channel
LinkedIN
Youtube
oracle.com
Yelp!
Reuters
Google
U.S. Census
Medical Devices
Medical Equipment
Building Controls
Campus Video
Real-time Location Service
Wayfinding
Data
Visualization
Ad Hoc
Cognitive
Computing
Big Data for Analytics & Cognitive Computing
Presentation
Cohort Explorer
Parking Garages
Pharmacy
Research
LCDR
Melcore
Gemini
IPCT
Data Governance
Data Stewardship
Data Portal
Data Profiling
and Quality
Data
Standardization
Compliance
Metadata and
Business
Glossary
Master Data
Management
Data
Repository
Dashboards
KPI’s
Analytic
Reports
Analytics & Informatics
Discoveries
Improve
Patient Care
Quality / Perf
Improvements
Data Mgt & Operations
Data Lake
Data Discovery
Profiling
Standards / Quality
Big Data (Structured and NoSQL)
Insight Apps
Genomic
Radiology
Labs
Epic / Clarity
Legacy Systems
Big Data – High Level
Big Data Technical Architecture
Our Hadoop Implementation
Our Hadoop Implementation cont.
Our Hadoop Implementation cont.
Average number of messages per day: 1,556,688
Estimated amount of storage increase per day: 5.7 GB
Number of channels currently being used: 24
Estimated daily message processing capacity: 4,320,000
Our Hadoop Implementation cont.
Medical Device Data Flow
Data Source Data Capture MDA Big DataData Lake
Access Portals
(Analytics/Visualization)
Integration HUB Data ingestion
Processing
Channels
HBase
Data Loader
Capsule
Capsule
DB
Medical
Device
End-Users
FIRE/Big Data
Cloverleaf
Engine
Epic
TCP-based Data
Listener - Flume
HIVE
PIG
HUNK
Sqoop
Validated HL7
with Patient ID
(from Epic)
HL7
Raw HL7
(from Capsule)
Cleanse &
Transform
Raw HL7
Validated HL7
Our Hadoop Implementation cont.
Developer
Workstation/Sandbox
SVN
(source control server)
Bamboo
(build server)
HDP Dev Cluster
HDP QA Cluster
HDP Prod Cluster
Daily Checkin/Checkout
Development Cycle
On Dev Lead Approval:
Build, Unit Test, Deploy & Tag
On Successful UAT
& Release Approval:
Deploy Per
Last Successful
Build Tag
Smoke Test
Before Updating Task status
Periodic Integration & Validation:
Build, Unit Test
& Notify On Error
Development
Cycle
Deployment
Cycle
process
1. It’s complex
2. It’s a journey
3. Leverage existing strengths
4. Collaborate openly
5. Learn from experts
6. One cluster – multiple use cases
7. Follow best practices
Lessons Learned – what went well
people
1. Continue to expand/evolve our platform
2. Ingest more data and data types
3. Identify high value use cases
4. Develop/Train people with new skills
Next Steps
Train People with new Skills
Accessing data
Computing data
Visualizing data
Insights &
Cognitive Computing
Starting the Hadoop Journey at a Global Leader in Cancer Research

Weitere ähnliche Inhalte

Was ist angesagt?

BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareSkillspeed
 
Data science workshop
Data science workshopData science workshop
Data science workshopHortonworks
 
Maven and google pharma r&d (1)
Maven and google pharma r&d  (1)Maven and google pharma r&d  (1)
Maven and google pharma r&d (1)Matt Barnes
 
Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureEdureka!
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Yellowfin
 
1 PSUT Big Data Class, introduction
1 PSUT Big Data Class,  introduction1 PSUT Big Data Class,  introduction
1 PSUT Big Data Class, introductionAkram Al-Kouz
 
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
 
Data-driven Healthcare
Data-driven HealthcareData-driven Healthcare
Data-driven HealthcareLindaWatson19
 
Real-Time Clinical Analytics
Real-Time Clinical AnalyticsReal-Time Clinical Analytics
Real-Time Clinical AnalyticsDataWorks Summit
 
(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...
(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...
(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...Amazon Web Services
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESDATAVERSITY
 
Reveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search SolutionReveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search Solutiond-Wise Technologies
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
Complex Analytics using Open Source Technologies
Complex Analytics using Open Source TechnologiesComplex Analytics using Open Source Technologies
Complex Analytics using Open Source TechnologiesDataWorks Summit
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Seeling Cheung
 
Big data and the Healthcare Sector
Big data and the Healthcare Sector Big data and the Healthcare Sector
Big data and the Healthcare Sector Chris Groves
 

Was ist angesagt? (20)

Hadoop in Healthcare Systems
Hadoop in Healthcare SystemsHadoop in Healthcare Systems
Hadoop in Healthcare Systems
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Hadoop Enabled Healthcare
Hadoop Enabled HealthcareHadoop Enabled Healthcare
Hadoop Enabled Healthcare
 
Data science workshop
Data science workshopData science workshop
Data science workshop
 
Maven and google pharma r&d (1)
Maven and google pharma r&d  (1)Maven and google pharma r&d  (1)
Maven and google pharma r&d (1)
 
Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cure
 
BDaas- BigData as a service
BDaas- BigData as a service  BDaas- BigData as a service
BDaas- BigData as a service
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 
1 PSUT Big Data Class, introduction
1 PSUT Big Data Class,  introduction1 PSUT Big Data Class,  introduction
1 PSUT Big Data Class, introduction
 
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
 
Data-driven Healthcare
Data-driven HealthcareData-driven Healthcare
Data-driven Healthcare
 
Real-Time Clinical Analytics
Real-Time Clinical AnalyticsReal-Time Clinical Analytics
Real-Time Clinical Analytics
 
(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...
(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...
(HLS305) Transforming Cancer Treatment: Integrating Data to Deliver on the Pr...
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDES
 
Reveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search SolutionReveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search Solution
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Complex Analytics using Open Source Technologies
Complex Analytics using Open Source TechnologiesComplex Analytics using Open Source Technologies
Complex Analytics using Open Source Technologies
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
 
Big data and the Healthcare Sector
Big data and the Healthcare Sector Big data and the Healthcare Sector
Big data and the Healthcare Sector
 

Andere mochten auch

Paul Capello Cerner Health Conference
Paul Capello Cerner Health ConferencePaul Capello Cerner Health Conference
Paul Capello Cerner Health Conferencetigcap
 
UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...
UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...
UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...CTSI at UCSF
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit
 

Andere mochten auch (6)

Paul Capello Cerner Health Conference
Paul Capello Cerner Health ConferencePaul Capello Cerner Health Conference
Paul Capello Cerner Health Conference
 
SmartSense Suite
SmartSense SuiteSmartSense Suite
SmartSense Suite
 
UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...
UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...
UCSF Informatics Day 2014 - Doug Berman, "A Brief Tour of UCSF’s Clinical Dat...
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
Knowledge from Noise
Knowledge from Noise Knowledge from Noise
Knowledge from Noise
 
Keep your Hadoop Cluster at its Best
Keep your Hadoop Cluster at its BestKeep your Hadoop Cluster at its Best
Keep your Hadoop Cluster at its Best
 

Ähnlich wie Starting the Hadoop Journey at a Global Leader in Cancer Research

Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...VMware Tanzu
 
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...Aridhia Informatics Ltd
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
 
Building an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-MakingBuilding an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-MakingDenodo
 
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204Kees van Bochove
 
Big Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short TimeBig Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short TimeDataWorks Summit
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Perficient, Inc.
 
The Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcareThe Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcarePerficient, Inc.
 
Big Data in Pediatric Critical Care by Mohit Mehra
Big Data in Pediatric Critical Care by Mohit MehraBig Data in Pediatric Critical Care by Mohit Mehra
Big Data in Pediatric Critical Care by Mohit MehraData Con LA
 
Supporting a Collaborative R&D Organization with a Dynamic Big Data Solution
Supporting a Collaborative R&D Organization with a Dynamic Big Data SolutionSupporting a Collaborative R&D Organization with a Dynamic Big Data Solution
Supporting a Collaborative R&D Organization with a Dynamic Big Data SolutionSaama
 
2016.10 HPDA in Precision Medicine
2016.10 HPDA in Precision Medicine2016.10 HPDA in Precision Medicine
2016.10 HPDA in Precision MedicineMichael Atkins
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...David Peyruc
 
Bridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through TechnologyBridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through TechnologySaama
 
The Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of HealthcareThe Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of HealthcareHealth Catalyst
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
 

Ähnlich wie Starting the Hadoop Journey at a Global Leader in Cancer Research (20)

Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
 
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Building an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-MakingBuilding an Intelligent Biobank to Power Research Decision-Making
Building an Intelligent Biobank to Power Research Decision-Making
 
Big data analystics
Big data analysticsBig data analystics
Big data analystics
 
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
 
Big Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short TimeBig Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short Time
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
 
2015 04-18-wilson cg
2015 04-18-wilson cg2015 04-18-wilson cg
2015 04-18-wilson cg
 
The Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcareThe Role of Data Lakes in Healthcare
The Role of Data Lakes in Healthcare
 
Big Data in Pediatric Critical Care by Mohit Mehra
Big Data in Pediatric Critical Care by Mohit MehraBig Data in Pediatric Critical Care by Mohit Mehra
Big Data in Pediatric Critical Care by Mohit Mehra
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
Supporting a Collaborative R&D Organization with a Dynamic Big Data Solution
Supporting a Collaborative R&D Organization with a Dynamic Big Data SolutionSupporting a Collaborative R&D Organization with a Dynamic Big Data Solution
Supporting a Collaborative R&D Organization with a Dynamic Big Data Solution
 
2016.10 HPDA in Precision Medicine
2016.10 HPDA in Precision Medicine2016.10 HPDA in Precision Medicine
2016.10 HPDA in Precision Medicine
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
 
Bridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through TechnologyBridging Health Care and Clinical Trial Data through Technology
Bridging Health Care and Clinical Trial Data through Technology
 
The Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of HealthcareThe Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of Healthcare
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for Healthcare
 
Big Data in Clinical Research
Big Data in Clinical ResearchBig Data in Clinical Research
Big Data in Clinical Research
 

Mehr von DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

Mehr von DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Kürzlich hochgeladen

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Kürzlich hochgeladen (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Starting the Hadoop Journey at a Global Leader in Cancer Research

  • 1. Vamshi Punugoti & Bryan Lari MD Anderson Cancer Center June 2016 HDP @ MD ANDERSON Starting the Hadoop Journey at a Global Leader in Cancer Research
  • 2. Agenda • About MD Anderson • Big Data Program • Our Hadoop Implementation • Lessons Learned • Next Steps
  • 3. • Who we are – One of the worlds largest centers devoted exclusively to cancer care – Created by the Texas legislature in 1941 – Named one of the nation's top two hospitals for cancer care every year since the survey began in 1990 • Mission – MD Anderson’s mission is to eliminate cancer in Texas, the nation and the world through exceptional programs that integrate patient care, research and prevention. About MD Anderson
  • 4. About MD Anderson cont. Patient Care Education Research
  • 5. Moon Shots Program • Launched in 2012 – to make a giant leap for patients • Accelerating the pace of converting scientific discoveries into clinical advances that reduce cancer deaths • Transdisciplinary team-science approach • Transformative professional platforms List of Moon Shots 12 Total Moon Shots B-cell Lymphoma Lung Cancer Breast Cancer Melanoma Colorectal Cancer Multiple Myeloma Glioblastoma Ovarian Cancer HPV-Related Cancers Pancreatic Cancer Leukemia (CLL, MDS, AML) Prostate Cancer http://www.cancermoonshots.org
  • 6.
  • 8. Gulf of Mexico Analogy
  • 9. Goals of Big Data Program • Data driven organization • All “types” of data • “Access” for all customers • Clinicians • Researchers • Administrative / Operational • Enable discovery of “insights” • Improve patient care • Increase research discoveries • Improve operations • Govern data like an asset • Provide a platform / environment to enable all these things
  • 10. To provide the right information to the right people at the right time with the right tools Goal data insight
  • 12. Make big data additive and build upon foundation
  • 13. What are we doing today? • FIRE Enterprise Data Warehouse • Natural Language Processing (NLP) • Data Governance • Hadoop NoSQL • Cognitive Computing • Data Visualization • Evolving our Platform / Architecture • Identifying big data use cases • Training & Skills
  • 14. • Federated Institutional Reporting Environment • Centralized data repository supporting analytics, decision making, and business intelligence • Central repository for historical and operational data • Break-down data silos Enterprise RepositorySource Systems Dashboards KPI’s Analytic Reports Analytics & Reporting Discoveries Improve Patient Care Quality / Perf Improvements Genomic FIRE Program Radiology Labs Epic / Clarity Legacy Systems
  • 15. • Vast amounts of unstructured data are stored on MDACC servers. • Conventional ETL tools are not designed to mine unstructured data. • Suite of tools make up the NLP Pipeline • Dictionaries were created to help Epic go-live (Provider Friendly Terminology) • Other examples: • Diagnosis from the pathology reports • Comorbidities • Family Cancer History • Cytogenetics • Obituary text • ICD10 Coding • Structured results feeding Moonshot TRA and OEA • Etc. IBM ECM NLP Engine Unstructured Data Sources Post NLP Database HDWF (FIRE) NLP Pipeline - Overview
  • 16. Enterprise Business Clinical Big Data Peoplesoft Systems of Record Systems of Reporting Systems of Insights Kronos Point of Sale Volunteer Services Rotary House MyHR UTPD Facilities Clinic Station Epic Lab GE IDX Cerner CARE EPM Hyperion Oracle Business Intelligence Smart View Web Analytics FIRE EIW Business Objects Crystal Hyperion Interactive Reporting Facebook Twitter UPS Center for Disease Control The Weather Channel LinkedIN Youtube oracle.com Yelp! Reuters Google U.S. Census Medical Devices Medical Equipment Building Controls Campus Video Real-time Location Service Wayfinding Data Visualization Ad Hoc Cognitive Computing Big Data for Analytics & Cognitive Computing Presentation Cohort Explorer Parking Garages Pharmacy Research LCDR Melcore Gemini IPCT
  • 17. Data Governance Data Stewardship Data Portal Data Profiling and Quality Data Standardization Compliance Metadata and Business Glossary Master Data Management
  • 18. Data Repository Dashboards KPI’s Analytic Reports Analytics & Informatics Discoveries Improve Patient Care Quality / Perf Improvements Data Mgt & Operations Data Lake Data Discovery Profiling Standards / Quality Big Data (Structured and NoSQL) Insight Apps Genomic Radiology Labs Epic / Clarity Legacy Systems
  • 19. Big Data – High Level
  • 20. Big Data Technical Architecture
  • 23. Our Hadoop Implementation cont. Average number of messages per day: 1,556,688 Estimated amount of storage increase per day: 5.7 GB Number of channels currently being used: 24 Estimated daily message processing capacity: 4,320,000
  • 24. Our Hadoop Implementation cont. Medical Device Data Flow Data Source Data Capture MDA Big DataData Lake Access Portals (Analytics/Visualization) Integration HUB Data ingestion Processing Channels HBase Data Loader Capsule Capsule DB Medical Device End-Users FIRE/Big Data Cloverleaf Engine Epic TCP-based Data Listener - Flume HIVE PIG HUNK Sqoop Validated HL7 with Patient ID (from Epic) HL7 Raw HL7 (from Capsule) Cleanse & Transform Raw HL7 Validated HL7
  • 25. Our Hadoop Implementation cont. Developer Workstation/Sandbox SVN (source control server) Bamboo (build server) HDP Dev Cluster HDP QA Cluster HDP Prod Cluster Daily Checkin/Checkout Development Cycle On Dev Lead Approval: Build, Unit Test, Deploy & Tag On Successful UAT & Release Approval: Deploy Per Last Successful Build Tag Smoke Test Before Updating Task status Periodic Integration & Validation: Build, Unit Test & Notify On Error Development Cycle Deployment Cycle
  • 26. process 1. It’s complex 2. It’s a journey 3. Leverage existing strengths 4. Collaborate openly 5. Learn from experts 6. One cluster – multiple use cases 7. Follow best practices Lessons Learned – what went well people
  • 27. 1. Continue to expand/evolve our platform 2. Ingest more data and data types 3. Identify high value use cases 4. Develop/Train people with new skills Next Steps
  • 28. Train People with new Skills Accessing data Computing data Visualizing data Insights & Cognitive Computing