Graphical Structure Learning accelerated with POWER9

Arghya Kusum Das, Ph.D.
Assistant Professor, University of Wisconsin-Platteville
In collaboration with
Radha Nagarajan, Ph.D.
Director, COSH, Marshfield Clinic Health System (Digital Health, Data Science, Bioinformatics, RWE)
RWE)
Graphical Structure Learning Accelerated with POWER9
o Overview of Graphical Models
o Implementation
o Preliminary Findings
o Healthcare Applications
Overview
Graphs/Networks: Comprised of nodes and edges
nodes/vertex: represent the entities of interest
edges: represent the associations/relationships between the nodes.
Graphical models: Model the associations between the entities as a graph.
Example:
nodes: COVID subjects
edges: association between the COVID subjects (e.g. contact tracing)
© searchengineland.com
Why Graphical Models?
o system-level abstractions: Graphical models can reveal system-level properties and behavior not
apparent in the reductionist representation. System-level abstractions is especially critical in developing
developing targeted intervention.
e.g. model COVID spread in a given community from contact tracing1; use the model to assist in
assist in targeted community-based interventions/policies
e.g. model the signaling mechanism initiated by COVID spike protein; use the model to identify
identify potential target molecules for drugs to minimize disease severity/inflammation2
o in-silico models: Graphical models can be experimented in a controlled and cost-effective manner. This
includes posing questions to these models (e.g. inference).
e.g. given the evidence that a subject has cough, fever, sore throat and shortness of breath
determine the probability that the subject is COVID +ve
o causal associations: Graphical models may reveal causal association3 under certain implicit assumptions
(Note: we are attempting decipher causality from observational data!)
1https://www.cdc.gov/coronavirus/2019-ncov/daily-life-coping/contact-tracing.html
2https://www.cebm.net/covid-19/dexamethasone/
3Pearl, J [2009] Causality: Models, Reasoning and Inference.
Problem:
What we have: Data across an informed set of variables (D)
What we need: Graphical structure (G) representing the associations between these variables
Pair-wise dependencies:
Direct associations between a given pair of nodes determined using similarity measures
Note: Associations between a pair of variables may not be direct and can mediated through a third
variable.Conclusions based on pair-wise dependencies while helpful may be incomplete.
e.g. Loss of Taste (L) and Disease Severity (D) may not be associated as such (i.e. marginally
marginally independent). However, L and D may be associated given that the subject has COVID
L D
C
D
L
What we need: Graphical structure
Approach: Bayesian structure learning
- Models the joint probability distribution across the given informed set of variables
- Incorporates conditional dependencies between a given set of variables in an iterative manner
C
D
L
Data?
o multivariate: more than one variable is measured
o Can be longitudinal or cross-sectional
longitudinal:
a continuous process is sampled as a function of time resulting in time series
challenging to obtain as the several factors have to be controlled
cross-sectional:
replicate measurements of a continuous process is sampled in a given time window (snapshot)
(snapshot)
relatively easier to obtain
Note: The approaches to be discussed implicitly assumes that the properties of the data is preserved across
the replicate realizations.
Question: Given the cross-sectional data on the loss of taste (Yes/No), Disease Severity (Yes/No), Result of
COVID test (+/-) can we model the association between them
Three popular approaches for structure learning (static):
o Constraint-based Learn the structure using conditional independence tests
o Search and score Learn the structure that best fits the data using a greedy search with a scoring
criteria
o Hybrid Learn the structure using a combination of constraint-based and search-score
approaches
Subject C (+/-) D (Y/N) L (Y/N)
1 + Y Y
2 + Y N
3 - Y Y
4 - N Y
. . . .
. . . .
. . . .
C
D
L
? ?
Bayesian network structure learning:
o Exhaustive Enumeration: Number of possible structures grows super-exponentially with the number
of nodes n1.
𝑎𝑛 = 𝑘=1
𝑛
(−1)𝑘−1 𝑛
𝑘
2𝑘(𝑛−𝑘)
𝑎𝑛−𝑘
𝑎0 = 1
Note: Exhaustive enumeration in general is not computationally feasible from a practical standpoint.
1Robinson, R. W. "Counting Labeled Acyclic Digraphs." In New Directions in Graph Theory (Ed. F. Harary). New
Nodes DAGs
1 1
2 3
3 25
4 543
5 29281
. .
. .
Markov Equivalence Class: probabilistically indistinguishable graphical structures.
𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐶 . 𝑝(𝐷/𝐶)
𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐷 . 𝑝(𝐶/𝐷)
𝑝 𝐿, 𝐷, 𝐶 = 𝑝 𝐿 . 𝑝(𝐶/𝐿). 𝑝(𝐷/𝐶)
Note: Even if exhaustive enumeration were possible, structures can be learned only up to the Markov
equivalence class.
C
D
L
C
D
L
C
D
L
Search and Score (Hill Climbing):
𝑃 𝐺|𝐷 α 𝑃 𝐷|𝐺 . 𝑃(𝐺)
Theoretical consideration on the complexity of Greedy search under certain assumptions have been
been investigated1
1Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing
Likelihood Prior
Search and Score (Hill Climbing)
Hill-climbing is a sequential algorithm. Score of the present structure G* is generated by modifying the
modifying the previous structure (G) as in Step 4 in an iterative manner
BIC Score = 𝑖=1
𝑛
log[𝑃(𝑋𝑖/Π𝑋𝑖
)] −
𝑑
2
log 𝑛
Opportunities for distributing the computation in the hill climbing approach
o The potential structures interrogated in Step 4(a) can be distributed
o BIC score of a candidate structure is the sum of the scores of its local structures, hence can be
distributed
o Greedy aspect of hill-climbing in conjunction Markov equivalence can result in locally optimal
convergence encouraging repeating the procedure with multiple random restarts, this in turn can be
can be distributed
Regularization term
d = #parameters
Implementation: Architecture
*Image from IC922 Redbook
x86:
Server: HPE ProLiant DL580
servers
CPU Type: Intel Xeon EX-series
Cores per node: 16
DRAM: 512GB
POWER 9:
Server: IC922
CPU Type: DD2.3 POWER9
processor modules
Cores per node: 160 virtual cores
Access up to 32DIMM
Sustained bandwidth 28.8 GB
Implementation:
o Data description: HEPMASS1,2 (10.5 x 106 samples comprising of 28 variables , Baldi et al., 20161). All
continuous normalized features were discretized into binary categorical variables by thresholding
thresholding about their mean.
o Python Implementation:
Bayesian network using Pandas, NetworkX
1Baldi P, et al. [2016] Parameterized Neural Networks for High-Energy Physics. The Eur. Phys. J. C 76(235).
2Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing.
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
Multiple Cores Architecture: Dask Distributed
Python/Dask APIs
Parallel Restart
SHA-256 Hash confirms
uniqueness of visited graph
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
Spawning multiple Hill Climbing
instances
Data
Performance of structure learning on POWER and x86:
Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill-
Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and
POWER architectures (# implies significant difference).
The computational time were statistically significant (p < 0.001) between the x86 and the POWER
architectures, with the POWER architectures taken considerably lesser time than x86. As expected, BIC
score takes less computational time than K2 score and these scores
0
10000
20000
30000
40000
50000
1 2 3
Time
(Seconds)
Max Fan in
x86 POWER
Performance of x86 and POWER 9 on HEPMASS (BIC Score)
0
10000
20000
30000
40000
50000
1 2 3
Time
(Seconds)
Max Fan in
x86 POWER
Performance of x86 and POWER 9 on HEPMASS (K2 Score)
# # #
# # #
Performance of structure learning on POWER and x86 with varying Map Tasks:
Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill-
Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and
POWER architectures (# implies significant difference).
There was statistically significant difference in the computational time between the x86 and the POWER
architectures when random restarts were distributed as map task jobs. As the number of map tasks were
increased the computations time decreased across both POWER and x86 and the separation in the average
time increased between x86 and POWER.
# Corresponds to p < 0.05; * Corresponds to p <
0.0001
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
1 2 3 4 5 6 7
Time
(Seconds)
Map Tasks
x86 POWER
# # # # # * *
Performance of POWER and x86 with varying Map Tasks
(BIC Score)
2 4 8 16 32 64 128
Healthcare – current trends:
o Explosion in Digital Healthcare Data:
- Source Systems: Continued digitization from multiple sources (EHR, Claims, Registries, IoT) and multiple types
(Text, Image, Signals)
- Multiscale Profiles: Emphasis on capturing the complete description of patients.
- Common Data Models: Develop approaches for sharing observational healthcare data (OMOP/OHDSI) across
multiple organizations and research networks (e.g. HIE, PCORNet)
- High-throughput: molecular data (e.g. Next Generation Sequencing)
- FHIR: Development of (Fast Healthcare Interoperability Resources) for enhanced interoperability across systems
and devices
o Explosion in Analytics Adoption:
- Descriptive, Predictive, Prescriptive Analytics
- Shift from storage to analytics and consensus-based to evidence-based/data-driven approaches to impact
outcomes/KPIs.
- Surge in the adoption of Machine Learning (ML) and Artificial Intelligence (AI) approaches.
Healthcare Applications:
Graphical Models – where do they fit in
o Healthcare data sets are inherently multivariate and noisy attributed to
several factors. Probabilistic graphical models are especially suited to
handle noisy data.
o Associations in multivariate healthcare data may be unknown.
Graphical models can discover novel associations (hypothesis
generation) in addition to validating known associations (hypothesis
testing). Deciphering these associations is critical in prescribing
targeted interventions.
o Graphical models fall under ML and AI1. Can be used for descriptive,
predictive and prescriptive analytics (e.g. Naïve Bayes Classifier). AI
aspect of Graphical models: Answer queries posed from the evidence
provided about a disease.
o Graphical Models Healthcare applications include: Diagnostic
Reasoning, Prognostic Reasoning and Treatment selection, Discovering
functional associations2
o Emphasis on inferring causal associations from observational
healthcare data with potential to complement classical approaches (e.g.
RCT 3), RCTs being idealizations.
o Interpretable and easily visualized for critical evaluation in healthcare
settings.
Need: Architectures and programming environment that can implement
1Russell, S. Norvig, R. [2020] Artificial Intelligence: A Modern
Approach, 4th ed
2Lucas PJF et al. [2004] Bayesian networks in biomedicine and
health-care Artif. Intell. Med. 30(3):201-14
3Berwick, D [2008] The Science of Improvement, JAMA, 1182-
1184
4Mclachlan, S et al. [2020] Bayesian networks in healthcare:
Distribution by medical condition. Artificial Intelligence in
Medicine. 107, 101912
Summary
o Structure learning is computationally intensive especially across large data sets and large number of variables
o Preliminary findings revealed marked improvement in performance using POWER architectures in
addressing computational challenges of structure learning approaches such as hill-climbing
o Need for a more detailed investigation using a battery of data sets and across distinct graphical model
algorithms
o Graphical modeling approaches in general have considerable healthcare applications. Their ability to reason
under uncertainty makes them especially ideal for healthcare analytics.
o https://onstituteacademy.herokuapp.com
Acknowledgements
Marco Scutari, Ph.D. Senior Researcher, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA),
Switzerland
Terry Leatherland, Trish Froeschle, Thomas Prokop, IBM, USA
Ganesan Narayanswami, OpenPOWER leader in Education and Research
1 von 20

Recomendados

ME Synopsis von
ME SynopsisME Synopsis
ME SynopsisPoonam Debnath
264 views4 Folien
Textual Data Partitioning with Relationship and Discriminative Analysis von
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
293 views12 Folien
ON DISTRIBUTED FUZZY DECISION TREES FOR BIG DATA von
 ON DISTRIBUTED FUZZY DECISION TREES FOR BIG DATA ON DISTRIBUTED FUZZY DECISION TREES FOR BIG DATA
ON DISTRIBUTED FUZZY DECISION TREES FOR BIG DATANexgen Technology
287 views6 Folien
A Novel Approach to Mathematical Concepts in Data Mining von
A Novel Approach to Mathematical Concepts in Data MiningA Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data Miningijdmtaiir
41 views4 Folien
WITH SEMANTICS AND HIDDEN MARKOV MODELS TO AN ADAPTIVE LOG FILE PARSER von
WITH SEMANTICS AND HIDDEN MARKOV MODELS TO AN ADAPTIVE LOG FILE PARSERWITH SEMANTICS AND HIDDEN MARKOV MODELS TO AN ADAPTIVE LOG FILE PARSER
WITH SEMANTICS AND HIDDEN MARKOV MODELS TO AN ADAPTIVE LOG FILE PARSERijnlc
7 views14 Folien
IEEE Datamining 2016 Title and Abstract von
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
89 views14 Folien

Más contenido relacionado

Was ist angesagt?

Parallel KNN for Big Data using Adaptive Indexing von
Parallel KNN for Big Data using Adaptive IndexingParallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingIRJET Journal
42 views4 Folien
A0360109 von
A0360109A0360109
A0360109iosrjournals
549 views9 Folien
Dimensionality Reduction Techniques for Document Clustering- A Survey von
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
385 views4 Folien
Big Data Processing using a AWS Dataset von
Big Data Processing using a AWS DatasetBig Data Processing using a AWS Dataset
Big Data Processing using a AWS DatasetVishva Abeyrathne
112 views3 Folien
C0312023 von
C0312023C0312023
C0312023iosrjournals
307 views4 Folien
A Comparison of Computation Techniques for DNA Sequence Comparison von
A Comparison of Computation Techniques for DNA Sequence Comparison A Comparison of Computation Techniques for DNA Sequence Comparison
A Comparison of Computation Techniques for DNA Sequence Comparison IJORCS
458 views6 Folien

Was ist angesagt?(19)

Parallel KNN for Big Data using Adaptive Indexing von IRJET Journal
Parallel KNN for Big Data using Adaptive IndexingParallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive Indexing
IRJET Journal42 views
Dimensionality Reduction Techniques for Document Clustering- A Survey von IJTET Journal
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A Survey
IJTET Journal385 views
A Comparison of Computation Techniques for DNA Sequence Comparison von IJORCS
A Comparison of Computation Techniques for DNA Sequence Comparison A Comparison of Computation Techniques for DNA Sequence Comparison
A Comparison of Computation Techniques for DNA Sequence Comparison
IJORCS458 views
Ensemble based Distributed K-Modes Clustering von IJERD Editor
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
IJERD Editor496 views
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic... von IRJET Journal
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET Journal72 views
Ba2419551957 von IJMER
Ba2419551957Ba2419551957
Ba2419551957
IJMER285 views
A TALE of DATA PATTERN DISCOVERY IN PARALLEL von Jenny Liu
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Jenny Liu124 views
Clustering for Stream and Parallelism (DATA ANALYTICS) von DheerajPachauri
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
DheerajPachauri1.3K views
Investigating the 3D structure of the genome with Hi-C data analysis von tuxette
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysis
tuxette1.6K views
Reproducibility and differential analysis with selfish von tuxette
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfish
tuxette301 views

Similar a Graphical Structure Learning accelerated with POWER9

Implementing a neural network potential for exascale molecular dynamics von
Implementing a neural network potential for exascale molecular dynamicsImplementing a neural network potential for exascale molecular dynamics
Implementing a neural network potential for exascale molecular dynamicsPFHub PFHub
1.1K views15 Folien
Data dissemination and materials informatics at LBNL von
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLAnubhav Jain
348 views1 Folie
PointNet von
PointNetPointNet
PointNetResearch Fellow
3.5K views13 Folien
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas... von
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...Derryck Lamptey, MPhil, CISSP
173 views9 Folien
Massive parallelism with gpus for centrality ranking in complex networks von
Massive parallelism with gpus for centrality ranking in complex networksMassive parallelism with gpus for centrality ranking in complex networks
Massive parallelism with gpus for centrality ranking in complex networksijcsit
279 views17 Folien
Ijciet 10 01_153-2 von
Ijciet 10 01_153-2Ijciet 10 01_153-2
Ijciet 10 01_153-2IAEME Publication
30 views12 Folien

Similar a Graphical Structure Learning accelerated with POWER9(20)

Implementing a neural network potential for exascale molecular dynamics von PFHub PFHub
Implementing a neural network potential for exascale molecular dynamicsImplementing a neural network potential for exascale molecular dynamics
Implementing a neural network potential for exascale molecular dynamics
PFHub PFHub1.1K views
Data dissemination and materials informatics at LBNL von Anubhav Jain
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNL
Anubhav Jain348 views
Massive parallelism with gpus for centrality ranking in complex networks von ijcsit
Massive parallelism with gpus for centrality ranking in complex networksMassive parallelism with gpus for centrality ranking in complex networks
Massive parallelism with gpus for centrality ranking in complex networks
ijcsit279 views
Graph Signal Processing for Machine Learning A Review and New Perspectives - ... von lauratoni4
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
lauratoni4198 views
Model Evaluation in the land of Deep Learning von Pramit Choudhary
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
Pramit Choudhary735 views
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS von ijcsit
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSEVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
ijcsit12 views
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut... von ijitcs
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
ijitcs143 views
Laplacian-regularized Graph Bandits von lauratoni4
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
lauratoni4122 views
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs von Jason Riedy
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Jason Riedy762 views
Engineering Data Science Objectives for Social Network Analysis von David Gleich
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
David Gleich452 views
1104.0355 von sudddd44
1104.03551104.0355
1104.0355
sudddd44333 views
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D... von cscpconf
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
cscpconf149 views
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ... von csandit
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
csandit523 views
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds von Subhajit Sahu
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Subhajit Sahu17 views
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds von Subhajit Sahu
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Subhajit Sahu44 views
Scalable Constrained Spectral Clustering von 1crore projects
Scalable Constrained Spectral ClusteringScalable Constrained Spectral Clustering
Scalable Constrained Spectral Clustering
1crore projects74 views

Más de Ganesan Narayanasamy

Chip Design Curriculum development Residency program von
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency programGanesan Narayanasamy
153 views9 Folien
Basics of Digital Design and Verilog von
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and VerilogGanesan Narayanasamy
392 views25 Folien
180 nm Tape out experience using Open POWER ISA von
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISAGanesan Narayanasamy
247 views10 Folien
Workload Transformation and Innovations in POWER Architecture von
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Ganesan Narayanasamy
217 views31 Folien
OpenPOWER Workshop at IIT Roorkee von
OpenPOWER Workshop at IIT RoorkeeOpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT RoorkeeGanesan Narayanasamy
167 views10 Folien
Deep Learning Use Cases using OpenPOWER systems von
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsGanesan Narayanasamy
175 views19 Folien

Más de Ganesan Narayanasamy(20)

Workload Transformation and Innovations in POWER Architecture von Ganesan Narayanasamy
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ... von Ganesan Narayanasamy
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems von Ganesan Narayanasamy
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems

Último

LLMs in Production: Tooling, Process, and Team Structure von
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureAggregage
57 views77 Folien
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... von
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...ShapeBlue
183 views18 Folien
Qualifying SaaS, IaaS.pptx von
Qualifying SaaS, IaaS.pptxQualifying SaaS, IaaS.pptx
Qualifying SaaS, IaaS.pptxSachin Bhandari
1.1K views8 Folien
"Running students' code in isolation. The hard way", Yurii Holiuk von
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk Fwdays
36 views34 Folien
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue von
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueShapeBlue
139 views15 Folien
Optimizing Communication to Optimize Human Behavior - LCBM von
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBMYaman Kumar
38 views49 Folien

Último(20)

LLMs in Production: Tooling, Process, and Team Structure von Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage57 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... von ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue183 views
"Running students' code in isolation. The hard way", Yurii Holiuk von Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays36 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue von ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue139 views
Optimizing Communication to Optimize Human Behavior - LCBM von Yaman Kumar
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBM
Yaman Kumar38 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... von ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue129 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... von ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue162 views
NTGapps NTG LowCode Platform von Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu437 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... von TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc176 views
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online von ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue225 views
The Role of Patterns in the Era of Large Language Models von Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li91 views
Business Analyst Series 2023 - Week 4 Session 7 von DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10146 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue von ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue207 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... von ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue141 views
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... von ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue196 views
Future of AR - Facebook Presentation von Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty65 views
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 von BookNet Canada
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
BookNet Canada44 views

Graphical Structure Learning accelerated with POWER9

  • 1. Arghya Kusum Das, Ph.D. Assistant Professor, University of Wisconsin-Platteville In collaboration with Radha Nagarajan, Ph.D. Director, COSH, Marshfield Clinic Health System (Digital Health, Data Science, Bioinformatics, RWE) RWE) Graphical Structure Learning Accelerated with POWER9
  • 2. o Overview of Graphical Models o Implementation o Preliminary Findings o Healthcare Applications Overview
  • 3. Graphs/Networks: Comprised of nodes and edges nodes/vertex: represent the entities of interest edges: represent the associations/relationships between the nodes. Graphical models: Model the associations between the entities as a graph. Example: nodes: COVID subjects edges: association between the COVID subjects (e.g. contact tracing) © searchengineland.com
  • 4. Why Graphical Models? o system-level abstractions: Graphical models can reveal system-level properties and behavior not apparent in the reductionist representation. System-level abstractions is especially critical in developing developing targeted intervention. e.g. model COVID spread in a given community from contact tracing1; use the model to assist in assist in targeted community-based interventions/policies e.g. model the signaling mechanism initiated by COVID spike protein; use the model to identify identify potential target molecules for drugs to minimize disease severity/inflammation2 o in-silico models: Graphical models can be experimented in a controlled and cost-effective manner. This includes posing questions to these models (e.g. inference). e.g. given the evidence that a subject has cough, fever, sore throat and shortness of breath determine the probability that the subject is COVID +ve o causal associations: Graphical models may reveal causal association3 under certain implicit assumptions (Note: we are attempting decipher causality from observational data!) 1https://www.cdc.gov/coronavirus/2019-ncov/daily-life-coping/contact-tracing.html 2https://www.cebm.net/covid-19/dexamethasone/ 3Pearl, J [2009] Causality: Models, Reasoning and Inference.
  • 5. Problem: What we have: Data across an informed set of variables (D) What we need: Graphical structure (G) representing the associations between these variables Pair-wise dependencies: Direct associations between a given pair of nodes determined using similarity measures Note: Associations between a pair of variables may not be direct and can mediated through a third variable.Conclusions based on pair-wise dependencies while helpful may be incomplete. e.g. Loss of Taste (L) and Disease Severity (D) may not be associated as such (i.e. marginally marginally independent). However, L and D may be associated given that the subject has COVID L D C D L
  • 6. What we need: Graphical structure Approach: Bayesian structure learning - Models the joint probability distribution across the given informed set of variables - Incorporates conditional dependencies between a given set of variables in an iterative manner C D L
  • 7. Data? o multivariate: more than one variable is measured o Can be longitudinal or cross-sectional longitudinal: a continuous process is sampled as a function of time resulting in time series challenging to obtain as the several factors have to be controlled cross-sectional: replicate measurements of a continuous process is sampled in a given time window (snapshot) (snapshot) relatively easier to obtain Note: The approaches to be discussed implicitly assumes that the properties of the data is preserved across the replicate realizations.
  • 8. Question: Given the cross-sectional data on the loss of taste (Yes/No), Disease Severity (Yes/No), Result of COVID test (+/-) can we model the association between them Three popular approaches for structure learning (static): o Constraint-based Learn the structure using conditional independence tests o Search and score Learn the structure that best fits the data using a greedy search with a scoring criteria o Hybrid Learn the structure using a combination of constraint-based and search-score approaches Subject C (+/-) D (Y/N) L (Y/N) 1 + Y Y 2 + Y N 3 - Y Y 4 - N Y . . . . . . . . . . . . C D L ? ?
  • 9. Bayesian network structure learning: o Exhaustive Enumeration: Number of possible structures grows super-exponentially with the number of nodes n1. 𝑎𝑛 = 𝑘=1 𝑛 (−1)𝑘−1 𝑛 𝑘 2𝑘(𝑛−𝑘) 𝑎𝑛−𝑘 𝑎0 = 1 Note: Exhaustive enumeration in general is not computationally feasible from a practical standpoint. 1Robinson, R. W. "Counting Labeled Acyclic Digraphs." In New Directions in Graph Theory (Ed. F. Harary). New Nodes DAGs 1 1 2 3 3 25 4 543 5 29281 . . . .
  • 10. Markov Equivalence Class: probabilistically indistinguishable graphical structures. 𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐶 . 𝑝(𝐷/𝐶) 𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐷 . 𝑝(𝐶/𝐷) 𝑝 𝐿, 𝐷, 𝐶 = 𝑝 𝐿 . 𝑝(𝐶/𝐿). 𝑝(𝐷/𝐶) Note: Even if exhaustive enumeration were possible, structures can be learned only up to the Markov equivalence class. C D L C D L C D L
  • 11. Search and Score (Hill Climbing): 𝑃 𝐺|𝐷 α 𝑃 𝐷|𝐺 . 𝑃(𝐺) Theoretical consideration on the complexity of Greedy search under certain assumptions have been been investigated1 1Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing Likelihood Prior
  • 12. Search and Score (Hill Climbing) Hill-climbing is a sequential algorithm. Score of the present structure G* is generated by modifying the modifying the previous structure (G) as in Step 4 in an iterative manner BIC Score = 𝑖=1 𝑛 log[𝑃(𝑋𝑖/Π𝑋𝑖 )] − 𝑑 2 log 𝑛 Opportunities for distributing the computation in the hill climbing approach o The potential structures interrogated in Step 4(a) can be distributed o BIC score of a candidate structure is the sum of the scores of its local structures, hence can be distributed o Greedy aspect of hill-climbing in conjunction Markov equivalence can result in locally optimal convergence encouraging repeating the procedure with multiple random restarts, this in turn can be can be distributed Regularization term d = #parameters
  • 13. Implementation: Architecture *Image from IC922 Redbook x86: Server: HPE ProLiant DL580 servers CPU Type: Intel Xeon EX-series Cores per node: 16 DRAM: 512GB POWER 9: Server: IC922 CPU Type: DD2.3 POWER9 processor modules Cores per node: 160 virtual cores Access up to 32DIMM Sustained bandwidth 28.8 GB
  • 14. Implementation: o Data description: HEPMASS1,2 (10.5 x 106 samples comprising of 28 variables , Baldi et al., 20161). All continuous normalized features were discretized into binary categorical variables by thresholding thresholding about their mean. o Python Implementation: Bayesian network using Pandas, NetworkX 1Baldi P, et al. [2016] Parameterized Neural Networks for High-Energy Physics. The Eur. Phys. J. C 76(235). 2Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing. A C D B E A C D B E A C D B E A C D B E A C D B E A C D B E
  • 15. Multiple Cores Architecture: Dask Distributed Python/Dask APIs Parallel Restart SHA-256 Hash confirms uniqueness of visited graph A C D B E A C D B E A C D B E A C D B E A C D B E A C D B E Spawning multiple Hill Climbing instances Data
  • 16. Performance of structure learning on POWER and x86: Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill- Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and POWER architectures (# implies significant difference). The computational time were statistically significant (p < 0.001) between the x86 and the POWER architectures, with the POWER architectures taken considerably lesser time than x86. As expected, BIC score takes less computational time than K2 score and these scores 0 10000 20000 30000 40000 50000 1 2 3 Time (Seconds) Max Fan in x86 POWER Performance of x86 and POWER 9 on HEPMASS (BIC Score) 0 10000 20000 30000 40000 50000 1 2 3 Time (Seconds) Max Fan in x86 POWER Performance of x86 and POWER 9 on HEPMASS (K2 Score) # # # # # #
  • 17. Performance of structure learning on POWER and x86 with varying Map Tasks: Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill- Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and POWER architectures (# implies significant difference). There was statistically significant difference in the computational time between the x86 and the POWER architectures when random restarts were distributed as map task jobs. As the number of map tasks were increased the computations time decreased across both POWER and x86 and the separation in the average time increased between x86 and POWER. # Corresponds to p < 0.05; * Corresponds to p < 0.0001 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 1 2 3 4 5 6 7 Time (Seconds) Map Tasks x86 POWER # # # # # * * Performance of POWER and x86 with varying Map Tasks (BIC Score) 2 4 8 16 32 64 128
  • 18. Healthcare – current trends: o Explosion in Digital Healthcare Data: - Source Systems: Continued digitization from multiple sources (EHR, Claims, Registries, IoT) and multiple types (Text, Image, Signals) - Multiscale Profiles: Emphasis on capturing the complete description of patients. - Common Data Models: Develop approaches for sharing observational healthcare data (OMOP/OHDSI) across multiple organizations and research networks (e.g. HIE, PCORNet) - High-throughput: molecular data (e.g. Next Generation Sequencing) - FHIR: Development of (Fast Healthcare Interoperability Resources) for enhanced interoperability across systems and devices o Explosion in Analytics Adoption: - Descriptive, Predictive, Prescriptive Analytics - Shift from storage to analytics and consensus-based to evidence-based/data-driven approaches to impact outcomes/KPIs. - Surge in the adoption of Machine Learning (ML) and Artificial Intelligence (AI) approaches.
  • 19. Healthcare Applications: Graphical Models – where do they fit in o Healthcare data sets are inherently multivariate and noisy attributed to several factors. Probabilistic graphical models are especially suited to handle noisy data. o Associations in multivariate healthcare data may be unknown. Graphical models can discover novel associations (hypothesis generation) in addition to validating known associations (hypothesis testing). Deciphering these associations is critical in prescribing targeted interventions. o Graphical models fall under ML and AI1. Can be used for descriptive, predictive and prescriptive analytics (e.g. Naïve Bayes Classifier). AI aspect of Graphical models: Answer queries posed from the evidence provided about a disease. o Graphical Models Healthcare applications include: Diagnostic Reasoning, Prognostic Reasoning and Treatment selection, Discovering functional associations2 o Emphasis on inferring causal associations from observational healthcare data with potential to complement classical approaches (e.g. RCT 3), RCTs being idealizations. o Interpretable and easily visualized for critical evaluation in healthcare settings. Need: Architectures and programming environment that can implement 1Russell, S. Norvig, R. [2020] Artificial Intelligence: A Modern Approach, 4th ed 2Lucas PJF et al. [2004] Bayesian networks in biomedicine and health-care Artif. Intell. Med. 30(3):201-14 3Berwick, D [2008] The Science of Improvement, JAMA, 1182- 1184 4Mclachlan, S et al. [2020] Bayesian networks in healthcare: Distribution by medical condition. Artificial Intelligence in Medicine. 107, 101912
  • 20. Summary o Structure learning is computationally intensive especially across large data sets and large number of variables o Preliminary findings revealed marked improvement in performance using POWER architectures in addressing computational challenges of structure learning approaches such as hill-climbing o Need for a more detailed investigation using a battery of data sets and across distinct graphical model algorithms o Graphical modeling approaches in general have considerable healthcare applications. Their ability to reason under uncertainty makes them especially ideal for healthcare analytics. o https://onstituteacademy.herokuapp.com Acknowledgements Marco Scutari, Ph.D. Senior Researcher, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA), Switzerland Terry Leatherland, Trish Froeschle, Thomas Prokop, IBM, USA Ganesan Narayanswami, OpenPOWER leader in Education and Research