Mining of electronic health records (EHRs) has recently gained importance. However, most efforts are restricted to analyzing drugs, diseases and their associations. In biomedical research, network analysis has provided the conceptual framework to interpret protein-protein interactions or gene-disease association networks via large-scale network maps.
We analyze associations between drugs, diseases, devices and procedures mined from EHRs using network analysis to extract hidden “modules of care” for hypothesis generation. In particular, we annotated the textual notes of the EHRs of one million patients in the Stanford Clinical Data Warehouse with disease, drug, procedure and device terms using ontologies such as SNOMED-CT or RxNorm. We then used standard co-occurrence statistics to establish associations between these clinical concepts and to construct networks. Hidden modules of care - clusters of diseases, drugs, procedures, devices – useful for hypothesis generation are extracted through network analysis approaches and visualized using Cytoscape.
We present a study for comparative effectiveness of Cilostazol vs. a control group in peripheral artery disease (PAD) patients (see Figure 1) and compare our results derived from the network analysis against standard methods such as regression analysis. We believe that network analysis allows us to uncover hidden (“latent”) modules of care not detected through standard approaches, which do not account for the connectivity of the clinical events and entities.
8. Cohort(building(–(restrict(by(follow(up(8me( Follow up time in
peripheral artery disease patients
3000 Patient timeline
PAD Last note
2500
tPAD tlast t
2000
Follow up time
Frequency
1500
1000
5757 PAD patients
500
0
0 365 1000 2000 3000 4000 5000 6000
follow up time in 30 day intervals
7/17/12 NetBio SIG, 2012 8
9. Cohort(building(–(set(8me(
Patient timeline
PAD CIL Last note
Cilostazol t
patient t PAD t CIL= t0 t last
Matching Outcome
Median = 22 days
PAD Last note
Other PAD t
patient t PAD t0 t last
22 days
before after
7/17/12 NetBio SIG, 2012 9
12. Cohort(building(–(set(8me(
Patient timeline
PAD CIL Last note
Cilostazol t
patient t PAD t CIL= t0 t last
Matching Outcome
Median = 22 days
PAD Last note
Other PAD t
patient t PAD t0 t last
22 days
before after
7/17/12 NetBio SIG, 2012 14