More Related Content Similar to Discovering the hidden treasure of data using graph analytic — Ana Paula Appel (IBM research) @PAPIs Connect — São Paulo 2017 (20) Discovering the hidden treasure of data using graph analytic — Ana Paula Appel (IBM research) @PAPIs Connect — São Paulo 20173. IBM Research – Brazil
view from Rio de Janeiro Lab
Mission: To be known for our science and technology and vital to IBM, Brazil, our
clients in the region and worldwide
6. © 2015 IBM Corporation6
• Paid Claims
• Total: 109M
• Doctors: 220k (almost half of all doctors In Brazil)
• Patients: 2.2M
• Unique Doctor-Patient pairs: 11.6M
• Other support data:
• Company
• Providers
• Authorizations ~3M
• Claim denials ~13M
• Geolocation
• ...
Over 40 tables,
hundreds of fields
Healthcare Data: Claims
CLAIM
• Physician ID
• Patient ID
• Timestamp
• Service code
• Disease – ICD9
• (80+ extra rows)
8. © 2015 IBM Corporation8
PhysID ICD9 PatientID DATE
SP45962 - 1001 09/04/13
SP45962 Z017 1001 26/04/13
SP47108 Z017 1001 06/12/13
SP47108 Z017 1001 16/12/13
SP45962 - 1002 11/07/13
SP45962 Z017 1002 12/07/13
SP45962 - 1002 19/08/13
SP59938 Z000 1002 24/10/13
… … … …
Bipartite graph
Weighted graph
Directed graph
• Bipartite network of doctors and patients
• |V|=2.4M, |E|=11.6M
• Keep only the largest connected component (92%-99% of all links)
• Remove multiple edges and map to weights
A Network Approach
15. © 2015 IBM Corporation15
a b
w(ab) = 17
Δt = 7 days
w(ba) = 8
Δt = 2 days
time
1 1 2 2
a b b a
visit visit visit visit
Patients
Doctors
Mutual Reference
Same patient visits two doctors
+
Happens in both directions
Δt = 7 days Δt = 2 days
Reciprocal Link
Goal
Identify strong connections between each pair of physicians, in particular, the outliers.
18. © 2015 IBM Corporation18
Mutual Reference
Conclusions and Insights
• Claim data is rich to identify connections among physicians
and how a partnership is done.
• The Mutual Reference is an indicative of physician
relationship and can potentially generate other analyses,
especially in a large volume of data.
• The proposed metric makes possible a frequent
computational analyze of that relationship.
Physician A Physician B rm Rank
MMS028 MMS027 1 1
MSP145 MSP144 0.31 10
Mutual Reference
• Specialties that appear more
• Ophthalmology to ophthalmology
• Gynecologic and obstetrician to Gynecologic and
obstetrician
• DF has most of consultation with irregular interval
• MDF010 and MDF009 with 267 consultations and
average of days equal to 0
• Top pair;
• 205 from MMS028 to MMS027
• 196 from MMS027 to MMS028
20. © 2015 IBM Corporation20
Patient Loyalty
Goal
Identify (and quantify) doctors that have recurring patients in a systematic way,
suggesting ‘loyalty’
1. Consider patients with many visits to doctors
2. Compute the relative weight for each doctor visited
3. Count the relative number of ‘loyal’ patients for that doctor
Time
Consultations
25. © 2015 IBM Corporation25
Summary & Take Home Messages
• Networks are all about relationships, as most data is.
• Network-derived insights are usually not reachable from other analyses.
• Complex Networks methods are very valuable to data science.
• Large Healthcare claim database from Brazilian insurance company.
• Applied complex network methods to find how physicians build their
network.
• Examples: Temporality, reciprocity and ‘loyalty’.