Suche senden
Hochladen
SAS M2007 Presentation
•
2 gefällt mir
•
711 views
G
GregPotts
Folgen
Data Preparation for Data Mining in Health Care using SAS
Weniger lesen
Mehr lesen
Melden
Teilen
Melden
Teilen
1 von 30
Empfohlen
iHT2 Health IT Summit in Seattle 2012 – Keynote Presentation "Improving Healt...
iHT2 Health IT Summit in Seattle 2012 – Keynote Presentation "Improving Healt...
Health IT Conference – iHT2
iHT2 Health IT Summit in Phoenix 2013 – Terhilda Garrido, VP, HIT Tranformati...
iHT2 Health IT Summit in Phoenix 2013 – Terhilda Garrido, VP, HIT Tranformati...
Health IT Conference – iHT2
SAS M2006 Presentation
SAS M2006 Presentation
GregPotts
Macquarie University Workshop on Text Mining and Health
Macquarie University Workshop on Text Mining and Health
Diego Molla-Aliod
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
Health Catalyst
Information Technology Data Mining
Information Technology Data Mining
samiksha sharma
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Health Catalyst
HIMSS National Data Warehousing Webinar
HIMSS National Data Warehousing Webinar
Dale Sanders
Empfohlen
iHT2 Health IT Summit in Seattle 2012 – Keynote Presentation "Improving Healt...
iHT2 Health IT Summit in Seattle 2012 – Keynote Presentation "Improving Healt...
Health IT Conference – iHT2
iHT2 Health IT Summit in Phoenix 2013 – Terhilda Garrido, VP, HIT Tranformati...
iHT2 Health IT Summit in Phoenix 2013 – Terhilda Garrido, VP, HIT Tranformati...
Health IT Conference – iHT2
SAS M2006 Presentation
SAS M2006 Presentation
GregPotts
Macquarie University Workshop on Text Mining and Health
Macquarie University Workshop on Text Mining and Health
Diego Molla-Aliod
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
Health Catalyst
Information Technology Data Mining
Information Technology Data Mining
samiksha sharma
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Health Catalyst
HIMSS National Data Warehousing Webinar
HIMSS National Data Warehousing Webinar
Dale Sanders
TUW - 184.742 Evaluating Data Concerns for DaaS
TUW - 184.742 Evaluating Data Concerns for DaaS
Hong-Linh Truong
TUW- 184.742 Analyzing and Specifying Concerns for DaaS
TUW- 184.742 Analyzing and Specifying Concerns for DaaS
Hong-Linh Truong
SAS Clinical training program in Hyderabad
SAS Clinical training program in Hyderabad
madhupriya3zen
Sdmx at australian bureau of statistics
Sdmx at australian bureau of statistics
Vinicius Silva
Data Mining : Healthcare Application
Data Mining : Healthcare Application
osman ansari
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
HMO Research Network
IBM Watson Progress and 2013 Roadmap
IBM Watson Progress and 2013 Roadmap
Manoj Saxena
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
IRJET Journal
Big-Data in HealthCare _ Overview
Big-Data in HealthCare _ Overview
Hamdaoui Younes
National Patient Safety Foundation 2012 Dashboard Demo
National Patient Safety Foundation 2012 Dashboard Demo
Edgewater
Introduction to Open Source VistA
Introduction to Open Source VistA
bmehling
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
HMO Research Network
Healthcatalyst texaschildrens
Healthcatalyst texaschildrens
Accenture
Enterprise Cloud Forum: Turning Big Data into Big Dollars
Enterprise Cloud Forum: Turning Big Data into Big Dollars
Rackspace
SAS Big Data Forum - Transforming Big Data into Corporate Gold
SAS Big Data Forum - Transforming Big Data into Corporate Gold
Louis Fernandes
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
BigDataExpo
Intro to big data and applications -day 3
Intro to big data and applications -day 3
Parviz Vakili
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
ssuserc491ef2
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
ssuserc491ef2
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Carol McDonald
Weitere ähnliche Inhalte
Ähnlich wie SAS M2007 Presentation
TUW - 184.742 Evaluating Data Concerns for DaaS
TUW - 184.742 Evaluating Data Concerns for DaaS
Hong-Linh Truong
TUW- 184.742 Analyzing and Specifying Concerns for DaaS
TUW- 184.742 Analyzing and Specifying Concerns for DaaS
Hong-Linh Truong
SAS Clinical training program in Hyderabad
SAS Clinical training program in Hyderabad
madhupriya3zen
Sdmx at australian bureau of statistics
Sdmx at australian bureau of statistics
Vinicius Silva
Data Mining : Healthcare Application
Data Mining : Healthcare Application
osman ansari
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
HMO Research Network
IBM Watson Progress and 2013 Roadmap
IBM Watson Progress and 2013 Roadmap
Manoj Saxena
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
IRJET Journal
Big-Data in HealthCare _ Overview
Big-Data in HealthCare _ Overview
Hamdaoui Younes
National Patient Safety Foundation 2012 Dashboard Demo
National Patient Safety Foundation 2012 Dashboard Demo
Edgewater
Introduction to Open Source VistA
Introduction to Open Source VistA
bmehling
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
HMO Research Network
Healthcatalyst texaschildrens
Healthcatalyst texaschildrens
Accenture
Enterprise Cloud Forum: Turning Big Data into Big Dollars
Enterprise Cloud Forum: Turning Big Data into Big Dollars
Rackspace
SAS Big Data Forum - Transforming Big Data into Corporate Gold
SAS Big Data Forum - Transforming Big Data into Corporate Gold
Louis Fernandes
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
BigDataExpo
Intro to big data and applications -day 3
Intro to big data and applications -day 3
Parviz Vakili
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
ssuserc491ef2
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
ssuserc491ef2
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Carol McDonald
Ähnlich wie SAS M2007 Presentation
(20)
TUW - 184.742 Evaluating Data Concerns for DaaS
TUW - 184.742 Evaluating Data Concerns for DaaS
TUW- 184.742 Analyzing and Specifying Concerns for DaaS
TUW- 184.742 Analyzing and Specifying Concerns for DaaS
SAS Clinical training program in Hyderabad
SAS Clinical training program in Hyderabad
Sdmx at australian bureau of statistics
Sdmx at australian bureau of statistics
Data Mining : Healthcare Application
Data Mining : Healthcare Application
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
IBM Watson Progress and 2013 Roadmap
IBM Watson Progress and 2013 Roadmap
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Big-Data in HealthCare _ Overview
Big-Data in HealthCare _ Overview
National Patient Safety Foundation 2012 Dashboard Demo
National Patient Safety Foundation 2012 Dashboard Demo
Introduction to Open Source VistA
Introduction to Open Source VistA
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
Healthcatalyst texaschildrens
Healthcatalyst texaschildrens
Enterprise Cloud Forum: Turning Big Data into Big Dollars
Enterprise Cloud Forum: Turning Big Data into Big Dollars
SAS Big Data Forum - Transforming Big Data into Corporate Gold
SAS Big Data Forum - Transforming Big Data into Corporate Gold
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Intro to big data and applications -day 3
Intro to big data and applications -day 3
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
Big data analytics in health care by data mining and classification techniques
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
SAS M2007 Presentation
1.
SAS M2007 Data Mining
Conference October 12, 2007 Data Preparation for Las Vegas Data Mining in Health Care using SAS S. Greg Potts, MBA 1 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
2.
Introduction Data mining practitioners are well aware that most of the total effort required to
Data Preparation for Data Mining in Health complete a data mining project is not spent in the “trendier” aspects of the project Care using SAS such as problem definition or algorithm/technique selection, application, and interpretation of the results. << Previous | Next >> Unfortunately, most time – up to 80% often cited – is spent “in the trenches” Contents acquiring the data (i.e., most business data today is stored in transactional data AFMC and QI in Medicare and Medicaid . . . . . . . . . 3 warehouses where data elements essential for mining are dispersed across AFMC as QIO and multiple tables), getting to know the data (i.e., conducting exploratory data EQRO . . . . . . . . . . . . . . .4 analysis), and preparing the data mining table (i.e., summarizing data to the “unit Data Preparation for of analysis” and creating derived variables to be used as targets and inputs in the Directed Data Mining. . . . 6 modeling analysis) in the form required by the data mining algorithm (i.e., most Data Preparation for Undirected Data Mining .16 current algorithms require data in the form of a a onerowpersubject data table). References . . . . . . . . .28 This presentation will present two case studies in using SAS to extract and prepare data for data mining. The first case will explore how to extract and prepare transactional (Medicaid claims) data for directed data mining, where the goal is to explain or predict the value(s) of a particular target variable. The second case will explore data extraction and preparation for undirected data mining (cluster analysis) using hospitallevel data supplied to Medicare Quality Improvement Organizations (QIOs). 2 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
3.
AFMC and QI in Medicare and
Data Preparation for Medicaid Data Mining in Health Care using SAS • Medicare provides health insurance for people << Previous | Next >> age 65 and over, those with permanent kidney failure and certain people with disabilities (more than 400,000 individuals in Arkansas) • Medicaid is a jointlyfunded, FederalState medical assistance program for certain low income and needy people (more than 600,000 individuals in Arkansas). 3 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
4.
AFMC as QIO and EQRO
Data Preparation for • Centers for Medicare & Medicaid Services (CMS) designated Data Mining in Health Care using SAS Quality Improvement Organization (QIO) for the state of Arkansas. Assist providers (Hospitals, Physicians, Nursing Homes, etc.) with << Previous | Next >> measuring and reporting quality measures and redesigning care processes; provide statistical support and assistance in interpreting data results. • External Quality Review Organization (EQROlike) & Review Agent for the Arkansas Medicaid Program. Prior Authorization Reviews, Retrospective Reviews, HEDIS measures, Patient Satisfaction Surveys, and Data Mining. • Multidisciplinary team of clinicians, statisticians, and consultants • Goal: To ensure that everyone receives the right care, at the right place, at the right time – every time. 4 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
5.
Today’s Health Care Environment
Data Preparation for Data Mining in Health Care using SAS EMPLOYER/CONSUMER RISING DEMANDS FOR COSTS << Previous | Next >> ACCOUNTABILITY & $$$$$$$$$ TRANSPARENCY Many challenges exist in today’s dynamic health EXPLOSION QUALITY WIDE VARIATIONS care environment. IN CLINICAL IN QUALITY KNOWLEDGE IMPROVEMENT PROVIDER/PAYER INDUSTRY DATA SILOS FRAGMENTATION 5 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
6.
Data Preparation for
Data Mining in Health Care using SAS Data Preparation for << Previous | Next >> Directed Data Mining 6 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
7.
Directed Data Mining in Medicaid
Data Preparation for Data Mining in Health • AFMC Medicaid Data Mining Projects are often conducted with an Care using SAS eye toward identifying high cost drivers for utilization review and/or cost containment or to identify care coordination inefficiencies, << Previous | Next >> which can be opportunities for quality improvement. • Projects are client and/or literaturebased. • In directed data mining, the goal is to explain or predict the value(s) of a target variable. • Recipient/Member is the often the “Unit of Analysis”. Target is (usually) total costs ($) per member while inputs are (usually) binary and represent diagnosis, procedure, drug, and provider type code classes. • Techniques used include decision trees and (infrequently) regression. 7 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
8.
Medicaid Data Source
Data Preparation for Data Mining in Health • Arkansas Medicaid Decision Support System (DSS) Data Care using SAS Warehouse contains over 6 years of historical medical claims data among 500+ << Previous | Next >> data tables (some containing millions of rows of data) at granular levels of detail along with eligibility and demographic data. Claims Analysis Columns Clm Num Dtl Num Recip ID Amt Paid NDC Code Drug Detail Eligibility Enrollment Columns NDC Code Columns Recip ID Columns Recip ID Drug Nam Plan Code County Drug Class Elig Curr DOB …….. Elig Beg Gender Elig End Race ……. …… 8 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
9.
Data Acquisition: Medicaid Data
Data Preparation for Queries Data Mining in Health Care using SAS ® ® • Access Oracle Data Warehouse through BusinessObjects to learn table ® relationships and build Oracle SQL queries when necessary. << Previous | Next >> • ® AFMC licenses SAS/ACCESS Interface to ODBC. ® ® • Established a Windows ODBC connection with Oracle Data Warehouse. Data is acquired via SQL queries. ® • Copy and Paste Oracle SQL code into SAS program and execute. • Advantages: 1) Pulls data back in SAS Data set and, ® 2) Bypasses BusinessObjects’ query size limitations. 9 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
10.
Medicaid Transactional Claims
Data Preparation for Data Example Data Mining in Health Care using SAS Recipient Clm Num DTL Prov Dx Proc Amt_ NDC Code Drug … ID Class << Previous | Next >> Num Type Paid 001 050600407 1 Phys 250.00 99212 $25.00 … 001 050600407 2 Phys 640.01 81000 $43.38 … 002 050600408 1 Pharm $106.45 00406035705 280808 … 002 050600409 1 Phys 250.10 99212 $45.13 … 003 050600410 1 Phys 427.0 A0426 $240.46 … 003 050600411 2 Phys 427.0 A0390 $225.72 … • Multiple rows of claim detail data make up one claim per recipient/member. • Challenge is to summarize data to the recipient level and to create target and input variables to be used in modeling analysis. 10 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
11.
SAS Procedures for Transactional Data
Data Preparation for Preparation – Summing Costs Data Mining in Health Care using SAS • Use PROC SQL to summarize paid amounts (AMT_PAID) to recipient level and to break claim costs out by medical vs. pharmacy. << Previous | Next >> • Total Costs (target) are calculated from the variables created in these tables using a DATA step later in the data prep program. 11 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
12.
SAS Procedures for Transactional Data
Data Preparation for Preparation – Creating Binary Inputs Data Mining in Health Care using SAS • Use PROC SQL to create table of recipient IDs with nonmissing Dx codes. << Previous | Next >> • Use FIRST. byprocessing to create an enumeration variable. As shown, a number of SAS procedures and language • Sort Data Set statements are needed to • Convert sorted data from rows to columns using PROC TRANSPOSE transform transactional data into a onerowper • Array to create binary input variables (0,1) for code classes. unitofanalysis format suitable for directed data mining • Repeat process for Procedure Class, Drug Class, and Provider Types techniques. 12 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
13.
Sample Directed Data Mining
Data Preparation for Modeling Data Set Data Mining in Health Care using SAS Total Cost = f (Clinical Diagnoses, Procedures, Prescription Drugs, Providers Seen, Age, Gender) per Recipient << Previous | Next >> Input Variables Target Mining data set Variable contains both Dx Dx Interval (i.e., Total Total Group1 (001139 Group2 (140239 Procedure Costs) and binary Recipient SFY … Code Group1 … Thera Class1 … ID 2004 Infectious & Parasitic Neoplasms) (0010001999 (04:00.00 variables (i.e., Costs Anesthesiology) Antihistamines) Diseases) DX_GROUP_1) 001 $15,232.84 1 1 … 1 … 1 … 002 $2,006.72 1 0 … 1 … 0 … 003 $8,354.89 0 1 … 1 … 1 … . . . . … . … . … . . . . … . … . … 13 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
14.
Why use Decision Trees?
Data Preparation for Data Mining in Health • Excellent Tool for Data Mining Profiling Care using SAS • Allow you to easily see patterns in data with respect << Previous | Next >> to a target variable (i.e., Interval – total cost $ per recipient). • Decision Tree algorithm “reads” the data and determines the best variable on which to “split” the data. • Splits continue as long as they are statistically significant. 14 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
15.
Sample Decision Tree Results
Data Preparation for Data Mining in Health SFY 2004 ContinuouslyEnrolled Medicaid Recipients Care using SAS with a Diabetes Dx (250.0X250.9X) (Partial Tree Results) << Previous | Next >> This group of recipients with a Diabetes Dx accrued average total costs more than three times other recipients with the same diagnosis. Why? Child nodes from decision tree may require more drilldown analysis or model tuning in the form of variable reduction and reapplication. 15 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
16.
Data Preparation for
Data Mining in Health Care using SAS << Previous | Next >> Data Preparation for Undirected Data Mining 16 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
17.
Undirected Data Mining in Medicare
Data Preparation for Data Mining in Health • AFMC Medicare Data Mining Projects are often conducted with an Care using SAS eye toward identifying opportunities for quality improvement and safeguarding Medicare program funds. << Previous | Next >> • In undirected data mining, the goal is to uncover the hidden structure in data without respect to a target variable. • Cluster analysis (PROC FASTCLUS) is technique often used. 17 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
18.
Hospital Payment Monitoring
Data Preparation for Program (HPMP) Data Mining in Health Care using SAS • The Centers for Medicare and Medicaid Services (CMS) << Previous | Next >> developed the Hospital Payment Monitoring Program (HPMP) primarily to calculate and monitor the Medicare HPMP is a QIO feeforservice paid claims error rate for inpatient acute Statement of Work (SOW) care hospital services. priority. • Under contracts with CMS, several companies – including QIOs like AFMC – are responsible for operating the HPMP. 18 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
19.
HPMP, QIOs, and Reporting
Data Preparation for • Each quarter, QIOs receive state hospitallevel data containing the Data Mining in Health calculated paid claims rates and other summary measures for 14 Care using SAS different target areas identified by CMS as prone to payment errors. This quarterly report is known as the FirstLook Analysis Tool for << Previous | Next >> Hospital Outlier Monitoring (FATHOM). Example target area paid claims rate: Target 12: Three Day Transfer to SNF Numerator: count of discharges to a SNF with a threeday length of stay Denominator: count of all discharges to a SNF or swing bed • From the FATHOM data, QIOs can generate and distribute the Program for Evaluating Payment Patterns Electronic Report (PEPPER). The PEPPERs are hospitalspecific reports that allow Inpatient Prospective Payment System (IPPS) hospitals to compare their own billing practices in the 14 target areas with other IPPS hospitals within the state. • Each QIO uses these data tools to work with hospitals in their state to reduce improper admissions and DiagnosisRelated Group (DRG) payment errors. 19 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
20.
Resource Challenge and Answer
Data Preparation for • With 50+ IPPS Arkansas hospitals and limited staff, AFMC Data Mining in Health Care using SAS desired to find a way to identify hospitals that have been “extreme th outliers” (95 percentile or above on two measures) over time, << Previous | Next >> thus indicating a need for close monitoring and possible notification. • Answer: Cluster Analysis to segment hospitals into 3 “like” groups based on 5 of the 11 hospital/targetlevel calculated measures in the data (see right), then graphing the results of two of the 5 measures to determine “extreme outliers”. 20 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
21.
Data Acquisition: Medicare
Data Preparation for HospitalLevel Data Data Mining in Health Care using SAS • The FATHOM hospitallevel data is provided to each << Previous | Next >> QIO from another CMS/HPMP contractor in a ® Microsoft Access database (*.mdb). • PROC IMPORT is used to import data table from the *.mdb file. 21 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
22.
SAS Procedures for Data Preparation
Data Preparation for at UAlevel – Standardization Data Mining in Health Care using SAS • Per SAS/STAT 9.1.3 Online Documentation regarding PROC FASTCLUS, Pg. 13821383: << Previous | Next >> “Variables with larger variances exert a larger influence in calculating the clusters...Therefore it is necessary to standardize the variables before performing the cluster analysis.” • PROC STANDARD is used to standardize all variables used in the cluster analysis to mean=0, std=1 prior to invoking PROC FASTCLUS. 22 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
23.
Cluster Analysis
Data Preparation for Data Mining in Health • Run PROC FASTCLUS against output data set Care using SAS containing standardized variables. << Previous | Next >> 23 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
24.
Plot to Produce Visual Representation
Data Preparation for Data Mining in Health • Run PROC GPLOT to plot two of the five variables used Care using SAS in cluster analysis to determine outliers (note: highlighted << Previous | Next >> syntax produces red vertical and horizontal references to th the 95 percentile value for each variable being plotted): 24 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
25.
Plot Results
Data Preparation for Data Mining in Health Care using SAS << Previous | Next >> • (Standardized) Variables used in cluster analysis being plotted are: Outlier Value: A single number from 10 to 10, describing how unusual a hospital is compared to all IPPS hospitals in the state. Outlier Times Count: Outlier value weighted by number of discharges. This measure captures both the unusualness of the hospitals’ target outlier value and the volume of target discharges. • Outliers facilities are those facilities that consistently fall in the upperrightmost quadrant (>95th percentile on BOTH measures). 25 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
26.
Sample Letter to Hospitals
Data Preparation for Regarding Outlier Status Data Mining in Health Care using SAS << Previous | Next >> 26 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
27.
Conclusions
Data Preparation for Data Mining in Health • Most time spent in a Data Mining project is spent Care using SAS acquiring and preparing data, not on << Previous | Next >> algorithm/technique selection and application. • SAS has a vast arsenal of tools to help you acquire, prepare/transform, and mine your data. 27 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
28.
References
Data Preparation for Data Mining in Health Data Preparation for Data Mining Using SAS Decision Trees in Enterprise Miner Care using SAS Mamdouh Refaat (Morgan Kaufmann/Elsevier) http://support.sas.com/documentation/onlin http://books.elsevier.com/us/mk/us/subindex.asp?isbn=97 edoc/91pdf/sasdoc_91/em_gs_7281.pdf 80123735775&country=United+States&community=mk&r << Previous | Next >> ef=&mscssid=KHFKVKDF6HNU8HJND9V1RB81QR712R SAS/STAT Procedures (PROC 2E or FASTCLUS) http://www.amazon.com/PreparationMiningKaufmann http://support.sas.com/documentation/onlin Management edoc/91pdf/sasdoc_91/stat_ug_7313.pdf Systems/dp/0123735777/ref=pd_bbs_sr_1/1025583212 or 7763329?ie=UTF8&s=books&qid=1179937157&sr=11 http://www2.sas.com/proceedings/sugi24/ Stats/p27024.pdf Base SAS Procedures and syntax (PROC IMPORT, PROC SQL, PROC TRANSPOSE) Hospital Payment Monitoring Program http://support.sas.com/documentation/onlinedoc/91pdf/sas (HPMP) doc_913/base_proc_8977_new.pdf http://oig.hhs.gov/oas/reports/region3/3050 0007.pdf SAS Language Reference (Arrays) http://support.sas.com/documentation/onlinedoc/91pdf/sas doc_913/base_lrconcept_9196.pdf 28 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
29.
Acknowledgements
Data Preparation for Data Mining in Health • M2007 CoChairs: Care using SAS •Jerry Oglesby, SAS Institute << Previous | Next >> •Goutam Chakraborty, Oklahoma State University • Rona Bellinger, AFMC Manager of Web & Graphic Services • Karen Gabel and Tori Gammill, AFMC HPMP Team Members • AFMC’s Data Mining Team 29 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas
30.
Questions???
Contact: S. Greg Potts, MBA Data Mining Team Leader Arkansas Foundation for Medical Care Office of Projects and Analysis 401 West Capitol, Suite 410 Little Rock, AR 72201 (501) 2128734 Phone (501) 3751201 Fax Email: spotts@afmc.org 30 ©2007 Arkansas Foundation for Medical Care, Inc. SAS M2007 Data Mining Conference | October 12, 2007 | Las Vegas