4. #PSLS2015Innovations & Insights Leadership Summit
WHAT IS BIG DATA?
• The collective concepts surrounding the challenges
and benefits of working with very large data sets.
• Challenges for Big Data:
• Collection
• Storage
• Data Curation (active management of data)
• Unstructured Data
• Benefits of Big Data:
• Inferential Statistics (removes uncertainty)
• Enhanced Predictive Modeling
• Fast, Agile Decision Support
• Analytics: Discovering and communicating
meaningful patterns found in data
5. #PSLS2015Innovations & Insights Leadership Summit
• Barriers
• Automated Data Collection (building automation,
clinical alarm analysis)
• Expensive BI tools
• Time Consuming
• Requires employees with highly specialized
education
• Advanced Mathematics
• Statistical Programming
• Needed to interpret results
“Given enough data, everything is statistically significant”–
Douglas Merrill
CLINICAL ENGINEERING AND BIG DATA
6. #PSLS2015Innovations & Insights Leadership Summit
• Most Clinical Engineering Departments are
sitting on a wealth of untapped data
• Actionable data can be found in your
Computerized Maintenance Management
System (CMMS)
• Data can be used as part of a Performance
Improvement Initiative (LEAN, Six Sigma)
• Realistic Analytics
• Easy to act on
• Affordable (Excel, XLMiner, Tableau)
WHY EVEN BOTHER???
7. #PSLS2015Innovations & Insights Leadership Summit
• Pose a Question
• Gather Data
• Prepare and Clean Data
• Choose Your Tool / Technique
• Summarize and Visualize Data
• Analyze and Interpret Results
STEPS TO ANALYZING DATA
9. #PSLS2015Innovations & Insights Leadership Summit
DEVICE RELIABILITY
• MMC uses Brand A
• CMC, NMC, OMC
use Brand B
• Popular belief was
that Brand A was the
most unreliable
• Data proved the
opposite was true
10. #PSLS2015Innovations & Insights Leadership Summit
DIG DOWN TO DEVICE LEVEL
20
1412
= 1.4%
1.4% of Total
Pump Inventory
Comprised 12.7%
Of All Repairs.
11. #PSLS2015Innovations & Insights Leadership Summit
MORE DEVICE LEVEL DISCOVERY
61
37
+ 27
125
So at ½ hour a
repair you have
62.5 hours.
That is over
8 days spent
on these 3
machines
alone.
12. #PSLS2015Innovations & Insights Leadership Summit
• Excel Add On
• Statistical Data Mining
• Limited, But Easy to Learn
• I will demonstrate two of the more popular
techniques: Association (Market Basket
Analysis) and Clustering
XLMINER
13. #PSLS2015Innovations & Insights Leadership Summit
• Used by Retailers to Determine if an Item
Being Purchased Based on Other Items
Purchased
• Ventilators - Parts used in the last 500 repairs
• Data Set Sample:
MARKET BASKET ANALYSIS
14. #PSLS2015Innovations & Insights Leadership Summit
MARKET BASKET(XLMINER)
Lift Ratio Determine How Much More Likely C
Will Be Used if A is Used
Confidence Shows Strength of the Rule
16. #PSLS2015Innovations & Insights Leadership Summit
• This Information Can Be Used to Put Together
a Smarter Buying Plan
• All parts one order
• Save money on shipping
• Improved discounts for volume shopping
• Equipment downtime is shortened as multiple orders do
not have to be placed
MARKET BASKET
17. #PSLS2015Innovations & Insights Leadership Summit
• Market Segmentation
• Microsoft used it when trying to break into
server world
• Hardcore UNIX
• LINUX Dabblers
• Market Followers
• Financially Driven
• LINUX Enthusiasts
So how can we use clustering?
CLUSTERING
18. #PSLS2015Innovations & Insights Leadership Summit
• Focused on Diagnostic Ultrasound
• Queried 3 Years of Repair Work Order Data
• Focused on following variables (WO Count,
Labor Minutes, Labor Cost, No Problem Found
Count, Parts Cost, Travel Cost)
• Performed a K-Means Cluster Analysis
CLUSTERING EXAMPLE (XL MINER)
19. #PSLS2015Innovations & Insights Leadership Summit
K-MEANS CLUSTERING
We randomly
place 3 Centroids
Then using
vectors, we group
the data points
based on their
distance
relationships to
the centroids
These groupings
become our
clusters.
21. #PSLS2015Innovations & Insights Leadership Summit
CLUSTER 4 (HIGH WO AND NPF)
A Review of Work Order History Pointed to User Error
End User Refresher Training Would Reduce These Repair Calls.
22. #PSLS2015Innovations & Insights Leadership Summit
DUAL AXIS CHARTS
•Allow visual comparison of large
data sets.
•Easy to create using Tableau
•Interactive – allow for quick drill
down
24. #PSLS2015Innovations & Insights Leadership Summit
• Communication:
• Share your findings
• Get your information into the hands of decision
makers
• Keep people informed
SUMMARIZE AND VISUALIZE DATA
25. #PSLS2015Innovations & Insights Leadership Summit
SIMPLE VISUALIZATION (CRYSTAL REPORTS)
Used to Keep Staff Informed
Can Be Emailed Out Daily
Snapshot of How We Stand
Hadoop – SAS – mongoDB – tableau – data mining - BIG DATA These words are very popular right now. It’s everywhere you look. Television ads, web ads, I even saw it on the sides of taxi cabs in London. Everyone is talking about, but the real question is “What is it?”
Big Data is a phrase coined to describe a multi-disciplined approach to the analysis of large data sets. Coined by Francis X. Diebold, an economist at the University of Pennsylvania, Big Data has many businesses searching for talent and many universities scrambling to find a way to produce this talent.
Big Data comes with its own set of challenges. How do we collect all of this data? Where do we store it? Who is going to care for and maintain this data? How do we discern meaning from the flood of unstructured data.
It does have many benefits of course, otherwise no one would bother with it. The first is that it removes the uncertainty that is inherent in inferential statistics. If you think back to your college STATs course, inferential statistics dealt with the problems that arise when you try to find meaning in data samples. If you are running for mayor in a city with 100,000 registered voters and you poll 5000 random people, you can use inferential statistics to build a confidence interval that will give you a good idea of how the vote is going to turn out (within a 5 to 10 % percent margin of error at least).
Well, imagine you could poll 99,000 of those 100,000 voters. With that kind of data, you margin of error disappears. Inferential statistics are no longer needed. That is Big Data. Walmart doesn’t have to guess what the hottest selling items are based on a sampling of stores, they have the actual data. And they are analyzing it as close to real time as is possible.
Having such a wealth of data, and such a level of confidence in it, you can developed enhanced prediction models and use those models for fast, agile, decision making.
Big Data for Clinical Engineering is still in its developing stage. Most big data applications I see are more on the clinical (alarm management) and plant operations (building automation). The biggest issue is that most CED departments simply do not produce the amounts of data that would generally qualify them for Big Data.
But that is okay because Big Data analytics are time consuming. They require hefty investments in expensive BI tools and require employees with highly specialized and advanced educations.
For right now, I believe most CED should focus on descriptive analytics
Most CED departments are sitting on a wealth of untapped data they aren’t using. A typical CMMS system contains plenty of actionable data simply waiting to be discovered. If you don’t analyze the data, you cannot act on it.
Once data is discovered it can used as part of a performance improvement initiative. Especially data driven ones like LEAN and Six Sigma.
Data not analyzed cannot be acted on
Without data decisions made with emotions and personal prejudices.
Okay, so we know we want to use analytics, but the big question is “How do we do it?”
Well, no matter how simple or complex your project is, data analysis can always be broken down into the following steps:
Pose a Question
Gather Data
Prepare and Clean Data
Choose Your Tool / Technique
Summarize and Visualize Data
Analyze and Interpret Results