Weitere ähnliche Inhalte Ähnlich wie Business Analytics and Optimization Introduction (20) Kürzlich hochgeladen (20) Business Analytics and Optimization Introduction1. Business Analytics and Optimization: A Technical Introduction
Oleksandr Romanko, Ph.D.
Senior Research Analyst, Risk Analytics – Business Analytics, IBM
Adjunct Professor, University of Toronto
Toronto SMAC Meetup
September 18, 2014 2. © 2014 IBM Corporation
2
Making the world work better – pioneering the science
2008
1973
1969
1981 3. © 2014 IBM Corporation
3
IBM Centennial: 100 Years of Innovation 9. © 2014 IBM Corporation
Predictive Analytics
What will happen?
Descriptive Analytics
What has happened?
Prescriptive Analytics What should we do?
What is analytics?
Data
Insight
Action
Decide
Analyze
Business Value
9
Analytics is the scientific process of deriving insights from data in order to make decisions 11. © 2014 IBM Corporation
11
IBM Business Analytics portfolio
IBM Business Analytics
Financial
Services Public Sector Distribution Industrial Communications
Customer Risk
Industry
Solutions
Finance Operations
Risk
Analytics
Business
Intelligence
Software
Categories
Predictive
Analytics
Performance
Management
Functional
Solutions
Core
Capabilities
REPORT MODEL COLLABORATE PREDICT
Budgeting &
Forecasting
Financial
Consolidation
Disclosure
Management
Risk
Identification
Risk & Control
Assessment
Resource
Optimization
Social Media
Analytics
Profitability
Modeling &
Optimization
Production
Planning
Asset
Management
Customer
Acquisition
Customer
Lifetime
Value
Customer
Loyalty
& Retention
Risk Mitigation
Planning
Risk Aware
Decisioning
Sales Performance
Management
ANALYZE PLAN
Visualize Discover
Forecast Mine
Govern
Score Decide
Simulate Contribute
Survey
Decision
Management
12. © 2014 IBM Corporation
12
Operations research
Operations Research (O.R.) is the discipline of applying advanced analytical methods to help make better decisions
Analytical techniques:
Simulation – giving you the ability to try out approaches and test ideas for improvement
Optimization – narrowing your choices to the very best when there are virtually innumerable feasible options and comparing them is difficult
Probability and Statistics – helping you measure risk, mine data to find valuable connections and insights, test conclusions, and make reliable forecasts
Mathematical Modeling – algorithms and software 13. © 2014 IBM Corporation
13
Our planet is a complex, dynamic, highly interconnected $54 Trillion system-of-systems (OECD-based analysis)
Communication
$ 3.96 Tn
Transportation
$ 6.95 Tn
Leisure / Recreation / Clothing
$ 7.80 Tn
Healthcare
$ 4.27 Tn
Food $ 4.89 Tn
Infrastructure
$ 12.54 Tn
Govt. & Safety $ 5.21 Tn
Finance $ 4.58 Tn
Electricity $ 2.94 Tn
Education
$ 1.36 Tn
Water
$ 0.13 Tn
Global system-of-systems $54 Trillion (100% of WW 2008 GDP)
Same Industry Business Support IT Systems Energy Resources Machinery Materials Trade
Legend for system inputs
Note: 1. Size of bubbles represents systems’ economic values 2. Arrows represent the strength of systems’ interaction
Source: IBV analysis based on OECD
This chart shows ‘systems‘ (not ‘industries‘)
1 Tn 14. 14 © 2014 IBM Corporation
Economists estimate, that all systems carry inefficiencies of up
to $15 Tn, of which $4 Tn could be eliminated
Global economic value of
System-of-systems
$54 Trillion
100% of WW 2008 GDP
Inefficiencies $15 Trillion
28% of WW 2008 GDP
Improvement
potential
$4 Trillion
7% of WW 2008 GDP
How to read the chart:
For example, the Healthcare system‘s
value is $4,270B. It carries an estimated
inefficiency of 42%. From that level of 42%
inefficiency, economists estimate that
~34% can be eliminated (= 34% x 42%).
Source: IBM economists survey 2009; n= 480
System inefficiency as % of total
economic value
Improvement potential as
% of system inefficiency
Education
1,360
Building & Transport
Infrastructure
12,540
Healthcare
4,270
Government & Safety
5,210
Electricity
2,940
Financial
4,580
Food & Water
4,890
Transportation (Goods
& Passenger)
6,950
Leisure / Recreation
/ Clothing
7,800
Communication
3,960
Analysis of inefficiencies in the
planet‘s system-of-systems
Note: Size of the bubble indicate absolute
value of the system in USD Billions
42%
34%
This chart shows ‘systems‘ (not ‘industries‘)
15%
20%
25%
30%
35%
40%
15% 20% 25% 30% 35% 40% 45%
15. © 2014 IBM Corporation
15
History of analytics 16. © 2014 IBM Corporation
16
History of business analytics 17. © 2014 IBM Corporation
Business Analytics Examples 18. © 2014 IBM Corporation
Pit stop analytics
7
Calculations showed that time spent changing tires and refilling the tank was more
than offset by the improved performance of the car on the track.
1. Softer tires stuck to the track better during turns than their harder cousins,
though they wore out more quickly.
2. Less gas in the tank translated into a lighter, and therefore faster, car.
Optimized F1 pit teams can change four tires in two seconds
21. © 2014 IBM Corporation
21
We can collect information from almost everything to make better decisions
Camera phones in existence able to document accidents, damage, and crimes
1 billion
RFID tags embedded into our world and across entire ecosystems
30 billion
Of new automobiles will contain event data recorders collecting travel information
85%
Instrumented
Interconnected
Intelligent 22. © 2014 IBM Corporation
22
Big data
Big data are datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analytics, and visualizing. Source: Wikipedia 24. © 2014 IBM Corporation
24
Applications of big data analytics
Homeland Security
Finance
Smarter Healthcare
Multi-channel sales
Telecom
Manufacturing
Traffic Control
Trading Analytics
Fraud and Risk
Log Analysis
Search Quality
Retail: Churn, NBO 25. © 2014 IBM Corporation
25
Police use analytics to reduce crime (video) 26. © 2014 IBM Corporation
26
Marketing and supply chain analytics (video) 28. 28 © 2014 IBM Corporation
Intelligent transport systems
Real time monitoring & forecasting of congestion in cities enables real time action to
reduce traffic and emissions
– Can charge drivers at point of use for access to city centers
Stockholm Congestion Tax Project
– Involves 18 barrier-free control points
– Allows differentiated pricing by time of day, congestion level, and potentially emissions level
– Results:
• Traffic reduced by 100,000 vehicle passages per day (25%)
• Public transportation passengers increased by 40,000 / day
• Congestion during peak hours and CO2 emissions were dramatically reduced
29. © 2014 IBM Corporation
29
Analytics for green vehicles and technology (video) 30. © 2014 IBM Corporation
30
Artificial intelligence
Source: A Brief Overview and Thoughts for Healthcare Education and Performance Improvement by Watson Team 31. © 2014 IBM Corporation
31
Artificial intelligence
Source: A Brief Overview and Thoughts for Healthcare Education and Performance Improvement by Watson Team
On 27th May 1498, Vasco da Gama landed in Kappad Beach
On 27th May 1498, Vasco da Gama landed in Kappad Beach
celebrated
May 1898
400th anniversary
arrival in
In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.
Portugal
landed in
27th May 1498
Vasco da Gama
Temporal Reasoning
Statistical Paraphrasing
GeoSpatial Reasoning
explorer
On 27th May 1498, Vasco da Gama landed in Kappad Beach
On the 27th of May 1498, Vasco da Gama landed in Kappad Beach
Kappad Beach
Para- phrases
Geo- KB
Date
Math
India
Search Far and Wide
Explore many hypotheses
Find Judge Evidence
Many inference algorithms 32. © 2014 IBM Corporation
32
Artificial intelligence (video) 36. © 2014 IBM Corporation
36
Bluemix
www.bluemix.net 38. © 2014 IBM Corporation
Business Analytics Education 39. © 2014 IBM Corporation
IBM Academic Initiative program
Cognos SPSS ILOG 40. © 2014 IBM Corporation
Master of Business Analytics programs – top 20 universities 41. © 2014 IBM Corporation
Industry support for Master of Business Analytics programs 42. © 2014 IBM Corporation
Business Analytics programs – curriculum
Applied Statistics and Probability
Fundamentals of Computational Mathematics
Data Mining and Knowledge Discovery
Simulation Modelling
Optimization
Financial Decision Making
Computational Methods for Business Data Analysis
Computational Finance and Risk Management
Visual Analytics and Knowledge Representation
Mathematical Modelling for Business
Machine Learning, Cognitive Computing and Artificial Intelligence
Marketing Analytics
Strategies for Managing Innovations
Analytics of Web, Social Networks and Business News 44. © 2014 IBM Corporation
What kind of data are we dealing with?
Types of data
•Quantitative
•Categorical (ordered, unordered)
Data collection
•Independent observations (one observation per subject)
•Dependent observations (repeated observation of the same subject, relationships within groups, relationships over time or space)
Type of data drives the direction of your analysis
•How to plot
•How to summarize
•How to draw inferences and conclusions
•How to issue predictions
44 45. © 2014 IBM Corporation
Quantitative data
Examples: temperature, age, income
Quick check: “Does it makes sense to calculate an average?”
Appropriate summary statistics:
–Mean and Median
–Standard Deviation
–Percentiles
More advanced predictive methods: Regression, Time Series Analysis, …
Plot your data!
45 46. © 2014 IBM Corporation
Summarizing quantitative data
One-number summaries
–Mean Average, obtained by summing all observations and dividing by the number of obs.
–Median The center value, below and above which you will find 50% of the observations.
Summarizing your data with one number may not tell the whole story:
46
Median = 19.8
Median = 19.8
Median = 10.5 47. © 2014 IBM Corporation
47
Flaw of averages
“Plans based on average assumptions are wrong on average”
Average depth 3 ft 48. © 2014 IBM Corporation
“Most observations fall within ±2 standard deviations of the mean.”
Standard deviation
48
If the data is normally distributed
95 % of observations
Standard Deviation = 4.2
~95% of observations between 11.4 and 28.2 49. © 2014 IBM Corporation
Descriptive statistics - example
Random sample of 5000 customers of a credit card company
49
Amount spent on primary card last month
Debt to income ratio (x100)
N
Valid
5000
5000
Missing
0
0
Mean
1683.7340
9.9578
Median
1690.0670
8.8000
Std. Deviation
210.26680
6.42317
Minimum
.00
.00
Maximum
2482.72
43.10 50. © 2014 IBM Corporation
Percentiles
Generalizations of the median (50th percentile).
The pth is the data point below which p percent of the observations fall.
Often used to compare a single observation to a general population.
Examples:
–Standardized test scores If you scored in the 93th percentile, your score was higher than that of 93% of test takers.
–Child growth percentiles
50 51. © 2014 IBM Corporation
Percentiles - example
Percentiles can be another way of describing how spread out data values are. Example: 5-Number Summary Minimum – 25th percentile – Median – 50th percentile - Maximum
51
Amount spent on primary card last month
Debt to income ratio (x100)
Minimum
.00
.00
Percentiles
25
1567.4658
5.1250
50
1690.0670
8.8000
75
1814.5430
13.5000
Maximum
2482.72
43.10 52. © 2014 IBM Corporation
Distributions: Normal distribution
52 54. © 2014 IBM Corporation
54
Distributions
Estimate of the probability distribution of global mean temperature resulting from a doubling of CO2 relative to its pre-industrial value, made from 100000 simulations