Weitere ähnliche Inhalte Ähnlich wie Technology Strategies for Big Data Analytics, (20) Mehr von Teradata Aster (20) Kürzlich hochgeladen (20) Technology Strategies for Big Data Analytics, 1. TECHNOLOGY STRATEGIES
FOR BIG DATA ANALYTICS
BERNARD BLAIS
PRINCIPAL, GLOBAL TECHNOLOGY PRACTICE
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
2. THE CHALLENGE?
VOLUME
VARIETY
DATA SIZE
VELOCITY
TODAY THE FUTURE
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
3. A flexible architecture that supports
many data types and usage patterns
Technology Upstream use of analytics to optimize
data relevance
Checklist for
Real-time visualization and advanced
Big Data analytics to accelerate understanding
Analytics and action
Collaborative approaches to align
Business and IT executives
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
4. THE ANALYTICS LIFECYCLE
IDENTIFY /
FORMULATE
BUSINESS EVALUATE /
PROBLEM DATA
MANAGER MONITOR DATA SCIENTIST
RESULTS PREPARATION
Domain Expert Data Exploration
Makes Decisions Data Visualization
Evaluates Processes and ROI
How can
DEPLOY you create
MODEL DATA
competitive EXPLORATION
advantage?
IT SYSTEMS / DATA MINER /
MANAGEMENT VALIDATE STATISTICIAN
MODEL TRANSFORM
Model Validation & SELECT Exploratory Analysis
Model Deployment BUILD
Descriptive Segmentation
Model Monitoring MODEL Predictive Modeling
Data Preparation
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
5. HIGH-
PERFORMANCE KEY COMPONENTS
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
6. HIGH-
PERFORMANCE KEY COMPONENTS
ANALYTICS IDENTIFY /
FORMULATE
EVALUATE /
PROBLEM DATA
BUSINESS MONITOR DATA SCIENTIST
MANAGER RESULTS PREPARATION
Data Exploration
Data Visualization
Domain Expert
Makes Decisions How can
Evaluates Processes and ROI
DEPLOY you create
MODEL DATA
competitive EXPLORATION
advantage?
IT SYSTEMS / DATA MINER /
MANAGEMENT VALIDATE STATISTICIAN
MODEL TRANSFORM
Model Validation & SELECT Exploratory Analysis
Model Deployment BUILD
Descriptive Segmentation
Model Monitoring MODEL Predictive Modeling
Data Preparation
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
7. HIGH-
PERFORMANCE KEY COMPONENTS
ANALYTICS
DEPLOY
In Database / FASTER
In Memory DECISIONS
CORE PREPARE Grid Computing /
OPPORTUNITY BIGGER
DATA
In Memory
DEVELOP
BETTER
RESULTS
In Memory
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
8. HIGH-
PERFORMANCE SAS® GRID COMPUTING
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
9. HIGH-
PERFORMANCE SAS® IN-DATABASE
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
10. HOW DO WE MANAGE DATA IN THE PHYSICAL WORLD?
1. Acquire 2. Determine Relevance
3. Store
Trash Cache Storage
Copyright © 2012, SAS Institute Inc. All rights reserved.
11. HOW DO WE MANAGE INFORMATION IN THE IT WORLD?
Users Systems
Relevance is traditionally A Big Data Analytics strategy
determined at query time . . . requires a new approach . . .
“Acquire, Store, Analyze” Queries “Stream it, Score it, Store it”
Data Acquisition
Data Transformations
Data Normalization
DATA
Copyright © 2012, SAS Institute Inc. All rights reserved.
12. INFORMATION
STREAM IT, SCORE IT, STORE IT
MANAGEMENT
ENTERPRISE
DECISIONS / ACTIONS / DATA
LOW COST STORAGE
RAW RELEVANT DATA
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
13. CUSTOMER
TRADITIONAL ANALYTICS PROCESS
CASE STUDY
3
HRS
DATA MODEL MODEL
EXPLORATION DEVELOPMENT DEPLOYMENT
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
14. CUSTOMER
HIGH-PERFORMANCE ANALYTICS PROCESS
CASE STUDY
Past Approach In-Database Approach
• Daily process begins • Daily process begins at
with flat file creation at 6:30am 4:00am with EDW load.
– SLA delivered at ~9:30am.
Business operational data loaded
• All Value
• File transferred to SQL Server,
limited to ~350K customer directly to EDW. No flat file or
- Scope of customer analysis: 350K vs. 40M records based on specific
criteria.
intermediate processing is
needed.
- Monthly collections: $1M-$3M per month
• 300 step process to support • 10 step process
data mining life cycle. • Scoring and customer
selection done in-database
against ALL customer rows 12
minutes
30 MINUTES TO SCORE ~350k 4 MINUTES TO SCORE ~40M
customers customers
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
15. HIGH-
PERFORMANCE SAS® IN-MEMORY ANALYTICS
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
16. IN-MEMORY
EXPLORATION AND VISUALIZATION
ARCHITECTURE
> 1.1 BILLION RECORDS
10
SECONDS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
17. IN-MEMORY
MODEL DEVELOPMENT & DEPLOYMENT
ARCHITECTURE
5½
HRS
82
SECONDS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
18. CUSTOMER
Customer Segmentation
CASE STUDY
Tailored and Real-time
Marketing Campaigns
Billions of
Purchase
Transactions
Copyright © 2012, SAS Institute Inc. All rights reserved.
19. CUSTOMER
TRADITIONAL ANALYTICS PROCESS
CASE STUDY
167 Hours
DATA MODEL MODEL
EXPLORATION DEVELOPMENT DEPLOYMENT
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
20. CUSTOMER
IN-MEMORY ANALYTICS PROCESS 167 Hours
CASE STUDY
DEVELOPMENT
EXPLORATION
DEPLOYMENT Bottom-line Impact:
MODEL
MODEL
DATA
Tens of Millions of
Dollars
84
SECONDS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
21. SAS HIGH-
PEFORMANCE
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
22. SAS HIGH-
PEFORMANCE
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
23. SAS HIGH-
PEFORMANCE
ANALYTICS
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
24. BEST
PRACTICE
Business Analytics Maturity Assessment
Overview:
Two-day on-site discovery session focused on understanding the client’s business and IT
objectives, key initiatives, existing information management and analytics architecture, top
challenges, and priorities.
Process:
• Review current business requirements, timeframes, critical success factors, and key
business metrics (e.g. customer retention, customer acquisition).
• Review operational data sources to support business priorities.
• Review analytical priorities, strategy, process, and gaps.
Deliverables:
• Technology roadmap to optimize the client’s current and future IT-enabled analytical
process.
• Projected high-level ROI analysis resulting from proposed analytical architecture and
process improvements.
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
25. PROVEN VALUE PROPOSITION
SAS
ACROSS MULTIPLE INDUSTRIES
FINANCIAL PUBLIC TELCO RETAIL SERVICES
INDUSTRY
SERVICES SECTOR
COMPANY
Risk Revenue Campaign Inventory Promotions
USE CASE
Management Leakage Optimization Management Management
VALUE • 356X faster • Better able to • 15% better • Markdown • More precise
risk audit campaign optimization – than
calculations response from 30 hours competition
• Detect issues rates to 2 hours
• Faster in/out pre-refund • Coupon
markets redemption
rate +15%
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
26. RETAIL
USE
In-database Model Scoring
CASE 4½
HRS
Overview:
The largest customer behavior marketing company in the world, Catalina Marketing analyzes and
predicts shoppers’ buying behaviors to generate customized point-of-sale color coupons,
advertisements and informational messages for retail stores and pharmacies nationwide.
Process and Deliverables:
Leveraging In-database scoring, automated the execution of scoring models against their entire
140 million consumer database;
Impact:
Catalina Marketing has reduced its model-scoring times from 4.5 hours to around 60 seconds
using SAS Scoring Accelerator. As a result, it is able to use more complex, varied models to obtain
analytical results faster for more efficient, reliable decisions -- improving brand performance on
behalf of its food, drug, and mass advertising and marketing partners.
60
SECONDS
Implementation of marketing campaigns in days vs. more than 1 month before.
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
27. FINANCIALSERVICES
USE
Credit Risk on Banking Data
CASE
Overview:
Data Source: Bank loan portfolio covering:
3 million loans;
5,000 stress scenarios;
40 time horizons;
Transition matrix approach
3
MINUTES
Process and Deliverables:
Estimates of credit losses under stress over multiple horizons.
Completed compute time: under 3 minutes.
Impact:
Fast estimates of credit losses under stress over multiple horizons,
enables the Bank to make changes to lending practices throughout the day
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
28. PUBLIC SECTOR
USE
Text Mining on Unstructured Data
CASE 5½
HRS
Overview:
USA’s National Highway Traffic Safety Administration
700,000 accident reports on Vehicles make and models, manufacturing date, purchase date,
failures, mileage, number of cylinders, etc… Car components, Accidents information, etc
Process and Deliverables:
Text Mining on accident reports. Analyze, Understand, Validate and Predict contents.
Report on content categorization. Text mining process runs in 1 minute 22 second on a High
Performance Analytics Server, instead of in 5 ½ hours on a regular server.
Impact:
99% time improvement means the whole process can now be considered an ITERATIVE,
DYNNAMIC process
82
SECONDS
Analyst can run it 20 times before lunch, each time fine-tuning the model and improving the
output, instead of maybe twice during the whole week.
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
29. UTILITIES
USE
Forecasting On Smart Meter Data
CASE
Overview:
Oklahoma Gas & Electric Company (OG&E) serves nearly 800,000 customers in
Oklahoma and western Arkansas. It was named the 2011 Utility of the Year.
Forecast energy demand with SAS Analytics, plan for future changes to its energy
portfolio and optimize programs that encourage wiser use of energy. 12 records
Process and Deliverables:
Use smart meter data coming from customers every 15 minutes (versus once a month) to
create and measure the effectiveness of programs that reduce energy consumption.
Impact: 30,000 records
What previously took one to three days can now be done in a matter of hours.
We've gone from receiving 12 records for each customer to over 30,000 records per
year.
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
30. CONCLUSION What High Performance Analytics Really Mean
It’s not just about incredible speed, it’s also about:
Confidence: No more sampling, subsetting, summarizing
Accuracy: More complex models, more variables
Efficiency: Leverage the Analytical Brain on valuable tasks
Agility: Adapt and (re)Act faster
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
31. A flexible architecture that supports
many data types and usage patterns
Technology Upstream use of analytics to optimize
data relevance
Checklist for
Real-time visualization and advanced
Big Data analytics to accelerate understanding
Analytics and action
Collaborative approaches to align
Business and IT executives
C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.