This document summarizes a presentation given by Revolution Analytics on using R for marketing analytics. It discusses challenges like needing to make decisions faster based on more data and predictive models. It provides examples of companies using Revolution's R software to improve results, such as increasing lift for a client by 14% and saving another $270k. The presentation promotes Revolution's R software for handling big data and analytics faster through techniques like parallel processing and distributed computing. It argues Revolution R is the leading commercial provider of high performance R software.
R Tool for Visual Studio ŕšŕ¸Ľŕ¸°ŕ¸ŕ¸˛ŕ¸Łŕ¸ŕ¸łŕ¸ŕ¸˛ŕ¸ŕ¸Łŕšŕ¸§ŕ¸Ąŕ¸ŕ¸ąŕ¸ŕšŕ¸ŕšŕ¸ŕ¸ŕ¸ľŕ¸Ą ŕšŕ¸ŕ¸˘ ŕšŕ¸ŕ¸Ľŕ¸´ŕ¸Ąŕ¸§ŕ¸ŕ¸¨ŕš วิŕ¸ŕ¸´ŕ¸ŕ¸Łŕ¸ŕ¸´ŕ¸˘ŕ¸°ŕ¸ŕ¸¸...BAINIDA
R Tool for Visual Studio ŕšŕ¸Ľŕ¸°ŕ¸ŕ¸˛ŕ¸Łŕ¸ŕ¸łŕ¸ŕ¸˛ŕ¸ŕ¸Łŕšŕ¸§ŕ¸Ąŕ¸ŕ¸ąŕ¸ŕšŕ¸ŕšŕ¸ŕ¸ŕ¸ľŕ¸Ą ŕšŕ¸ŕ¸˘ ŕšŕ¸ŕ¸Ľŕ¸´ŕ¸Ąŕ¸§ŕ¸ŕ¸¨ŕš วิŕ¸ŕ¸´ŕ¸ŕ¸Łŕ¸ŕ¸´ŕ¸˘ŕ¸°ŕ¸ŕ¸¸...
5. Revolution Confidential
Todayâs Challenge:
Accelerating Business Cadence
5
Changing Business Environment
⢠Fact Based Decisions Require More Data
⢠Need to Understand Tradeoffs and Best Course of Action
⢠Predictive Models Need to Continually Deliver Lift
⢠Reduced Shelf Life for Predictive Models
Faster Time to Value
⢠Reduce Analytic Cycle Time
⢠Build & Deploy Models Faster
⢠Eliminate Time Consuming Data Movements
Rapid Customer Facing Decisions
⢠Score More Frequently
⢠Need to Make Best Decision in Real Time
10. Revolution Confidential
Can we be more innovative in marketing
analyticsâŚand precise in our targeting⌠using
new and âoldâ data⌠in less time?
10
11. Revolution Confidential
How fast can the marketing data scientist innovate
to drive better precision in model output? âŚ
âŚand can you get it (scale of data / scale of model scoring) in to production?
âŚat an acceptable price point?
13. Revolution ConfidentialScaleR: High Performance Scalable
Parallel External Memory Algorithms
13
ď§ Data import â Delimited,
Fixed, SAS, SPSS, OBDC
ď§ Variable creation &
transformation
ď§ Recode variables
ď§ Factor variables
ď§ Missing value handling
ď§ Sort
ď§ Merge
ď§ Split
ď§ Aggregate by category
(means, sums)
ď§ Data import â Delimited,
Fixed, SAS, SPSS, OBDC
ď§ Variable creation &
transformation
ď§ Recode variables
ď§ Factor variables
ď§ Missing value handling
ď§ Sort
ď§ Merge
ď§ Split
ď§ Aggregate by category
(means, sums)
ď§ Min / Max
ď§ Mean
ď§ Median (approx.)
ď§ Quantiles (approx.)
ď§ Standard Deviation
ď§ Variance
ď§ Correlation
ď§ Covariance
ď§ Sum of Squares (cross product
matrix for set variables)
ď§ Pairwise Cross tabs
ď§ Risk Ratio & Odds Ratio
ď§ Cross-Tabulation of Data
(standard tables & long form)
ď§ Marginal Summaries of Cross
Tabulations
ď§ Min / Max
ď§ Mean
ď§ Median (approx.)
ď§ Quantiles (approx.)
ď§ Standard Deviation
ď§ Variance
ď§ Correlation
ď§ Covariance
ď§ Sum of Squares (cross product
matrix for set variables)
ď§ Pairwise Cross tabs
ď§ Risk Ratio & Odds Ratio
ď§ Cross-Tabulation of Data
(standard tables & long form)
ď§ Marginal Summaries of Cross
Tabulations
ď§ Chi Square Test
ď§ Kendall Rank Correlation
ď§ Fisherâs Exact Test
ď§ Studentâs t-Test
ď§ Chi Square Test
ď§ Kendall Rank Correlation
ď§ Fisherâs Exact Test
ď§ Studentâs t-Test
Data Prep, Distillation & Descriptive AnalyticsData Prep, Distillation & Descriptive Analytics
ď§ Subsample (observations &
variables)
ď§ Random Sampling
ď§ Subsample (observations &
variables)
ď§ Random Sampling
R Data Step Statistical Tests
Sampling
Descriptive Statistics
14. Revolution ConfidentialScaleR: High Performance Scalable
Parallel External Memory Algorithms
14
ď§ Sum of Squares (cross product
matrix for set variables)
ď§ Multiple Linear Regression
ď§ Generalized Linear Models (GLM)
- All exponential family
distributions: binomial, Gaussian,
inverse Gaussian, Poisson,
Tweedie. Standard link functions
including: cauchit, identity, log,
logit, probit. User defined
distributions & link functions.
ď§ Covariance & Correlation
Matrices
ď§ Logistic Regression
ď§ Classification & Regression Trees
ď§ Predictions/scoring for models
ď§ Residuals for all models
ď§ Sum of Squares (cross product
matrix for set variables)
ď§ Multiple Linear Regression
ď§ Generalized Linear Models (GLM)
- All exponential family
distributions: binomial, Gaussian,
inverse Gaussian, Poisson,
Tweedie. Standard link functions
including: cauchit, identity, log,
logit, probit. User defined
distributions & link functions.
ď§ Covariance & Correlation
Matrices
ď§ Logistic Regression
ď§ Classification & Regression Trees
ď§ Predictions/scoring for models
ď§ Residuals for all models
ď§ Histogram
ď§ Line Plot
ď§ Scatter Plot
ď§ Lorenz Curve
ď§ ROC Curves (actual data and
predicted values)
ď§ Histogram
ď§ Line Plot
ď§ Scatter Plot
ď§ Lorenz Curve
ď§ ROC Curves (actual data and
predicted values)
ď§ K-Meansď§ K-Means
Statistical ModelingStatistical Modeling
ď§ Decision Treesď§ Decision Trees
Predictive Models Cluster AnalysisData Visualization
Classification
Machine LearningMachine Learning
SimulationSimulation
Variable Selection
ď§ Stepwise Regression
ď§ Monte Carlo
ď§ Parallel Random Number
Generation
ď§ Monte Carlo
ď§ Parallel Random Number
Generation
15. Revolution Confidential
15
⢠User Churn: predict the likelihood of a user leaving a particular game
⢠User Community Impact: understand the impact players have on communities
⢠Promotional Pricing: understand user purchase behavior better.
⢠Game Content Optimization: understand user behavior to develop new games
Revolution example: multi-use predictive analytics
16. Revolution Confidential
Example of what we do:
DataSong, marketing attribution and optimisation
16
Company: Data Song Software, San Francisco
www.datasong.com
Industry: software / services for marketing
attribution and campaign optimization
Challenge: economically develop a scalable,
high-performing R-powered Big Data Analytics
platform on which to provide services to clients
Solution:
⢠Revolution R Enterprise for Big Data
Analytics and Hadoop for data management
⢠Customized exploratory data analysis and
GAM survival models to drive NBA and
targeting
⢠Saved one client $270,000 on one campaign
⢠Generated 14% lift for another client
We saw about a 4x performance improvement on
50 million records. It works brilliantly.â
- CEO, John Wallace, DataSong
17. Revolution Confidential
Example of what we do: [X+1], digital marketing
analytics
17
Company: [X+1] New York, www.xplusone.com
Industry: software and services for optimized
digital marketing through multi-channel visitor
experiences on personalized websites and real-
time digital audience targeting
Challenge: needed real-time analytics,
automated model updates, include new data
types and manage quickly-growing data volumes
Solution:
⢠Revolution R Enterprise, for Big Data
Analytics, and a distributed computing
platform for data management
⢠Higher lift of real time multi-channel ad
targeting analytics derived from use of more
data and attributes
⢠Higher lift through higher precision audience
targeting and tailored messaging 2X data, 2X attributes
no impact on performance
19. Revolution Confidential
PEMAs Beat In-Memory Algorithms
ď§ Parallel external memory algorithms
(PEMAâs)
ď§ Exploit distributed and streaming data
ď§ Deliver scalability and performance
ď§ Split computations so not all data has to be in
memory at one time
ď§ âautomaticallyâ parallelize and distribute
algorithms
19
20. Revolution Confidential
20
Revolution R Enterprise
High Performance, Multi-Platform Analytics Platform
Revolution R EnterpriseRevolution R Enterprise
DeployR
Web Services Software Development Kit
DevelopR
Integrated
Development
Environment
ConnectR
High Speed & Direct Connectors
Teradata, HDFS (both), Hbase, Netezza, SAS, SPSS, CSV, ODBC
ScaleR
High Performance Big Data Analytics
DistributedR
Streaming, In-Memory Distributed Computing Framework
IBM PureData, IBM Platform LSF, HPC Server, MS Azure Burst, Windows &
redhat Servers
RevoR
Performance Enhanced Open Source R + Open Source R packages