SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Big-data, real-time R?
Yes, you can!
1
David Smith
Revolution Analytics
@revodavid
2
REAL TIME
BIG DATA
PREDICTIVE ANALYTICS
Buzzword
Bingo!
Real-time Deployment
1. Data distillation
2. Model development
and validation
3. Model deployment
4. Real-time model
scoring
5. Model refresh
3
4Photo: Sarah&Boston (flickr: pocheco) Creative Commons BY-SA 2.0
“Big Data”
1. Data Distillation in Hadoop
5
Unstructured
Data
Analytics
Data Mart
Structured
Data
Log Files
Sensor Streams
Language Text
HDFS Load Map-Reduce
RHadoop
rmr
6
2. The Model Development Cycle
Feature
Selection
Sampling
Aggregati
on
Variable
Trans-
formation
Model
Estimation
Model
Refineme
nt
Model
Comparis
on /
Bench-
marking
Predictive
Model
R White Paper
bit.ly/r-is-hot
Structured
Data
7
Big-Data Predictive Models with ScaleR
3: Deployment Options
Unknown factors
SQL / Rules Engine
Code (C++, Java, R, Hadoop)
PMML Engine
Factors known in advance
Batch Lookup Tables
8
Factors
Scores
9
4. Real-Time
Scoring Factors
Scores
”IO VAPOURA” by Jaya Prime flickr.com/photos/sanjayaprime/4924462993 CC-BY 2.0
Decision Tree
Logistic Regression
Neural Network
K-means clustering
Ensemble Model
Predictive Model
User ID
Browser
Time/Date / Location
Previous purchases
Friend data
Any known information
Product of most interest
Offer of most likely sale
Most relevant link
Forecast sale value
Optimal Bid
Prediction or Selection
Scoring Rules
5. Model refresh Factors
Scores
Actual Outcomes
11
Big Data Real Time
Kilobytes/S
ec
Megabytes/
Sec
Gigabytes
 Terabytes
Petabytes 
Exabytes
Seconds
Milliseconds
Minutes
Minutes 
Hours
Real-World Examples
Revolution Analytics Case Studies
12
Why did I buy that blender?
Just browsing in the mall
TV ad / magazine ad
Coupon in the mail
“Just moved” promo email
Webstore recommendation
Browsing catalog
13
UpStream: Attribution Modeling
14
• ETL
• Marketing channel data
• Behavioral variables
• Promotional data
• Overlay data
• Exploratory data analysis
• Time-to-event models
• GAM survival models
• Scoring for inference
• Scoring for prediction
• 5 billion scores per day
per retailer
UPSTREAM DATA
FORMAT
CUSTOM VARIABLES
(PMML)
ACI
Top-20 mutual
fund company
$125B assets
Research and
data-driven
Innovative
16
• Collaboration
• Speed
• Deployment
Process
• Adoption
• Results
17
Analytics Function Library
rACI Package (w/ RevoR)
Model Building Function Library
Data Acquisition Function Library
Portfolio
Optimization and
Simulation API
Market Data from Thomson
Reuters (QA-Direct)
American Century Quant
Proprietary Data
Additional 3rd Party Data
Vendors
Live Analytics
PRODUCTION MODEL GENERATION
AND TRADING PROCESSES
Data Feeds
18
PREDICTIVE
ANALYTICS
BIG DATA
REAL TIME
19
www.revolutionanalytics.com +1 650 646 9545 Twitter: @RevolutionR
The leading enterprise provider of software and services for Open Source R
Big-Data, Real-Time R?
Yes, you can!
David Smith
@revodavid

Weitere ähnliche Inhalte

Was ist angesagt?

18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
Revolution Analytics
 
"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org
AIBDP
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
Revolution Analytics
 

Was ist angesagt? (14)

Integrating Structure and Analytics with Unstructured Data
Integrating Structure and Analytics with Unstructured DataIntegrating Structure and Analytics with Unstructured Data
Integrating Structure and Analytics with Unstructured Data
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business Investment
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 
Innovaccer service capabilities with case studies
Innovaccer service capabilities with case studiesInnovaccer service capabilities with case studies
Innovaccer service capabilities with case studies
 
De-Mystifying Big Data
De-Mystifying Big DataDe-Mystifying Big Data
De-Mystifying Big Data
 
Supply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AISupply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AI
 
"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org
 
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics  Dell Statisti...BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics  Dell Statisti...
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
 
Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
 

Andere mochten auch

Real Time P C R
Real  Time  P C RReal  Time  P C R
Real Time P C R
elmayestro
 

Andere mochten auch (11)

Real-TIme Market Data in R
Real-TIme Market Data in RReal-TIme Market Data in R
Real-TIme Market Data in R
 
Streaming Data in R
Streaming Data in RStreaming Data in R
Streaming Data in R
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R Language
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
Real Time P C R
Real  Time  P C RReal  Time  P C R
Real Time P C R
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
Flume-Cassandra Log Processor
Flume-Cassandra Log ProcessorFlume-Cassandra Log Processor
Flume-Cassandra Log Processor
 
Creating R Packages
Creating R PackagesCreating R Packages
Creating R Packages
 
Introduction to kdb+
Introduction to kdb+Introduction to kdb+
Introduction to kdb+
 
Streaming Data and Concurrency in R
Streaming Data and Concurrency in RStreaming Data and Concurrency in R
Streaming Data and Concurrency in R
 

Ähnlich wie Big data real time R - useR! 2013 - David Smith

Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
Revolution Analytics
 
Beauty and Big Data
Beauty and Big DataBeauty and Big Data
Beauty and Big Data
Sri Ambati
 
Halko_santafe_2015
Halko_santafe_2015Halko_santafe_2015
Halko_santafe_2015
Nathan Halko
 

Ähnlich wie Big data real time R - useR! 2013 - David Smith (20)

Big data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data VirtualizationBig data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data Virtualization
 
0xdata_h2o_BigDataScience_5.28.2013
0xdata_h2o_BigDataScience_5.28.20130xdata_h2o_BigDataScience_5.28.2013
0xdata_h2o_BigDataScience_5.28.2013
 
Big Data & Hadoop. Simone Leo (CRS4)
Big Data & Hadoop. Simone Leo (CRS4)Big Data & Hadoop. Simone Leo (CRS4)
Big Data & Hadoop. Simone Leo (CRS4)
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous Driving
 
Eclipse day Sydney 2014 BIG data presentation
Eclipse day Sydney 2014 BIG data presentationEclipse day Sydney 2014 BIG data presentation
Eclipse day Sydney 2014 BIG data presentation
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of Music
 
BigData
BigDataBigData
BigData
 
Beauty and Big Data
Beauty and Big DataBeauty and Big Data
Beauty and Big Data
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 Talk
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Halko_santafe_2015
Halko_santafe_2015Halko_santafe_2015
Halko_santafe_2015
 
DataOps - Production ML
DataOps - Production MLDataOps - Production ML
DataOps - Production ML
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
 
Hadoop dev 01
Hadoop dev 01Hadoop dev 01
Hadoop dev 01
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
TSE_Pres12.pptx
TSE_Pres12.pptxTSE_Pres12.pptx
TSE_Pres12.pptx
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of Hadoop
 

Mehr von Revolution Analytics

The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
Revolution Analytics
 

Mehr von Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solution
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Big data real time R - useR! 2013 - David Smith

  • 1. Big-data, real-time R? Yes, you can! 1 David Smith Revolution Analytics @revodavid
  • 2. 2 REAL TIME BIG DATA PREDICTIVE ANALYTICS Buzzword Bingo!
  • 3. Real-time Deployment 1. Data distillation 2. Model development and validation 3. Model deployment 4. Real-time model scoring 5. Model refresh 3
  • 4. 4Photo: Sarah&Boston (flickr: pocheco) Creative Commons BY-SA 2.0 “Big Data”
  • 5. 1. Data Distillation in Hadoop 5 Unstructured Data Analytics Data Mart Structured Data Log Files Sensor Streams Language Text HDFS Load Map-Reduce RHadoop rmr
  • 6. 6 2. The Model Development Cycle Feature Selection Sampling Aggregati on Variable Trans- formation Model Estimation Model Refineme nt Model Comparis on / Bench- marking Predictive Model R White Paper bit.ly/r-is-hot Structured Data
  • 8. 3: Deployment Options Unknown factors SQL / Rules Engine Code (C++, Java, R, Hadoop) PMML Engine Factors known in advance Batch Lookup Tables 8 Factors Scores
  • 9. 9 4. Real-Time Scoring Factors Scores ”IO VAPOURA” by Jaya Prime flickr.com/photos/sanjayaprime/4924462993 CC-BY 2.0 Decision Tree Logistic Regression Neural Network K-means clustering Ensemble Model Predictive Model User ID Browser Time/Date / Location Previous purchases Friend data Any known information Product of most interest Offer of most likely sale Most relevant link Forecast sale value Optimal Bid Prediction or Selection Scoring Rules
  • 10. 5. Model refresh Factors Scores Actual Outcomes
  • 11. 11 Big Data Real Time Kilobytes/S ec Megabytes/ Sec Gigabytes  Terabytes Petabytes  Exabytes Seconds Milliseconds Minutes Minutes  Hours
  • 13. Why did I buy that blender? Just browsing in the mall TV ad / magazine ad Coupon in the mail “Just moved” promo email Webstore recommendation Browsing catalog 13
  • 15. • ETL • Marketing channel data • Behavioral variables • Promotional data • Overlay data • Exploratory data analysis • Time-to-event models • GAM survival models • Scoring for inference • Scoring for prediction • 5 billion scores per day per retailer UPSTREAM DATA FORMAT CUSTOM VARIABLES (PMML)
  • 16. ACI Top-20 mutual fund company $125B assets Research and data-driven Innovative 16
  • 17. • Collaboration • Speed • Deployment Process • Adoption • Results 17 Analytics Function Library rACI Package (w/ RevoR) Model Building Function Library Data Acquisition Function Library Portfolio Optimization and Simulation API Market Data from Thomson Reuters (QA-Direct) American Century Quant Proprietary Data Additional 3rd Party Data Vendors Live Analytics PRODUCTION MODEL GENERATION AND TRADING PROCESSES Data Feeds
  • 19. 19 www.revolutionanalytics.com +1 650 646 9545 Twitter: @RevolutionR The leading enterprise provider of software and services for Open Source R Big-Data, Real-Time R? Yes, you can! David Smith @revodavid

Hinweis der Redaktion

  1. FastScalableIn Production
  2. Data as “new oil” – valuable commodityBig Data is crude oil: messy, hard to get at, got contaminants in it.
  3. Model development processNot just about the computational speed. Also about productivity of developer.
  4. Start off with stuff we know in real time.
  5. Demographics: consumer, product, marketActions: web clicks, email clicks, mobile app usage, call center logs, social, search …Outcomes: impressions, touches, orders (retail, online, mobile)Strategic allocation
  6. Outcome is “buying” instead of “dying”
  7. From Revolution Analytics. We help companies deploy predictive models created in R to real-time production systems.