SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Revolution Confidential




    Revolution R Enterprise for IBM Netezza




1                                        © 2012 IBM Corporation
IBM Netezza with Revolution Analytics                            Revolution Confidential




  High-performance, in-database analytics platform for Big Data
     – Massively parallel processing delivers 10-100x performance
     – Run analytics in-database and eliminate data movement
     – Scalable architecture fosters experimentation
  Innovation with Advanced Analytics
     – Analytic modeling with most current statistical methods and 2,500+
       open source packages
  Enterprise ready advanced analytics software, services &
   support
     – Security, IDE, training, professional services
     – Web Services stack enables integration with front-end
       presentation layer




 2                                                                © 2012 IBM Corporation
Revolution Analytics




March 1, 2012                      © 2012 IBM Corporation
What is R?                                                       Revolution Confidential
                                                    Download the White Paper
                                                            R is Hot
 Data analysis software                                    bit.ly/r-is-hot
 A programming language
     – Development platform designed by and for statisticians
     – Object-oriented: vector, matrix, model, …
     – Built-in libraries of algorithms
 An environment
     – Huge library of algorithms for data access, data manipulation, analysis
       and graphics
 An open-source software project
     – Free, open, and active
 A community
     – Thousands of contributors, 2 million users
     – Resources and help in every domain




 4                                                                    © 2012 IBM Corporation
Revolution Confidential



Most advanced statistical
analysis software available
                                 The professor who invented analytic software for

Half the cost of                the experts now wants to take it to the masses


commercial alternatives

2M+ Users
                                                                             Power
2,500+ Applications


                 Finance
    Statistics
                 Life Sciences
   Predictive    Manufacturing
    Analytics                                                                        Productivity
                 Retail
  Data Mining    Telecom                             Enterprise
                 Social Media                        Readiness
 Visualization
                 Government




                                                                                                              5
R evolution R E nterpris e has the Open-
S ourc e R E ngine at the c ore                                                                     Revolution Confidential




                 2,500 community packages and growing exponentially
             Multi-Threaded          Technology       Web Services     Big Data          Parallel
             Math Libraries           Partners            API          Analysis           Tools

                                                                                                         Revolution
 Technical                                                                                               Productivity
  Support                                                                                               Environment




                     Open Source R                                                  Build
                       Packages                       R Engine                    Assurance
                                                  Language Libraries




                                                                                                                        6
Working with Revolution R
    Enterprise for IBM Netezza




March 1, 2012                    © 2012 IBM Corporation
Revolution Confidential
Revolution R Enterprise for IBM Netezza
inside the IBM Netezza Architecture




                     IBM Netezza
                     Analytics




 8                                    © 2012 IBM Corporation
In-Database Paradigms for using R                                                Revolution Confidential



                                               Examples
  In-database Scoring
     – Family of apply functions which score       –   Customer lifetime value
       analytic models by using data               –   Credit score
       parallelism                                 –   Affinity
     – Underlying truism is that there is a fact   –   Good stock/bad stock
       that can be applied across all data

  Big Data Analytics                         Big data analytics
     – Family of parallelized, in-database         – Clustering of all data to determine
       analytics that have R wrappers and            groupings
       work on entire data set                     – Models that are apply across a whole
     – Underlying truism exists across all           data set – decision trees
       data                                        – Data transformation – variable
                                                     selection, correlation
  Grouped by Row (tapply)                    Group 
     – Data and Task Parallelism                    – Forecasting – by store, stock symbol,
        • Data flow technique to apply analytics to   etc.
          naturally occurring groups of data using – Build model for each customer or
          non-parallelized analytics                  product or etc.
     – Underlying relationship in data is by a
       group


 9                                                                                © 2012 IBM Corporation
Access In-Database Language Support from RConfidential
                                       Revolution




                   SQL          Java




             C                         Python




                  Fortran       C++




 10                                             © 2012 IBM Corporation
Open Source R Package Support                                Revolution Confidential




           Horizontal             Vertical
           • Bayesian             • Econometrics

           • Cluster              • Experimental Design
           • Distributions        • Computational
                                  Physics
           • Graphics
                                  • Clinical Trials       2500+
           • Graphical Models
                                  • Environmetrics
           • Machine Learning                             community
                                  • Finance
           • Multivariate                                 packages
                                  • Genetics
           • Natural Language
           Processing             • Medical Imaging
           • Optimization         • Pharmacokinetics
           • Robust Statistical   • Phylogenetics
           Metrics                • Psychometrics
           • Spatial              • Social Sciences
           • Survival Analysis
           • Time Series



 11                                                           © 2012 IBM Corporation
Using Revolution R Enterprise with IBM NetezzaConfidential
                                          Revolution


                                                                                            Business Intelligence, Excel
                                                                                             or Third-Party Application




                                                                                                            HTTP
                                                                                           RevoDeployR Server
                                                                                         Web Services Interface for R

           Revolution R Enterprise - Workstation                                         Revolution R Enterprise - Server

                              RODBC
                                                     R Packages integrate and                              RODBC
                                 &                   push analytics processing                                &
                              nzODBC                       in-database                                     nzODBC



         IBM Netezza Analytics                                      Host

      IBM Netezza Analytics      IBM Netezza Analytics     IBM Netezza Analytics   IBM Netezza Analytics           IBM Netezza Analytics

            S-Blade                    S-Blade                   S-Blade                 S-Blade                         S-Blade



 12                                                                                                                       © 2012 IBM Corporation
Deploying Revolution R Enterprise to IBM Netezza
                                          Revolution Confidential




                       •Remote terminal connection to Host
                       •Create your R Script
                       •Compile and Register your R Script as an AE (UDAP)
                       •Execute SQL that will invoke the registered AE
                       •Go back Revolution R Client to retrieve results and continue
                       additional analysis




          IBM Netezza Analytics                                  Host

       IBM Netezza Analytics    IBM Netezza Analytics   IBM Netezza Analytics   IBM Netezza Analytics   IBM Netezza Analytics

             S-Blade                  S-Blade                 S-Blade                 S-Blade                 S-Blade




  13                                                                                                           © 2012 IBM Corporation
Revolution R Enterprise Client Configuration
                                           Revolution Confidential




  Revolution R Enterprise          R Package Dependencies
      – Productivity Environment     –   RODBC
                                     –   caTools
                                     –   Tree
                                     –   Bitops
                                     –   E1071
                                     –   Rgl
                                     –   Ca
                                     –   MASS
                                     –   XML


  Netezza ODBC Drivers
  ‘nz’ R Packages
      – nzA, nzR, nzMatrix


 14                                                    © 2012 IBM Corporation
IBM Netezza In-Database Analytics from Revolution R Confidential
                                                 Revolution




           nzR                         nzA                        nzMatrix
         Package                     Package                      Package
                                                                Encapsulation of Matrices
       Encapsulate database and        Entry point to the      and operations in Database
       expose “R”-like constructs        nzAnalytics

                                                               nz.matrix construct in
                                                               R to access matrices in the
                                                                        database
      R data.frame =
      database table                 Explicitly parallelized
      Apply an R function to a row   algorithms that run in        R operations on
      of data or grouped rows of            database           nz.matrix translate to
      data                                                      matrix stored procedure
                                                                       operations



 15                                                                           © 2012 IBM Corporation
nzR Package                                                                          Revolution Confidential



       Basic Functions                               Sample Code
       Database Connection    nzConnect              #load packages
                              nzConnectDSN
                                                     library(nzr)
       SQL Execution          nzQuery,
                              nzScalarQuery          #connect to a database via ODBC
                              nzDeleteTable          nzConnect("admin", "xyz", "127.0.0.1", "iclasstest")
       Data Management        as.nz.data.frame
                                                     #load the iris table
                              nz.data.frame
                                                     nzdf <- nz.data.frame("iris")
       Apply an R function    nzApply
                              nzTApply               #run a nzTApply against the nz dataframe
                              nzGroupedApply         fun <- function(x) max(x[,1])
       R Package Management   nzInstallPackages      nzTApply(nzdf, nzdf[,5], fun)
                              nzIsPackageInstalled




 16                                                                                   © 2012 IBM Corporation
nzA Package                                                                          Revolution Confidential




 Data Manipulation
      Moments                         nz.moments
      Quantiles                       nz.quantile, nz.quartile
      Outlier Detection               nz.outliers
      Frequency Table                 nz.bitable
      Histogram                       nz.hist
      Pearson's Correlation           nz.corr
      Spearman's Correlation          nz.spearman.corr, nz.spearman.corr.s
      Covariance                      nz.cov, nz.cov.matrix
      Mutual Information              nz.mutualinfo
      Chi-Square Test                 nzChisq.test, nz.chisq.test
      t -Test                         t.ls.test, t.me.test, t.pmd.test, t.umd.test

      Mann-Whitney-Wilcoxon Test      nz.mww.test
      Wilcoxon Test                   nz.wilcoxon.test
      Canonical Correlation           nz.canonical.corr
      One-Way ANOVA                   nzAnova, nz.anova.CRD.test, nz.anova.RBD.test

      Principal Component Analysis    nzPCA

      Tree-Shaped Bayesian Networks   nz.TBNet Apply, nz.TBNet Grow, nz.BigBNControl,
                                      nz.TBNet1g2p, nz.TBNet1g,nz.TBNet2g




 17                                                                                   © 2012 IBM Corporation
nzA Package                                                                 Revolution Confidential




 Data Transformations
      Discretization                      nz.efdisc, nz.emdisc, nz.ewdisc
      Standardization and Normalization   nz.std.norm
      Data Imputation                     nz.impute.data




 Model Diagnostics
      Misclassification Error             nz.cerror
      Confusion Matrix                    nz.acc, nz.CMATRIX STATS
      Mean Absolute Error                 nz.mae
      Mean Square Error                   nz.mse
      Relative Absolute Error             nz.rae
      Percentage Split                    nz.percentage.split
      Cross-Validation                    nz.cross.validation




 18                                                                          © 2012 IBM Corporation
nzA Package                                                                        Revolution Confidential




       Classification                            Clustering
       Naive Bayes       nzNaiveBayes,            K-Means Clustering    nzKMeans, nz.kmeans,
                         nz.naivebayes,                                 nz.predict.kmeans
                         nz.predict.naivebayes    Divisive Clustering   nz.divcluster,
       Decision Trees    nzDecTree,                                     nz.predict.divcluster
                         nz.dectree,
                         nz.grow.dectree,
                         nz.print.dectree,
                         nz.prune.dectree,
                         nz.predict.dectree
       Nearest Neighbors nz.knn

                                                  Associative Rule Mining
       Regression                                FP-Growth             nz.fpgrowth,
       Linear Regression   nzLm                                         nz.prepare.fpgrowth
       Regression Trees    nzRegTree,
                           nz.regtree,
                           nz.grow.regtree,
                           nz.print.regtree,
                           nz.predict.regtree




 19                                                                                 © 2012 IBM Corporation
nzMatrix Package                                                                           Revolution Confidential




 Data Manipulation
      Coerce or point to a nz.matrix   as.nz.matrix, as.nz.matrix.matrix, nz.matrix
      Combine Matrices                 nzCBind, nzRBind
      Create Matrices From Tables      nzCreateMatrixFromTable, nzCreateTableFromMatrix
      Create Special Matrices          nzIdentityMatrix, nzNormalMatrix, nzOnesMatrix,
                                       nzRandomMatrix, nzVecToDiag
      Decomposition                    nzSVD, svd, nzEigen
      Delete Matrices                  nzDeleteMatrix, nzDeleteMatrixByName
      Dimensions                       dim, NCOL, ncol, NROW, nrow
      Mathematical Functions           abs, add, aubtr, ceiling, div, exp, floor, ln, log10, mod,
                                       mult, nzPowerMatrix, pow, rounding, sqrt, trunc
      Matrix Engine Initialization     nzMatrixEngineInitialization
      Matrix Info                      is.nz.matrix, isSparse, nzExistMatrix, nzExistMatrixByName,
                                       nzGetValidMatrixName
      Operators                        *, +, -, <, ==, >, nzKronecker, nzPMax, nzPMin, nzSetValue,
                                       [, scale, t
      Printing Matrices                print.nz.matrix
      Solve                            nzInv, nzSolve, nzSolveLLS
      Sparse Matrices                  isSparse, nzSparse2matrix
      Summaries
                                       nzAll, nzAny, nzMax, nzMin, nzSsq, nzSum, nzTr




 20                                                                                         © 2012 IBM Corporation
Demonstration
                Using Revolution R
                 with IBM Netezza




March 1, 2012                        © 2012 IBM Corporation
Revolution Confidential




Turbo-C harge Your
  A nalytic s with IB M
  Netezza and R evolution
  R E nterpris e

P res ented by:
Derek M Norton, S enior S ales E ngineer
Us e C as e – C redit R is k             Revolution Confidential




 We have a dataset comprised of individuals
  and their credit risk
   stored on the Netezza Appliance
 The goal is to model if someone is
  “approvable” for a loan.
 This use case will follow a modeling process
  (though condensed) from start to finish.
 I will discuss each of the parts and at the end
  there will be a demo of the code
Modeling E xerc is e                 Revolution Confidential




1.   Learning more about the data
2.   Prepare the data for modeling
3.   Fit models to the data
4.   Model Performance
1. L earning more about the data                                                                                             Revolution Confidential




 Connect to the IBM Netezza appliance
 Summarize the data
 Visualize the data
                               Continuous Variable                                                          Discrete Varible
            300




                                                               300
            250




                                                               250
Frequency

            200




                                                               200
            150




                                                               150
            100




                                                               100
            50




                                                               50
            0




                                                               0




                  0   5   10        15               20   25         High School Diploma Bachelors Degree   Masters Degree     Professional Degree PhD

                               x
2. P repare the data for modeling            Revolution Confidential




 Split the data in to 70/30 Training/Test sets
 Transform some variables
   Discretize numeric variables for later use
3. F it models to the data                Revolution Confidential




 Build two different models to predict if an
  individual is “approvable”
   Decision Tree
   Naïve Bayes
4. Model P erformanc e                Revolution Confidential




 Examine confusion matrices to determine:
   Training performance
   Test performance
Demo   Revolution Confidential
Summary
 Familiar environment for R Developers
     – World-class productivity tools
     – Enterprise class service, support and integration
 Execution of analytics in-database
     – Analytic computing distributed across Netezza nodes and run
       in a massively parallel manner
     – Each Netezza node gets a data slice and analytics are pushed
       down from the Host to the individual nodes
 Capabilities
     – R Code executed on Netezza nodes in row-by-row fashion or
       on groups of rows
     – Enables access to explicitly parallelized algorithms running on
       entire data set
     – Large-scale parallel matrix operations on database tables
 Performance
     – 10-100x Performance improvements



 9                                                            © 2012 IBM Corporation
C ontac t Us                                                                   Revolution Confidential




                                Bill Zanine
                                Business Solutions Executive, Analytics Solutions
                                IBM Netezza
                                wzanine@us.ibm.com




                                Derek Norton
                                Solutions Executive
                                Revolution Analytics
                                derek.norton@revolutionanalytics.com




               www.revolutionanalytics.com +1 (650) 646 9545 Twitter: @RevolutionR

Weitere ähnliche Inhalte

Was ist angesagt?

CIGNEX Datamatics - Alfresco CLMS Solution
CIGNEX Datamatics - Alfresco CLMS SolutionCIGNEX Datamatics - Alfresco CLMS Solution
CIGNEX Datamatics - Alfresco CLMS SolutionAlfresco Software
 
Is Your IT Infrastructure Future-Proof?
Is Your IT Infrastructure Future-Proof? Is Your IT Infrastructure Future-Proof?
Is Your IT Infrastructure Future-Proof? Internap
 
Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...John Archer
 
Track 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil bridTrack 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil bridEMC Forum India
 
Leveraging Digital Content Services to Increase Customer Lifetime Value
Leveraging Digital Content Services to Increase Customer Lifetime ValueLeveraging Digital Content Services to Increase Customer Lifetime Value
Leveraging Digital Content Services to Increase Customer Lifetime Valuenewbaymarketing
 
Congress 2012: Enterprise Cloud Adoption – an Evolution from Infrastructure ...
Congress 2012:  Enterprise Cloud Adoption – an Evolution from Infrastructure ...Congress 2012:  Enterprise Cloud Adoption – an Evolution from Infrastructure ...
Congress 2012: Enterprise Cloud Adoption – an Evolution from Infrastructure ...eurocloud
 
Grid07 4 Tzannetakis
Grid07 4 TzannetakisGrid07 4 Tzannetakis
Grid07 4 Tzannetakisimec.archive
 
Big Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumBig Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumIntelAPAC
 
Net App At A Glance
Net App At A GlanceNet App At A Glance
Net App At A Glancenburgett
 
Open Source DWBI-A Primer
Open Source DWBI-A PrimerOpen Source DWBI-A Primer
Open Source DWBI-A Primerpartha69
 

Was ist angesagt? (12)

CIGNEX Datamatics - Alfresco CLMS Solution
CIGNEX Datamatics - Alfresco CLMS SolutionCIGNEX Datamatics - Alfresco CLMS Solution
CIGNEX Datamatics - Alfresco CLMS Solution
 
Is Your IT Infrastructure Future-Proof?
Is Your IT Infrastructure Future-Proof? Is Your IT Infrastructure Future-Proof?
Is Your IT Infrastructure Future-Proof?
 
Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...
 
Value Stories - 7th Issue
Value Stories - 7th Issue Value Stories - 7th Issue
Value Stories - 7th Issue
 
Track 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil bridTrack 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil brid
 
Leveraging Digital Content Services to Increase Customer Lifetime Value
Leveraging Digital Content Services to Increase Customer Lifetime ValueLeveraging Digital Content Services to Increase Customer Lifetime Value
Leveraging Digital Content Services to Increase Customer Lifetime Value
 
Congress 2012: Enterprise Cloud Adoption – an Evolution from Infrastructure ...
Congress 2012:  Enterprise Cloud Adoption – an Evolution from Infrastructure ...Congress 2012:  Enterprise Cloud Adoption – an Evolution from Infrastructure ...
Congress 2012: Enterprise Cloud Adoption – an Evolution from Infrastructure ...
 
Nuxeo at 10
Nuxeo at 10Nuxeo at 10
Nuxeo at 10
 
Grid07 4 Tzannetakis
Grid07 4 TzannetakisGrid07 4 Tzannetakis
Grid07 4 Tzannetakis
 
Big Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumBig Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick Buddenbaum
 
Net App At A Glance
Net App At A GlanceNet App At A Glance
Net App At A Glance
 
Open Source DWBI-A Primer
Open Source DWBI-A PrimerOpen Source DWBI-A Primer
Open Source DWBI-A Primer
 

Andere mochten auch

Shortening the feedback loop
Shortening the feedback loopShortening the feedback loop
Shortening the feedback loopJosh Baer
 
Apa 2017 diego rodriguez 3
Apa 2017 diego rodriguez 3Apa 2017 diego rodriguez 3
Apa 2017 diego rodriguez 3Larry Hines
 
Mozilla And Social Media
Mozilla And Social MediaMozilla And Social Media
Mozilla And Social Mediajorendorff
 
Making Employee Referral programs work
Making Employee Referral programs workMaking Employee Referral programs work
Making Employee Referral programs workmanishwisestep
 
xe nâng Hyundai động cơ xăng LPG
xe nâng Hyundai động cơ xăng LPGxe nâng Hyundai động cơ xăng LPG
xe nâng Hyundai động cơ xăng LPGĐoàn Hưng Thai
 
Airbus A380
Airbus A380Airbus A380
Airbus A380rubal_9
 
Presentation On Fighter Planes
Presentation On Fighter PlanesPresentation On Fighter Planes
Presentation On Fighter PlanesKunal Dhingra
 
Maruti suzuki ppt
Maruti suzuki pptMaruti suzuki ppt
Maruti suzuki pptanurag77
 
Development Applications 2008 05 26
Development Applications 2008 05 26Development Applications 2008 05 26
Development Applications 2008 05 26jgabateman
 
0 to 2,500 Customers with No Cold Calls
0 to 2,500 Customers with No Cold Calls0 to 2,500 Customers with No Cold Calls
0 to 2,500 Customers with No Cold CallsHubSpot
 
MasterPlus - Sistema Binário
MasterPlus - Sistema BinárioMasterPlus - Sistema Binário
MasterPlus - Sistema BinárioMasterplusBrasil
 
Strip your charts
Strip your chartsStrip your charts
Strip your chartsuwseidl
 

Andere mochten auch (19)

Apani Ov V9
Apani Ov V9Apani Ov V9
Apani Ov V9
 
Indian Railway
Indian RailwayIndian Railway
Indian Railway
 
Shortening the feedback loop
Shortening the feedback loopShortening the feedback loop
Shortening the feedback loop
 
Apa 2017 diego rodriguez 3
Apa 2017 diego rodriguez 3Apa 2017 diego rodriguez 3
Apa 2017 diego rodriguez 3
 
Mozilla And Social Media
Mozilla And Social MediaMozilla And Social Media
Mozilla And Social Media
 
Making Employee Referral programs work
Making Employee Referral programs workMaking Employee Referral programs work
Making Employee Referral programs work
 
xe nâng Hyundai động cơ xăng LPG
xe nâng Hyundai động cơ xăng LPGxe nâng Hyundai động cơ xăng LPG
xe nâng Hyundai động cơ xăng LPG
 
Air Powered Car
Air Powered CarAir Powered Car
Air Powered Car
 
Scaling docker @ovh
Scaling docker @ovhScaling docker @ovh
Scaling docker @ovh
 
Airbus A380
Airbus A380Airbus A380
Airbus A380
 
How Gas Turbine Engine Works
How Gas Turbine Engine WorksHow Gas Turbine Engine Works
How Gas Turbine Engine Works
 
Presentation On Fighter Planes
Presentation On Fighter PlanesPresentation On Fighter Planes
Presentation On Fighter Planes
 
Maruti suzuki ppt
Maruti suzuki pptMaruti suzuki ppt
Maruti suzuki ppt
 
Development Applications 2008 05 26
Development Applications 2008 05 26Development Applications 2008 05 26
Development Applications 2008 05 26
 
Acoples rapidos
Acoples rapidosAcoples rapidos
Acoples rapidos
 
0 to 2,500 Customers with No Cold Calls
0 to 2,500 Customers with No Cold Calls0 to 2,500 Customers with No Cold Calls
0 to 2,500 Customers with No Cold Calls
 
MasterPlus - Sistema Binário
MasterPlus - Sistema BinárioMasterPlus - Sistema Binário
MasterPlus - Sistema Binário
 
Strip your charts
Strip your chartsStrip your charts
Strip your charts
 
Apresentacao
ApresentacaoApresentacao
Apresentacao
 

Ähnlich wie Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise: A Step-by-Step Approach for Acceleration and Innovation

Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...
Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...
Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...Revolution Analytics
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Revolution Analytics
 
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...Massimo Gaetano Panunzio
 
Revolution R Enterprise: 100% R and More (14 Mar 2013)
Revolution R Enterprise: 100% R and More (14 Mar 2013)Revolution R Enterprise: 100% R and More (14 Mar 2013)
Revolution R Enterprise: 100% R and More (14 Mar 2013)Revolution Analytics
 
APAC Big Data Strategy_RK
APAC Big Data Strategy_RKAPAC Big Data Strategy_RK
APAC Big Data Strategy_RKIntelAPAC
 
100% R and More: Plus What's New in Revolution R Enterprise 6.0
100% R and More: Plus What's New in Revolution R Enterprise 6.0100% R and More: Plus What's New in Revolution R Enterprise 6.0
100% R and More: Plus What's New in Revolution R Enterprise 6.0Revolution Analytics
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceIBM Cloud Data Services
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobsBill Jacobs
 
ActuateOne for Utility Analytics
ActuateOne for Utility AnalyticsActuateOne for Utility Analytics
ActuateOne for Utility Analyticskatsoulis
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise DataWorks Summit
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter AnalyticsAdrian Turcu
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPCNetApp
 
1 jazz overview-karthik_k
1 jazz overview-karthik_k1 jazz overview-karthik_k
1 jazz overview-karthik_kIBM
 
Jazz Overview- Karthik K
Jazz Overview-  Karthik KJazz Overview-  Karthik K
Jazz Overview- Karthik KRoopa Nadkarni
 
Aras Connected Cloud for PLM
Aras Connected Cloud for PLMAras Connected Cloud for PLM
Aras Connected Cloud for PLMAras
 
Insync 10 session jd edwards strategy and roadmap anz (a4) - final
Insync 10 session   jd edwards strategy and roadmap anz (a4) - finalInsync 10 session   jd edwards strategy and roadmap anz (a4) - final
Insync 10 session jd edwards strategy and roadmap anz (a4) - finalInSync Conference
 

Ähnlich wie Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise: A Step-by-Step Approach for Acceleration and Innovation (20)

Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...
Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...
Integrate Your Advanced Analytics into BI Apps and MS Office and Multiply The...
 
Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
 
Revolution R Enterprise: 100% R and More (14 Mar 2013)
Revolution R Enterprise: 100% R and More (14 Mar 2013)Revolution R Enterprise: 100% R and More (14 Mar 2013)
Revolution R Enterprise: 100% R and More (14 Mar 2013)
 
APAC Big Data Strategy_RK
APAC Big Data Strategy_RKAPAC Big Data Strategy_RK
APAC Big Data Strategy_RK
 
Big Data Analysis Starts with R
Big Data Analysis Starts with RBig Data Analysis Starts with R
Big Data Analysis Starts with R
 
100% R and More: Plus What's New in Revolution R Enterprise 6.0
100% R and More: Plus What's New in Revolution R Enterprise 6.0100% R and More: Plus What's New in Revolution R Enterprise 6.0
100% R and More: Plus What's New in Revolution R Enterprise 6.0
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobs
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
 
ActuateOne for Utility Analytics
ActuateOne for Utility AnalyticsActuateOne for Utility Analytics
ActuateOne for Utility Analytics
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPC
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
1 jazz overview-karthik_k
1 jazz overview-karthik_k1 jazz overview-karthik_k
1 jazz overview-karthik_k
 
Jazz Overview- Karthik K
Jazz Overview-  Karthik KJazz Overview-  Karthik K
Jazz Overview- Karthik K
 
Aras Connected Cloud for PLM
Aras Connected Cloud for PLMAras Connected Cloud for PLM
Aras Connected Cloud for PLM
 
Insync 10 session jd edwards strategy and roadmap anz (a4) - final
Insync 10 session   jd edwards strategy and roadmap anz (a4) - finalInsync 10 session   jd edwards strategy and roadmap anz (a4) - final
Insync 10 session jd edwards strategy and roadmap anz (a4) - final
 

Mehr von Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint packageRevolution Analytics
 

Mehr von Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 

Kürzlich hochgeladen

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 

Kürzlich hochgeladen (20)

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 

Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise: A Step-by-Step Approach for Acceleration and Innovation

  • 1. Revolution Confidential Revolution R Enterprise for IBM Netezza 1 © 2012 IBM Corporation
  • 2. IBM Netezza with Revolution Analytics Revolution Confidential  High-performance, in-database analytics platform for Big Data – Massively parallel processing delivers 10-100x performance – Run analytics in-database and eliminate data movement – Scalable architecture fosters experimentation  Innovation with Advanced Analytics – Analytic modeling with most current statistical methods and 2,500+ open source packages  Enterprise ready advanced analytics software, services & support – Security, IDE, training, professional services – Web Services stack enables integration with front-end presentation layer 2 © 2012 IBM Corporation
  • 3. Revolution Analytics March 1, 2012 © 2012 IBM Corporation
  • 4. What is R? Revolution Confidential Download the White Paper R is Hot  Data analysis software bit.ly/r-is-hot  A programming language – Development platform designed by and for statisticians – Object-oriented: vector, matrix, model, … – Built-in libraries of algorithms  An environment – Huge library of algorithms for data access, data manipulation, analysis and graphics  An open-source software project – Free, open, and active  A community – Thousands of contributors, 2 million users – Resources and help in every domain 4 © 2012 IBM Corporation
  • 5. Revolution Confidential Most advanced statistical analysis software available The professor who invented analytic software for Half the cost of the experts now wants to take it to the masses commercial alternatives 2M+ Users Power 2,500+ Applications Finance Statistics Life Sciences Predictive Manufacturing Analytics Productivity Retail Data Mining Telecom Enterprise Social Media Readiness Visualization Government 5
  • 6. R evolution R E nterpris e has the Open- S ourc e R E ngine at the c ore Revolution Confidential 2,500 community packages and growing exponentially Multi-Threaded Technology Web Services Big Data Parallel Math Libraries Partners API Analysis Tools Revolution Technical Productivity Support Environment Open Source R Build Packages R Engine Assurance Language Libraries 6
  • 7. Working with Revolution R Enterprise for IBM Netezza March 1, 2012 © 2012 IBM Corporation
  • 8. Revolution Confidential Revolution R Enterprise for IBM Netezza inside the IBM Netezza Architecture IBM Netezza Analytics 8 © 2012 IBM Corporation
  • 9. In-Database Paradigms for using R Revolution Confidential  Examples  In-database Scoring – Family of apply functions which score – Customer lifetime value analytic models by using data – Credit score parallelism – Affinity – Underlying truism is that there is a fact – Good stock/bad stock that can be applied across all data  Big Data Analytics Big data analytics – Family of parallelized, in-database – Clustering of all data to determine analytics that have R wrappers and groupings work on entire data set – Models that are apply across a whole – Underlying truism exists across all data set – decision trees data – Data transformation – variable selection, correlation  Grouped by Row (tapply) Group – Data and Task Parallelism – Forecasting – by store, stock symbol, • Data flow technique to apply analytics to etc. naturally occurring groups of data using – Build model for each customer or non-parallelized analytics product or etc. – Underlying relationship in data is by a group 9 © 2012 IBM Corporation
  • 10. Access In-Database Language Support from RConfidential Revolution SQL Java C Python Fortran C++ 10 © 2012 IBM Corporation
  • 11. Open Source R Package Support Revolution Confidential Horizontal Vertical • Bayesian • Econometrics • Cluster • Experimental Design • Distributions • Computational Physics • Graphics • Clinical Trials 2500+ • Graphical Models • Environmetrics • Machine Learning community • Finance • Multivariate packages • Genetics • Natural Language Processing • Medical Imaging • Optimization • Pharmacokinetics • Robust Statistical • Phylogenetics Metrics • Psychometrics • Spatial • Social Sciences • Survival Analysis • Time Series 11 © 2012 IBM Corporation
  • 12. Using Revolution R Enterprise with IBM NetezzaConfidential Revolution Business Intelligence, Excel or Third-Party Application HTTP RevoDeployR Server Web Services Interface for R Revolution R Enterprise - Workstation Revolution R Enterprise - Server RODBC R Packages integrate and RODBC & push analytics processing & nzODBC in-database nzODBC IBM Netezza Analytics Host IBM Netezza Analytics IBM Netezza Analytics IBM Netezza Analytics IBM Netezza Analytics IBM Netezza Analytics S-Blade S-Blade S-Blade S-Blade S-Blade 12 © 2012 IBM Corporation
  • 13. Deploying Revolution R Enterprise to IBM Netezza Revolution Confidential •Remote terminal connection to Host •Create your R Script •Compile and Register your R Script as an AE (UDAP) •Execute SQL that will invoke the registered AE •Go back Revolution R Client to retrieve results and continue additional analysis IBM Netezza Analytics Host IBM Netezza Analytics IBM Netezza Analytics IBM Netezza Analytics IBM Netezza Analytics IBM Netezza Analytics S-Blade S-Blade S-Blade S-Blade S-Blade 13 © 2012 IBM Corporation
  • 14. Revolution R Enterprise Client Configuration Revolution Confidential  Revolution R Enterprise  R Package Dependencies – Productivity Environment – RODBC – caTools – Tree – Bitops – E1071 – Rgl – Ca – MASS – XML  Netezza ODBC Drivers  ‘nz’ R Packages – nzA, nzR, nzMatrix 14 © 2012 IBM Corporation
  • 15. IBM Netezza In-Database Analytics from Revolution R Confidential Revolution nzR nzA nzMatrix Package Package Package Encapsulation of Matrices Encapsulate database and Entry point to the and operations in Database expose “R”-like constructs nzAnalytics nz.matrix construct in R to access matrices in the database R data.frame = database table Explicitly parallelized Apply an R function to a row algorithms that run in R operations on of data or grouped rows of database nz.matrix translate to data matrix stored procedure operations 15 © 2012 IBM Corporation
  • 16. nzR Package Revolution Confidential  Basic Functions  Sample Code Database Connection nzConnect #load packages nzConnectDSN library(nzr) SQL Execution nzQuery, nzScalarQuery #connect to a database via ODBC nzDeleteTable nzConnect("admin", "xyz", "127.0.0.1", "iclasstest") Data Management as.nz.data.frame #load the iris table nz.data.frame nzdf <- nz.data.frame("iris") Apply an R function nzApply nzTApply #run a nzTApply against the nz dataframe nzGroupedApply fun <- function(x) max(x[,1]) R Package Management nzInstallPackages nzTApply(nzdf, nzdf[,5], fun) nzIsPackageInstalled 16 © 2012 IBM Corporation
  • 17. nzA Package Revolution Confidential  Data Manipulation Moments nz.moments Quantiles nz.quantile, nz.quartile Outlier Detection nz.outliers Frequency Table nz.bitable Histogram nz.hist Pearson's Correlation nz.corr Spearman's Correlation nz.spearman.corr, nz.spearman.corr.s Covariance nz.cov, nz.cov.matrix Mutual Information nz.mutualinfo Chi-Square Test nzChisq.test, nz.chisq.test t -Test t.ls.test, t.me.test, t.pmd.test, t.umd.test Mann-Whitney-Wilcoxon Test nz.mww.test Wilcoxon Test nz.wilcoxon.test Canonical Correlation nz.canonical.corr One-Way ANOVA nzAnova, nz.anova.CRD.test, nz.anova.RBD.test Principal Component Analysis nzPCA Tree-Shaped Bayesian Networks nz.TBNet Apply, nz.TBNet Grow, nz.BigBNControl, nz.TBNet1g2p, nz.TBNet1g,nz.TBNet2g 17 © 2012 IBM Corporation
  • 18. nzA Package Revolution Confidential  Data Transformations Discretization nz.efdisc, nz.emdisc, nz.ewdisc Standardization and Normalization nz.std.norm Data Imputation nz.impute.data  Model Diagnostics Misclassification Error nz.cerror Confusion Matrix nz.acc, nz.CMATRIX STATS Mean Absolute Error nz.mae Mean Square Error nz.mse Relative Absolute Error nz.rae Percentage Split nz.percentage.split Cross-Validation nz.cross.validation 18 © 2012 IBM Corporation
  • 19. nzA Package Revolution Confidential  Classification  Clustering Naive Bayes nzNaiveBayes, K-Means Clustering nzKMeans, nz.kmeans, nz.naivebayes, nz.predict.kmeans nz.predict.naivebayes Divisive Clustering nz.divcluster, Decision Trees nzDecTree, nz.predict.divcluster nz.dectree, nz.grow.dectree, nz.print.dectree, nz.prune.dectree, nz.predict.dectree Nearest Neighbors nz.knn  Associative Rule Mining  Regression FP-Growth nz.fpgrowth, Linear Regression nzLm nz.prepare.fpgrowth Regression Trees nzRegTree, nz.regtree, nz.grow.regtree, nz.print.regtree, nz.predict.regtree 19 © 2012 IBM Corporation
  • 20. nzMatrix Package Revolution Confidential  Data Manipulation Coerce or point to a nz.matrix as.nz.matrix, as.nz.matrix.matrix, nz.matrix Combine Matrices nzCBind, nzRBind Create Matrices From Tables nzCreateMatrixFromTable, nzCreateTableFromMatrix Create Special Matrices nzIdentityMatrix, nzNormalMatrix, nzOnesMatrix, nzRandomMatrix, nzVecToDiag Decomposition nzSVD, svd, nzEigen Delete Matrices nzDeleteMatrix, nzDeleteMatrixByName Dimensions dim, NCOL, ncol, NROW, nrow Mathematical Functions abs, add, aubtr, ceiling, div, exp, floor, ln, log10, mod, mult, nzPowerMatrix, pow, rounding, sqrt, trunc Matrix Engine Initialization nzMatrixEngineInitialization Matrix Info is.nz.matrix, isSparse, nzExistMatrix, nzExistMatrixByName, nzGetValidMatrixName Operators *, +, -, <, ==, >, nzKronecker, nzPMax, nzPMin, nzSetValue, [, scale, t Printing Matrices print.nz.matrix Solve nzInv, nzSolve, nzSolveLLS Sparse Matrices isSparse, nzSparse2matrix Summaries nzAll, nzAny, nzMax, nzMin, nzSsq, nzSum, nzTr 20 © 2012 IBM Corporation
  • 21. Demonstration Using Revolution R with IBM Netezza March 1, 2012 © 2012 IBM Corporation
  • 22. Revolution Confidential Turbo-C harge Your A nalytic s with IB M Netezza and R evolution R E nterpris e P res ented by: Derek M Norton, S enior S ales E ngineer
  • 23. Us e C as e – C redit R is k Revolution Confidential  We have a dataset comprised of individuals and their credit risk  stored on the Netezza Appliance  The goal is to model if someone is “approvable” for a loan.  This use case will follow a modeling process (though condensed) from start to finish.  I will discuss each of the parts and at the end there will be a demo of the code
  • 24. Modeling E xerc is e Revolution Confidential 1. Learning more about the data 2. Prepare the data for modeling 3. Fit models to the data 4. Model Performance
  • 25. 1. L earning more about the data Revolution Confidential  Connect to the IBM Netezza appliance  Summarize the data  Visualize the data Continuous Variable Discrete Varible 300 300 250 250 Frequency 200 200 150 150 100 100 50 50 0 0 0 5 10 15 20 25 High School Diploma Bachelors Degree Masters Degree Professional Degree PhD x
  • 26. 2. P repare the data for modeling Revolution Confidential  Split the data in to 70/30 Training/Test sets  Transform some variables  Discretize numeric variables for later use
  • 27. 3. F it models to the data Revolution Confidential  Build two different models to predict if an individual is “approvable”  Decision Tree  Naïve Bayes
  • 28. 4. Model P erformanc e Revolution Confidential  Examine confusion matrices to determine:  Training performance  Test performance
  • 29. Demo Revolution Confidential
  • 30. Summary  Familiar environment for R Developers – World-class productivity tools – Enterprise class service, support and integration  Execution of analytics in-database – Analytic computing distributed across Netezza nodes and run in a massively parallel manner – Each Netezza node gets a data slice and analytics are pushed down from the Host to the individual nodes  Capabilities – R Code executed on Netezza nodes in row-by-row fashion or on groups of rows – Enables access to explicitly parallelized algorithms running on entire data set – Large-scale parallel matrix operations on database tables  Performance – 10-100x Performance improvements 9 © 2012 IBM Corporation
  • 31. C ontac t Us Revolution Confidential Bill Zanine Business Solutions Executive, Analytics Solutions IBM Netezza wzanine@us.ibm.com Derek Norton Solutions Executive Revolution Analytics derek.norton@revolutionanalytics.com www.revolutionanalytics.com +1 (650) 646 9545 Twitter: @RevolutionR