SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Solution Spotlight Presents
Integrating R and Hadoop Part of Revolution Analytics’  Big Analytics Strategy Contact us at info@revolutionanalytics.com 2
Outline Introduction to Revolution Analytics Opportunity and Challenges of Big Analytics Revolution Analytics’ Support of Integration between R and Hadoop Contact Info
Open Source Analytics for the Enterprise ,[object Object],The professor who invented analytic software for the experts now wants to take it to the masses ,[object Object]
2M+ Users
2,500+ ApplicationsFinance Statistics Life Sciences Predictive Analytics Manufacturing Retail Data Mining Telecom Social Media Visualization Government
Revolution has garnered tremendous attention from media and analysts
Big Analytics, Big Advantages Big Analytics could be Simple algorithms running on “Big Data” Compute-intensive algorithms running on either “Big Data” or small data sets Advanced Analytic routines for data visualization or statistical analysis
Extracting Value with Big Analytics Big Analytics’ Advantages Predict the Future Understand Risk and Uncertainty Embrace Complexity Identify the Unusual Think Big 7
Big Analytics Challenges	 Computations are data intensive (i.e. require large amounts of data) To be effective, must rely on data parallelism Data is distributed across compute nodes Same task is run in parallel on each of the data partitions Examples of distributed computing frameworks that support data parallelism Traditional file based analytics using on-premise clusters Hadoop and MapReduce In-Database Analytics using parallel hardware architectures 8
Key Objectives for Big Analytics Deployments Best performance is achieved when these Big Analytics challenges are overcome: Avoid sampling / aggregation;  Reduce data movement and replication;  Bring the analytics as close as possible to the data and;  Optimize computation speed.  Revolution Analytics’ support for R and Hadoop helps overcome these challenges
Revolution Analytics’ RevoConnectRsfor Hadoop RevoHDFS provides connectivity from R to HDFS and RevoHBase Allows an R programmer to manipulate Hadoop data stores directly from HDFS and HBASE RevoHStream allows MapReduce jobs to be developed in R and executed as Hadoop Streaming jobs  Gives R programmers the ability to write MapReduce jobs in R using Hadoop Streaming
R/Hadoop – Revolution Analytics HDFS HBASE ,[object Object]
Hadoop Streaming package for executing MapReduce jobs from R.R Map Reduce Task Tracker Task Node R Client Job Tracker
RevoHDFS R package for working with HDFS Connect and Browse HDFS Read/Write/Delete/Copy/Rename files Examples: Read an HDFS text file into a data frame Serialize a data frame to HDFS Stream lines from HDFS text file that can be used with biglm or bigglm 12
RevoHBase R Package for working with HBASE Connect and Browse HBASE Get Rows/Columns of an HBASE table Write data to HBASE table Create/Delete HBASE table Examples Create a data frame in R from a collection of Rows/Columns from HBASE Update an HBASE table with values from a data frame 13
RevoHStream RevoHStream – R package capable of performing the following types of Analysis using Hadoop Streaming Simulations - Monte Carlo and other Stochastic analysis R ‘apply’ family of operations (tapply, lapply
) Binning, quantiles, summaries and crosstabs for input to displays (ggplot, lattice). Data transformations Data Mining 14

Weitere Àhnliche Inhalte

Was ist angesagt?

Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataMatt Stubbs
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Data Con LA
 
Accion Labs - Big Data Services
Accion Labs - Big Data ServicesAccion Labs - Big Data Services
Accion Labs - Big Data ServicesAccion Labs, Inc.
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Technologies
 
Accion Labs - Rackspace - How can cloud help you?
Accion Labs - Rackspace - How can cloud help you?Accion Labs - Rackspace - How can cloud help you?
Accion Labs - Rackspace - How can cloud help you?Accion Labs, Inc.
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real WorldMark Kromer
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachDataWorks Summit
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapRThe World Bank
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark Summit
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoMark Kromer
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerMark Kromer
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonMapR Technologies
 

Was ist angesagt? (20)

Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
 
Managing a Multi-Tenant Data Lake
Managing a Multi-Tenant Data LakeManaging a Multi-Tenant Data Lake
Managing a Multi-Tenant Data Lake
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
 
Accion Labs - Big Data Services
Accion Labs - Big Data ServicesAccion Labs - Big Data Services
Accion Labs - Big Data Services
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 
Accion Labs - Rackspace - How can cloud help you?
Accion Labs - Rackspace - How can cloud help you?Accion Labs - Rackspace - How can cloud help you?
Accion Labs - Rackspace - How can cloud help you?
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapR
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Big Data at your Desk with KNIME
Big Data at your Desk with KNIMEBig Data at your Desk with KNIME
Big Data at your Desk with KNIME
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
 

Andere mochten auch

Understanding Job market using Probabilistic Graphical Models
Understanding Job market using Probabilistic Graphical ModelsUnderstanding Job market using Probabilistic Graphical Models
Understanding Job market using Probabilistic Graphical Modelsvumaasha
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysisthinrhino
 
C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...
C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...
C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...Jluis Dela Rosa
 
CANENERO Advertising - Gilberto Chiacchiera
CANENERO Advertising - Gilberto ChiacchieraCANENERO Advertising - Gilberto Chiacchiera
CANENERO Advertising - Gilberto Chiacchierabnioceanoblu
 
Madagascar analysis
Madagascar analysisMadagascar analysis
Madagascar analysiscroberts100
 
Chrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻ
Chrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻChrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻ
Chrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻYosuke HASEGAWA
 
Go蚀èȘž
Go蚀èȘžGo蚀èȘž
Go蚀èȘžna_o_ys
 
Climbing Off The Ladder, Before We Fall Off
Climbing Off The Ladder, Before We Fall OffClimbing Off The Ladder, Before We Fall Off
Climbing Off The Ladder, Before We Fall OffC4Media
 
Trabajo colaborativo list
Trabajo colaborativo listTrabajo colaborativo list
Trabajo colaborativo listKaterin Colcha
 
EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet
EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet
EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet Doctipharma
 
AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...
AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...
AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...AWS Germany
 
Digital Marketing
Digital MarketingDigital Marketing
Digital MarketingSaiful Islam
 
Nettet som en del av mediemiksen - Google Think 2014 - Espen Grimmert
Nettet som en del av mediemiksen - Google Think 2014  - Espen GrimmertNettet som en del av mediemiksen - Google Think 2014  - Espen Grimmert
Nettet som en del av mediemiksen - Google Think 2014 - Espen GrimmertEspen Grimmert
 
00025233
0002523300025233
00025233fpem
 
Η Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόλη
Η Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόληΗ Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόλη
Η Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόληdroula_
 
Transformation or Transition
Transformation or TransitionTransformation or Transition
Transformation or TransitionMike Pounsford
 
portfolio_tmajasaari
portfolio_tmajasaariportfolio_tmajasaari
portfolio_tmajasaariTarmo Majasaari
 

Andere mochten auch (20)

Understanding Job market using Probabilistic Graphical Models
Understanding Job market using Probabilistic Graphical ModelsUnderstanding Job market using Probabilistic Graphical Models
Understanding Job market using Probabilistic Graphical Models
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...
C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...
C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...
 
CANENERO Advertising - Gilberto Chiacchiera
CANENERO Advertising - Gilberto ChiacchieraCANENERO Advertising - Gilberto Chiacchiera
CANENERO Advertising - Gilberto Chiacchiera
 
Madagascar analysis
Madagascar analysisMadagascar analysis
Madagascar analysis
 
Chrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻ
Chrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻChrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻ
Chrome-eject ăŒă“ăźć…ˆç”Ÿăăźă“ă‚‹ă«ăŻ
 
Go蚀èȘž
Go蚀èȘžGo蚀èȘž
Go蚀èȘž
 
Climbing Off The Ladder, Before We Fall Off
Climbing Off The Ladder, Before We Fall OffClimbing Off The Ladder, Before We Fall Off
Climbing Off The Ladder, Before We Fall Off
 
Trabajo colaborativo list
Trabajo colaborativo listTrabajo colaborativo list
Trabajo colaborativo list
 
EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet
EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet
EnquĂȘte Doctipharma : Les français et la vente de mĂ©dicaments sur internet
 
LWF 101 for Open Hack Day
LWF 101 for Open Hack DayLWF 101 for Open Hack Day
LWF 101 for Open Hack Day
 
AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...
AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...
AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...
 
Digital Marketing
Digital MarketingDigital Marketing
Digital Marketing
 
Nettet som en del av mediemiksen - Google Think 2014 - Espen Grimmert
Nettet som en del av mediemiksen - Google Think 2014  - Espen GrimmertNettet som en del av mediemiksen - Google Think 2014  - Espen Grimmert
Nettet som en del av mediemiksen - Google Think 2014 - Espen Grimmert
 
00025233
0002523300025233
00025233
 
Η Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόλη
Η Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόληΗ Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόλη
Η Î±ÎłÎ±Ï€Î·ÎŒÎ­ÎœÎ· ÎŒÎżÏ… πόλη
 
Transformation or Transition
Transformation or TransitionTransformation or Transition
Transformation or Transition
 
portfolio_tmajasaari
portfolio_tmajasaariportfolio_tmajasaari
portfolio_tmajasaari
 
Shepherd Elementary School Community Meeting Flyer
Shepherd Elementary School Community Meeting FlyerShepherd Elementary School Community Meeting Flyer
Shepherd Elementary School Community Meeting Flyer
 

Ähnlich wie Revolution Analytics

High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopRevolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopRevolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopDataWorks Summit
 
The Powerful Marriage of Hadoop and R (David Champagne)
The Powerful Marriage of Hadoop and R (David Champagne)The Powerful Marriage of Hadoop and R (David Champagne)
The Powerful Marriage of Hadoop and R (David Champagne)Revolution Analytics
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Cloudera, Inc.
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analyticsAvinash Pandu
 
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationModel Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationRevolution Analytics
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution Analytics
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...Debraj GuhaThakurta
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...Debraj GuhaThakurta
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenRevolution Analytics
 
What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2Revolution Analytics
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with RGreat Wide Open
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with RTechsparks
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxJOELFRANKLIN13
 
Open source analytics
Open source analyticsOpen source analytics
Open source analyticsAjay Ohri
 

Ähnlich wie Revolution Analytics (20)

High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache Hadoop
 
The Powerful Marriage of Hadoop and R (David Champagne)
The Powerful Marriage of Hadoop and R (David Champagne)The Powerful Marriage of Hadoop and R (David Champagne)
The Powerful Marriage of Hadoop and R (David Champagne)
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analytics
 
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationModel Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee Edlefsen
 
What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with R
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptx
 
Open source analytics
Open source analyticsOpen source analytics
Open source analytics
 

Mehr von templedf

JavaOne14 Hands-on Hadoop
JavaOne14 Hands-on HadoopJavaOne14 Hands-on Hadoop
JavaOne14 Hands-on Hadooptempledf
 
Java one14 handsonhadoop
Java one14 handsonhadoopJava one14 handsonhadoop
Java one14 handsonhadooptempledf
 
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop InfrastructureSupermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructuretempledf
 
Talend
TalendTalend
Talendtempledf
 
Datameer Analytics Solution
Datameer Analytics SolutionDatameer Analytics Solution
Datameer Analytics Solutiontempledf
 
Puppet Labs Puppet Enterprise
Puppet Labs Puppet EnterprisePuppet Labs Puppet Enterprise
Puppet Labs Puppet Enterprisetempledf
 
Composite Information Server
Composite Information ServerComposite Information Server
Composite Information Servertempledf
 

Mehr von templedf (7)

JavaOne14 Hands-on Hadoop
JavaOne14 Hands-on HadoopJavaOne14 Hands-on Hadoop
JavaOne14 Hands-on Hadoop
 
Java one14 handsonhadoop
Java one14 handsonhadoopJava one14 handsonhadoop
Java one14 handsonhadoop
 
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop InfrastructureSupermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
 
Talend
TalendTalend
Talend
 
Datameer Analytics Solution
Datameer Analytics SolutionDatameer Analytics Solution
Datameer Analytics Solution
 
Puppet Labs Puppet Enterprise
Puppet Labs Puppet EnterprisePuppet Labs Puppet Enterprise
Puppet Labs Puppet Enterprise
 
Composite Information Server
Composite Information ServerComposite Information Server
Composite Information Server
 

KĂŒrzlich hochgeladen

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Navi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

KĂŒrzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Navi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Revolution Analytics

  • 2. Integrating R and Hadoop Part of Revolution Analytics’ Big Analytics Strategy Contact us at info@revolutionanalytics.com 2
  • 3. Outline Introduction to Revolution Analytics Opportunity and Challenges of Big Analytics Revolution Analytics’ Support of Integration between R and Hadoop Contact Info
  • 4.
  • 6. 2,500+ ApplicationsFinance Statistics Life Sciences Predictive Analytics Manufacturing Retail Data Mining Telecom Social Media Visualization Government
  • 7. Revolution has garnered tremendous attention from media and analysts
  • 8. Big Analytics, Big Advantages Big Analytics could be Simple algorithms running on “Big Data” Compute-intensive algorithms running on either “Big Data” or small data sets Advanced Analytic routines for data visualization or statistical analysis
  • 9. Extracting Value with Big Analytics Big Analytics’ Advantages Predict the Future Understand Risk and Uncertainty Embrace Complexity Identify the Unusual Think Big 7
  • 10. Big Analytics Challenges Computations are data intensive (i.e. require large amounts of data) To be effective, must rely on data parallelism Data is distributed across compute nodes Same task is run in parallel on each of the data partitions Examples of distributed computing frameworks that support data parallelism Traditional file based analytics using on-premise clusters Hadoop and MapReduce In-Database Analytics using parallel hardware architectures 8
  • 11. Key Objectives for Big Analytics Deployments Best performance is achieved when these Big Analytics challenges are overcome: Avoid sampling / aggregation; Reduce data movement and replication; Bring the analytics as close as possible to the data and; Optimize computation speed. Revolution Analytics’ support for R and Hadoop helps overcome these challenges
  • 12. Revolution Analytics’ RevoConnectRsfor Hadoop RevoHDFS provides connectivity from R to HDFS and RevoHBase Allows an R programmer to manipulate Hadoop data stores directly from HDFS and HBASE RevoHStream allows MapReduce jobs to be developed in R and executed as Hadoop Streaming jobs Gives R programmers the ability to write MapReduce jobs in R using Hadoop Streaming
  • 13.
  • 14. Hadoop Streaming package for executing MapReduce jobs from R.R Map Reduce Task Tracker Task Node R Client Job Tracker
  • 15. RevoHDFS R package for working with HDFS Connect and Browse HDFS Read/Write/Delete/Copy/Rename files Examples: Read an HDFS text file into a data frame Serialize a data frame to HDFS Stream lines from HDFS text file that can be used with biglm or bigglm 12
  • 16. RevoHBase R Package for working with HBASE Connect and Browse HBASE Get Rows/Columns of an HBASE table Write data to HBASE table Create/Delete HBASE table Examples Create a data frame in R from a collection of Rows/Columns from HBASE Update an HBASE table with values from a data frame 13
  • 17. RevoHStream RevoHStream – R package capable of performing the following types of Analysis using Hadoop Streaming Simulations - Monte Carlo and other Stochastic analysis R ‘apply’ family of operations (tapply, lapply
) Binning, quantiles, summaries and crosstabs for input to displays (ggplot, lattice). Data transformations Data Mining 14
  • 18. Example MapReduce AlgorithmLogistic Regresion ## create test set as follows ## rhwrite(lapply (1:100, function(i) {eps = rnorm(1, sd =10) ; keyval(i, list(x = c(i,i+eps), y = 2 * (eps > 0) - 1))}), "/tmp/logreg") ## run as: ## rhLogisticRegression("/tmp/logreg", 10, 2, 0.05) ## max likelihood solution diverges for separable dataset, (-inf, inf) such as the above rhLogisticRegression = function(input, iterations, dims, alpha){ plane = rep(0, dims) g = function(z) 1/(1 + exp(-z)) for (i in 1:iterations) { gradient = rhread(revoMapReduce(input, map = function(k, v) keyval (1, v$y * v$x * g(-v$y * (plane %*% v$x))), reduce = function(k, vv) keyval(k, apply(do.call(rbind,vv),2,sum)), combine = T)) plane = plane + alpha * gradient[[1]]$val } plane } 15
  • 19. Get more information about Revolution Analytics’ Big Analytics Solutions, including R connectors for Hadoop 1 855-GET-REVO 16 http://www.revolutionanalytics.com/big-analytics