Revolution Analytics

•

1 gefällt mir•4,800 views

templedf

Technologie

Integrating R and Hadoop Part of Revolution Analytics’ Big Analytics Strategy Contact us at info@revolutionanalytics.com 2

Outline Introduction to Revolution Analytics Opportunity and Challenges of Big Analytics Revolution Analytics’ Support of Integration between R and Hadoop Contact Info

Open Source Analytics for the Enterprise ,[object Object],The professor who invented analytic software for the experts now wants to take it to the masses ,[object Object]

2,500+ ApplicationsFinance Statistics Life Sciences Predictive Analytics Manufacturing Retail Data Mining Telecom Social Media Visualization Government

Revolution has garnered tremendous attention from media and analysts

Big Analytics, Big Advantages Big Analytics could be Simple algorithms running on “Big Data” Compute-intensive algorithms running on either “Big Data” or small data sets Advanced Analytic routines for data visualization or statistical analysis

Extracting Value with Big Analytics Big Analytics’ Advantages Predict the Future Understand Risk and Uncertainty Embrace Complexity Identify the Unusual Think Big 7

Big Analytics Challenges Computations are data intensive (i.e. require large amounts of data) To be effective, must rely on data parallelism Data is distributed across compute nodes Same task is run in parallel on each of the data partitions Examples of distributed computing frameworks that support data parallelism Traditional file based analytics using on-premise clusters Hadoop and MapReduce In-Database Analytics using parallel hardware architectures 8

Key Objectives for Big Analytics Deployments Best performance is achieved when these Big Analytics challenges are overcome: Avoid sampling / aggregation; Reduce data movement and replication; Bring the analytics as close as possible to the data and; Optimize computation speed. Revolution Analytics’ support for R and Hadoop helps overcome these challenges

Revolution Analytics’ RevoConnectRsfor Hadoop RevoHDFS provides connectivity from R to HDFS and RevoHBase Allows an R programmer to manipulate Hadoop data stores directly from HDFS and HBASE RevoHStream allows MapReduce jobs to be developed in R and executed as Hadoop Streaming jobs Gives R programmers the ability to write MapReduce jobs in R using Hadoop Streaming

R/Hadoop – Revolution Analytics HDFS HBASE ,[object Object]

Hadoop Streaming package for executing MapReduce jobs from R.R Map Reduce Task Tracker Task Node R Client Job Tracker

RevoHDFS R package for working with HDFS Connect and Browse HDFS Read/Write/Delete/Copy/Rename files Examples: Read an HDFS text file into a data frame Serialize a data frame to HDFS Stream lines from HDFS text file that can be used with biglm or bigglm 12

RevoHBase R Package for working with HBASE Connect and Browse HBASE Get Rows/Columns of an HBASE table Write data to HBASE table Create/Delete HBASE table Examples Create a data frame in R from a collection of Rows/Columns from HBASE Update an HBASE table with values from a data frame 13

RevoHStream RevoHStream – R package capable of performing the following types of Analysis using Hadoop Streaming Simulations - Monte Carlo and other Stochastic analysis R ‘apply’ family of operations (tapply, lapply…) Binning, quantiles, summaries and crosstabs for input to displays (ggplot, lattice). Data transformations Data Mining 14

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data LDN 2016: When Big Data Meets Fast DataMatt Stubbs

Managing a Multi-Tenant Data LakeDataWorks Summit/Hadoop Summit

An Introduction to the MapR Converged Data PlatformMapR Technologies

Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Data Con LA

Accion Labs - Big Data ServicesAccion Labs, Inc.

Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies

MapR Streams and MapR Converged Data PlatformMapR Technologies

Accion Labs - Rackspace - How can cloud help you?Accion Labs, Inc.

Big Data in the Real WorldMark Kromer

Free Servers to Build Big Data System on: Bing’s ApproachDataWorks Summit

Best Practices for Data Convergence in HealthcareMapR Technologies

Meruvian - Introduction to MapRThe World Bank

MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies

Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark Summit

Big Data Analytics Projects - Real World with PentahoMark Kromer

Big Data at your Desk with KNIMEDataWorks Summit/Hadoop Summit

Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.

Big Data Analytics with Hadoop, MongoDB and SQL ServerMark Kromer

Big Data Use Casesboorad

Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonMapR Technologies

Was ist angesagt? (20)

Big Data LDN 2016: When Big Data Meets Fast Data

Managing a Multi-Tenant Data Lake

An Introduction to the MapR Converged Data Platform

Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...

Accion Labs - Big Data Services

Data Warehouse Modernization: Accelerating Time-To-Action

MapR Streams and MapR Converged Data Platform

Accion Labs - Rackspace - How can cloud help you?

Big Data in the Real World

Free Servers to Build Big Data System on: Bing’s Approach

Best Practices for Data Convergence in Healthcare

Meruvian - Introduction to MapR

MapR on Azure: Getting Value from Big Data in the Cloud -

Spark and Hadoop at Production Scale-(Anil Gadre, MapR)

Big Data Analytics Projects - Real World with Pentaho

Big Data at your Desk with KNIME

Enabling Next Gen Analytics with Azure Data Lake and StreamSets

Big Data Analytics with Hadoop, MongoDB and SQL Server

Big Data Use Cases

Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson

Andere mochten auch

Understanding Job market using Probabilistic Graphical Modelsvumaasha

Exploratory Data Analysisthinrhino

Apriori AlgorithmInternational School of Engineering

C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...Jluis Dela Rosa

CANENERO Advertising - Gilberto Chiacchierabnioceanoblu

Madagascar analysiscroberts100

Chrome-eject がこの先生きのこるにはYosuke HASEGAWA

Go言語na_o_ys

Climbing Off The Ladder, Before We Fall OffC4Media

Trabajo colaborativo listKaterin Colcha

Enquête Doctipharma : Les français et la vente de médicaments sur internet Doctipharma

LWF 101 for Open Hack DayDaniel-Hiroyuki Haga

AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...AWS Germany

Digital MarketingSaiful Islam

Nettet som en del av mediemiksen - Google Think 2014 - Espen GrimmertEspen Grimmert

00025233fpem

Η αγαπημένη μου πόληdroula_

Transformation or TransitionMike Pounsford

portfolio_tmajasaariTarmo Majasaari

Shepherd Elementary School Community Meeting FlyerDC Department of General Services

Andere mochten auch (20)

Understanding Job market using Probabilistic Graphical Models

Exploratory Data Analysis

Apriori Algorithm

C mo-ganarse-la-vida-escribiendo-orientaciones-para-desarrollar-la-escritura-...

CANENERO Advertising - Gilberto Chiacchiera

Madagascar analysis

Chrome-eject がこの先生きのこるには

Go言語

Climbing Off The Ladder, Before We Fall Off

Trabajo colaborativo list

Enquête Doctipharma : Les français et la vente de médicaments sur internet

LWF 101 for Open Hack Day

AWS Roadshow Herbst 2013: Beschleunigen Sie Entwicklungs- und Test-Szenarien ...

Digital Marketing

Nettet som en del av mediemiksen - Google Think 2014 - Espen Grimmert

00025233

Η αγαπημένη μου πόλη

Transformation or Transition

portfolio_tmajasaari

Shepherd Elementary School Community Meeting Flyer

Ähnlich wie Revolution Analytics

High Performance Predictive Analytics in R and HadoopRevolution Analytics

High Performance Predictive Analytics in R and HadoopDataWorks Summit

Apache HadoopKumaresan Manickavelu

The Powerful Marriage of Hadoop and R (David Champagne)Revolution Analytics

Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics

Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Cloudera, Inc.

Stratosphere with big_data_analyticsAvinash Pandu

Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationRevolution Analytics

Revolution R Enterprise - Portland R User Group, November 2013Revolution Analytics

TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...Debraj GuhaThakurta

TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...Debraj GuhaThakurta

Scalable Data Analysis in R -- Lee EdlefsenRevolution Analytics

What's New in Revolution R Enterprise 6.2Revolution Analytics

Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics

BIG DATA and USE CASESBhaskara Reddy Sannapureddy

Big Data Analytics with RGreat Wide Open

Big Data - Analytics with RTechsparks

Twitter_Sentiment_analysis.pptxJOELFRANKLIN13

Open source analyticsAjay Ohri

Ähnlich wie Revolution Analytics (20)

High Performance Predictive Analytics in R and Hadoop

Apache Hadoop

The Powerful Marriage of Hadoop and R (David Champagne)

Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...

Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...

Stratosphere with big_data_analytics

Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation

Revolution R Enterprise - Portland R User Group, November 2013

TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...

TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...

Scalable Data Analysis in R -- Lee Edlefsen

What's New in Revolution R Enterprise 6.2

Performance and Scale Options for R with Hadoop: A comparison of potential ar...

BIG DATA and USE CASES

Big Data Analytics with R

Big Data - Analytics with R

Twitter_Sentiment_analysis.pptx

Open source analytics

Mehr von templedf

JavaOne14 Hands-on Hadooptempledf

Java one14 handsonhadooptempledf

Supermicro High Performance Enterprise Hadoop Infrastructuretempledf

Talendtempledf

Datameer Analytics Solutiontempledf

Puppet Labs Puppet Enterprisetempledf

Composite Information Servertempledf

Mehr von templedf (7)

JavaOne14 Hands-on Hadoop

Java one14 handsonhadoop

Supermicro High Performance Enterprise Hadoop Infrastructure

Talend

Datameer Analytics Solution

Puppet Labs Puppet Enterprise

Composite Information Server

Kürzlich hochgeladen

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

MS Copilot expands with MS Graph connectorsNanddeep Nachan

A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz

FWD Group - Insurer Innovation Award 2024The Digital Insurer

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Ransomware_Q4_2023. The report. [EN].pdfOverkill Security

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

Architecting Cloud Native ApplicationsWSO2

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

ICT role in 21st century education and its challengesrafiqahmad00786416

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays

Real Time Object Detection Using Open CVKhem

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

MS Copilot expands with MS Graph connectors

A Beginners Guide to Building a RAG App Using Open Source Milvus

FWD Group - Insurer Innovation Award 2024

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Ransomware_Q4_2023. The report. [EN].pdf

Exploring the Future Potential of AI-Enabled Smartphone Processors

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Data Cloud, More than a CDP by Matt Robison

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Architecting Cloud Native Applications

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

ICT role in 21st century education and its challenges

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...

Real Time Object Detection Using Open CV

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Revolution Analytics

1. Solution Spotlight Presents

2. Integrating R and Hadoop Part of Revolution Analytics’ Big Analytics Strategy Contact us at info@revolutionanalytics.com 2

3. Outline Introduction to Revolution Analytics Opportunity and Challenges of Big Analytics Revolution Analytics’ Support of Integration between R and Hadoop Contact Info

5. 2M+ Users

6. 2,500+ ApplicationsFinance Statistics Life Sciences Predictive Analytics Manufacturing Retail Data Mining Telecom Social Media Visualization Government

7. Revolution has garnered tremendous attention from media and analysts

8. Big Analytics, Big Advantages Big Analytics could be Simple algorithms running on “Big Data” Compute-intensive algorithms running on either “Big Data” or small data sets Advanced Analytic routines for data visualization or statistical analysis

9. Extracting Value with Big Analytics Big Analytics’ Advantages Predict the Future Understand Risk and Uncertainty Embrace Complexity Identify the Unusual Think Big 7

10. Big Analytics Challenges Computations are data intensive (i.e. require large amounts of data) To be effective, must rely on data parallelism Data is distributed across compute nodes Same task is run in parallel on each of the data partitions Examples of distributed computing frameworks that support data parallelism Traditional file based analytics using on-premise clusters Hadoop and MapReduce In-Database Analytics using parallel hardware architectures 8

11. Key Objectives for Big Analytics Deployments Best performance is achieved when these Big Analytics challenges are overcome: Avoid sampling / aggregation; Reduce data movement and replication; Bring the analytics as close as possible to the data and; Optimize computation speed. Revolution Analytics’ support for R and Hadoop helps overcome these challenges

12. Revolution Analytics’ RevoConnectRsfor Hadoop RevoHDFS provides connectivity from R to HDFS and RevoHBase Allows an R programmer to manipulate Hadoop data stores directly from HDFS and HBASE RevoHStream allows MapReduce jobs to be developed in R and executed as Hadoop Streaming jobs Gives R programmers the ability to write MapReduce jobs in R using Hadoop Streaming

13.

14. Hadoop Streaming package for executing MapReduce jobs from R.R Map Reduce Task Tracker Task Node R Client Job Tracker

15. RevoHDFS R package for working with HDFS Connect and Browse HDFS Read/Write/Delete/Copy/Rename files Examples: Read an HDFS text file into a data frame Serialize a data frame to HDFS Stream lines from HDFS text file that can be used with biglm or bigglm 12

16. RevoHBase R Package for working with HBASE Connect and Browse HBASE Get Rows/Columns of an HBASE table Write data to HBASE table Create/Delete HBASE table Examples Create a data frame in R from a collection of Rows/Columns from HBASE Update an HBASE table with values from a data frame 13

17. RevoHStream RevoHStream – R package capable of performing the following types of Analysis using Hadoop Streaming Simulations - Monte Carlo and other Stochastic analysis R ‘apply’ family of operations (tapply, lapply…) Binning, quantiles, summaries and crosstabs for input to displays (ggplot, lattice). Data transformations Data Mining 14

18. Example MapReduce AlgorithmLogistic Regresion ## create test set as follows ## rhwrite(lapply (1:100, function(i) {eps = rnorm(1, sd =10) ; keyval(i, list(x = c(i,i+eps), y = 2 * (eps > 0) - 1))}), "/tmp/logreg") ## run as: ## rhLogisticRegression("/tmp/logreg", 10, 2, 0.05) ## max likelihood solution diverges for separable dataset, (-inf, inf) such as the above rhLogisticRegression = function(input, iterations, dims, alpha){ plane = rep(0, dims) g = function(z) 1/(1 + exp(-z)) for (i in 1:iterations) { gradient = rhread(revoMapReduce(input, map = function(k, v) keyval (1, v$y * v$x * g(-v$y * (plane %*% v$x))), reduce = function(k, vv) keyval(k, apply(do.call(rbind,vv),2,sum)), combine = T)) plane = plane + alpha * gradient[[1]]$val } plane } 15

19. Get more information about Revolution Analytics’ Big Analytics Solutions, including R connectors for Hadoop 1 855-GET-REVO 16 http://www.revolutionanalytics.com/big-analytics

20. http://www.cloudera.com/partners

Revolution Analytics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Revolution Analytics

Ähnlich wie Revolution Analytics (20)

Mehr von templedf

Mehr von templedf (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Revolution Analytics