SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
We’ll get started soon… 
Q&A box is available for your questions 
Webinar will be recorded for future viewing 
Thank you for joining! 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Combine SAS High-Performance 
Capabilities with Hadoop YARN 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
We do Hadoop.
Your speakers… 
Arun Murthy, Founder and Architect 
Hortonworks 
@acmurthy 
Paul Kent, Vice President Big Data 
SAS 
@hornpolish 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Agenda 
• Introduction to YARN 
• SAS Workloads on the Cluster 
• SAS Workloads: Resource Settings 
• SAS and YARN 
• YARN Futures 
• Next Steps 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
The 1st Generation of Hadoop: Batch 
HADOOP 1.0 
Built for Web-Scale Batch Apps 
Single 
App 
INTERACTIVE 
Single 
App 
BATCH 
HDFS 
Single 
App 
BATCH 
HDFS 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
• All other usage patterns must 
leverage that same 
infrastructure 
• Forces the creation of silos 
for managing mixed 
workloads 
Single 
App 
ONLINE 
Single 
App 
BATCH 
HDFS
Hadoop MapReduce Classic 
JobTracker 
§ Manages cluster resources and job scheduling 
TaskTracker 
§ Per-node agent 
§ Manage tasks 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
MapReduce Classic: Limitations 
Scalability 
§ Maximum Cluster size – 4,000 nodes 
§ Maximum concurrent tasks – 40,000 
§ Coarse synchronization in JobTracker 
Availability 
§ Failure kills all queued and running jobs 
Hard partition of resources into map and reduce slots 
§ Low resource utilization 
Lacks support for alternate paradigms and services 
§ Iterative applications implemented using MapReduce are 10x slower 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Our Vision: Hadoop as Next-Gen Platform 
Real-time 
HBase 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Tez 
YARN: Data Operating System 
(Cluster Resource Management) 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
MapReduce 
(Cluster Resource Management & Data Processing) 
Script 
Pig 
SQL 
Hive 
Others 
Storm, 
Solr, etc. 
1 ° ° ° ° ° 
° ° ° ° ° ° 
° ° ° ° ° ° 
° 
° 
N 
HDFS 
(Hadoop Distributed File System) 
Script 
Pig 
SQL 
Hive 
Engines 
HBase 
Accumulo, Storm, 
Solr, Spark. 
Others 
ISV Engines 
TezTez 
Others 
Engines 
Tez 
Hadoop 1 
• Silos & Largely batch 
• Single Processing engine 
Hadoop 2 w/ 
• Multiple Engines, Single Data Set 
• Batch, Interactive & Real-Time 
Java 
Cascading 
T ez 
° ° 
° ° 
° ° 
° 
° 
N 
HDFS 
(Hadoop Distributed File System)
YARN: Taking Hadoop Beyond Batch 
Applica,ons 
Run 
Na,vely 
IN 
Hadoop 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDFS2 
(Redundant, 
Reliable 
Storage) 
YARN 
(Cluster 
Resource 
Management) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, 
S4,…) 
GRAPH 
(Giraph) 
IN-­‐MEMORY 
(Spark) 
HPC 
MPI 
(OpenMPI) 
ONLINE 
(HBase) 
OTHER 
(Search) 
(Weave…) 
Store ALL DATA in one place… 
Interact with that data in MULTIPLE WAYS 
with Predictable Performance and Quality of Service
YARN 
Hortonworks Data Platform 
Script 
Pig 
SQL 
Hive 
TezT ez 
Java 
Cascading 
T ez 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Accumulo 
NoSQL 
YARN: Data Operating System 
(Cluster Resource Management) 
Others 
Engines 
Tez 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° 
° ° 
° ° 
HBase 
NoSQL 
Storm 
Stream 
Slider 
Sli der 
Others 
Engines 
Slider 
Slider 
° ° ° ° ° 
° ° ° ° ° 
° ° ° ° ° 
° 
° 
° 
Spark 
In-Memory 
° 
° 
° 
° 
° 
° 
PaaS 
Kubernetes 
LASR 
HPA 
° 
° 
N 
° 
° 
° 
° 
° 
° 
HDFS 
(Hadoop Distributed File System) 
Batch 
MR
5 Key Benefits of YARN 
1. Scale 
2. New Programming Models 5 & Services 
3. Improved cluster utilization 
4. Agility 
5. Beyond Java 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Concepts 
Application 
§ Application is a temporal job or a service submitted YARN 
§ Examples 
– Map Reduce Job (job) 
– Hbase Cluster (service) 
Container 
§ Basic unit of allocation 
§ Fine-grained resource allocation across multiple resource types (memory, cpu, disk, 
network, gpu etc.) 
– container_0 = 2GB, 1CPU 
– container_1 = 1GB, 6 CPU 
§ Replaces the fixed map/reduce slots 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Design Centre 
Split up the two major functions of JobTracker 
§ Cluster resource management 
§ Application life-cycle management 
MapReduce becomes user-land library 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
NodeManager 
NodeManager 
Container 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
Container 
1.1 
Container 
2.4 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1.2 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
Container 
1.3 
AM 
1 
Container 
2.2 
Container 
2.1 
Container 
2.3 
AM2 
YARN Architecture - Walkthrough 
Client2 
ResourceManager 
Scheduler
Multi-Tenancy with YARN 
Economics as queue-capacity 
§ Heirarchical Queues 
SLAs 
§ Preemption 
Resource Isolation 
§ Linux: cgroups 
§ MS Windows: Job Control 
§ Roadmap: Virtualization (Xen, KVM) 
Administration 
§ Queue ACLs 
§ Run-time re-configuration for queues 
§ Charge-back 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
ResourceManager 
Scheduler 
root 
Adhoc 
10% 
DW 
70% 
Mrkting 
20% 
Dev 
10% 
Reserved 
20% 
Prod 
70% 
Prod 
80% 
Dev 
20% 
P0 
70% 
P1 
30% 
Capacity Scheduler 
Hierarchical 
Queues
YARN Applications 
Data processing applications and services 
§ Services - Slider 
§ Real-time event processing – Storm, S4, other commercial platforms 
§ Tez – Generic framework to run a complex DAG 
§ MPI: OpenMPI, MPICH2 
§ Master-Worker 
§ Machine Learning: Spark 
§ Graph processing: Giraph 
§ Enabled by allowing the use of paradigm-specific application master 
Run all on the same Hadoop cluster! 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
SHARE! 
Customers are: 
wrapping up POCs 
building Bigger Clusters 
assembling their Data { Lake, Reservoir } 
want their software to SHARE the cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster - Video 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Some Requests are for a significant slice of the cluster 
Reservation will be ALL DAY, ALL WEEK, ALL MONTH? 
Memory typically fixed (15% of cluster) 
CPU floor, would like the spare capacity when available 
Some Requests are more short term 
Memory can be estimated 
Duration can be capped 
CPU floor, would like spare capacity 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads – Resource Settings 
How much should you reserve? 
not a perfect science yet 
Long Running? 
LASR server by percent of total memory 
More like a batch request? 
HPA procedure by anecdotal experience 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads – Resource Settings 
if [ "$USER" = "lasradm" ]; then 
# Custom settings for running under the lasradm account. 
export TKMPI_ULIMIT="-v 50000000” 
export TKMPI_MEMSIZE=50000 
export TKMPI_CGROUP="cgexec -g cpu:75” 
fi 
# if [ "$TKMPI_APPNAME" = "lasr" ]; then 
# Custom settings for a lasr process running under any account. 
# export TKMPI_ULIMIT="-v 50000000" 
# export TKMPI_MEMSIZE=50000 
# export TKMPI_CGROUP="cgexec -g cpu:75" 
Copyright © 2014, SAS Institute Inc. All rights reserved.
YARN: Taking Hadoop Beyond Batch 
Store ALL DATA in one place… 
Interact with that data in MULTIPLE WAYS 
with Predictable Performance and Quality of Service 
Applica,ons 
Run 
Na,vely 
IN 
Hadoop 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDFS2 
YARN 
(Redundant, 
Reliable 
Storage) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, 
S4,…) 
GRAPH 
(Giraph) 
IN-­‐MEMORY 
(Spark) 
ONLINE 
(HBase)
YARN Futures 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
Container 
ResourceManager 
1.1 
NodeManager 
NodeManager 
AM 
1 
startContainer! 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1 
allocate! 
container! 2 
3
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
ResourceManager 
ServiceX 
NodeManager 
NodeManager 
AM 
1 
delegateContainer! 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1 
allocate! 
2 
container! 
3 
4
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
ServiceX 
NodeManager 
NodeManager 
AM 
1 
ResourceManager 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
5
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
AM 
1 
ResourceManager 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
6 ServiceX
PaaS - Kubernetes-on-YARN 
YARN as the default enterprise-class scheduler and resource manager for Kubernetes and 
OpenShift 3 
q First class support for containerization and mainstream PaaS 
q Updated go language bindings for YARN 
q Uses container delegation model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Labels – Constraint Specifications 
NodeManager 
NodeManager 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
w/ 
GPU 
map 
1.1 
NodeManager 
NodeManager 
NodeManager 
w/ 
GPU 
NodeManager 
w/ 
GPU 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
w/ 
GPU 
map1.2 
reduce1.1 
MR 
AM 
1 
DL1.1 
DL1.2 
DL1.3 
DL-­‐AM 
ResourceManager 
Scheduler
Reservations - SLAs via Allocation Planning 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN 
Hortonworks Data Platform 
Script 
Pig 
SQL 
Hive 
TezT ez 
Java 
Cascading 
T ez 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Accumulo 
NoSQL 
YARN: Data Operating System 
(Cluster Resource Management) 
Others 
Engines 
Tez 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° 
° ° 
° ° 
HBase 
NoSQL 
Storm 
Stream 
Slider 
Sli der 
Others 
Engines 
Slider 
Slider 
° ° ° ° ° 
° ° ° ° ° 
° ° ° ° ° 
° 
° 
° 
Spark 
In-Memory 
° 
° 
° 
° 
° 
° 
PaaS 
Kubernetes 
LASR 
HPA 
° 
° 
N 
° 
° 
° 
° 
° 
° 
HDFS 
(Hadoop Distributed File System) 
Batch 
MR
Next Steps… 
More about SAS & Hortonworks 
http://hortonworks.com/partner/SAS/ 
Download the Hortonworks Sandbox 
Learn Hadoop 
Build Your Analytic App 
Try Hadoop 2 
Contact us: events@hortonworks.com 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved

Más contenido relacionado

Was ist angesagt?

Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun ConnollyHortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchHortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - finalHortonworks
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache HadoopHortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 

Was ist angesagt? (20)

Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 

Ähnlich wie Combine SAS High-Performance Capabilities with Hadoop YARN

How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopPOSSCON
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider projectSteve Loughran
 
Hadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun MurthyHadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun Murthyhuguk
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Hortonworks
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA EditionHadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA EditionBig Data Joe™ Rossi
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Big Data Joe™ Rossi
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingDataWorks Summit
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Big Data Joe™ Rossi
 

Ähnlich wie Combine SAS High-Performance Capabilities with Hadoop YARN (20)

Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider project
 
Hadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun MurthyHadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun Murthy
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA EditionHadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
 
Hoya for Code Review
Hoya for Code ReviewHoya for Code Review
Hoya for Code Review
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Hadoop YARN Services
Hadoop YARN ServicesHadoop YARN Services
Hadoop YARN Services
 
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
 

Mehr von Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.Ritesh Kanjee
 
Practical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfPractical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfICS
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...MyFAA
 
openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleShane Coughlan
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energyjeyasrig
 
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxCYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxBarakaMuyengi
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfCloudMetic
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdfOffsiteNOC
 
Revolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridRevolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridMathew Thomas
 
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
BusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptxBusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptx
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptxAGATSoftware
 
Building Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startBuilding Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startMaxim Salnikov
 
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurMinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurPriyadarshini T
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houstonjennysmithusa549
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityRandy Shoup
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tipsmichealwillson701
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Inc
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsUniversity of Antwerp
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developersmichealwillson701
 

Último (20)

8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.
 
Practical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfPractical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdf
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
 
openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scale
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energy
 
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxCYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdf
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf
 
Revolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridRevolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM Grid
 
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
BusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptxBusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptx
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
 
Building Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startBuilding Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to start
 
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurMinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houston
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data Mesh
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tips
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
 
20140812 - OBD2 Solution
20140812 - OBD2 Solution20140812 - OBD2 Solution
20140812 - OBD2 Solution
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developers
 

Combine SAS High-Performance Capabilities with Hadoop YARN

  • 1. We’ll get started soon… Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 2. Combine SAS High-Performance Capabilities with Hadoop YARN © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop.
  • 3. Your speakers… Arun Murthy, Founder and Architect Hortonworks @acmurthy Paul Kent, Vice President Big Data SAS @hornpolish © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 4. Agenda • Introduction to YARN • SAS Workloads on the Cluster • SAS Workloads: Resource Settings • SAS and YARN • YARN Futures • Next Steps © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 5. The 1st Generation of Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App INTERACTIVE Single App BATCH HDFS Single App BATCH HDFS © Hortonworks Inc. 2011 – 2014. All Rights Reserved • All other usage patterns must leverage that same infrastructure • Forces the creation of silos for managing mixed workloads Single App ONLINE Single App BATCH HDFS
  • 6. Hadoop MapReduce Classic JobTracker § Manages cluster resources and job scheduling TaskTracker § Per-node agent § Manage tasks © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 7. MapReduce Classic: Limitations Scalability § Maximum Cluster size – 4,000 nodes § Maximum concurrent tasks – 40,000 § Coarse synchronization in JobTracker Availability § Failure kills all queued and running jobs Hard partition of resources into map and reduce slots § Low resource utilization Lacks support for alternate paradigms and services § Iterative applications implemented using MapReduce are 10x slower © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 8. Our Vision: Hadoop as Next-Gen Platform Real-time HBase © Hortonworks Inc. 2011 – 2014. All Rights Reserved Tez YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° MapReduce (Cluster Resource Management & Data Processing) Script Pig SQL Hive Others Storm, Solr, etc. 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) Script Pig SQL Hive Engines HBase Accumulo, Storm, Solr, Spark. Others ISV Engines TezTez Others Engines Tez Hadoop 1 • Silos & Largely batch • Single Processing engine Hadoop 2 w/ • Multiple Engines, Single Data Set • Batch, Interactive & Real-Time Java Cascading T ez ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System)
  • 9. YARN: Taking Hadoop Beyond Batch Applica,ons Run Na,vely IN Hadoop © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-­‐MEMORY (Spark) HPC MPI (OpenMPI) ONLINE (HBase) OTHER (Search) (Weave…) Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service
  • 10. YARN Hortonworks Data Platform Script Pig SQL Hive TezT ez Java Cascading T ez © Hortonworks Inc. 2011 – 2014. All Rights Reserved Accumulo NoSQL YARN: Data Operating System (Cluster Resource Management) Others Engines Tez 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HBase NoSQL Storm Stream Slider Sli der Others Engines Slider Slider ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Spark In-Memory ° ° ° ° ° ° PaaS Kubernetes LASR HPA ° ° N ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Batch MR
  • 11. 5 Key Benefits of YARN 1. Scale 2. New Programming Models 5 & Services 3. Improved cluster utilization 4. Agility 5. Beyond Java © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 12. Concepts Application § Application is a temporal job or a service submitted YARN § Examples – Map Reduce Job (job) – Hbase Cluster (service) Container § Basic unit of allocation § Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – container_0 = 2GB, 1CPU – container_1 = 1GB, 6 CPU § Replaces the fixed map/reduce slots © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 13. Design Centre Split up the two major functions of JobTracker § Cluster resource management § Application life-cycle management MapReduce becomes user-land library © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 14. NodeManager NodeManager Container © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager Container 1.1 Container 2.4 NodeManager NodeManager NodeManager NodeManager 1.2 NodeManager NodeManager NodeManager NodeManager Container 1.3 AM 1 Container 2.2 Container 2.1 Container 2.3 AM2 YARN Architecture - Walkthrough Client2 ResourceManager Scheduler
  • 15. Multi-Tenancy with YARN Economics as queue-capacity § Heirarchical Queues SLAs § Preemption Resource Isolation § Linux: cgroups § MS Windows: Job Control § Roadmap: Virtualization (Xen, KVM) Administration § Queue ACLs § Run-time re-configuration for queues § Charge-back © Hortonworks Inc. 2011 – 2014. All Rights Reserved ResourceManager Scheduler root Adhoc 10% DW 70% Mrkting 20% Dev 10% Reserved 20% Prod 70% Prod 80% Dev 20% P0 70% P1 30% Capacity Scheduler Hierarchical Queues
  • 16. YARN Applications Data processing applications and services § Services - Slider § Real-time event processing – Storm, S4, other commercial platforms § Tez – Generic framework to run a complex DAG § MPI: OpenMPI, MPICH2 § Master-Worker § Machine Learning: Spark § Graph processing: Giraph § Enabled by allowing the use of paradigm-specific application master Run all on the same Hadoop cluster! © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 17. SHARE! Customers are: wrapping up POCs building Bigger Clusters assembling their Data { Lake, Reservoir } want their software to SHARE the cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 18. SAS Workloads on the Cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 19. SAS Workloads on the Cluster - Video Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 20. SAS Workloads on the Cluster Some Requests are for a significant slice of the cluster Reservation will be ALL DAY, ALL WEEK, ALL MONTH? Memory typically fixed (15% of cluster) CPU floor, would like the spare capacity when available Some Requests are more short term Memory can be estimated Duration can be capped CPU floor, would like spare capacity Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 21. SAS Workloads on the Cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 22. SAS Workloads – Resource Settings How much should you reserve? not a perfect science yet Long Running? LASR server by percent of total memory More like a batch request? HPA procedure by anecdotal experience Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 23. SAS Workloads – Resource Settings if [ "$USER" = "lasradm" ]; then # Custom settings for running under the lasradm account. export TKMPI_ULIMIT="-v 50000000” export TKMPI_MEMSIZE=50000 export TKMPI_CGROUP="cgexec -g cpu:75” fi # if [ "$TKMPI_APPNAME" = "lasr" ]; then # Custom settings for a lasr process running under any account. # export TKMPI_ULIMIT="-v 50000000" # export TKMPI_MEMSIZE=50000 # export TKMPI_CGROUP="cgexec -g cpu:75" Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 24. YARN: Taking Hadoop Beyond Batch Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service Applica,ons Run Na,vely IN Hadoop © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDFS2 YARN (Redundant, Reliable Storage) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-­‐MEMORY (Spark) ONLINE (HBase)
  • 25. YARN Futures © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 26. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager Container ResourceManager 1.1 NodeManager NodeManager AM 1 startContainer! Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 1 allocate! container! 2 3
  • 27. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager ResourceManager ServiceX NodeManager NodeManager AM 1 delegateContainer! Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 1 allocate! 2 container! 3 4
  • 28. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager ServiceX NodeManager NodeManager AM 1 ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 5
  • 29. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager NodeManager NodeManager AM 1 ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 6 ServiceX
  • 30. PaaS - Kubernetes-on-YARN YARN as the default enterprise-class scheduler and resource manager for Kubernetes and OpenShift 3 q First class support for containerization and mainstream PaaS q Updated go language bindings for YARN q Uses container delegation model © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 31. Labels – Constraint Specifications NodeManager NodeManager © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager w/ GPU map 1.1 NodeManager NodeManager NodeManager w/ GPU NodeManager w/ GPU NodeManager NodeManager NodeManager NodeManager w/ GPU map1.2 reduce1.1 MR AM 1 DL1.1 DL1.2 DL1.3 DL-­‐AM ResourceManager Scheduler
  • 32. Reservations - SLAs via Allocation Planning © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 33. YARN Hortonworks Data Platform Script Pig SQL Hive TezT ez Java Cascading T ez © Hortonworks Inc. 2011 – 2014. All Rights Reserved Accumulo NoSQL YARN: Data Operating System (Cluster Resource Management) Others Engines Tez 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HBase NoSQL Storm Stream Slider Sli der Others Engines Slider Slider ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Spark In-Memory ° ° ° ° ° ° PaaS Kubernetes LASR HPA ° ° N ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Batch MR
  • 34. Next Steps… More about SAS & Hortonworks http://hortonworks.com/partner/SAS/ Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 Contact us: events@hortonworks.com © Hortonworks Inc. 2011 – 2014. All Rights Reserved