Submit Search
Upload
Dynamic Resource Allocation in Apache Spark
•
7 likes
•
3,014 views
Yuta Imai
Follow
The talk about Dynamic Resource Allocation and External Shuffle Service.
Read less
Read more
Technology
Report
Share
Report
Share
1 of 21
Download now
Download to read offline
Recommended
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
DataStax Academy
RxJS Evolved
RxJS Evolved
trxcllnt
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
alexstorer
R and cpp
R and cpp
Romain Francois
RxJS - The Reactive extensions for JavaScript
RxJS - The Reactive extensions for JavaScript
Viliam Elischer
Operations on rdd
Operations on rdd
sparrowAnalytics.com
RxJS101 - What you need to know to get started with RxJS tomorrow
RxJS101 - What you need to know to get started with RxJS tomorrow
Viliam Elischer
Climate data in r with the raster package
Climate data in r with the raster package
Alberto Labarga
Recommended
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
DataStax Academy
RxJS Evolved
RxJS Evolved
trxcllnt
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
alexstorer
R and cpp
R and cpp
Romain Francois
RxJS - The Reactive extensions for JavaScript
RxJS - The Reactive extensions for JavaScript
Viliam Elischer
Operations on rdd
Operations on rdd
sparrowAnalytics.com
RxJS101 - What you need to know to get started with RxJS tomorrow
RxJS101 - What you need to know to get started with RxJS tomorrow
Viliam Elischer
Climate data in r with the raster package
Climate data in r with the raster package
Alberto Labarga
Meet scala
Meet scala
Wojciech Pituła
MapReduce with Scalding @ 24th Hadoop London Meetup
MapReduce with Scalding @ 24th Hadoop London Meetup
Landoop Ltd
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
InfluxData
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
Anton Kasyanov
Scalding Presentation
Scalding Presentation
Landoop Ltd
JS Fest 2019. Anjana Vakil. Serverless Bebop
JS Fest 2019. Anjana Vakil. Serverless Bebop
JSFestUA
Spark workshop
Spark workshop
Wojciech Pituła
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Modern Data Stack France
Spark_Documentation_Template1
Spark_Documentation_Template1
Nagavarunkumar Kolla
Spark schema for free with David Szakallas
Spark schema for free with David Szakallas
Databricks
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
InfluxData
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
bzamecnik
Time Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
mrphilroth
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Data Con LA
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
Sages
R and C++
R and C++
Romain Francois
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages
Caching a page
Caching a page
Radha Krishnan
Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
Databricks
Spark devoxx2014
Spark devoxx2014
Andy Petrella
Artigo 81 - spark_tutorial.pdf
Artigo 81 - spark_tutorial.pdf
WalmirCouto3
More Related Content
What's hot
Meet scala
Meet scala
Wojciech Pituła
MapReduce with Scalding @ 24th Hadoop London Meetup
MapReduce with Scalding @ 24th Hadoop London Meetup
Landoop Ltd
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
InfluxData
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
Anton Kasyanov
Scalding Presentation
Scalding Presentation
Landoop Ltd
JS Fest 2019. Anjana Vakil. Serverless Bebop
JS Fest 2019. Anjana Vakil. Serverless Bebop
JSFestUA
Spark workshop
Spark workshop
Wojciech Pituła
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Modern Data Stack France
Spark_Documentation_Template1
Spark_Documentation_Template1
Nagavarunkumar Kolla
Spark schema for free with David Szakallas
Spark schema for free with David Szakallas
Databricks
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
InfluxData
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
bzamecnik
Time Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
mrphilroth
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Data Con LA
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
Sages
R and C++
R and C++
Romain Francois
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages
Caching a page
Caching a page
Radha Krishnan
Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
Databricks
What's hot
(20)
Meet scala
Meet scala
MapReduce with Scalding @ 24th Hadoop London Meetup
MapReduce with Scalding @ 24th Hadoop London Meetup
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
Scalding Presentation
Scalding Presentation
JS Fest 2019. Anjana Vakil. Serverless Bebop
JS Fest 2019. Anjana Vakil. Serverless Bebop
Spark workshop
Spark workshop
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Spark_Documentation_Template1
Spark_Documentation_Template1
Spark schema for free with David Szakallas
Spark schema for free with David Szakallas
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
Time Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
R and C++
R and C++
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Caching a page
Caching a page
Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
Similar to Dynamic Resource Allocation in Apache Spark
Spark devoxx2014
Spark devoxx2014
Andy Petrella
Artigo 81 - spark_tutorial.pdf
Artigo 81 - spark_tutorial.pdf
WalmirCouto3
Using spark 1.2 with Java 8 and Cassandra
Using spark 1.2 with Java 8 and Cassandra
Denis Dus
Spark by Adform Research, Paulius
Spark by Adform Research, Paulius
Vasil Remeniuk
20130912 YTC_Reynold Xin_Spark and Shark
20130912 YTC_Reynold Xin_Spark and Shark
YahooTechConference
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
Konrad Malawski
Spark4
Spark4
poovarasu maniandan
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
Hakka Labs
Beauty and the beast - Haskell on JVM
Beauty and the beast - Haskell on JVM
Jarek Ratajski
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Databricks
Introduction to Apache Spark
Introduction to Apache Spark
Mohamed hedi Abidi
Spark: Taming Big Data
Spark: Taming Big Data
Leonardo Gamas
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
Ruslan Shevchenko
Big Data Scala by the Bay: Interactive Spark in your Browser
Big Data Scala by the Bay: Interactive Spark in your Browser
gethue
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
Martin Goodson
Meetup ml spark_ppt
Meetup ml spark_ppt
Snehal Nagmote
Introduction to Spark with Scala
Introduction to Spark with Scala
Himanshu Gupta
Apache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet
Scala.js - yet another what..?
Scala.js - yet another what..?
Artur Skowroński
Real Time Big Data Management
Real Time Big Data Management
Albert Bifet
Similar to Dynamic Resource Allocation in Apache Spark
(20)
Spark devoxx2014
Spark devoxx2014
Artigo 81 - spark_tutorial.pdf
Artigo 81 - spark_tutorial.pdf
Using spark 1.2 with Java 8 and Cassandra
Using spark 1.2 with Java 8 and Cassandra
Spark by Adform Research, Paulius
Spark by Adform Research, Paulius
20130912 YTC_Reynold Xin_Spark and Shark
20130912 YTC_Reynold Xin_Spark and Shark
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
Spark4
Spark4
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
Beauty and the beast - Haskell on JVM
Beauty and the beast - Haskell on JVM
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Introduction to Apache Spark
Introduction to Apache Spark
Spark: Taming Big Data
Spark: Taming Big Data
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
Big Data Scala by the Bay: Interactive Spark in your Browser
Big Data Scala by the Bay: Interactive Spark in your Browser
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
Meetup ml spark_ppt
Meetup ml spark_ppt
Introduction to Spark with Scala
Introduction to Spark with Scala
Apache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Scala.js - yet another what..?
Scala.js - yet another what..?
Real Time Big Data Management
Real Time Big Data Management
More from Yuta Imai
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Yuta Imai
HDP2.5 Updates
HDP2.5 Updates
Yuta Imai
Deep Learning On Apache Spark
Deep Learning On Apache Spark
Yuta Imai
Hadoop in adtech
Hadoop in adtech
Yuta Imai
Hadoop/Spark セルフサービス系の事例まとめ
Hadoop/Spark セルフサービス系の事例まとめ
Yuta Imai
IoTアプリケーションで利用するApache NiFi
IoTアプリケーションで利用するApache NiFi
Yuta Imai
OLAP options on Hadoop
OLAP options on Hadoop
Yuta Imai
Apache ambari
Apache ambari
Yuta Imai
Spark at Scale
Spark at Scale
Yuta Imai
Apache Hiveの今とこれから - 2016
Apache Hiveの今とこれから - 2016
Yuta Imai
Hadoop最新事情とHortonworks Data Platform
Hadoop最新事情とHortonworks Data Platform
Yuta Imai
Benchmark and Metrics
Benchmark and Metrics
Yuta Imai
Hadoop and Kerberos
Hadoop and Kerberos
Yuta Imai
Spark Streaming + Amazon Kinesis
Spark Streaming + Amazon Kinesis
Yuta Imai
オンラインゲームの仕組みと工夫
オンラインゲームの仕組みと工夫
Yuta Imai
Amazon Machine Learning
Amazon Machine Learning
Yuta Imai
Global Gaming On AWS
Global Gaming On AWS
Yuta Imai
Digital marketing on AWS
Digital marketing on AWS
Yuta Imai
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
Yuta Imai
クラウドネイティブなアーキテクチャでサクサク解析
クラウドネイティブなアーキテクチャでサクサク解析
Yuta Imai
More from Yuta Imai
(20)
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
HDP2.5 Updates
HDP2.5 Updates
Deep Learning On Apache Spark
Deep Learning On Apache Spark
Hadoop in adtech
Hadoop in adtech
Hadoop/Spark セルフサービス系の事例まとめ
Hadoop/Spark セルフサービス系の事例まとめ
IoTアプリケーションで利用するApache NiFi
IoTアプリケーションで利用するApache NiFi
OLAP options on Hadoop
OLAP options on Hadoop
Apache ambari
Apache ambari
Spark at Scale
Spark at Scale
Apache Hiveの今とこれから - 2016
Apache Hiveの今とこれから - 2016
Hadoop最新事情とHortonworks Data Platform
Hadoop最新事情とHortonworks Data Platform
Benchmark and Metrics
Benchmark and Metrics
Hadoop and Kerberos
Hadoop and Kerberos
Spark Streaming + Amazon Kinesis
Spark Streaming + Amazon Kinesis
オンラインゲームの仕組みと工夫
オンラインゲームの仕組みと工夫
Amazon Machine Learning
Amazon Machine Learning
Global Gaming On AWS
Global Gaming On AWS
Digital marketing on AWS
Digital marketing on AWS
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
クラウドネイティブなアーキテクチャでサクサク解析
クラウドネイティブなアーキテクチャでサクサク解析
Recently uploaded
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Precisely
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
DianaGray10
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
comworks
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Alfredo García Lavilla
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Scott Keck-Warren
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
hariprasad279825
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Lars Bell
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Curtis Poe
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Alex Barbosa Coqueiro
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Dubai Multi Commodity Centre
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Mark Simos
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
Hervé Boutemy
Recently uploaded
(20)
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
Dynamic Resource Allocation in Apache Spark
1.
Dynamic Resource Alloca1on in Apache Spark Yuta Imai @imai_factory
2.
1. RDD Graph val text = "Hello Spark, this is my first Spark application." val textArray = text.split(" ").map(_.replaceAll(" ","")) val result = sc.parallelize(textArray) .map(item => (item, 1)) .reduceByKey((x,y) => x + y) .collect()
3.
Array Array ParallelCollec1onRDD Par11on0 Par11on1 Par11on2 Par11on3 MapPar11onsRDD Par11on0 Par11on1 Par11on2 Par11on3 ShuffledRDD Par11on0 Par11on1 sc.parallelize() .map(…)
.reduceByKey(…) .collect() 2. DAG Scheduler
4.
Array Array ParallelCollec1onRDD Par11on0 Par11on1 Par11on2 Par11on3 MapPar11onsRDD Par11on0 Par11on1 Par11on2 Par11on3 ShuffledRDD Par11on0 Par11on1 sc.parallelize() .map(…)
.reduceByKey(…) .collect() 2. DAG Scheduler Narrow Dependency Shuffle Dependency
5.
Array Array ParallelCollec1onRDD Par11on0 Par11on1 Par11on2 Par11on3 MapPar11onsRDD Par11on0 Par11on1 Par11on2 Par11on3 ShuffledRDD Par11on0 Par11on1 sc.parallelize() .map(…)
.reduceByKey(…) .collect() 2. DAG Scheduler Narrow Dependency Shuffle Dependency Stage0 Stage1 Task0 Task1 Task2 Task3 Task4 Task5
6.
3. Task Scheduler Par11on0 Par11on1 Par11on2 Par11on3 Par11on0 Par11on1 Par11on2 Par11on3 Task0 Task1 Task2 Task3 Executors
7.
Shuffle File iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node
8.
DYNAMIC RESOURCE ALLOCATION
9.
Dynamic Resource Alloca1on • Adds extra executors to an app which has pending tasks. – Offloads challenge for exact resource planning for an app. • Removes idle executors from an app. – Helps a long running app to free idle executors.
10.
Overview Tasks Executors
11.
Overview Tasks Executors Insufficient capacity
12.
Overview Tasks Executors Insufficient capacity
13.
Overview Tasks Executors Insufficient capacity
14.
Overview Tasks Executors Insufficient capacity Op1mal capacity
15.
Overview Tasks Executors ✔ ✔ Insufficient capacity Op1mal capacity
Idle executors
16.
Tasks Executors ✔ ✔ Overview Insufficient capacity Op1mal capacity
Idle executors Op1mal capacity
17.
Request Policy • An app starts with user specified # of executors. ./bin/spark-submit --class <main-class> --master <master-url> --num-executors <# of executors> • Ader spark.dynamicAlloca1on.schedulerBacklogTimeout(sec), App starts reques1ng new executors, if it has pending task(s). •
App requests new executors every spark.dynamicAlloca1on.sustainedSchedulerBacklogTimeout(sec), with doubling # of requests like 1, 2, 4, 8, 16…
18.
Remove Policy • An app removes an executor when it has been idle for more than spark.dynamicAlloca1on.executorIdleTimeout seconds.
19.
External Shuffle Service iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node
20.
External Shuffle Service iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node
21.
External Shuffle Service iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node Shuffle Service Shuffle Service
Download now