SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
COMP 5704 Project Presentation
Towards Using Smart Hill Climbing
Heuristic Search Strategy in
Hadoop/YARN Dynamic Parameter
Tuning
Ali Davoudian, Pablo Navarro
School of Computer Science
Carleton University, Ottawa, Canada
COMP 5704 Project Presentation – Slide 1
MapReduce Framework
Input
Chunk-2
Input
Chunk-1
Input
Chunk-3
Map#1
Map#2
Map#3
G11
G12
G21
G22
G31
G32
G11
G21
G31
G12
G22
G32
Red#1
Red#2
Output
Chunk-1
Output
Chunk-2
MapReduce Job
COMP 5704 Project Presentation – Slide 2
Hadoop/YARN Configuration Parameters
• Hadoop/YARN configuration parameters have significant
effect on the cost of MapReduce jobs.
io.sort.mb
Circular memory buffer
Collect
Sort and
spill to
disk Merge
Read of
HDFS
COMP 5704 Project Presentation – Slide 3
Configuration Parameter Tuning
• Which configuration gives the minimum MR job cost?
1. Manual-tuning
• Challenge: combinatorial explosion problem
2. Auto-tuning
I. Static
II. Dynamic
• Search-based methods
COMP 5704 Project Presentation – Slide 4
Static Parameter Tuning
Execute a
test run
With enabled
profilingInitial MR job
configuration
Profiling outputs
Performance
analyzer
MR job
configuration
 Time consuming
 Not cost-effective
 By changing the data
set or hardware, tests
should be repeated
MR job
COMP 5704 Project Presentation – Slide 5
Dynamic Parameter Tuning
Initial MR job
configuration
autotuner wave
execution Cost analyzer
Optimum or near
optimum configuration
Smart hill
climbing
exploration
MR job
COMP 5704 Project Presentation – Slide 6
Search-based Auto-tuning
• Define an objective function Y as a candidate for the cost of
MR job.
 E.g., average execution time of containers
• Assumption:
• Problem: What is the optimal configuration as it gives the
minimum or near to minimum amount of C
• Challenge: 𝒇is unknown or black-box
COMP 5704 Project Presentation – Slide 7
Heuristic Search Methods
I. Simulated annealing
 Uses the Metropolis Monte Carlo sampling strategy
 Guarantees a global optima
 Has a slow convergence to the solution
II. Recursive random search
 Uses the Recursive Random sampling strategy
 It may be inefficient, as restarts of the naïve random sampling may waste efforts
III. Hill climbing
 Uses the gradient-based sampling strategy
 It may get stuck at a local optima area
IV. Genetic algorithms
V. Particle swarm optimization
COMP 5704 Project Presentation – Slide 8
Smart Hill Climbing Exploration - SHC
Collect m sample configurations c1, . . . ,cm
in the whole configuration space S
From the obtained sample points & their costs determine a
reduced or changed subspace S’ which is most likely to
contain optimal or near-optimal configuration
Determine the optimal configuration with regard to
all the obtained sample points
Collect m sample configurations c1, . . . ,cm
in the whole configuration space S’
Restart
Focus
COMP 5704 Project Presentation – Slide 9
Bypassing Local Searches – Approach1
Reduces the overhead
But less noise resilient
COMP 5704 Project Presentation – Slide 10
Bypassing Local Searches – Approach2
More noise resilient but
increases the overhead
COMP 5704 Project Presentation – Slide 11
Weighted Latin HyperCube Sampling - wLHS
Determine K equi-sized non-overlapping intervals I1,…IK in the
space of each parameter Pi
Calculate the cost Y of each configuration
Determine the general trend and correlation of Y with
each configuration parameter Pi
Determine K equi-probability non-overlapping intervals I1,…IK in
the space of each parameter Pi regarding its PDF
Randomly select one parameter value from each interval
COMP 5704 Project Presentation – Slide 12
wLHS – Example
C2.p C3.pC1.p 𝒑
1/3
2/3
1
PDF(p)
A = 𝜡 𝟎 B = 𝜡 𝟑𝜡 𝟏 𝜡 𝟐
r3
r2
r1
I1 I2 I3
0
Assumptions:
• One dimensional configurations
• Parameter space: [A-B]
• Number of samples: 3
COMP 5704 Project Presentation – Slide 13
Implementation and Experiments
• The algorithms were implemented in java for one
dimensional configuration optimization outside of the
Hadoop/YARN context, to reduce complexity and make
testing easier.
• Experiments were conducted in artificial cost functions to
look for optimal configurations.
COMP 5704 Project Presentation – Slide 14
Weighted Latin Hypercube Sampling
• wLHS was tested to verify if the size of intervals would
shift after learning where good values (low values) are
found.
• A simple artificial cost function was used to do these
experiments.
COMP 5704 Project Presentation – Slide 15
Rosenbrock Function
0
20
40
60
80
100
120
140
160
180
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Rosenbrock function
COMP 5704 Project Presentation – Slide 16
Round 1 wLHS
0
20
40
60
80
100
120
140
160
180
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
cost(x)
x values
LHS round 1
COMP 5704 Project Presentation – Slide 17
Round 4 wHLS
0
20
40
60
80
100
120
140
160
180
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
cost(x)
x values
WHLS round 4
COMP 5704 Project Presentation – Slide 18
Intervals After 20 Rounds
-1000
-500
0
500
1000
1500
2000
2500
3000
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
COMP 5704 Project Presentation – Slide 19
SHC Experiments
• Smart hill climbing was implemented and tested using
more complex cost functions.
• Two different versions of SHC were implemented:
• SHC Original version(Approach 1).
• SHC MROnline version (Approach 2).
• The focus of the tests were:
• Finding the best parameters (Lowest cost).
• Number of cost function Executions.
• Noise Resilience.
COMP 5704 Project Presentation – Slide 20
SHC Used for a Complex Function
0
1
2
3
4
5
6
-3 -2 -1 0 1 2 3
Complex function
Function
Optima
Real Optima = 2.2615652875
COMP 5704 Project Presentation – Slide 21
Number of Average Cost Function Executions
0
20
40
60
80
100
120
140
160
180
200
Number of Cost Function Executions Average
SHC MROnline
SHCO 80% Acceptance
SHCO 20% Acceptance
COMP 5704 Project Presentation – Slide 22
The Effect of Noise in the Precision of SHC
0
50
100
150
200
250
0 10 20 30 40 50 60 70
Theproportionalaveragedistancetoglobaloptima
Percentage of noise
Gaussian
Uniform
COMP 5704 Project Presentation – Slide 23
DISTRIBUTION IN THE CURVE FOR 20%
GAUSSIAN NOISE
0
1
2
3
4
5
6
-3 -2 -1 0 1 2 3
Shape of the function
Results with no noise
Results with noise
COMP 5704 Project Presentation – Slide 24
DISTRIBUTION IN THE CURVE FOR 40%
GAUSSIAN NOISE
0
1
2
3
4
5
6
-3 -2 -1 0 1 2 3
Shape of the function
Results with no noise
Results with noise
COMP 5704 Project Presentation – Slide 25
DISTRIBUTION IN THE CURVE FOR 60%
GAUSSIAN NOISE
0
1
2
3
4
5
6
-3 -2 -1 0 1 2 3
Shape of the function
Results with no noise
Results with noise
COMP 5704 Project Presentation – Slide 26
Future Work
• Implementing SHC inside the Hadoop/YARN
environment
• Enhancing our current version to tune N dimensional
Hadoop/YARN configurations
• Including tuning rules into our tuning algorithms
• Assessing the feasibility of other heuristic search
algorithms such as MOWILE (More With Less)
heuristic search algorithm.
COMP 5704 Project Presentation – Slide 27
Questions
1. What does auto-tuning mean?
2. What is the dynamic auto-tuning technique?
3. Were our tests executed in the Hadoop environment or in
a simulation environment?
4. What kind of distributions are being used in our noise
generation?

Weitere ähnliche Inhalte

Was ist angesagt?

ICMR 2014 - Sparse Kernel Learning Poster
ICMR 2014 - Sparse Kernel Learning PosterICMR 2014 - Sparse Kernel Learning Poster
ICMR 2014 - Sparse Kernel Learning PosterSean Moran
 
DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...
DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...
DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...IAEME Publication
 
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...IJMER
 
Csla 130319073823-phpapp01-140821210430-phpapp02
Csla 130319073823-phpapp01-140821210430-phpapp02Csla 130319073823-phpapp01-140821210430-phpapp02
Csla 130319073823-phpapp01-140821210430-phpapp02Jayaprakash Nagaruru
 
Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...
Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...
Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...Unity Technologies
 
Design and development of carry select adder
Design and development of carry select adderDesign and development of carry select adder
Design and development of carry select adderABIN THOMAS
 
Project report on design & implementation of high speed carry select adder
Project report on design & implementation of high speed carry select adderProject report on design & implementation of high speed carry select adder
Project report on design & implementation of high speed carry select adderssingh7603
 
High Speed Carryselect Adder
High Speed Carryselect AdderHigh Speed Carryselect Adder
High Speed Carryselect Adderijsrd.com
 
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...Edge AI and Vision Alliance
 
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...IJTET Journal
 
Analysis of different bit carry look ahead adder using verilog code 2
Analysis of different bit carry look ahead adder using verilog code 2Analysis of different bit carry look ahead adder using verilog code 2
Analysis of different bit carry look ahead adder using verilog code 2IAEME Publication
 
Area–delay–power efficient carry select adder
Area–delay–power efficient carry select adderArea–delay–power efficient carry select adder
Area–delay–power efficient carry select adderLogicMindtech Nologies
 

Was ist angesagt? (17)

ICMR 2014 - Sparse Kernel Learning Poster
ICMR 2014 - Sparse Kernel Learning PosterICMR 2014 - Sparse Kernel Learning Poster
ICMR 2014 - Sparse Kernel Learning Poster
 
Final ppt
Final pptFinal ppt
Final ppt
 
DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...
DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...
DESIGN AND IMPLEMENTATION OF LOW POWER ALU USING CLOCK GATING AND CARRY SELEC...
 
51 b wittmer_latest_features_of_p_vsyst
51 b wittmer_latest_features_of_p_vsyst51 b wittmer_latest_features_of_p_vsyst
51 b wittmer_latest_features_of_p_vsyst
 
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...
 
IMPLEMENTATION OF 128-BIT SPARSE KOGGE-STONE ADDER USING VERILOG
IMPLEMENTATION OF 128-BIT SPARSE KOGGE-STONE ADDER USING VERILOGIMPLEMENTATION OF 128-BIT SPARSE KOGGE-STONE ADDER USING VERILOG
IMPLEMENTATION OF 128-BIT SPARSE KOGGE-STONE ADDER USING VERILOG
 
Hybrid Adder
Hybrid AdderHybrid Adder
Hybrid Adder
 
Csla 130319073823-phpapp01-140821210430-phpapp02
Csla 130319073823-phpapp01-140821210430-phpapp02Csla 130319073823-phpapp01-140821210430-phpapp02
Csla 130319073823-phpapp01-140821210430-phpapp02
 
Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...
Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...
Turning large CAD assemblies into real-time 3D visualizations- Unite Copenhag...
 
Design and development of carry select adder
Design and development of carry select adderDesign and development of carry select adder
Design and development of carry select adder
 
Project report on design & implementation of high speed carry select adder
Project report on design & implementation of high speed carry select adderProject report on design & implementation of high speed carry select adder
Project report on design & implementation of high speed carry select adder
 
High Speed Carryselect Adder
High Speed Carryselect AdderHigh Speed Carryselect Adder
High Speed Carryselect Adder
 
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
 
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
 
Analysis of different bit carry look ahead adder using verilog code 2
Analysis of different bit carry look ahead adder using verilog code 2Analysis of different bit carry look ahead adder using verilog code 2
Analysis of different bit carry look ahead adder using verilog code 2
 
Area–delay–power efficient carry select adder
Area–delay–power efficient carry select adderArea–delay–power efficient carry select adder
Area–delay–power efficient carry select adder
 
Project 2019 05
Project 2019 05Project 2019 05
Project 2019 05
 

Ähnlich wie Comp5704-Final Presentation

Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningJui-Hsin (Larry) Lai
 
Virtual Simulation Of Systems
Virtual Simulation Of SystemsVirtual Simulation Of Systems
Virtual Simulation Of SystemsHites
 
Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...SURFevents
 
Portofolio Control Version SN
Portofolio Control Version SNPortofolio Control Version SN
Portofolio Control Version SNSamuel Narcisse
 
Design and minimization of reversible programmable logic arrays and its reali...
Design and minimization of reversible programmable logic arrays and its reali...Design and minimization of reversible programmable logic arrays and its reali...
Design and minimization of reversible programmable logic arrays and its reali...Sajib Mitra
 
EMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATION
EMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATIONEMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATION
EMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATIONPiero Belforte
 
VLSI subsystem design processes and illustration
VLSI subsystem design processes and illustrationVLSI subsystem design processes and illustration
VLSI subsystem design processes and illustrationVishal kakade
 
vlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptx
vlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptxvlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptx
vlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptxAssemNazirova2
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...
SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...
SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...South Tyrol Free Software Conference
 
Planning & Scheduling - Training
Planning & Scheduling - TrainingPlanning & Scheduling - Training
Planning & Scheduling - TrainingMohammed Feroze
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
 
Portfolio control version sn_v5
Portfolio control version sn_v5Portfolio control version sn_v5
Portfolio control version sn_v5Samuel Narcisse
 
Realtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN ModelsRealtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN Modelsnithinsai2992
 
Digital twin for ports and terminals schuett slideshare
Digital twin for ports and terminals schuett slideshareDigital twin for ports and terminals schuett slideshare
Digital twin for ports and terminals schuett slideshareHolger Schuett
 
Aggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the GapsAggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the GapsRoberto Casadei
 

Ähnlich wie Comp5704-Final Presentation (20)

lecture 2 parametric yield.pdf
lecture 2 parametric yield.pdflecture 2 parametric yield.pdf
lecture 2 parametric yield.pdf
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
Virtual Simulation Of Systems
Virtual Simulation Of SystemsVirtual Simulation Of Systems
Virtual Simulation Of Systems
 
Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...Computational steering Interactive Design-through-Analysis for Simulation Sci...
Computational steering Interactive Design-through-Analysis for Simulation Sci...
 
Portofolio Control Version SN
Portofolio Control Version SNPortofolio Control Version SN
Portofolio Control Version SN
 
resumelrs_jan_2017
resumelrs_jan_2017resumelrs_jan_2017
resumelrs_jan_2017
 
Design and minimization of reversible programmable logic arrays and its reali...
Design and minimization of reversible programmable logic arrays and its reali...Design and minimization of reversible programmable logic arrays and its reali...
Design and minimization of reversible programmable logic arrays and its reali...
 
EMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATION
EMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATIONEMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATION
EMCLO PROJECT: EMC DESIGN METHODOLOGY FOR LAYOUT OPTIMIZATION
 
VLSI subsystem design processes and illustration
VLSI subsystem design processes and illustrationVLSI subsystem design processes and illustration
VLSI subsystem design processes and illustration
 
vlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptx
vlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptxvlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptx
vlsisubsystemdesignprocessesandillustration-131101063110-phpapp02.pptx
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...
SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...
SFScon 2020 - Alex Bojeri - BLUESLEMON project autonomous UAS for landslides ...
 
Planning & Scheduling - Training
Planning & Scheduling - TrainingPlanning & Scheduling - Training
Planning & Scheduling - Training
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
Edge-Fog Cloud
Edge-Fog CloudEdge-Fog Cloud
Edge-Fog Cloud
 
Portfolio control version sn_v5
Portfolio control version sn_v5Portfolio control version sn_v5
Portfolio control version sn_v5
 
Realtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN ModelsRealtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN Models
 
Digital twin for ports and terminals schuett slideshare
Digital twin for ports and terminals schuett slideshareDigital twin for ports and terminals schuett slideshare
Digital twin for ports and terminals schuett slideshare
 
Paper
PaperPaper
Paper
 
Aggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the GapsAggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the Gaps
 

Comp5704-Final Presentation

  • 1. COMP 5704 Project Presentation Towards Using Smart Hill Climbing Heuristic Search Strategy in Hadoop/YARN Dynamic Parameter Tuning Ali Davoudian, Pablo Navarro School of Computer Science Carleton University, Ottawa, Canada
  • 2. COMP 5704 Project Presentation – Slide 1 MapReduce Framework Input Chunk-2 Input Chunk-1 Input Chunk-3 Map#1 Map#2 Map#3 G11 G12 G21 G22 G31 G32 G11 G21 G31 G12 G22 G32 Red#1 Red#2 Output Chunk-1 Output Chunk-2 MapReduce Job
  • 3. COMP 5704 Project Presentation – Slide 2 Hadoop/YARN Configuration Parameters • Hadoop/YARN configuration parameters have significant effect on the cost of MapReduce jobs. io.sort.mb Circular memory buffer Collect Sort and spill to disk Merge Read of HDFS
  • 4. COMP 5704 Project Presentation – Slide 3 Configuration Parameter Tuning • Which configuration gives the minimum MR job cost? 1. Manual-tuning • Challenge: combinatorial explosion problem 2. Auto-tuning I. Static II. Dynamic • Search-based methods
  • 5. COMP 5704 Project Presentation – Slide 4 Static Parameter Tuning Execute a test run With enabled profilingInitial MR job configuration Profiling outputs Performance analyzer MR job configuration  Time consuming  Not cost-effective  By changing the data set or hardware, tests should be repeated MR job
  • 6. COMP 5704 Project Presentation – Slide 5 Dynamic Parameter Tuning Initial MR job configuration autotuner wave execution Cost analyzer Optimum or near optimum configuration Smart hill climbing exploration MR job
  • 7. COMP 5704 Project Presentation – Slide 6 Search-based Auto-tuning • Define an objective function Y as a candidate for the cost of MR job.  E.g., average execution time of containers • Assumption: • Problem: What is the optimal configuration as it gives the minimum or near to minimum amount of C • Challenge: 𝒇is unknown or black-box
  • 8. COMP 5704 Project Presentation – Slide 7 Heuristic Search Methods I. Simulated annealing  Uses the Metropolis Monte Carlo sampling strategy  Guarantees a global optima  Has a slow convergence to the solution II. Recursive random search  Uses the Recursive Random sampling strategy  It may be inefficient, as restarts of the naïve random sampling may waste efforts III. Hill climbing  Uses the gradient-based sampling strategy  It may get stuck at a local optima area IV. Genetic algorithms V. Particle swarm optimization
  • 9. COMP 5704 Project Presentation – Slide 8 Smart Hill Climbing Exploration - SHC Collect m sample configurations c1, . . . ,cm in the whole configuration space S From the obtained sample points & their costs determine a reduced or changed subspace S’ which is most likely to contain optimal or near-optimal configuration Determine the optimal configuration with regard to all the obtained sample points Collect m sample configurations c1, . . . ,cm in the whole configuration space S’ Restart Focus
  • 10. COMP 5704 Project Presentation – Slide 9 Bypassing Local Searches – Approach1 Reduces the overhead But less noise resilient
  • 11. COMP 5704 Project Presentation – Slide 10 Bypassing Local Searches – Approach2 More noise resilient but increases the overhead
  • 12. COMP 5704 Project Presentation – Slide 11 Weighted Latin HyperCube Sampling - wLHS Determine K equi-sized non-overlapping intervals I1,…IK in the space of each parameter Pi Calculate the cost Y of each configuration Determine the general trend and correlation of Y with each configuration parameter Pi Determine K equi-probability non-overlapping intervals I1,…IK in the space of each parameter Pi regarding its PDF Randomly select one parameter value from each interval
  • 13. COMP 5704 Project Presentation – Slide 12 wLHS – Example C2.p C3.pC1.p 𝒑 1/3 2/3 1 PDF(p) A = 𝜡 𝟎 B = 𝜡 𝟑𝜡 𝟏 𝜡 𝟐 r3 r2 r1 I1 I2 I3 0 Assumptions: • One dimensional configurations • Parameter space: [A-B] • Number of samples: 3
  • 14. COMP 5704 Project Presentation – Slide 13 Implementation and Experiments • The algorithms were implemented in java for one dimensional configuration optimization outside of the Hadoop/YARN context, to reduce complexity and make testing easier. • Experiments were conducted in artificial cost functions to look for optimal configurations.
  • 15. COMP 5704 Project Presentation – Slide 14 Weighted Latin Hypercube Sampling • wLHS was tested to verify if the size of intervals would shift after learning where good values (low values) are found. • A simple artificial cost function was used to do these experiments.
  • 16. COMP 5704 Project Presentation – Slide 15 Rosenbrock Function 0 20 40 60 80 100 120 140 160 180 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Rosenbrock function
  • 17. COMP 5704 Project Presentation – Slide 16 Round 1 wLHS 0 20 40 60 80 100 120 140 160 180 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 cost(x) x values LHS round 1
  • 18. COMP 5704 Project Presentation – Slide 17 Round 4 wHLS 0 20 40 60 80 100 120 140 160 180 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 cost(x) x values WHLS round 4
  • 19. COMP 5704 Project Presentation – Slide 18 Intervals After 20 Rounds -1000 -500 0 500 1000 1500 2000 2500 3000 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
  • 20. COMP 5704 Project Presentation – Slide 19 SHC Experiments • Smart hill climbing was implemented and tested using more complex cost functions. • Two different versions of SHC were implemented: • SHC Original version(Approach 1). • SHC MROnline version (Approach 2). • The focus of the tests were: • Finding the best parameters (Lowest cost). • Number of cost function Executions. • Noise Resilience.
  • 21. COMP 5704 Project Presentation – Slide 20 SHC Used for a Complex Function 0 1 2 3 4 5 6 -3 -2 -1 0 1 2 3 Complex function Function Optima Real Optima = 2.2615652875
  • 22. COMP 5704 Project Presentation – Slide 21 Number of Average Cost Function Executions 0 20 40 60 80 100 120 140 160 180 200 Number of Cost Function Executions Average SHC MROnline SHCO 80% Acceptance SHCO 20% Acceptance
  • 23. COMP 5704 Project Presentation – Slide 22 The Effect of Noise in the Precision of SHC 0 50 100 150 200 250 0 10 20 30 40 50 60 70 Theproportionalaveragedistancetoglobaloptima Percentage of noise Gaussian Uniform
  • 24. COMP 5704 Project Presentation – Slide 23 DISTRIBUTION IN THE CURVE FOR 20% GAUSSIAN NOISE 0 1 2 3 4 5 6 -3 -2 -1 0 1 2 3 Shape of the function Results with no noise Results with noise
  • 25. COMP 5704 Project Presentation – Slide 24 DISTRIBUTION IN THE CURVE FOR 40% GAUSSIAN NOISE 0 1 2 3 4 5 6 -3 -2 -1 0 1 2 3 Shape of the function Results with no noise Results with noise
  • 26. COMP 5704 Project Presentation – Slide 25 DISTRIBUTION IN THE CURVE FOR 60% GAUSSIAN NOISE 0 1 2 3 4 5 6 -3 -2 -1 0 1 2 3 Shape of the function Results with no noise Results with noise
  • 27. COMP 5704 Project Presentation – Slide 26 Future Work • Implementing SHC inside the Hadoop/YARN environment • Enhancing our current version to tune N dimensional Hadoop/YARN configurations • Including tuning rules into our tuning algorithms • Assessing the feasibility of other heuristic search algorithms such as MOWILE (More With Less) heuristic search algorithm.
  • 28. COMP 5704 Project Presentation – Slide 27 Questions 1. What does auto-tuning mean? 2. What is the dynamic auto-tuning technique? 3. Were our tests executed in the Hadoop environment or in a simulation environment? 4. What kind of distributions are being used in our noise generation?