SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
Hadoop Performance Tuning 
      A case study 
          Milind Bhandarkar 
           (Suhas Gogate) 
              (Viraj Bhat) 
Do we really need to tune it ? 
    Case study: RecommendaGon Engine 
         20 dedicated machines ($160,000 for 3 Years) 
         3 Hours runGme every day 
            Needed 15% improvement per year 

               Equivalent to 3 addiGonal machines ($24,000 for 3 years) 
         4 Full‐Gme programmers 
            2 months for needed improvement ($120,000) 


    Throw more hardware at it ! 
         Happy Ending: 100 machines, 1 hour runGme = $20 per day 
            $22000 for 3 years 

                                                                            6/29/09
Hadoop 
    Highly configurable commodity cluster compuGng programming 
     framework 
         Hides messy details of parallelizaGon and exposes only sequenGal map‐reduce API for 
          programmers 
         Hadoop run‐Gme system takes care of parGGoning, scheduling and program execuGon 
              Hints from user 
         Allows programmers having li`le/no experience in parallel programming write parallel programs 
         Exposes various knobs to tuning jobs 

    Tuning individual jobs is a non‐trivial task 
         165+ tunable parameters in hadoop framework 
         Tuning a single parameter can have adverse impact on the other 


                                                                                                      6/29/09
Map‐Reduce ApplicaGon OpGmizaGon 
    Job Performance – user perspecGve 
         Reduce end‐to‐end execuGon Gme 
              Yield quicker analysis of user data 

    Cluster UGlizaGon – provider perspecGve 
         Service provider to make use of cluster resources across mulGple users 
         Increase overall throughput in terms of number of jobs/unit Gme 




                                                                                    6/29/09
Approach 
    ApplicaGon AgonisGc 
         Without instrumentaGon or modificaGon of user code 
         Instrument the underlying run‐Gme hadoop (map‐reduce) framework 

    Case Study 
         Model Trainer based on MaxEnt 
         IteraGve ApplicaGon 
              Each iteraGon is a Map‐Reduce job 
         Uses Hadoop Streaming 
         Wri`en in C++ 

                                                                             6/29/09
MaxEnt 
    Maximum entropy models are considered promising avenues for text 
     classificaGon 
         Long training Gmes make entropy research difficult 

    To offset this, trainer uses Hadoop Map‐Reduce programming framework to 
     speed execuGon 
         200 machines, 16‐18 Hours runGme 




                                                                         6/29/09
MaxEnt (contd) 
  MaxEnt uses Hadoop Streaming  

    C++ Binaries as Mappers and Reducers 

    File‐based splits 
         large “mapred.min.split.size” to allow file size splits 
                Results 5.9k maps each map operaGng on 230MB of idx data 




                                                                             6/29/09
MaxEnt baseline 
    OpGmizaGons already used                           Hadoop Version   Time sec (M:R)

         SpeculaGve execuGon 
         Cache intermediate results in the iteraGon 
          stages using ‐cacheFile opGon 
                                                        0.17             1545 (5900:800)
    Establish Baseline                                 (100 nodes)
         Benchmark MaxEnt code on dedicated            0.18             1459 (5900:800)
          cluster for reproducible results              (100 nodes)




                                                                                           6/29/09
Methodology for Tuning MaxEnt 
    Changing Map/Reduces 

    Reducing Intermediate data  
         Use of combiners 

    Reducing disk spill on map side 

    Compressing map output 

    Reducing Reduce side disk spill 

    Effect of Increasing Maps Slots 

                                        6/29/09
Changing Number of Maps/Reduces 
    OpGmal number of Maps and Reduces                      Hadoop     Time sec      Hadoop    Time sec
                                                            Version      (M:R)       Version     (M:R)
       Typically larger number of maps are be`er 

          Low map re‐execuGon Gme in case of 
           failure                                           0.17        1545         0.18        1459
          ComputaGon to CommunicaGon overlap                         (5900:800)               (5900:800)
           is be`er                                          0.17         1740        0.18        1197
       Typically 0.95% of total reduce slots is a                    (12,300:380)             (5900:380)
        good number for reduce tasks                                                  0.18        1255
          If map local o/p is very large (exceeding                                           (5900:200)
           the local disk capacity) 
          Reduce logic needs scale out (cpu 
           bound) 
          In this case typically use larger cluster for 
           the job.  
       Shuffle requires M * R transfers 




                                                                                                      6/29/09
Use of combiner 

         Hadoop    Combiner   Time sec (M:R)
         Version

          0.17       No       1545 (5900:800)
          0.17       Yes      1256 (5900:800)
          0.18       No       1197 (5900:380)
          0.18       Yes      934 (5900:380)




                                                6/29/09
Combiner Efficiency 
Significant gain is obtained in using 
the combiner at both map and 
reduce stages in the feature count 
applicaGon 

Combiner Efficiency = [1 ‐ 
(Combine O/P Records) /
(Combiner I/P Records)]*100 
     Combiner Efficiency at map stage 
     is around 40% 
     Combiner Efficiency at reduce 
     stage is around 22% 



                                        6/29/09
Avoiding Map side Disk spill 
                                                                               Map Task M0
    P = ParGGon  
                                                                               Map Sort buffers

    KS = Key Start, VS = Value Start                                          Index                            Index and
                                                                                                                partition
                                                                                                                buffers (int
    Sort buffer is a byte array which           io.sort.record.percent
                                                                          P KS VS                               arrays)
     wraps around                               * io.sort.mb
                                                                                              Sort buffer, bytes array

    ParGGon and Index buffers maintain                                             Wrap around buffer
     locaGon of key and value for records 
     for in‐memory quick sort                                 io.sort.mb*io.sort.spill.percent

                                                                           io.sort.mb
    Data is spilled to disk when the 
     io.sort.record.percent exceeds 
         For both the index and data buffers                                      HDFS                Local
                                                                                  Partition



                                                                                                                         6/29/09
Reducing disk spill on map side  
   Hadoop         Io.sort.mb        Time sec (M:R)
    Version


     0.18            256            934 (5900:380)

     0.18            350            921 (5900:380)




                                                     6/29/09
Analyzing effect of minimizing disk spill 

     Time sec (M:R)   Map Local   Map Local      Map HDFS      Avg. Map     Avg
                        Bytes       Bytes      Bytes Read TB   Time sec    Reduce
                       Read TB    Written TB                              Time sec


     934 (5900:380)      1.6         2.6           1.29          42         872

     921 (5900:380)      0          0.789          1.29          39         853




                                                                                     6/29/09
Effect of compressing map output 
   Hadoop    Map Output Compression   Time sec (M:R)
   Version




    0.18                  No          921 (5900:380)

    0.18                  Yes         1308 (5900:380)




                                                        6/29/09
Reducing Reduce side disk spill 
                            Reduce side disk spill is caused when merging 
                             map‐output causes the hadoop framework to 
                             write merges to local disk 

                            Reduce side disk spill is difference between 
                             total map output bytes (MOB) for all maps and 
                             total reducer local bytes wri`en (RLBW) 


                             2 parameters determine reduce side disk spill 
                                 fs.inmemory.size.mb determines size of merge 
                                  buffer allocated by the framework 
                                 io.sort.factor determines the number of 
                                  segments to keep in memory before wriGng to 
                                  disk 




                                                                           6/29/09
Effect of Minimizing Reduce side disk spill 
    fs.inmemory.size.mb = 350MB and         Hadoop    Time sec (M:R)
                                             Version
     io.sort.factor = 380 
       Larger amount of memory 
                                              0.18     1308 (5900:380)
        allocated for the in‐memory file‐
                                              0.18     834 (5900:380)
        system used to merge map‐
        outputs at the reduces 
         io.sort.factor was increased to 
          keep more segments in memory 
          before merging 




                                                                         6/29/09
Increasing number of map and reduce slots  
    Normal nature of the map‐reduce computaGons require 
         Larger number of maps compared to reduces 

    Increase number of map slots depending on nature of map computaGons 
         If map is cpu intensive increasing map slots could result in slowing of map‐tasks 
         If map is io‐intensive increasing map slots could result in increasing local disk 
          contenGon 

    Caveat 
         Consider resources used by task tracker and data nodes when increasing map or 
          reduce slots 


                                                                                           6/29/09
Opt 6: Effect of Increasing Maps Slots 
   Hadoop     Map Slots + Reduce Slots   Time sec (M:R)
   Version




    0.18               4+4               834 (5900:380)

    0.18               7+4               732 (5900:380)




                                                          6/29/09
Conclusions 
    ApplicaGon agnosGc tuning was beneficial for the MaxEnt applicaGon 
         RunGme reduced from 1459 seconds to 732 seconds 
         Total Gme spent: 2 days 

    AddiGve nature of opGmizaGon may not be true 
         Tuning one parameter can have adverse impact on the other 




                                                                          6/29/09
Vaidya 
    Hadoop Vaidya (Vaidya in Sanskrit language means "one who knows", or "a 
     physician") 




                                                                           6/29/09
Hadoop Vaidya: Rule based performance 
diagnosGcs 
    Rule based performance diagnosis of M/R jobs  
          M/R performance analysis experGse is captured and provided as an input through 
          a set of pre‐defined diagnosGc rules 
          Detects performance problems by postmortem analysis of a job by execuGng the 
          diagnosGc rules against the job execuGon counters 
          Provides targeted advice against individual performance problems 

     Extensible framework 
          Can add more and more hints 
             based on a hint template and published job counters data structures  
          Write complex hints using exisGng simpler hint 

                                                                                      6/29/09
Hadoop Vaidya 
    Input Data used for evaluaGng the rules 
          Job History, Job ConfiguraGon (xml) 

     Patch commi`ed : contrib/vaidya (Hadoop 0.20.0) 
         h6p://issues.apache.org/jira/browse/HADOOP‐4179  

     Future enhancements 
          Online progress analysis of the Map/Reduce jobs  
          Adapter for Chukwa : Performance data collecGon system 



                                                                     6/29/09
DiagnosGc Test Rule 
<DiagnosticTest>
    <Title>Balanaced Reduce Partitioning</Title>
   <ClassName>org.apache.hadoop.chukwa.vaidya.postexdiagnosis.tests.BalancedReducePartitioning</
   ClassName>
    <Description>This rule tests as to how well the input to reduce tasks is balanced</
   Description>
    <Importance>High</Importance>
   <SuccessThreshold>0.20</SuccessThreshold>
   <Prescription>advice</Prescription>
   <InputElement>
      <PercentReduceRecords>0.85</PercentReduceRecords>
   </InputElement>
</DiagnosticTest>




                                                                                             6/29/09
DiagnosGc Report 
<TestReportElement>	
      <TestTitle>Balanaced Reduce Partitioning</TestTitle>	
      <TestDescription>	
            This rule tests as to how well the input to reduce tasks is balanced	
      </TestDescription>	
      <TestImportance>HIGH</TestImportance>	
      <TestResult>POSITIVE</TestResult>	
      <TestSeverity>0.16</TestSeverity>	
      <ReferenceDetails>	
            * TotalReduceTasks: 4096	
            * BusyReduceTasks processing 0.85% of total records: 3373	
            * Impact: 0.17	
      </ReferenceDetails>	
      <TestPrescription>	
           * Use the appropriate partitioning function	
           * For streaming job consider following partitioner and hadoop config parameters	
              * org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner	
              * -jobconf stream.map.output.field.separator, -jobconf stream.num.map.output.key.fields	
      </TestPrescription>	
  </TestReportElement>	




                                                                                                         6/29/09
ExecuGng and WriGng your own tests 
    Execute $HADOOP_HOME/contrib/vaidya/bin/vaidya.sh 
          Job's configuraGon file ‐jobconf job_conf.xml 
          Job history log file ‐joblog job_history_log_file 
          Test descripGon file ‐testconf postex_diagonosGc_tests.xml 
              Default test is available at: $HADOOP_HOME/contrib/vaidya/conf/postex_diagnosGc_tests.xml  
         LocaGon of the final report file ‐report report_file  

    WriGng and execuGng your own test rules is not very hard 
         WriGng a test class for your new test case should extend the 
          org.apache.hadoop.vaidya.DiagnosGcTest 
         evaluate() 
         getPrescripGon() 
         getReferenceDetails()  


                                                                                                             6/29/09
QuesGons ? 




              6/29/09

Weitere ähnliche Inhalte

Was ist angesagt?

Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeWuBinbo
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteElectronic Arts / DICE
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringElectronic Arts / DICE
 
A new Post-Processing Pipeline
A new Post-Processing PipelineA new Post-Processing Pipeline
A new Post-Processing PipelineWolfgang Engel
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
A GPU Implementation of Generalized Graph Processing Algorithm GIM-V
A GPU Implementation of Generalized Graph Processing Algorithm GIM-VA GPU Implementation of Generalized Graph Processing Algorithm GIM-V
A GPU Implementation of Generalized Graph Processing Algorithm GIM-VKoichi Shirahata
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect AndromedaElectronic Arts / DICE
 
Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Johan Andersson
 
Foveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsFoveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsTakahiro Harada
 
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU ArchitectureRevisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecturemohamedragabslideshare
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunElectronic Arts / DICE
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsHolger Gruen
 
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Jyotirmoy Sundi
 
06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clusteringSubhas Kumar Ghosh
 
Parallel Futures of a Game Engine
Parallel Futures of a Game EngineParallel Futures of a Game Engine
Parallel Futures of a Game EngineJohan Andersson
 

Was ist angesagt? (20)

Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with compute
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
 
A new Post-Processing Pipeline
A new Post-Processing PipelineA new Post-Processing Pipeline
A new Post-Processing Pipeline
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
A GPU Implementation of Generalized Graph Processing Algorithm GIM-V
A GPU Implementation of Generalized Graph Processing Algorithm GIM-VA GPU Implementation of Generalized Graph Processing Algorithm GIM-V
A GPU Implementation of Generalized Graph Processing Algorithm GIM-V
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 
Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)
 
Scope Stack Allocation
Scope Stack AllocationScope Stack Allocation
Scope Stack Allocation
 
Foveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsFoveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUs
 
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU ArchitectureRevisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked Lists
 
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
 
06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering
 
Parallel Futures of a Game Engine
Parallel Futures of a Game EngineParallel Futures of a Game Engine
Parallel Futures of a Game Engine
 

Andere mochten auch

Stroke shoulder
Stroke   shoulderStroke   shoulder
Stroke shoulderregunath
 
LinkedIn: The Power of Authenticity
LinkedIn:  The Power of AuthenticityLinkedIn:  The Power of Authenticity
LinkedIn: The Power of AuthenticitySocial Jack
 
Climate project
Climate projectClimate project
Climate projectdbelova
 
01情緒管理與壓力調適對現代人的重要性
01情緒管理與壓力調適對現代人的重要性01情緒管理與壓力調適對現代人的重要性
01情緒管理與壓力調適對現代人的重要性juliant00077
 
界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07
界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07
界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07Hans Shih
 
OpenID Connect - An Emperor or Just New Cloths?
OpenID Connect - An Emperor or Just New Cloths?OpenID Connect - An Emperor or Just New Cloths?
OpenID Connect - An Emperor or Just New Cloths?Oliver Pfaff
 
淺談金字塔原理
淺談金字塔原理淺談金字塔原理
淺談金字塔原理基欽 劉
 
2015 Family Financial Trade-offs Survey Highlights
2015 Family Financial Trade-offs Survey Highlights2015 Family Financial Trade-offs Survey Highlights
2015 Family Financial Trade-offs Survey HighlightsT. Rowe Price
 
Theory of Change: Strategic Planning for Nonprofit Organizations
Theory of Change: Strategic Planning for Nonprofit OrganizationsTheory of Change: Strategic Planning for Nonprofit Organizations
Theory of Change: Strategic Planning for Nonprofit OrganizationsEmily Davis Consulting
 
Minimum Desirable Product
Minimum Desirable ProductMinimum Desirable Product
Minimum Desirable ProductAndrew Chen
 
Guidelines for open office layouts
Guidelines for open office layoutsGuidelines for open office layouts
Guidelines for open office layoutsEd Kraay
 
Shutterfly 2012 Media Plan
Shutterfly 2012 Media PlanShutterfly 2012 Media Plan
Shutterfly 2012 Media PlanCathy Chao
 
Primitive And Tonic Reflexes
Primitive And Tonic ReflexesPrimitive And Tonic Reflexes
Primitive And Tonic ReflexesApeksha Besekar
 
SplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical OverviewSplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical OverviewSplunk
 
LUMA's State of the State 2015 at DMS 15
LUMA's State of the State 2015 at DMS 15LUMA's State of the State 2015 at DMS 15
LUMA's State of the State 2015 at DMS 15LUMA Partners
 
Sobreexplotación de los recursos naturales
Sobreexplotación de los recursos naturalesSobreexplotación de los recursos naturales
Sobreexplotación de los recursos naturalesQoral Arely Ramos
 
Writing kick-ass hypotheses: Lean UX Meetup, Las Vegas : July
Writing kick-ass hypotheses: Lean UX Meetup, Las Vegas : JulyWriting kick-ass hypotheses: Lean UX Meetup, Las Vegas : July
Writing kick-ass hypotheses: Lean UX Meetup, Las Vegas : JulyKate Rutter
 

Andere mochten auch (20)

Stroke shoulder
Stroke   shoulderStroke   shoulder
Stroke shoulder
 
LinkedIn: The Power of Authenticity
LinkedIn:  The Power of AuthenticityLinkedIn:  The Power of Authenticity
LinkedIn: The Power of Authenticity
 
Climate project
Climate projectClimate project
Climate project
 
01情緒管理與壓力調適對現代人的重要性
01情緒管理與壓力調適對現代人的重要性01情緒管理與壓力調適對現代人的重要性
01情緒管理與壓力調適對現代人的重要性
 
界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07
界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07
界面設計黑魔法 - The Dark Art of Interface Design @ RGBA 07
 
OpenID Connect - An Emperor or Just New Cloths?
OpenID Connect - An Emperor or Just New Cloths?OpenID Connect - An Emperor or Just New Cloths?
OpenID Connect - An Emperor or Just New Cloths?
 
淺談金字塔原理
淺談金字塔原理淺談金字塔原理
淺談金字塔原理
 
2015 Family Financial Trade-offs Survey Highlights
2015 Family Financial Trade-offs Survey Highlights2015 Family Financial Trade-offs Survey Highlights
2015 Family Financial Trade-offs Survey Highlights
 
Theory of Change: Strategic Planning for Nonprofit Organizations
Theory of Change: Strategic Planning for Nonprofit OrganizationsTheory of Change: Strategic Planning for Nonprofit Organizations
Theory of Change: Strategic Planning for Nonprofit Organizations
 
Minimum Desirable Product
Minimum Desirable ProductMinimum Desirable Product
Minimum Desirable Product
 
Robotium Tutorial
Robotium TutorialRobotium Tutorial
Robotium Tutorial
 
Guidelines for open office layouts
Guidelines for open office layoutsGuidelines for open office layouts
Guidelines for open office layouts
 
Shutterfly 2012 Media Plan
Shutterfly 2012 Media PlanShutterfly 2012 Media Plan
Shutterfly 2012 Media Plan
 
Primitive And Tonic Reflexes
Primitive And Tonic ReflexesPrimitive And Tonic Reflexes
Primitive And Tonic Reflexes
 
Hack Day EU 2011 YQL
Hack Day EU 2011 YQLHack Day EU 2011 YQL
Hack Day EU 2011 YQL
 
SplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical OverviewSplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical Overview
 
Chef introduction
Chef introductionChef introduction
Chef introduction
 
LUMA's State of the State 2015 at DMS 15
LUMA's State of the State 2015 at DMS 15LUMA's State of the State 2015 at DMS 15
LUMA's State of the State 2015 at DMS 15
 
Sobreexplotación de los recursos naturales
Sobreexplotación de los recursos naturalesSobreexplotación de los recursos naturales
Sobreexplotación de los recursos naturales
 
Writing kick-ass hypotheses: Lean UX Meetup, Las Vegas : July
Writing kick-ass hypotheses: Lean UX Meetup, Las Vegas : JulyWriting kick-ass hypotheses: Lean UX Meetup, Las Vegas : July
Writing kick-ass hypotheses: Lean UX Meetup, Las Vegas : July
 

Ähnlich wie Berkeley Performance Tuning

White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuningAnil Reddy
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map ReduceApache Apex
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceBhupesh Chawda
 
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Yahoo Developer Network
 
Hadoop performance optimization tips
Hadoop performance optimization tipsHadoop performance optimization tips
Hadoop performance optimization tipsSubhas Kumar Ghosh
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reducerantav
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profilepramodbiligiri
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorAMD Developer Central
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptxShimoFcis
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceM Baddar
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersAshraf Uddin
 
Map reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clustersMap reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clustersCleverence Kombe
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine ParallelismSri Prasanna
 

Ähnlich wie Berkeley Performance Tuning (20)

White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
 
Apache Nemo
Apache NemoApache Nemo
Apache Nemo
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
 
Hadoop performance optimization tips
Hadoop performance optimization tipsHadoop performance optimization tips
Hadoop performance optimization tips
 
Ultra Fast SOM using CUDA
Ultra Fast SOM using CUDAUltra Fast SOM using CUDA
Ultra Fast SOM using CUDA
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
MapReduce
MapReduceMapReduce
MapReduce
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
E031201032036
E031201032036E031201032036
E031201032036
 
Map reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clustersMap reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clusters
 
MapReduce-Notes.pdf
MapReduce-Notes.pdfMapReduce-Notes.pdf
MapReduce-Notes.pdf
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine Parallelism
 

Mehr von George Ang

Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...George Ang
 
Opinion mining and summarization
Opinion mining and summarizationOpinion mining and summarization
Opinion mining and summarizationGeorge Ang
 
Huffman coding
Huffman codingHuffman coding
Huffman codingGeorge Ang
 
Do not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar textDo not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar textGeorge Ang
 
大规模数据处理的那些事儿
大规模数据处理的那些事儿大规模数据处理的那些事儿
大规模数据处理的那些事儿George Ang
 
腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势George Ang
 
腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程George Ang
 
腾讯大讲堂04 im qq
腾讯大讲堂04 im qq腾讯大讲堂04 im qq
腾讯大讲堂04 im qqGeorge Ang
 
腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道George Ang
 
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化George Ang
 
腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间George Ang
 
腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨George Ang
 
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站George Ang
 
腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程George Ang
 
腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagement腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagementGeorge Ang
 
腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享George Ang
 
腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍George Ang
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍George Ang
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍George Ang
 
腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享George Ang
 

Mehr von George Ang (20)

Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...
 
Opinion mining and summarization
Opinion mining and summarizationOpinion mining and summarization
Opinion mining and summarization
 
Huffman coding
Huffman codingHuffman coding
Huffman coding
 
Do not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar textDo not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar text
 
大规模数据处理的那些事儿
大规模数据处理的那些事儿大规模数据处理的那些事儿
大规模数据处理的那些事儿
 
腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势
 
腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程
 
腾讯大讲堂04 im qq
腾讯大讲堂04 im qq腾讯大讲堂04 im qq
腾讯大讲堂04 im qq
 
腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道
 
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化
 
腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间
 
腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨
 
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站
 
腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程
 
腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagement腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagement
 
腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享
 
腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
 
腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享
 

Kürzlich hochgeladen

Vip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts Service
Vip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts ServiceVip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts Service
Vip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts ServiceApsara Of India
 
Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...
Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...
Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...Riya Pathan
 
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...First NO1 World Amil baba in Faisalabad
 
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts ServiceVIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts ServiceApsara Of India
 
Udaipur Call Girls 9602870969 Call Girl in Udaipur Rajasthan
Udaipur Call Girls 9602870969 Call Girl in Udaipur RajasthanUdaipur Call Girls 9602870969 Call Girl in Udaipur Rajasthan
Udaipur Call Girls 9602870969 Call Girl in Udaipur RajasthanApsara Of India
 
Kolkata Call Girls Service +918240919228 - Kolkatanightgirls.com
Kolkata Call Girls Service +918240919228 - Kolkatanightgirls.comKolkata Call Girls Service +918240919228 - Kolkatanightgirls.com
Kolkata Call Girls Service +918240919228 - Kolkatanightgirls.comKolkata Call Girls
 
Hi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort Services
Hi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort ServicesHi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort Services
Hi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort ServicesApsara Of India
 
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzersQUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzersSJU Quizzers
 
High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...
High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...
High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...Riya Pathan
 
Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7
Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7
Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7Riya Pathan
 
Call Girls Near Delhi Pride Hotel New Delhi 9873777170
Call Girls Near Delhi Pride Hotel New Delhi 9873777170Call Girls Near Delhi Pride Hotel New Delhi 9873777170
Call Girls Near Delhi Pride Hotel New Delhi 9873777170Sonam Pathan
 
5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...
5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...
5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...Apsara Of India
 
Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...
Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...
Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...Amil Baba Company
 
5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services
5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services
5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort ServicesApsara Of India
 
Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...
Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...
Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...srsj9000
 
Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170
Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170
Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170Sonam Pathan
 
Hot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtS
Hot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtSHot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtS
Hot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtSApsara Of India
 
Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...
Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...
Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...Amil Baba Company
 
Call Girls Delhi {Safdarjung} 9711199012 high profile service
Call Girls Delhi {Safdarjung} 9711199012 high profile serviceCall Girls Delhi {Safdarjung} 9711199012 high profile service
Call Girls Delhi {Safdarjung} 9711199012 high profile servicerehmti665
 
1681275559_haunting-adeline and hunting.pdf
1681275559_haunting-adeline and hunting.pdf1681275559_haunting-adeline and hunting.pdf
1681275559_haunting-adeline and hunting.pdfTanjirokamado769606
 

Kürzlich hochgeladen (20)

Vip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts Service
Vip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts ServiceVip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts Service
Vip Udaipur Call Girls 9602870969 Dabok Airport Udaipur Escorts Service
 
Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...
Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...
Housewife Call Girls Sonagachi - 8250192130 Booking and charges genuine rate ...
 
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
 
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts ServiceVIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
 
Udaipur Call Girls 9602870969 Call Girl in Udaipur Rajasthan
Udaipur Call Girls 9602870969 Call Girl in Udaipur RajasthanUdaipur Call Girls 9602870969 Call Girl in Udaipur Rajasthan
Udaipur Call Girls 9602870969 Call Girl in Udaipur Rajasthan
 
Kolkata Call Girls Service +918240919228 - Kolkatanightgirls.com
Kolkata Call Girls Service +918240919228 - Kolkatanightgirls.comKolkata Call Girls Service +918240919228 - Kolkatanightgirls.com
Kolkata Call Girls Service +918240919228 - Kolkatanightgirls.com
 
Hi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort Services
Hi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort ServicesHi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort Services
Hi Class Call Girls In Goa 7028418221 Call Girls In Anjuna Beach Escort Services
 
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzersQUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
 
High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...
High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...
High Profile Call Girls Sodepur - 8250192130 Escorts Service with Real Photos...
 
Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7
Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7
Kolkata Call Girl Bagbazar 👉 8250192130 ❣️💯 Available With Room 24×7
 
Call Girls Near Delhi Pride Hotel New Delhi 9873777170
Call Girls Near Delhi Pride Hotel New Delhi 9873777170Call Girls Near Delhi Pride Hotel New Delhi 9873777170
Call Girls Near Delhi Pride Hotel New Delhi 9873777170
 
5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...
5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...
5* Hotel Call Girls In Goa 7028418221 Call Girls In Calangute Beach Escort Se...
 
Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...
Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...
Amil baba in Pakistan amil baba Karachi amil baba in pakistan amil baba in la...
 
5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services
5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services
5* Hotel Call Girls In Goa 7028418221 Call Girls In North Goa Escort Services
 
Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...
Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...
Hifi Laxmi Nagar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ D...
 
Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170
Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170
Call Girls Near Taurus Sarovar Portico Hotel New Delhi 9873777170
 
Hot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtS
Hot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtSHot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtS
Hot Call Girls In Goa 7028418221 Call Girls In Vagator Beach EsCoRtS
 
Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...
Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...
Amil Baba in Pakistan Kala jadu Expert Amil baba Black magic Specialist in Is...
 
Call Girls Delhi {Safdarjung} 9711199012 high profile service
Call Girls Delhi {Safdarjung} 9711199012 high profile serviceCall Girls Delhi {Safdarjung} 9711199012 high profile service
Call Girls Delhi {Safdarjung} 9711199012 high profile service
 
1681275559_haunting-adeline and hunting.pdf
1681275559_haunting-adeline and hunting.pdf1681275559_haunting-adeline and hunting.pdf
1681275559_haunting-adeline and hunting.pdf
 

Berkeley Performance Tuning

  • 1. Hadoop Performance Tuning  A case study  Milind Bhandarkar  (Suhas Gogate)  (Viraj Bhat) 
  • 2. Do we really need to tune it ?    Case study: RecommendaGon Engine    20 dedicated machines ($160,000 for 3 Years)    3 Hours runGme every day    Needed 15% improvement per year    Equivalent to 3 addiGonal machines ($24,000 for 3 years)    4 Full‐Gme programmers    2 months for needed improvement ($120,000)    Throw more hardware at it !    Happy Ending: 100 machines, 1 hour runGme = $20 per day    $22000 for 3 years  6/29/09
  • 3. Hadoop    Highly configurable commodity cluster compuGng programming  framework    Hides messy details of parallelizaGon and exposes only sequenGal map‐reduce API for  programmers    Hadoop run‐Gme system takes care of parGGoning, scheduling and program execuGon    Hints from user    Allows programmers having li`le/no experience in parallel programming write parallel programs    Exposes various knobs to tuning jobs    Tuning individual jobs is a non‐trivial task    165+ tunable parameters in hadoop framework    Tuning a single parameter can have adverse impact on the other  6/29/09
  • 4. Map‐Reduce ApplicaGon OpGmizaGon    Job Performance – user perspecGve    Reduce end‐to‐end execuGon Gme    Yield quicker analysis of user data    Cluster UGlizaGon – provider perspecGve    Service provider to make use of cluster resources across mulGple users    Increase overall throughput in terms of number of jobs/unit Gme  6/29/09
  • 5. Approach    ApplicaGon AgonisGc    Without instrumentaGon or modificaGon of user code    Instrument the underlying run‐Gme hadoop (map‐reduce) framework    Case Study    Model Trainer based on MaxEnt    IteraGve ApplicaGon    Each iteraGon is a Map‐Reduce job    Uses Hadoop Streaming    Wri`en in C++  6/29/09
  • 6. MaxEnt    Maximum entropy models are considered promising avenues for text  classificaGon    Long training Gmes make entropy research difficult    To offset this, trainer uses Hadoop Map‐Reduce programming framework to  speed execuGon    200 machines, 16‐18 Hours runGme  6/29/09
  • 7. MaxEnt (contd)    MaxEnt uses Hadoop Streaming     C++ Binaries as Mappers and Reducers    File‐based splits    large “mapred.min.split.size” to allow file size splits    Results 5.9k maps each map operaGng on 230MB of idx data  6/29/09
  • 8. MaxEnt baseline    OpGmizaGons already used  Hadoop Version Time sec (M:R)   SpeculaGve execuGon    Cache intermediate results in the iteraGon  stages using ‐cacheFile opGon  0.17 1545 (5900:800)   Establish Baseline  (100 nodes)   Benchmark MaxEnt code on dedicated  0.18 1459 (5900:800) cluster for reproducible results  (100 nodes) 6/29/09
  • 9. Methodology for Tuning MaxEnt    Changing Map/Reduces    Reducing Intermediate data     Use of combiners    Reducing disk spill on map side    Compressing map output    Reducing Reduce side disk spill    Effect of Increasing Maps Slots  6/29/09
  • 10. Changing Number of Maps/Reduces    OpGmal number of Maps and Reduces  Hadoop Time sec Hadoop Time sec Version (M:R) Version (M:R)   Typically larger number of maps are be`er    Low map re‐execuGon Gme in case of  failure  0.17 1545 0.18 1459   ComputaGon to CommunicaGon overlap  (5900:800) (5900:800) is be`er  0.17 1740 0.18 1197   Typically 0.95% of total reduce slots is a  (12,300:380) (5900:380) good number for reduce tasks  0.18 1255   If map local o/p is very large (exceeding  (5900:200) the local disk capacity)    Reduce logic needs scale out (cpu  bound)    In this case typically use larger cluster for  the job.     Shuffle requires M * R transfers  6/29/09
  • 11. Use of combiner  Hadoop Combiner Time sec (M:R) Version 0.17 No 1545 (5900:800) 0.17 Yes 1256 (5900:800) 0.18 No 1197 (5900:380) 0.18 Yes 934 (5900:380) 6/29/09
  • 13. Avoiding Map side Disk spill  Map Task M0   P = ParGGon   Map Sort buffers   KS = Key Start, VS = Value Start  Index Index and partition buffers (int   Sort buffer is a byte array which  io.sort.record.percent P KS VS arrays) wraps around  * io.sort.mb Sort buffer, bytes array   ParGGon and Index buffers maintain  Wrap around buffer locaGon of key and value for records  for in‐memory quick sort  io.sort.mb*io.sort.spill.percent io.sort.mb   Data is spilled to disk when the  io.sort.record.percent exceeds    For both the index and data buffers  HDFS Local Partition 6/29/09
  • 14. Reducing disk spill on map side   Hadoop Io.sort.mb Time sec (M:R) Version 0.18 256 934 (5900:380) 0.18 350 921 (5900:380) 6/29/09
  • 15. Analyzing effect of minimizing disk spill  Time sec (M:R) Map Local Map Local Map HDFS Avg. Map Avg Bytes Bytes Bytes Read TB Time sec Reduce Read TB Written TB Time sec 934 (5900:380) 1.6 2.6 1.29 42 872 921 (5900:380) 0 0.789 1.29 39 853 6/29/09
  • 16. Effect of compressing map output  Hadoop Map Output Compression Time sec (M:R) Version 0.18 No 921 (5900:380) 0.18 Yes 1308 (5900:380) 6/29/09
  • 17. Reducing Reduce side disk spill    Reduce side disk spill is caused when merging  map‐output causes the hadoop framework to  write merges to local disk    Reduce side disk spill is difference between  total map output bytes (MOB) for all maps and  total reducer local bytes wri`en (RLBW)     2 parameters determine reduce side disk spill    fs.inmemory.size.mb determines size of merge  buffer allocated by the framework    io.sort.factor determines the number of  segments to keep in memory before wriGng to  disk  6/29/09
  • 18. Effect of Minimizing Reduce side disk spill    fs.inmemory.size.mb = 350MB and  Hadoop Time sec (M:R) Version io.sort.factor = 380    Larger amount of memory  0.18 1308 (5900:380) allocated for the in‐memory file‐ 0.18 834 (5900:380) system used to merge map‐ outputs at the reduces    io.sort.factor was increased to  keep more segments in memory  before merging  6/29/09
  • 19. Increasing number of map and reduce slots     Normal nature of the map‐reduce computaGons require    Larger number of maps compared to reduces    Increase number of map slots depending on nature of map computaGons    If map is cpu intensive increasing map slots could result in slowing of map‐tasks    If map is io‐intensive increasing map slots could result in increasing local disk  contenGon    Caveat    Consider resources used by task tracker and data nodes when increasing map or  reduce slots  6/29/09
  • 20. Opt 6: Effect of Increasing Maps Slots  Hadoop Map Slots + Reduce Slots Time sec (M:R) Version 0.18 4+4 834 (5900:380) 0.18 7+4 732 (5900:380) 6/29/09
  • 21. Conclusions    ApplicaGon agnosGc tuning was beneficial for the MaxEnt applicaGon    RunGme reduced from 1459 seconds to 732 seconds    Total Gme spent: 2 days    AddiGve nature of opGmizaGon may not be true    Tuning one parameter can have adverse impact on the other  6/29/09
  • 22. Vaidya    Hadoop Vaidya (Vaidya in Sanskrit language means "one who knows", or "a  physician")  6/29/09
  • 23. Hadoop Vaidya: Rule based performance  diagnosGcs    Rule based performance diagnosis of M/R jobs      M/R performance analysis experGse is captured and provided as an input through  a set of pre‐defined diagnosGc rules     Detects performance problems by postmortem analysis of a job by execuGng the  diagnosGc rules against the job execuGon counters     Provides targeted advice against individual performance problems     Extensible framework     Can add more and more hints     based on a hint template and published job counters data structures      Write complex hints using exisGng simpler hint  6/29/09
  • 24. Hadoop Vaidya    Input Data used for evaluaGng the rules     Job History, Job ConfiguraGon (xml)     Patch commi`ed : contrib/vaidya (Hadoop 0.20.0)    h6p://issues.apache.org/jira/browse/HADOOP‐4179      Future enhancements     Online progress analysis of the Map/Reduce jobs      Adapter for Chukwa : Performance data collecGon system  6/29/09
  • 25. DiagnosGc Test Rule  <DiagnosticTest> <Title>Balanaced Reduce Partitioning</Title> <ClassName>org.apache.hadoop.chukwa.vaidya.postexdiagnosis.tests.BalancedReducePartitioning</ ClassName> <Description>This rule tests as to how well the input to reduce tasks is balanced</ Description> <Importance>High</Importance> <SuccessThreshold>0.20</SuccessThreshold> <Prescription>advice</Prescription> <InputElement> <PercentReduceRecords>0.85</PercentReduceRecords> </InputElement> </DiagnosticTest> 6/29/09
  • 26. DiagnosGc Report  <TestReportElement> <TestTitle>Balanaced Reduce Partitioning</TestTitle> <TestDescription> This rule tests as to how well the input to reduce tasks is balanced </TestDescription> <TestImportance>HIGH</TestImportance> <TestResult>POSITIVE</TestResult> <TestSeverity>0.16</TestSeverity> <ReferenceDetails> * TotalReduceTasks: 4096 * BusyReduceTasks processing 0.85% of total records: 3373 * Impact: 0.17 </ReferenceDetails> <TestPrescription> * Use the appropriate partitioning function * For streaming job consider following partitioner and hadoop config parameters * org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner * -jobconf stream.map.output.field.separator, -jobconf stream.num.map.output.key.fields </TestPrescription> </TestReportElement> 6/29/09
  • 27. ExecuGng and WriGng your own tests    Execute $HADOOP_HOME/contrib/vaidya/bin/vaidya.sh     Job's configuraGon file ‐jobconf job_conf.xml     Job history log file ‐joblog job_history_log_file     Test descripGon file ‐testconf postex_diagonosGc_tests.xml    Default test is available at: $HADOOP_HOME/contrib/vaidya/conf/postex_diagnosGc_tests.xml     LocaGon of the final report file ‐report report_file     WriGng and execuGng your own test rules is not very hard    WriGng a test class for your new test case should extend the  org.apache.hadoop.vaidya.DiagnosGcTest    evaluate()    getPrescripGon()    getReferenceDetails()   6/29/09
  • 28. QuesGons ?  6/29/09