SlideShare ist ein Scribd-Unternehmen logo
1 von 28
1 Map-Reduce Programming & Best Practices Apache Hadoop India Summit 2011 Basant Verma Yahoo! India R&D February 16, 2011
Hadoop Components 2 Client 1 Client 2 Processing Framework DFS HDFS (Hadoop Distributed File System) Modeled on GFS Reliable, High Bandwidth file system that can  store TB' and PB's data. Map-Reduce Using Map/Reduce metaphor from Lisp language A distributed processing framework paradigm that process the data stored onto HDFS in key-value.
Word Count DataFlow
Word Count $ cat ~/wikipedia.txt | br />sed -e 's/ //g' | grep . | br />sort | br />uniq -c > br />~/frequencies.txt 4
MR for Word-Count mapper (filename, file-contents): 	for each word in file-contents: 		emit (word, 1) reducer (word, values[]): 	sum = 0 	for each value in values: 		sum = sum + value 	emit (word, sum) 5
MR Dataflow 6
MapReduce Pipeline 7
Pipeline Details 8
Available tweaks and optimizations! Input to Maps Map only jobs Combiner Compression Speculation Fault Tolerance Buffer Size Parallelism (threads)  Partitioner Reporter DistributedCache Task child environment settings
Input to Map Maps should process significant amount of data to minimize the effect of overhead. Process multiple-files per map for jobs with very large number of small input files. Process large chunk of data for large scale processing Use as fewer maps to process data in parallel, as few as possible without having bad failure recovery cases. Unless the application's maps are heavily CPU bound, there is almost no reason to ever require more than 60,000-70,000 maps for a single application. 10
Map only jobs Run map only job once for generating data Run multiple jobs with different reduce implementations Map only jobs will write directly to HDFS
Combiner Provides map-side aggregation of data Each and every record emitted by the Mapper need not be shipped to the reducers. Reduce code can be used as combiner. Example : Word count! Helps reduce network traffic for the shuffle. Results in lesser disk space usage. However, it is important to ensure that Ensure they really work  the Combiner does provide sufficient aggregation.
Compression Map and Reduce outputs can be compressed Compressing intermediate data will help reduce the amount of disk usage and network I/O. Compression helps reduce the total data size on the DFS.
Shuffle Shuffle Phase performance depends on the crossbar between the map tasks and the reduce tasks, which must be minimized. Compression of intermediate output Use of Combiner 14
Reduces Configure appropriate number of reduces Too few hurt the nodes Too many hurt the cross-bar All reduces must be complete in single wave. Each reduce should process at least 1-2 GB of data, and at most 5-10GB of data, in most scenarios.
Partitioner Distribute data evenly across reduces Uneven distribution will hurt the whole job runtime. Default is hash partitioner hash(key)%num-reducers Why is a custom partitioner needed?  Sort WordCount
Output Outputs to a few large files, with each file spanning multiple HDFS blocks and appropriately compressed. Number of output artifacts is linearly proportionate to the number of configured reduces Compress Outputs Use appropriate file-formats for output E.g. compressed text file is not a great idea if not using splittable codec. Consider using Hadoop ARchive (HAR) to reduce namespace usage. Think of the consumers of your data-set
Speculation Slow running tasks can be speculated Slowness is determined by the expected time the task will take to complete.  Speculation will kick-in only when there are no pending tasks. Total number of tasks that can be speculated for a job is capped to reduce wastage.
Fault Tolerance Data is stored as blocks on separate nodes Nodes are composed of cheap commodity hardware Tasks are independent of each other New tasks can be scheduled on new nodes The JobTracker tries 4 times (default) before giving up.  Job can be configured to tolerate task failures up to N% of the total tasks.
Reporter  Used to report progress to the parent processes.  Commonly used when the tasks try to      - Connect to a remote application like web-service, database     - Do some disk intensive computation     - Get blocked on some event  One can also spawn a thread and make it report the progress periodically
Distributed Cache Efficient distribution of read-only files for applications Localized automatically once the task is scheduled on the slave node Cleaned up once no task running on the slave needs the cache files Designed for small number of mid-size files. Artifacts in the distributed-cache should not require more i/o than the actual input to the application tasks.
Few tips for better performance Increase the memory/buffer allocated to the tasks (io.sort.mb)? Increase the number of tasks that can be run in parallel Increase the number of threads that serve the map outputs Disable unnecessary logging Find the optimal value of dfs block size Share the cluster between the DFS and MR for data locality Turn on speculation Run reducers in one wave as they can be really costly Make proper use of DistributedCache
Anti-Patterns Processing thousands of small files (sized less than 1 HDFS block, typically 128MB) with one map processing a single small file.  Processing very large data-sets with small HDFS block size i.e. 128MB resulting in tens of thousands of maps.  Applications with a large number (thousands) of maps with a very small runtime (e.g. 5s).  Straight-forward aggregations without the use of the Combiner.  Applications with greater than 60,000-70,000 maps.  Applications processing large data-sets with very few reduces (such as1).  Applications using a single reduce for total-order amount the output records. Pig scripts processing large data-sets without using the PARALLEL keyword
Anti-Patterns (Cont…) Applications processing data with very large number of reduces, such that each reduce processes less than 1-2GB of data.  Applications writing out multiple, small, output files from each reduce. Applications using the DistributedCache to distribute a large number of artifacts and/or very large artifacts (hundreds of MBs each). Applications using more than 25 counters per task. Applications performing metadata operations (e.g. listStatus) on the file-system from the map/reduce tasks. Applications doing screen-scraping of JobTracker web-ui for status of queues/jobs or worse, job-history of completed jobs. Workflows comprising of hundreds of of small jobs processing small amounts of data with a very high job submission rate.
Debugging Side effect files : Write to external files from M/R code Web UI : Web UI shows stdout/stderr Isolation Runner : Run the task on the tracker where the task failed. Switch to the workspace of the task and run IsolationRunner. Debug Scripts : Upload the script to the DFS, create a symlink and pass this script in the conf file. One common use is to filter out exceptions from the logs/stderr/stdout LocalJobRunner is used to run a MapReduce job on local node. It can be used for faster debugging and proof-of-concept.
Task child environment settings The child-task inherits the environment of the parent TaskTracker. The user can specify additional options to the child JVM via the mapred.child.java.opts An example showing multiple arguments and substitutions showing jvm GC logging  start of a passwordless JVM JMX agent so that it can connect with jconsole  get the thread dumps  sets the maximum heap-size of the child jvm to 512MB   add an additional path to the java.library.path of the child-jvm. <property>   <name>mapred.child.java.opts</name>   <value>-Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc                   -Xloggc:/tmp/@taskid@.gc                    -Dcom.sun.management.jmxremote.authenticate=false                    -Dcom.sun.management.jmxremote.ssl=false   </value> </property>
Checklist.. Are your partitions uniform? Can you combine records at the map side? Are maps reading off a DFS block worth of data? Are you running a single reduce wave (unless the data size per reducers is too big) ? Have you tried compressing intermediate data & final data? Are your buffer sizes large enough to minimize spills but small enough to stay clear of swapping? Do you see unexplained “long tails” ? (can be mitigated via speculative execution) Are you keeping your cores busy? (via slot configuration) Is at least one system resource being loaded?
28

Weitere ähnliche Inhalte

Was ist angesagt?

NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and ApplicationsNYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
Jason Shao
 
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
soujavajug
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 

Was ist angesagt? (19)

Hadoop 2
Hadoop 2Hadoop 2
Hadoop 2
 
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and ApplicationsNYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Overview of Spark for HPC
Overview of Spark for HPCOverview of Spark for HPC
Overview of Spark for HPC
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 
Unit 1
Unit 1Unit 1
Unit 1
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other OptimizationsMastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
 
01 hbase
01 hbase01 hbase
01 hbase
 
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
 
Hadoop2.2
Hadoop2.2Hadoop2.2
Hadoop2.2
 
Hadoop-Introduction
Hadoop-IntroductionHadoop-Introduction
Hadoop-Introduction
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
 
Hadoop interview questions
Hadoop interview questionsHadoop interview questions
Hadoop interview questions
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduce
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 

Ähnlich wie Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Practices" by Basant Verma

Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
pramodbiligiri
 

Ähnlich wie Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Practices" by Basant Verma (20)

Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
Meethadoop
MeethadoopMeethadoop
Meethadoop
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
 
Hadoop ppt2
Hadoop ppt2Hadoop ppt2
Hadoop ppt2
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 
Cppt
CpptCppt
Cppt
 
Cppt
CpptCppt
Cppt
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khan
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMRBig Data and Hadoop in Cloud - Leveraging Amazon EMR
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 

Mehr von Yahoo Developer Network

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
Yahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
Yahoo Developer Network
 

Mehr von Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Practices" by Basant Verma

  • 1. 1 Map-Reduce Programming & Best Practices Apache Hadoop India Summit 2011 Basant Verma Yahoo! India R&D February 16, 2011
  • 2. Hadoop Components 2 Client 1 Client 2 Processing Framework DFS HDFS (Hadoop Distributed File System) Modeled on GFS Reliable, High Bandwidth file system that can store TB' and PB's data. Map-Reduce Using Map/Reduce metaphor from Lisp language A distributed processing framework paradigm that process the data stored onto HDFS in key-value.
  • 4. Word Count $ cat ~/wikipedia.txt | br />sed -e 's/ //g' | grep . | br />sort | br />uniq -c > br />~/frequencies.txt 4
  • 5. MR for Word-Count mapper (filename, file-contents): for each word in file-contents: emit (word, 1) reducer (word, values[]): sum = 0 for each value in values: sum = sum + value emit (word, sum) 5
  • 9. Available tweaks and optimizations! Input to Maps Map only jobs Combiner Compression Speculation Fault Tolerance Buffer Size Parallelism (threads) Partitioner Reporter DistributedCache Task child environment settings
  • 10. Input to Map Maps should process significant amount of data to minimize the effect of overhead. Process multiple-files per map for jobs with very large number of small input files. Process large chunk of data for large scale processing Use as fewer maps to process data in parallel, as few as possible without having bad failure recovery cases. Unless the application's maps are heavily CPU bound, there is almost no reason to ever require more than 60,000-70,000 maps for a single application. 10
  • 11. Map only jobs Run map only job once for generating data Run multiple jobs with different reduce implementations Map only jobs will write directly to HDFS
  • 12. Combiner Provides map-side aggregation of data Each and every record emitted by the Mapper need not be shipped to the reducers. Reduce code can be used as combiner. Example : Word count! Helps reduce network traffic for the shuffle. Results in lesser disk space usage. However, it is important to ensure that Ensure they really work the Combiner does provide sufficient aggregation.
  • 13. Compression Map and Reduce outputs can be compressed Compressing intermediate data will help reduce the amount of disk usage and network I/O. Compression helps reduce the total data size on the DFS.
  • 14. Shuffle Shuffle Phase performance depends on the crossbar between the map tasks and the reduce tasks, which must be minimized. Compression of intermediate output Use of Combiner 14
  • 15. Reduces Configure appropriate number of reduces Too few hurt the nodes Too many hurt the cross-bar All reduces must be complete in single wave. Each reduce should process at least 1-2 GB of data, and at most 5-10GB of data, in most scenarios.
  • 16. Partitioner Distribute data evenly across reduces Uneven distribution will hurt the whole job runtime. Default is hash partitioner hash(key)%num-reducers Why is a custom partitioner needed? Sort WordCount
  • 17. Output Outputs to a few large files, with each file spanning multiple HDFS blocks and appropriately compressed. Number of output artifacts is linearly proportionate to the number of configured reduces Compress Outputs Use appropriate file-formats for output E.g. compressed text file is not a great idea if not using splittable codec. Consider using Hadoop ARchive (HAR) to reduce namespace usage. Think of the consumers of your data-set
  • 18. Speculation Slow running tasks can be speculated Slowness is determined by the expected time the task will take to complete. Speculation will kick-in only when there are no pending tasks. Total number of tasks that can be speculated for a job is capped to reduce wastage.
  • 19. Fault Tolerance Data is stored as blocks on separate nodes Nodes are composed of cheap commodity hardware Tasks are independent of each other New tasks can be scheduled on new nodes The JobTracker tries 4 times (default) before giving up. Job can be configured to tolerate task failures up to N% of the total tasks.
  • 20. Reporter Used to report progress to the parent processes. Commonly used when the tasks try to - Connect to a remote application like web-service, database - Do some disk intensive computation - Get blocked on some event One can also spawn a thread and make it report the progress periodically
  • 21. Distributed Cache Efficient distribution of read-only files for applications Localized automatically once the task is scheduled on the slave node Cleaned up once no task running on the slave needs the cache files Designed for small number of mid-size files. Artifacts in the distributed-cache should not require more i/o than the actual input to the application tasks.
  • 22. Few tips for better performance Increase the memory/buffer allocated to the tasks (io.sort.mb)? Increase the number of tasks that can be run in parallel Increase the number of threads that serve the map outputs Disable unnecessary logging Find the optimal value of dfs block size Share the cluster between the DFS and MR for data locality Turn on speculation Run reducers in one wave as they can be really costly Make proper use of DistributedCache
  • 23. Anti-Patterns Processing thousands of small files (sized less than 1 HDFS block, typically 128MB) with one map processing a single small file. Processing very large data-sets with small HDFS block size i.e. 128MB resulting in tens of thousands of maps. Applications with a large number (thousands) of maps with a very small runtime (e.g. 5s). Straight-forward aggregations without the use of the Combiner. Applications with greater than 60,000-70,000 maps. Applications processing large data-sets with very few reduces (such as1). Applications using a single reduce for total-order amount the output records. Pig scripts processing large data-sets without using the PARALLEL keyword
  • 24. Anti-Patterns (Cont…) Applications processing data with very large number of reduces, such that each reduce processes less than 1-2GB of data. Applications writing out multiple, small, output files from each reduce. Applications using the DistributedCache to distribute a large number of artifacts and/or very large artifacts (hundreds of MBs each). Applications using more than 25 counters per task. Applications performing metadata operations (e.g. listStatus) on the file-system from the map/reduce tasks. Applications doing screen-scraping of JobTracker web-ui for status of queues/jobs or worse, job-history of completed jobs. Workflows comprising of hundreds of of small jobs processing small amounts of data with a very high job submission rate.
  • 25. Debugging Side effect files : Write to external files from M/R code Web UI : Web UI shows stdout/stderr Isolation Runner : Run the task on the tracker where the task failed. Switch to the workspace of the task and run IsolationRunner. Debug Scripts : Upload the script to the DFS, create a symlink and pass this script in the conf file. One common use is to filter out exceptions from the logs/stderr/stdout LocalJobRunner is used to run a MapReduce job on local node. It can be used for faster debugging and proof-of-concept.
  • 26. Task child environment settings The child-task inherits the environment of the parent TaskTracker. The user can specify additional options to the child JVM via the mapred.child.java.opts An example showing multiple arguments and substitutions showing jvm GC logging start of a passwordless JVM JMX agent so that it can connect with jconsole get the thread dumps sets the maximum heap-size of the child jvm to 512MB add an additional path to the java.library.path of the child-jvm. <property> <name>mapred.child.java.opts</name> <value>-Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false </value> </property>
  • 27. Checklist.. Are your partitions uniform? Can you combine records at the map side? Are maps reading off a DFS block worth of data? Are you running a single reduce wave (unless the data size per reducers is too big) ? Have you tried compressing intermediate data & final data? Are your buffer sizes large enough to minimize spills but small enough to stay clear of swapping? Do you see unexplained “long tails” ? (can be mitigated via speculative execution) Are you keeping your cores busy? (via slot configuration) Is at least one system resource being loaded?
  • 28. 28

Hinweis der Redaktion

  1. (&gt;90% of map tasks are data local)(10X gain with the use of Combiner)
  2. Check the speculation formula and update
  3. IsolationRunner is intended to facilitate debugging by re-running a specific task, given left-over task files for a (typically failed) past jobCurrently, it is limited to re-running map tasks.mapreduce.task.files.preserve.failedtasks