SlideShare a Scribd company logo
1 of 52
Download to read offline
Shell Script Rewrite Overview
Allen Wittenauer
Twitter: @_a__w_ (1 a 2 w 1)
Email: aw @ apache.org!
3
What is the shell code?!
!
! bin/*!
! etc/hadoop/*sh!
! libexec/*!
! sbin/*!
!
CUTTING, DOUG
1710 554 6239
2005
APACHE SOFTWARE FOUNDATION
6 https://www.flickr.com/photos/new_and_used_tires/6549497793/
7
8 https://www.flickr.com/photos/hkuchera/5084213883
9
10
11 https://www.flickr.com/photos/83633410@N07/7658225516/
“[The scripts] finally got to
you, didn’t they?”
13
Primary Goals!
Consistency!
Code and Config Simplification!
De-clash Parameters!
Documentation!
!
Secondary Goals!
Backward Compatibility!
“Lost” Ideas and Fixes!
14 https://www.flickr.com/photos/k6mmc/2176537668/
15
!
!
Tuesday, August 19, 2014 majority committed into trunk:!
!
!
!
!
!
... followed by many fixes & enhancements from the
community
16
https://www.flickr.com/photos/ifindkarma/9304374538/	
  
https://www.flickr.com/photos/liveandrock/2650732780/
17
Old:!
! hadoop -> hadoop-config.sh -> hadoop-env.sh!
! yarn -> yarn-config.sh -> yarn-env.sh!
! hdfs-> hdfs-config.sh -> hadoop-env.sh !
!
New:!
! hadoop -> hadoop-config.sh! -> hadoop-functions.sh!
! ! ! ! ! ! ! -> hadoop-env.sh!
! yarn -> yarn-config.sh! -> hadoop-config.sh -> (above)!
! ! ! ! ! ! -> yarn-env.sh!
! hdfs -> hdfs-config.sh! -> hadoop-config.sh -> (above)!
18
Old:!
! yarn-env.sh:!
	
  	
   	
   JAVA_HOME=xyz	
  
! hadoop-env.sh:!
	
   	
   JAVA_HOME=xyz	
  
! mapred-env.sh:!
	
   	
   JAVA_HOME=xyz	
   	
  
New:!
! hadoop-env.sh!
	
   	
   JAVA_HOME=xyz	
  
! OS X:!
	
   	
   JAVA_HOME=$(/usr/libexec/java_home)
19
Old:!
! xyz_OPT=“-­‐Xmx4g”	
  hdfs	
  namenode	
  
	
   	
   java	
  …	
  -­‐Xmx1000	
  …	
  -­‐Xmx4g	
  …	
  
	
   !
! Command line size: ~2500 bytes!
New:!
! xyz_OPT=“-­‐Xmx4g”	
  hdfs	
  namenode	
  
	
   	
   java	
  …	
  -­‐Xmx4g	
  …	
  
!
! Command line size: ~1750 bytes
20
! $	
  TOOL_PATH=blah:blah:blah	
  hadoop	
  distcp	
  /old	
  /new	
  
	
   Error:	
  could	
  not	
  find	
  or	
  load	
  main	
  class	
  
org.apache.hadoop.tools.DistCp!
!
Old:!
! $	
  bash	
  -­‐x	
  hadoop	
  distcp	
  /old	
  /new	
  
+	
  this=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop	
  
+++	
  dirname	
  -­‐-­‐	
  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop	
  
++	
  cd	
  -­‐P	
  -­‐-­‐	
  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin	
  
++	
  pwd	
  -­‐P	
  
+	
  bin=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin	
  
+	
  DEFAULT_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec	
  
+	
  HADOOP_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec	
  
+	
  [[	
  -­‐f	
  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec/hadoop-­‐
config.sh	
  ]]	
  
…	
  
!
21
New:!
! $	
  TOOL_PATH=blah:blah:blah	
  hadoop	
  -­‐-­‐debug	
  
distcp	
  /tmp/	
  /1	
  
	
   DEBUG:	
  HADOOP_CONF_DIR=/home/aw/HADOOP/conf	
  
	
   DEBUG:	
  Initial	
  CLASSPATH=/home/aw/HADOOP/conf	
  
	
   	
   	
   	
   	
   …	
  
	
   DEBUG:	
  Append	
  CLASSPATH:	
  /home/aw/HADOOP/
hadoop-­‐3.0.0-­‐SNAPSHOT/share/hadoop/mapreduce/*	
  
	
   DEBUG:	
  Injecting	
  TOOL_PATH	
  into	
  CLASSPATH	
  
	
   DEBUG:	
  Rejected	
  CLASSPATH:	
  blah:blah:blah	
  (does	
  
not	
  exist)	
  
	
   	
   	
   	
   	
   …	
  
!
22
Old:!
! hdfs help!
23 https://www.flickr.com/photos/joshuamckenty/2297179486/
24
New:!
! hdfs help!
25
Old:!
!
	
   hadoop	
  thisisnotacommand	
  
! ! == stack trace!
New:!
	
   hadoop	
  thisisnotacommand	
  
! ! == hadoop help
26
Old:!
! sbin/hadoop-­‐daemon.sh	
  start	
  namenode	
  
	
  	
   sbin/yarn-­‐daemon.sh	
  start	
  resourcemanager	
  
!
New:!
! bin/hdfs	
  -­‐-­‐daemon	
  start	
  namenode	
  
	
  	
   bin/yarn	
  -­‐-­‐daemon	
  start	
  resourcemanager	
  
!
! + common daemon start/stop/status routines
27
hdfs	
  namenode vs hadoop-­‐daemon.sh	
  namenode	
  
!
Old:!!
! - effectively different code paths!
! - no pid vs pid!
! ! - wait for socket for failure!
New:!
! - same code path !
! - hadoop-­‐daemon.sh	
  cmd => hdfs	
  -­‐-­‐daemon	
  cmd !
! ! - both generate pid!
! - hdfs	
  -­‐-­‐daemon	
  status	
  namenode
28
Old:!
! “mkdir:	
  cannot	
  create	
  <dir>”!
! “chown:	
  cannot	
  change	
  permission	
  of	
  <dir>”!
! !
New:!
! “WARNING:	
  <dir>	
  does	
  not	
  exist.	
  Creating.”!
! “ERROR:	
  Unable	
  to	
  create	
  <dir>.	
  Aborting.”!
! “ERROR:	
  Cannot	
  write	
  to	
  <dir>.”
29
Old:!
! (foo)	
  >	
  (foo).out	
  
	
   rm	
  (foo).out	
  
	
   	
   = Open file handle!
!
New:!
	
   (foo)	
  >>	
  (foo).out	
  
	
   rm	
  (foo).out	
  
! ! = Closed file handle!
! ! = rotatable .out files!
30
Old:!
! sbin/*-­‐daemons.sh	
  -­‐>	
  slaves.sh	
  blah!
! (several hundred ssh processes later)!
! *crash*! !
!
New:!
! sbin/*-­‐daemons.sh -> hadoop-­‐functions.sh	
  
! slaves.sh -> hadoop-­‐functions.sh	
  
! pdsh or (if enabled) xargs	
  -­‐P!
! *real work gets done*
31
Old:!
	
   egrep	
  -­‐c	
  ‘^#’	
  hadoop-­‐branch-­‐2/…/*-­‐env.sh	
  
! ! ! hadoop-env.sh: 59!
! ! ! mapred-env.sh: 21!
! ! ! yarn-env.sh: 60!
New:!
! egrep	
  -­‐c	
  ‘^#’	
  hadoop-­‐trunk/…/*-­‐env.sh	
  
! ! ! hadoop-env.sh: 333!
! ! ! mapred-env.sh: 40!
! ! ! yarn-env.sh: 112!
! ! ! + hadoop-layout.sh.example : 77!
! ! ! + hadoop-user-functions.sh.example: 109
But wait! There’s more!
33
!
! HADOOP_namenode_USER=hdfs !
! ! hdfs	
  namenode only works as hdfs!
! ! Fun: HADOOP_fs_USER=aw!
! ! ! hadoop	
  fs only works as aw!
!
! hadoop	
  -­‐-­‐loglevel	
  WARN !
! ! ! => WARN,whatever!
! hadoop	
  -­‐-­‐loglevel	
  DEBUG	
  -­‐-­‐daemon	
  start	
  	
  
	
   	
   => start daemon in DEBUG mode!
34
!
Old:!
! HADOOP_HEAPSIZE=15234	
  	
  	
  	
  	
  <-­‐-­‐-­‐	
  M	
  only	
  
	
   JAVA_HEAP_MAX="hahahah	
  you	
  set	
  something	
  in	
  
HADOOP_HEAPSIZE"	
  
!
New:!
! HADOOP_HEAPSIZE_MAX=15g	
  
	
   HADOOP_HEAPSIZE_MIN=10g	
  	
  	
  	
  <-­‐-­‐-­‐	
  units!	
  
	
   JAVA_HEAP_MAX	
  removed	
  =>	
  
	
   	
   no	
  Xmx	
  settings	
  ==	
  Java	
  default	
  
35
!
Old:!
! Lots of different yet same variables for settings	
  
!
New:!
! Deprecated	
  ~60	
  variables	
  
	
   ${HDFS|YARN|KMS|HTTPFS|*}_{foo}	
  =>	
  	
  
	
   	
   HADOOP_{foo}
36
!
Old:!
! "I wonder what's in HADOOP_CLIENT_OPTS?"!
! "I want to override just this one thing in *-env.sh."!
!
New:!
! ${HOME}/.hadooprc
37
!
shellprofile.d!
!
! bash snippets to easily inject:!
! ! classpath!
! ! JNI!
! ! Java command line options!
! ! ... and more!
38 https://www.flickr.com/photos/83633410@N07/7658230838/
Power Users Rejoice:!
Function Overrides
40
Default *.out log rotation:!
!
function	
  hadoop_rotate_log	
  
{	
  
	
  	
  local	
  log=$1;	
  
	
  	
  local	
  num=${2:-­‐5};	
  
!
	
  	
  if	
  [[	
  -­‐f	
  "${log}"	
  ]];	
  then	
  #	
  rotate	
  logs	
  
	
  	
  	
  	
  while	
  [[	
  ${num}	
  -­‐gt	
  1	
  ]];	
  do	
  
	
  	
  	
  	
  	
  let	
  prev=${num}-­‐1	
  
	
  	
  	
  	
  	
  	
  if	
  [[	
  -­‐f	
  "${log}.${prev}"	
  ]];	
  then	
  
	
  	
  	
  	
  	
  	
  	
  	
  mv	
  "${log}.${prev}"	
  "${log}.${num}"	
  
	
  	
  	
  	
  	
  	
  fi	
  
	
  	
  	
  	
  	
  	
  num=${prev}	
  
	
  	
  	
  	
  done	
  
	
  	
  	
  	
  mv	
  "${log}"	
  "${log}.${num}"	
  
	
  	
  fi	
  
}
namenode.out.1	
  -­‐>	
  namenode.out.2	
  
namenode.out	
  -­‐>	
  namenode.out.1
41
Put a replacement rotate function w/gzip support in hadoop-user-functions.sh!!
!
function	
  hadoop_rotate_log	
  
{	
  
	
  	
  local	
  log=$1;	
  
	
  	
  local	
  num=${2:-­‐5};	
  
!
	
  	
  if	
  [[	
  -­‐f	
  "${log}"	
  ]];	
  then	
  
	
  	
  	
  	
  while	
  [[	
  ${num}	
  -­‐gt	
  1	
  ]];	
  do	
  
	
  	
  	
  	
  	
  	
  let	
  prev=${num}-­‐1	
  
	
  	
  	
  	
  	
  	
  if	
  [[	
  -­‐f	
  "${log}.${prev}.gz"	
  ]];	
  then	
  
	
  	
  	
  	
  	
  	
  	
  	
  mv	
  "${log}.${prev}.gz"	
  "${log}.${num}.gz"	
  
	
  	
  	
  	
  	
  	
  fi	
  
	
  	
  	
  	
  	
  	
  num=${prev}	
  
	
  	
  	
  	
  done	
  
	
  	
  	
  	
  mv	
  "${log}"	
  "${log}.${num}"	
  
	
  	
  	
  	
  gzip	
  -­‐9	
  "${log}.${num}"	
  
	
  	
  fi	
  
}
namenode.out.1.gz	
  -­‐>	
  namenode.out.2.gz	
  
namenode.out	
  -­‐>	
  namenode.out.1	
  
gzip	
  -­‐9	
  namenode.out.1	
  -­‐>	
  namenode.out.1.gz
What if we wanted to log
every daemon start in
syslog?
43
Default daemon starter:!
!
function	
  hadoop_start_daemon	
  
{	
  
	
  	
  local	
  command=$1	
  
	
  	
  local	
  class=$2	
  
	
  	
  shift	
  2	
  
!
	
  	
  hadoop_debug	
  "Final	
  CLASSPATH:	
  ${CLASSPATH}"	
  
	
  	
  hadoop_debug	
  "Final	
  HADOOP_OPTS:	
  ${HADOOP_OPTS}"	
  
!
	
  	
  export	
  CLASSPATH	
  
	
  	
  exec	
  "${JAVA}"	
  "-­‐Dproc_${command}"	
  ${HADOOP_OPTS}	
  "$
{class}"	
  "$@"	
  
}	
  
44
Put a replacement start function in hadoop-user-functions.sh!!
!
function	
  hadoop_start_daemon	
  
{	
  
	
  	
  local	
  command=$1	
  
	
  	
  local	
  class=$2	
  
	
  	
  shift	
  2	
  
!
	
  	
  hadoop_debug	
  "Final	
  CLASSPATH:	
  ${CLASSPATH}"	
  
	
  	
  hadoop_debug	
  "Final	
  HADOOP_OPTS:	
  ${HADOOP_OPTS}"	
  
!
	
  	
  export	
  CLASSPATH	
  
	
  	
  logger	
  -­‐i	
  -­‐p	
  local0.notice	
  -­‐t	
  hadoop	
  "Started	
  ${COMMAND}"	
  
	
  	
  exec	
  "${JAVA}"	
  "-­‐Dproc_${command}"	
  ${HADOOP_OPTS}	
  "$
{class}"	
  "$@"	
  
}
Secure Daemons
What if we could start them
as non-root?
47
Setup:!
!
sudoers (either /etc/sudoers or in LDAP):!
!
hdfs	
   ALL=(root:root)	
  NOPASSWD:	
  /usr/bin/jsvc!
!
hadoop-env.sh:!
!
HADOOP_SECURE_COMMAND=/usr/sbin/sudo	
  
48
# hadoop-user-functions.sh: (partial code below)!
function	
  hadoop_start_secure_daemon	
  
{	
  
	
  	
   	
   	
   	
   	
   	
   	
  …	
  
	
  	
  jsvc="${JSVC_HOME}/jsvc"	
  
!
	
  	
  if	
  [[	
  “${USER}”	
  -­‐ne	
  "${HADOOP_SECURE_USER}"	
  ]];	
  then	
  	
  
	
  	
  	
  	
  hadoop_error	
  "You	
  must	
  be	
  ${HADOOP_SECURE_USER}	
  in	
  order	
  to	
  start	
  a	
  
secure	
  ${daemonname}"	
  
	
  	
  	
  	
  exit	
  1	
  
	
  	
  fi	
  	
  
	
  	
  	
   	
   	
   	
   	
   	
   …	
  
	
  	
  exec	
  /usr/sbin/sudo	
  "${jsvc}"	
  "-­‐Dproc_${daemonname}"	
  	
  
	
  	
  -­‐outfile	
  "${daemonoutfile}"	
  -­‐errfile	
  "${daemonerrfile}"	
  	
  
	
  	
  -­‐pidfile	
  "${daemonpidfile}"	
  -­‐nodetach	
  -­‐home	
  "${JAVA_HOME}"	
  	
  
	
  	
  —user	
  "${HADOOP_SECURE_USER}"	
  	
  
	
  	
  -­‐cp	
  "${CLASSPATH}"	
  ${HADOOP_OPTS}	
  "${class}"	
  "$@"	
  
}
49
$ hdfs	
  datanode!
sudo launches jsvc as root!
jsvc launches secure datanode!
!
!
In order to get -­‐-­‐daemon	
  start to work, one other
function needs to get replaced*, but that’s a SMOP, now
that you know how!!
!
!
* - hadoop_start_secure_daemon_wrapper	
  assumes it
is running as root!
50
Lots more, but out of time... e.g.:!
!
! Internals for contributors!
! Unit tests!
! API documentation!
! Other projects in the works!
! ...!
!
! Reminder: This is in trunk. Ask vendors their plans!
51 https://www.flickr.com/photos/nateone/3768979925
Altiscale copyright 2015. All rights reserved.52

More Related Content

What's hot

2005_Structures and functions of Makefile
2005_Structures and functions of Makefile2005_Structures and functions of Makefile
2005_Structures and functions of Makefile
NakCheon Jung
 
Programming Hive Reading #4
Programming Hive Reading #4Programming Hive Reading #4
Programming Hive Reading #4
moai kids
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku
ronnywang_tw
 
apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010
Thejas Nair
 

What's hot (20)

Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparison
 
Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
2005_Structures and functions of Makefile
2005_Structures and functions of Makefile2005_Structures and functions of Makefile
2005_Structures and functions of Makefile
 
Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501
 
Hvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterHvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøster
 
Commands documentaion
Commands documentaionCommands documentaion
Commands documentaion
 
Programming Hive Reading #4
Programming Hive Reading #4Programming Hive Reading #4
Programming Hive Reading #4
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406
 
apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010
 
Ansible for Beginners
Ansible for BeginnersAnsible for Beginners
Ansible for Beginners
 
Package Management via Spack on SJTU π Supercomputer
Package Management via Spack on SJTU π SupercomputerPackage Management via Spack on SJTU π Supercomputer
Package Management via Spack on SJTU π Supercomputer
 
Using ngx_lua in UPYUN
Using ngx_lua in UPYUNUsing ngx_lua in UPYUN
Using ngx_lua in UPYUN
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Oliver hookins puppetcamp2011
Oliver hookins puppetcamp2011Oliver hookins puppetcamp2011
Oliver hookins puppetcamp2011
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 

Viewers also liked

Viewers also liked (6)

Deploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopDeploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache Hadoop
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase Contributors
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile Problem
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedIn
 

Similar to Apache Hadoop Shell Rewrite

Virtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetVirtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + Puppet
Omar Reygaert
 
Im trying to run make qemu-nox In a putty terminal but it.pdf
Im trying to run  make qemu-nox  In a putty terminal but it.pdfIm trying to run  make qemu-nox  In a putty terminal but it.pdf
Im trying to run make qemu-nox In a putty terminal but it.pdf
maheshkumar12354
 
Biicode OpenExpoDay
Biicode OpenExpoDayBiicode OpenExpoDay
Biicode OpenExpoDay
fcofdezc
 
Unix shell scripting basics
Unix shell scripting basicsUnix shell scripting basics
Unix shell scripting basics
Abhay Sapru
 
Unix Shell Scripting Basics
Unix Shell Scripting BasicsUnix Shell Scripting Basics
Unix Shell Scripting Basics
Dr.Ravi
 

Similar to Apache Hadoop Shell Rewrite (20)

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them All
 
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
 
Shell scripting
Shell scriptingShell scripting
Shell scripting
 
Virtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetVirtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + Puppet
 
Really useful linux commands
Really useful linux commandsReally useful linux commands
Really useful linux commands
 
Im trying to run make qemu-nox In a putty terminal but it.pdf
Im trying to run  make qemu-nox  In a putty terminal but it.pdfIm trying to run  make qemu-nox  In a putty terminal but it.pdf
Im trying to run make qemu-nox In a putty terminal but it.pdf
 
Git::Hooks
Git::HooksGit::Hooks
Git::Hooks
 
Using Nix and Docker as automated deployment solutions
Using Nix and Docker as automated deployment solutionsUsing Nix and Docker as automated deployment solutions
Using Nix and Docker as automated deployment solutions
 
Biicode OpenExpoDay
Biicode OpenExpoDayBiicode OpenExpoDay
Biicode OpenExpoDay
 
EC2
EC2EC2
EC2
 
Unix shell scripting basics
Unix shell scripting basicsUnix shell scripting basics
Unix shell scripting basics
 
Unix Shell Scripting Basics
Unix Shell Scripting BasicsUnix Shell Scripting Basics
Unix Shell Scripting Basics
 
Bash is not a second zone citizen programming language
Bash is not a second zone citizen programming languageBash is not a second zone citizen programming language
Bash is not a second zone citizen programming language
 
myHadoop 0.30
myHadoop 0.30myHadoop 0.30
myHadoop 0.30
 
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
 
Os Treat
Os TreatOs Treat
Os Treat
 
Automate Yo'self -- SeaGL
Automate Yo'self -- SeaGL Automate Yo'self -- SeaGL
Automate Yo'self -- SeaGL
 
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
 
What we Learned Implementing Puppet at Backstop
What we Learned Implementing Puppet at BackstopWhat we Learned Implementing Puppet at Backstop
What we Learned Implementing Puppet at Backstop
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 

Recently uploaded

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Recently uploaded (20)

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 

Apache Hadoop Shell Rewrite

  • 1. Shell Script Rewrite Overview Allen Wittenauer
  • 2. Twitter: @_a__w_ (1 a 2 w 1) Email: aw @ apache.org!
  • 3. 3 What is the shell code?! ! ! bin/*! ! etc/hadoop/*sh! ! libexec/*! ! sbin/*! !
  • 4.
  • 5. CUTTING, DOUG 1710 554 6239 2005 APACHE SOFTWARE FOUNDATION
  • 7. 7
  • 9. 9
  • 10. 10
  • 12. “[The scripts] finally got to you, didn’t they?”
  • 13. 13 Primary Goals! Consistency! Code and Config Simplification! De-clash Parameters! Documentation! ! Secondary Goals! Backward Compatibility! “Lost” Ideas and Fixes!
  • 15. 15 ! ! Tuesday, August 19, 2014 majority committed into trunk:! ! ! ! ! ! ... followed by many fixes & enhancements from the community
  • 17. 17 Old:! ! hadoop -> hadoop-config.sh -> hadoop-env.sh! ! yarn -> yarn-config.sh -> yarn-env.sh! ! hdfs-> hdfs-config.sh -> hadoop-env.sh ! ! New:! ! hadoop -> hadoop-config.sh! -> hadoop-functions.sh! ! ! ! ! ! ! ! -> hadoop-env.sh! ! yarn -> yarn-config.sh! -> hadoop-config.sh -> (above)! ! ! ! ! ! ! -> yarn-env.sh! ! hdfs -> hdfs-config.sh! -> hadoop-config.sh -> (above)!
  • 18. 18 Old:! ! yarn-env.sh:!       JAVA_HOME=xyz   ! hadoop-env.sh:!     JAVA_HOME=xyz   ! mapred-env.sh:!     JAVA_HOME=xyz     New:! ! hadoop-env.sh!     JAVA_HOME=xyz   ! OS X:!     JAVA_HOME=$(/usr/libexec/java_home)
  • 19. 19 Old:! ! xyz_OPT=“-­‐Xmx4g”  hdfs  namenode       java  …  -­‐Xmx1000  …  -­‐Xmx4g  …     ! ! Command line size: ~2500 bytes! New:! ! xyz_OPT=“-­‐Xmx4g”  hdfs  namenode       java  …  -­‐Xmx4g  …   ! ! Command line size: ~1750 bytes
  • 20. 20 ! $  TOOL_PATH=blah:blah:blah  hadoop  distcp  /old  /new     Error:  could  not  find  or  load  main  class   org.apache.hadoop.tools.DistCp! ! Old:! ! $  bash  -­‐x  hadoop  distcp  /old  /new   +  this=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop   +++  dirname  -­‐-­‐  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop   ++  cd  -­‐P  -­‐-­‐  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin   ++  pwd  -­‐P   +  bin=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin   +  DEFAULT_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec   +  HADOOP_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec   +  [[  -­‐f  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec/hadoop-­‐ config.sh  ]]   …   !
  • 21. 21 New:! ! $  TOOL_PATH=blah:blah:blah  hadoop  -­‐-­‐debug   distcp  /tmp/  /1     DEBUG:  HADOOP_CONF_DIR=/home/aw/HADOOP/conf     DEBUG:  Initial  CLASSPATH=/home/aw/HADOOP/conf             …     DEBUG:  Append  CLASSPATH:  /home/aw/HADOOP/ hadoop-­‐3.0.0-­‐SNAPSHOT/share/hadoop/mapreduce/*     DEBUG:  Injecting  TOOL_PATH  into  CLASSPATH     DEBUG:  Rejected  CLASSPATH:  blah:blah:blah  (does   not  exist)             …   !
  • 25. 25 Old:! !   hadoop  thisisnotacommand   ! ! == stack trace! New:!   hadoop  thisisnotacommand   ! ! == hadoop help
  • 26. 26 Old:! ! sbin/hadoop-­‐daemon.sh  start  namenode       sbin/yarn-­‐daemon.sh  start  resourcemanager   ! New:! ! bin/hdfs  -­‐-­‐daemon  start  namenode       bin/yarn  -­‐-­‐daemon  start  resourcemanager   ! ! + common daemon start/stop/status routines
  • 27. 27 hdfs  namenode vs hadoop-­‐daemon.sh  namenode   ! Old:!! ! - effectively different code paths! ! - no pid vs pid! ! ! - wait for socket for failure! New:! ! - same code path ! ! - hadoop-­‐daemon.sh  cmd => hdfs  -­‐-­‐daemon  cmd ! ! ! - both generate pid! ! - hdfs  -­‐-­‐daemon  status  namenode
  • 28. 28 Old:! ! “mkdir:  cannot  create  <dir>”! ! “chown:  cannot  change  permission  of  <dir>”! ! ! New:! ! “WARNING:  <dir>  does  not  exist.  Creating.”! ! “ERROR:  Unable  to  create  <dir>.  Aborting.”! ! “ERROR:  Cannot  write  to  <dir>.”
  • 29. 29 Old:! ! (foo)  >  (foo).out     rm  (foo).out       = Open file handle! ! New:!   (foo)  >>  (foo).out     rm  (foo).out   ! ! = Closed file handle! ! ! = rotatable .out files!
  • 30. 30 Old:! ! sbin/*-­‐daemons.sh  -­‐>  slaves.sh  blah! ! (several hundred ssh processes later)! ! *crash*! ! ! New:! ! sbin/*-­‐daemons.sh -> hadoop-­‐functions.sh   ! slaves.sh -> hadoop-­‐functions.sh   ! pdsh or (if enabled) xargs  -­‐P! ! *real work gets done*
  • 31. 31 Old:!   egrep  -­‐c  ‘^#’  hadoop-­‐branch-­‐2/…/*-­‐env.sh   ! ! ! hadoop-env.sh: 59! ! ! ! mapred-env.sh: 21! ! ! ! yarn-env.sh: 60! New:! ! egrep  -­‐c  ‘^#’  hadoop-­‐trunk/…/*-­‐env.sh   ! ! ! hadoop-env.sh: 333! ! ! ! mapred-env.sh: 40! ! ! ! yarn-env.sh: 112! ! ! ! + hadoop-layout.sh.example : 77! ! ! ! + hadoop-user-functions.sh.example: 109
  • 33. 33 ! ! HADOOP_namenode_USER=hdfs ! ! ! hdfs  namenode only works as hdfs! ! ! Fun: HADOOP_fs_USER=aw! ! ! ! hadoop  fs only works as aw! ! ! hadoop  -­‐-­‐loglevel  WARN ! ! ! ! => WARN,whatever! ! hadoop  -­‐-­‐loglevel  DEBUG  -­‐-­‐daemon  start         => start daemon in DEBUG mode!
  • 34. 34 ! Old:! ! HADOOP_HEAPSIZE=15234          <-­‐-­‐-­‐  M  only     JAVA_HEAP_MAX="hahahah  you  set  something  in   HADOOP_HEAPSIZE"   ! New:! ! HADOOP_HEAPSIZE_MAX=15g     HADOOP_HEAPSIZE_MIN=10g        <-­‐-­‐-­‐  units!     JAVA_HEAP_MAX  removed  =>       no  Xmx  settings  ==  Java  default  
  • 35. 35 ! Old:! ! Lots of different yet same variables for settings   ! New:! ! Deprecated  ~60  variables     ${HDFS|YARN|KMS|HTTPFS|*}_{foo}  =>         HADOOP_{foo}
  • 36. 36 ! Old:! ! "I wonder what's in HADOOP_CLIENT_OPTS?"! ! "I want to override just this one thing in *-env.sh."! ! New:! ! ${HOME}/.hadooprc
  • 37. 37 ! shellprofile.d! ! ! bash snippets to easily inject:! ! ! classpath! ! ! JNI! ! ! Java command line options! ! ! ... and more!
  • 40. 40 Default *.out log rotation:! ! function  hadoop_rotate_log   {      local  log=$1;      local  num=${2:-­‐5};   !    if  [[  -­‐f  "${log}"  ]];  then  #  rotate  logs          while  [[  ${num}  -­‐gt  1  ]];  do            let  prev=${num}-­‐1              if  [[  -­‐f  "${log}.${prev}"  ]];  then                  mv  "${log}.${prev}"  "${log}.${num}"              fi              num=${prev}          done          mv  "${log}"  "${log}.${num}"      fi   } namenode.out.1  -­‐>  namenode.out.2   namenode.out  -­‐>  namenode.out.1
  • 41. 41 Put a replacement rotate function w/gzip support in hadoop-user-functions.sh!! ! function  hadoop_rotate_log   {      local  log=$1;      local  num=${2:-­‐5};   !    if  [[  -­‐f  "${log}"  ]];  then          while  [[  ${num}  -­‐gt  1  ]];  do              let  prev=${num}-­‐1              if  [[  -­‐f  "${log}.${prev}.gz"  ]];  then                  mv  "${log}.${prev}.gz"  "${log}.${num}.gz"              fi              num=${prev}          done          mv  "${log}"  "${log}.${num}"          gzip  -­‐9  "${log}.${num}"      fi   } namenode.out.1.gz  -­‐>  namenode.out.2.gz   namenode.out  -­‐>  namenode.out.1   gzip  -­‐9  namenode.out.1  -­‐>  namenode.out.1.gz
  • 42. What if we wanted to log every daemon start in syslog?
  • 43. 43 Default daemon starter:! ! function  hadoop_start_daemon   {      local  command=$1      local  class=$2      shift  2   !    hadoop_debug  "Final  CLASSPATH:  ${CLASSPATH}"      hadoop_debug  "Final  HADOOP_OPTS:  ${HADOOP_OPTS}"   !    export  CLASSPATH      exec  "${JAVA}"  "-­‐Dproc_${command}"  ${HADOOP_OPTS}  "$ {class}"  "$@"   }  
  • 44. 44 Put a replacement start function in hadoop-user-functions.sh!! ! function  hadoop_start_daemon   {      local  command=$1      local  class=$2      shift  2   !    hadoop_debug  "Final  CLASSPATH:  ${CLASSPATH}"      hadoop_debug  "Final  HADOOP_OPTS:  ${HADOOP_OPTS}"   !    export  CLASSPATH      logger  -­‐i  -­‐p  local0.notice  -­‐t  hadoop  "Started  ${COMMAND}"      exec  "${JAVA}"  "-­‐Dproc_${command}"  ${HADOOP_OPTS}  "$ {class}"  "$@"   }
  • 46. What if we could start them as non-root?
  • 47. 47 Setup:! ! sudoers (either /etc/sudoers or in LDAP):! ! hdfs   ALL=(root:root)  NOPASSWD:  /usr/bin/jsvc! ! hadoop-env.sh:! ! HADOOP_SECURE_COMMAND=/usr/sbin/sudo  
  • 48. 48 # hadoop-user-functions.sh: (partial code below)! function  hadoop_start_secure_daemon   {                  …      jsvc="${JSVC_HOME}/jsvc"   !    if  [[  “${USER}”  -­‐ne  "${HADOOP_SECURE_USER}"  ]];  then            hadoop_error  "You  must  be  ${HADOOP_SECURE_USER}  in  order  to  start  a   secure  ${daemonname}"          exit  1      fi                     …      exec  /usr/sbin/sudo  "${jsvc}"  "-­‐Dproc_${daemonname}"        -­‐outfile  "${daemonoutfile}"  -­‐errfile  "${daemonerrfile}"        -­‐pidfile  "${daemonpidfile}"  -­‐nodetach  -­‐home  "${JAVA_HOME}"        —user  "${HADOOP_SECURE_USER}"        -­‐cp  "${CLASSPATH}"  ${HADOOP_OPTS}  "${class}"  "$@"   }
  • 49. 49 $ hdfs  datanode! sudo launches jsvc as root! jsvc launches secure datanode! ! ! In order to get -­‐-­‐daemon  start to work, one other function needs to get replaced*, but that’s a SMOP, now that you know how!! ! ! * - hadoop_start_secure_daemon_wrapper  assumes it is running as root!
  • 50. 50 Lots more, but out of time... e.g.:! ! ! Internals for contributors! ! Unit tests! ! API documentation! ! Other projects in the works! ! ...! ! ! Reminder: This is in trunk. Ask vendors their plans!
  • 52. Altiscale copyright 2015. All rights reserved.52