Handwritten Text Recognition for manuscripts and early printed texts
Self-serve Hadoop Performance Tuning with Dr. Elephant
1.
2. Mark Wagner
Engineer, Hadoop Infrastructure
LinkedIn
Dr. Elephant:
Self-serve performance tuning for
Hadoop
3. 3
Hadoop @ LinkedIn
• Thousands of users of Hadoop infrastructure
• Tens of thousands of jobs a day
• Thousands of registered projects
• Multiple analytics, experimentation, and metrics platforms built on top
• Diverse backgrounds and levels of experience with Hadoop
4. 4
Hadoop team @ LinkedIn
• Roll our own distribution
• Build next generation systems
• Optimize our investment in hardware
• Enable our users to be productive
6. 6
Optimizing people
Workflow tooling: Gradle DSL for Hadoop
• Nobody writes one Hadoop job
• How do you structure Hadoop
codebases?
hadoop
{
buildPath
'conf/jobs';
propertyFile('common'){
set
properties:
[
'user.to.proxy'
:
'mwagner'
]
}
workflow('my-‐first-‐workflow'){
commandJob('start-‐job'){
uses
'echo
"Hello,
World!"'
}
pigLiJob('vowels'){
uses
'src/main/pig/vowels.pig'
depends
'start-‐job'
}
targets
'vowels'
}
}
7. Easier tuning?
7
Optimizing people
• Large investment in hardware
• Cost(People) >> Cost(Machines)
• Can’t throw machines at the problem forever
• Some tuning needed to get things running
• Minimum effort gives the worst of both worlds
8. 8
Barriers to tuning
Problems are not obvious
• What’s wrong with this job?
Anything?
...
2015-‐06-‐09
05:57:56,281
Stage-‐1
map
=
95%,
reduce
=
0%,
Cumulative
CPU
12602.08
sec
2015-‐06-‐09
05:58:17,821
Stage-‐1
map
=
96%,
reduce
=
0%,
Cumulative
CPU
12688.5
sec
2015-‐06-‐09
05:58:23,952
Stage-‐1
map
=
97%,
reduce
=
0%,
Cumulative
CPU
12705.91
sec
2015-‐06-‐09
05:58:24,976
Stage-‐1
map
=
99%,
reduce
=
0%,
Cumulative
CPU
12710.31
sec
2015-‐06-‐09
05:58:26,000
Stage-‐1
map
=
100%,
reduce
=
0%,
Cumulative
CPU
12712.08
sec
2015-‐06-‐09
05:58:40,317
Stage-‐1
map
=
100%,
reduce
=
100%,
Cumulative
CPU
12714.17
sec
MapReduce
Total
cumulative
CPU
time:
0
days
3
hours
31
minutes
54
seconds
170
msec
Ended
Job
=
job_1433389922983_133809
MapReduce
Jobs
Launched:
Job
0:
Map:
35
Reduce:
1
Cumulative
CPU:
12714.17
sec
HDFS
Read:
23223452
HDFS
Write:
18
SUCCESS
Total
MapReduce
CPU
Time
Spent:
0
days
3
hours
31
minutes
54
seconds
170
msec
OK
1234567
Time
taken:
564.189
seconds,
Fetched:
1
row(s)
hive
(default)>
10. Inter-related settings
10
Barriers to tuning
What interface
are you using?
Did you set max
split size?
Did you set min
split size?
Did you have
split combination
enabled?
How large are
your files?
Extend
CombineFileInputFormat?
CombineHiveInputFormat?
What’s your
maxCombinedSplitSize?
What’s your
block size?
11. Large Parameter Space
11
Barriers to tuning
Mapreduce.task.io.sort.mb
Mapreduce.job.min.split.size
Pig.maxcombinedsplitsize
Hive.autoconvert.join
Mapreduce.task.io.sort.factor
Hive.exec.reducers.bytes.per.reducer
Pig.exec.reducer.max
Pig.exec.reducers.bytes.per.reducer
Hive.map.aggr
Hive.groupby.skewindata
Hive.multigroupby.singlemr
Mapreduce.map.memory.mb
Pig.cachedbag.memusage
Hive.optimize.correlation
Hive.exec.orc.dictionary.key.size.threshold
Pig.exec.mapPartAgg
Pig.exec.mapPartAgg.minReduction
Pig.skewedjoin.reduce.memusage
Mapreduce.map.sort.spill.percent
Mapreduce.job.max.split.locations
Mapreduce.reduce.shuffle.parallelcopies
Mapreduce.reduce.shuffle.merge.percent
Mapreduce.map.speculative
Mapreduce.reduce.speculative
Mapreduce.map.output.compress
Mapreduce.job.ubertask.maxmaps
Mapreduce.ifile.readahead.bytes
Hive.exec.compress.intermediate
Hive.merge.mapfiles
200+ configuration settings in MapReduce
300+ more in Hive
14. Expert intervention
14
Things that don’t work
• Not enough support resources available
• Poor coverage
• Difficult to prioritize efforts
• Delays user development
15. Extensive training
15
Things that don’t work
• Too many users
• Diverse backgrounds
• Scope is large and evolving
• Other responsibilities are more important
16. Goals
16
Dr. Elephant
• Help every user to get the best performance of their jobs
• Impose minimal burden on the user
• Development burden
• Intellectual burden
• Provide a platform for other performance related tools
21. Internals
21
Dr. Elephant
• All completed jobs are monitored
• Diagnostic information collected automatically
• REST API for everything
22. 22
Dr. Elephant
Monitoring scheduled workflows
• Performance Characteristics
change
• Data growth
• Data distribution change
• Hardware change
• Incremental software change
• Monitor performance on each
execution
• Compare behavior across revisions
======TOP
20
BAD
JOBS
YESTERDAY======
JobId
Score
job_1431576474881_181412
36035
job_1431576474881_185548
27710
.
.
.
======TOP
20
BAD
FLOWS
YESTERDAY======
FlowUrl
Score
https://prod-‐azkaban/...
45379
.
.
.
======TOP
10
FLOWS
WITH
SIGNIFICANT
PERFORMANCE
CHANGE======
Project
Flow
ChangeScore
User
myProject
score-‐daily
48755
mwagner
.
.
.
23. Automated audits
23
Dr. Elephant
• Separate cluster for critical workloads
• Audit before deployment
• Improved accuracy
• Faster turnaround
• Higher throughput
24. 24
Dr. Elephant
As an operator utility
• Global view of performance issues
• Search and identify jobs for extra
attention
• Dr. Elephant sign-off as a
requirement for capacity requests
25. • Dr. Elephant can grade itself
• Social pressures encourage good
behavior
• Tuning degrades over time
25
Results and experiences
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fraction
Fraction of healthy jobs
26. 26
Dr. Elephant for all
• Plugins for other execution engines
• Tez, Spark on the way
• Allow the user community to build a knowledge-base
27. 27
Dr. Elephant today
• Evaluating 60000+ jobs a day across multiple clusters
• Open source release coming soon