Improving Hadoop Performance with Data Placement and Prefetching

Performance Issues on
Hadoop Clusters
Jiong Xie
Advisor: Dr. Xiao Qin
Committee Members:
Dr. Cheryl Seals
Dr. Dean Hendrix
University Reader:
Dr. Fa Foster Dai
05/08/12 1

Overview of My Research
Data locality Data movement Data shuffling

Data Placement Prefetching Data
Reduce network
on Heterogeneous from Disk
congest
Cluster to Memory
[To Be Submitted]
[HCW 10] [Submit to IPDPS]

05/08/12 2

Data-Intensive Applications

05/08/12 3

Data-Intensive Applications (cont.)

05/08/12 4

Background
• MapReduce programming model is
growing in popularity

• Hadoop is used by Yahoo, Facebook,
Amazon.

05/08/12 5

Hadoop Overview
--Mapreduce Running System

(J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters.
OSDI ’04, pages 137–150)
05/08/12 6

Hadoop Distributed File System

(http://lucene.apache.org/hadoop)

05/08/12 7

Motivations
• MapReduce provides
– Automatic parallelization & distribution
– Fault tolerance
– I/O scheduling
– Monitoring & status updates

05/08/12 8

Existing Hadoop Clusters
• Observation 1: Cluster nodes are
dedicated
– Data locality issues
– Data transfer time
• Observation 2: The number of nodes is
increased
d Scalability issues
e Shuffling overhead goes up

05/08/12 9

Proposed Solutions
Input Map
Reduce
Map Output

Map Reduce

Map Reduce

Map
P1: Data placement
P2: Prefetching
P3: Preshuffling
05/08/12 10

Solutions
P1: Data placement
Offline, distributed data, heterogeneous node

P2: Prefetching P3: Preshuffling
Online, data preloading Intermediate data movement,
reducing traffic

05/08/12 11

Improving MapReduce
Performance through Data
Placement in Heterogeneous
Hadoop Clusters

05/08/12 12

Motivational Example

Node A 1 task/min
(fast)
Node B 2x slower
(slow)
Node C 3x slower
(slowest)

Time (min)

05/08/12 13

The Native Strategy
Node A 6 tasks

Node B 3 tasks

Node C 2 tasks

Time (min)
Loading Transferring Processing
05/08/12 14

Our Solution
--Reducing data transfer time
Node A
6 tasks
Node A’

Node B’ 3 tasks

Node C’ 2 tasks

Time (min)

Loading Transferring Processing
05/08/12 15

Challenges
• Does distribution strategy depend on
applications?
• Initialization of data distribution
• The data skew problems
– New data arrival
– Data deletion
– Data updating
– New joining nodes
05/08/12 16

Measure Computing Ratios
• Computing ratio

• Fast machines process large data sets
1 task/min
Node A

Node B 2x slower

Node C 3x slower

Time
05/08/12 17

Measuring Computing Ratios
1. Run an application, collect response time
2. Set ratio of a node offering the shortest response time as 1
3. Normalize ratios of other nodes
4. Calculate the least common multiple of these ratios
5. Determine the amount of data processed by each node

Node Response Ratio # of File Speed
time(s) Fragments
Node A 10 1 6 Fastest
Node B 20 2 3 Average
Node C 30 3 2 Slowest

05/08/12 18

Initialize Data Distribution
Portions
3:2:1
• Input files split into 64MB Namenode
File1
blocks 1
2
4
5
3 6
• Round-Robin data
distribution algorithm
A B C

7 a
8 c
b
9

Datanodes
05/08/12 19

Data Redistribution 4
3
2
1
1.Get network topology, Namenode
ratio, and utilization L1 A C
2.Build and sort two lists:
under-utilized node list  L1 L2 B
over-utilized node list  L2
3. Select the source and A B C Portion
destination node from 3:2:1
the lists.
1 4 7 a 6
4.Transfer data 2 5 8 b
9 c
5.Repeat step 3, 4 until the 3

list is empty.
05/08/12 20

Experimental Environment
Node CPU Model CPU(Hz) L1 Cache(KB)
Node A Intel core 2 Duo 2*1G=2G 204
Node B Intel Celeron 2.8G 256
Node C Intel Pentium 3 1.2G 256
Node D Intel Pentium 3 1.2G 256
Node E Intel Pentium 3 1.2G 256

Five nodes in a Hadoop heterogeneous cluster

05/08/12 21

Benckmarks
• Grep: a tool searching for a regular
expression in a text file

• WordCount: a program used to count
words in a text file

• Sort: a program used to list the inputs in
sorted order.
05/08/12 22

Response Time of Grep and
Wordcount in Each Node

Application dependence
Computing ratio is
Data size independence
05/08/12 23

Computing Ratio for Two
Applications

Computing Node Ratios for Grep Ratios for Wordcount
Node A 1 1
Node B 2 2
Node C 3.3 5
Node D 3.3 5
Node E 3.3 5

Computing ratio of the five nodes with respective of Grep and
Wordcount applications

05/08/12 24

Six Data Placement Decisions

05/08/12 25

Impact of data placement on
performance of Grep

05/08/12 26

Impact of data placement on
performance of WordCount

05/08/12 27

Summary of Data Placement
P1: Data Placement Strategy
• Motivation: Fast machines process large data sets
• Problem: Data locality issue in heterogeneous
clusters
• Contributions: Distribute data according to
computing capability
– Measure computing ratio
– Initialize data placement
– Redistribution

05/08/12 28

Predictive Scheduling and
Prefetching for Hadoop clusters

05/08/12 29

Prefetching
• Goal: Improving performance
• Approach
– Best effort to guarantee data locality.
– Keeping data close to computing nodes
– Reducing the CPU stall time

05/08/12 30

Challenges
• What to prefetch?
• How to prefetch?
• What is the size of blocks to be
prefetched?

05/08/12 31

Dataflow in Hadoop

1.Submit job

t
ea

k
rtb

tas
2.Schedule

ea

xt
5.h

Ne
6.
3.Read Input map Local
reduce
FS
Block 1
HDFS
Block 2 Local
map FS reduce
7.Read new file

4. Run map
05/08/12 32

Dataflow in Hadoop

1.Submit job

t
2.Schedule

ea

k
rtb
+ more task

tas
ea
+ meta

xt
5.h

Ne
data

6.
3.Read Input map Local
reduce
FS
Block 1
HDFS
Block 2 Local
map FS reduce
5.1.Read new file

4. Run map
05/08/12 33

Prefetching Processing

6
7

8

05/08/12 34

Software Architecture

05/08/12 35

Grep Performance

9.5% 1G
8.5% 2G

05/08/12 36

WordCount Performance

8.9% 1G
8.1% 2G

05/08/12 37

Large/Small file in a node

9.1% Grep 18% Grep
8.3% WordCount 24% WordCount

05/08/12 38

Experiment Setting

05/08/12 39

Large/Small file in cluster

05/08/12 40

Summary
P2: Predictive Scheduler and Prefetching
• Goal: Moving data before task assigns
• Problem: Synchronization task and data
• Contributions: Preloading the required data early
than the task assigned
– Predictive scheduler
– Prefetching mechanism
– Worker thread

05/08/12 41

Adaptive Preshuffling
in Hadoop clusters

05/08/12 42

Preshuffling
• Observation 1: Too much data move from
Map worker to Reduce worker
– Solution1: Map nodes apply pre-shuffling
functions to their local output

• Observation 2: No reduce can start until a
map is complete.
– Solution2: Intermediate data is pipelined
between mappers and reducers.
05/08/12 43

Preshuffling
• Goal ： Minimize data shuffle during
Reduce
• Approach
– Pipeline
– Overlap between map and data movement
– Group map and reduce
• Challenges
– Synchronize map and reduce
– Data locality
05/08/12 44

Dataflow in Hadoop

1.Submit job 2.Schedule

2.
at

Ne
e
rtb

k

w
tas

tas
ea

xt
5.h

k
Ne
6.
3.Read Input 5.Write data
map Local
FS
reduce
Block 1
HTTP GET HDFS
HDFS
Block 2 Local
map FS reduce

3. Request data
4. Run map 4. Send data
05/08/12 45

PreShuffle

map reduce

map reduce

Data request

05/08/12 46

In-memory buffer

05/08/12 47

Pipelining – A new design

map reduce
Block 1

HDFS HDFS
Block 2
map reduce

05/08/12 48


230 seconds vs 180 seconds

05/08/12 49


05/08/12 50

Sort Performace

05/08/12 51

Summary
P3: Preshuffling
• Goal: Minimize data shuffling during the Reduce
• Problem: task distribution and synchronization
• Contributions: preshuffling agorithm
– Push data instead of tradition pull
– In-memory buffer
– Pipeline

05/08/12 52

Conclusion

Input
P1: Data placement
Offline, distributed data, heterogeneous
node
Map

Map

Map

Map

Map
P2: Prefetching
Online, data preloading, single node

P3: Preshuffling
Intermediate data movement, reducing
Reduce

Reduce

Reduce

traffic
Output

05/08/12 53

Future Work
• Extend Pipelining
– Implement the pipelining design
• Small files issue
– Har file
– Sequence file
– CombineFileInputFormat
• Extend Data placement

05/08/12 54

Thanks!
And Questions?

55

Run Time affected by Network Condition

Experiment result conducted by Yixian Yang
05/08/12 56

Traffic Volume affected by Network
Condition

Experiment result conducted by Yixian Yang
05/08/12 57

Improving Hadoop Performance with Data Placement and Prefetching

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (10)

Ähnlich wie Improving Hadoop Performance with Data Placement and Prefetching

Ähnlich wie Improving Hadoop Performance with Data Placement and Prefetching (20)

Mehr von Xiao Qin

Mehr von Xiao Qin (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Improving Hadoop Performance with Data Placement and Prefetching

Hinweis der Redaktion