Azure Monitor & Application Insight to monitor Infrastructure & Application
Lab 2: Running a Hadoop Application
1. 2: Running a Hadoop Application
Zubair Nabi
zubair.nabi@itu.edu.pk
April 18, 2013
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 1 / 8
2. Running Hadoop
The first order of the day is to format the Hadoop DFS
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
3. Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoop
namenode -format
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
4. Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoop
namenode -format
To run Hadoop and HDFS: bin/start-all.sh
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
5. Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoop
namenode -format
To run Hadoop and HDFS: bin/start-all.sh
To terminate them: bin/stop-all.sh
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
6. Generating a dataset
Create a temporary directory to hold the data: mkdir
/tmp/gutenberg
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8
7. Generating a dataset
Create a temporary directory to hold the data: mkdir
/tmp/gutenberg
Jump to it: cd /tmp/gutenberg
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8
8. Generating a dataset
Create a temporary directory to hold the data: mkdir
/tmp/gutenberg
Jump to it: cd /tmp/gutenberg
Download text files:
wget www.gutenberg.org/etext/20417
wget www.gutenberg.org/etext/5000
wget www.gutenberg.org/etext/4300
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8
9. Copying the dataset to the HDFS
Jump to the Hadoop directory and execute: bin/hadoop dfs
-copyFromLocal /tmp/gutenberg /ccw/gutenberg
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 4 / 8
10. Running Wordcount
Execute: bin/hadoop jar hadoop-examples-1.0.4.jar
wordcount /ccw/gutenberg /ccw/gutenberg-output
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 5 / 8
11. Retrieving results from the HDFS
Copy to the local FS: bin/hadoop dfs -getmerge
/ccw/gutenberg-output /tmp/gutenberg-output
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 6 / 8
12. Accessing the web interface
JobTracker: http://localhost:50030
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8
13. Accessing the web interface
JobTracker: http://localhost:50030
TaskTracker: http://localhost:50060
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8
14. Reference(s)
Running Hadoop on Ubuntu Linux (Single-Node Cluster):
http://www.michael-noll.com/tutorials/
running-hadoop-on-ubuntu-linux-single-node-cluste
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 8 / 8