soujanyabargavi

hadoop: tracking MapReduce tasks

Discussion created by soujanyabargavi on Jun 19, 2018
Latest reply on Jun 21, 2018 by jesse_amd

I'm new to hadoop and this is probably a stupid question but I've been looking for it for hours and cannot find how to do it.

I'm running Hadoop MapReduce with a different number of mappers and reducers to see the difference in performance (e.g. execution time). I want to check if the specified number of mappers/reducers were used but I just can't figure out how I do it.

Hadoop 1.2.1 is installed on a quad-core machine with hyper-threading and I'm sshing to the server, and MindMajix Hadoop is running in Pseudo-distributed mode.

My MapReduce program was written in Python, so I'm using hadoop-streaming, and this is how I ran the MR program.

$ hadoop jar /Users/hadoop/hadoop-1.2.1/contrib/streaming/hadoop-streaming-1.2.1.jar -file /Users/hadoop/map.py -mapper /Users/hadoop/map.py -file /Users/hadoop/reduce.py -reducer /Users/hadoop/reduce.py -input file:///Users/hadoop/inputfile -output file:///Users/hadoop/outputfile

I want to see log information that looks like this, or anything that provides this kind of information.

Outcomes