5.4.3.1 Running Job Analyzer Using the Perfect Balance API

This section first explains how to prepare your code using the API and then how to run Job Analyzer.

Before You Start:

Before running Job Analyzer, invoke Balancer in your application code. Make the following updates to your code and recompile.
  • Import the Balancer class.

  • After the job finishes, you can also call Balancer.save().

    If Balancer ran, this optional method saves the partition file report into the _balancer subdirectory of the job output directory. It also writes a JobAnalyzer report.

For example:

...
import oracle.hadoop.balancer.Balancer;
...
<Configure your job>
...
job.waitForCompletion(true);
Balancer.save(job);
...

After compiling the modified application, follow these steps to generate Job Analyzer:

  1. Log in to the server where you will submit the job that uses Perfect Balance.
  2. Set up Perfect Balance by taking the steps in "Getting Started with Perfect Balance."
  3. Run the job.

The example below runs a script that does the following:

  • Sets the required variables

  • Uses Perfect Balance to run a job with Job Analyzer (and without load balancing).

  • Creates the report in the default location.

  • Copies the HTML version of the report from HDFS to the /home/jdoe local directory.

  • Opens the report in a browser

The output includes warnings, which you can ignore.

Example 5-2 Running Job Analyzer with Perfect Balance

$ cat ja_nobalance.sh
 
# set up perfect balance
BALANCER_HOME=/opt/oracle/orabalancer-<version>-h2
export HADOOP_CLASSPATH=${BALANCER_HOME}/jlib/orabalancer-<version>.jar:${BALANCER_HOME}/jlib/commons-math-2.2.jar:${HADOOP_CLASSPATH} 

# run the job
hadoop jar application_jarfile.jar ApplicationClass \ 
 -D application_config_property \
 -D mapreduce.input.fileinputformat.inputdir=jdoe_application/input \
 -D mapreduce.output.fileoutputformat.outputdir=jdoe_nobal_outdir \
 -D mapreduce.job.name=nobal \
 -D mapreduce.job.reduces=10 \
 -conf application_config_file.xml 

$ sh ja_nobalance.sh
14/04/14 14:52:42 INFO input.FileInputFormat: Total input paths to process : 5
14/04/14 14:52:42 INFO mapreduce.JobSubmitter: number of splits:5
14/04/14 14:52:42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1397066986369_3478
14/04/14 14:52:43 INFO impl.YarnClientImpl: Submitted application application_1397066986369_3478
     .
     .
     .
File Input Format Counters 
Bytes Read=112652976
File Output Format Counters 
Bytes Written=384974202
 
$ hadoop fs -get jdoe_nobal_outdir/_balancer/jobanalyzer-report.html /home/jdoe
$ cd /home/jdoe
$ firefox jobanalyzer-report.html