5.4.3.2 Collecting Additional Metrics

The Job Analyzer report includes the load metrics for each key, if you set the oracle.hadoop.balancer.Balancer.configureCountingReducer() method before job submission.

This additional information provides a more detailed picture of the load for each reducer, with metrics that are not available in the standard Hadoop counters.


The Job Analyzer report also compares its predicted load with the actual load. The difference between these values measures how effective Perfect Balance was in balancing the job.

Job Analyzer might recommend key load coefficients for the Perfect Balance key load model, based on its analysis of the job load. To use these recommended coefficients when running a job with Perfect Balance, set the oracle.hadoop.balancer.linearKeyLoad.feedbackDir property to the directory containing the Job Analyzer report of a previously analyzed run of the job.

If the report contains recommended coefficients, then Perfect Balance automatically uses them. If Job Analyzer encounters an error while collecting the additional metrics, then the report does not contain the additional metrics.

Use the feedbackDir property when you do not know the values of the load model coefficients for a job, but you have the Job Analyzer output from a previous run of the job. Then you can set the value of feedbackDir to the directory where that output is stored. The values recommended from those files typically perform better than the Perfect Balance default values, because the recommended values are based on an analysis of your job's load.

Alternately, if you already know good values of the load model coefficients for your job, you can set the load model properties:

Running the job with these coefficients results in a more balanced job.