5.6.1 Modifying Your Java Code to Use Perfect Balance

The Perfect Balance installation directory contains a complete example, including input data, of a Java MapReduce program that uses the Perfect Balance API.

For a description of the inverted index example and execution instructions, see orabalancer-<version>-h2/examples/invindx/README.txt.

To explore the modified Java code, see orabalancer-<version>-h2/examples/jsrc/oracle/hadoop/balancer/examples/invindx/InvertedIndexMapred.java or InvertedIndexMapreduce.java.

The modifications to run Perfect Balance include the following:

  • The createBalancer method validates the configuration properties and returns a Balancer instance.

  • The waitForCompletion method samples the data and creates a partitioning plan.

  • The addBalancingPlan method adds the partitioning plan to the job configuration settings.

  • The configureCountingReducer method collects additional load statistics.

  • The save method saves the partition report and generates the Job Analyzer report.

Example 5-3 shows fragments from the inverted index Java code.

Example 5-3 Running Perfect Balance in a MapReduce Job

     .
     .
     .
import oracle.hadoop.balancer.Balancer;
     .
     .
     .
///// BEGIN: CODE TO INVOKE BALANCER (PART-1, before job submission) //////
    Configuration conf = job.getConfiguration();
    
    Balancer balancer = null;
    
    boolean useBalancer =
        conf.getBoolean("oracle.hadoop.balancer.driver.balance", true);
    if(useBalancer)
    {
      balancer = Balancer.createBalancer(conf);
      balancer.waitForCompletion();
      balancer.addBalancingPlan(conf);
    }
    
    if(conf.getBoolean("oracle.hadoop.balancer.tools.useCountingReducer", true))
    {
      Balancer.configureCountingReducer(conf);
    }
    ////////////// END: CODE TO INVOKE BALANCER (PART-1) //////////////////////
    
    boolean isSuccess = job.waitForCompletion(true);
    
    ///////////////////////////////////////////////////////////////////////////
    // BEGIN: CODE TO INVOKE BALANCER (PART-2, after job completion, optional)
    // If balancer ran, this saves the partition file report into the _balancer
    // sub-directory of the job output directory. It also writes a JobAnalyzer
    // report.
    Balancer.save(job);
    ////////////// END: CODE TO INVOKE BALANCER (PART-2) //////////////////////
     .
     .
     .
}