public class Partitioning extends MultipleInputsJob
setSamplingRatio(double) property (by default it is set to 0.1).| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
PARTITION_RESULT_FILE |
| Constructor and Description |
|---|
Partitioning() |
| Modifier and Type | Method and Description |
|---|---|
void |
configure(JobConf jobConf)
Validates and adds the current parameters to the job configuration
|
java.lang.String |
getCmdOptions()
Gets a description of the arguments expected from command line.
|
java.util.Map<java.lang.String,java.lang.Object> |
getCurrentCmdArgs(Configuration conf)
Returns the current driver properties in a map where each key-value is a name and value of a command line argument.
|
Path |
getPartitionsPath() |
double |
getSamplingRatio()
Gets the ratio of the sample size to the input data size
|
static void |
main(java.lang.String[] args) |
void |
processArgs(java.lang.String[] args, Configuration conf)
Extracts and validates arguments from the command line
|
int |
run(java.lang.String[] args) |
boolean |
runFullPartitioningProcess(JobConf jobConf)
Runs the full partitioning process.
|
void |
setSamplingRatio(double samplingRatio)
Sets the ratio of the sample size to the input data size so only a fraction of the whole input data is used for partitioning.
|
addInputDataSet, getInputs, getMultipleInputDataSetsParams, removeInputDataSet, setInputDataSetsgetCmdOptionsWithInputDataSets, getCurrentCmdArgsAsString, getInput, getInputDataSet, getInputFormatClass, getJarClass, getOutput, getRecordInfoProviderClass, getSpatialConfig, setInput, setInputDataSet, setInputFormatClass, setJarClass, setOutput, setRecordInfoProviderClass, setSpatialConfigpublic static final java.lang.String PARTITION_RESULT_FILE
public void configure(JobConf jobConf)
throws java.lang.Exception
BaseJobpublic java.lang.String getCmdOptions()
BaseJobgetCmdOptions in class BaseJob<java.lang.Object,java.lang.Object>public java.util.Map<java.lang.String,java.lang.Object> getCurrentCmdArgs(Configuration conf)
BaseJobgetCurrentCmdArgs in class BaseJob<java.lang.Object,java.lang.Object>conf - a job configurationpublic Path getPartitionsPath()
public double getSamplingRatio()
public static void main(java.lang.String[] args)
throws java.lang.Exception
java.lang.Exception
public void processArgs(java.lang.String[] args,
Configuration conf)
throws java.lang.Exception
BaseJobprocessArgs in class MultipleInputsJobargs - arguments from the command lineconf - the job configurationjava.lang.Exception
public int run(java.lang.String[] args)
throws java.lang.Exception
java.lang.Exception
public boolean runFullPartitioningProcess(JobConf jobConf)
throws java.lang.Exception
jobConf -java.lang.Exceptionpublic void setSamplingRatio(double samplingRatio)
samplingRatio -Copyright © 2017 Oracle and/or its affiliates. All Rights Reserved.