SpatialJoin (Oracle Big Data Spatial and Graph Vector Analysis Java API Reference)

java.lang.Object
- org.apache.hadoop.conf.Configured
- - oracle.spatial.hadoop.vector.mapred.job.BaseJob<java.lang.Object,java.lang.Object>
  - - oracle.spatial.hadoop.vector.mapred.job.MultipleInputsJob
    - - oracle.spatial.hadoop.vector.mapred.job.SpatialJoin

All Implemented Interfaces:

org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

Direct Known Subclasses:

SpatialJoin
```
public class SpatialJoin
extends MultipleInputsJob
```
This class can be used to run or configure a job for joining pairs of records from different data sets based on a spatial interaction between records.
A partitioning result file as generated by Partitioning is required to perform the spatial join. A spatial join can be executed as follows:
1. Create a SpatialJoin instance
2. Set the input parameters such as input data sets, output path and optionally a partitioning result file path.
3. Call preprocess(JobConf)
4. Call configure(JobConf)
5. Run the job.
Note that if a partitioning result file path is not set, partitioning is performed when calling preprocess(JobConf).

Field Summary

Fields
Modifier and Type	Field and Description
`protected static java.lang.String`	`JOIN_FOLDER`
`protected static java.lang.String`	`PARTITIONING_FOLDER`
`protected org.apache.hadoop.fs.Path`	`partitioningResultPath`
`protected double`	`samplingRatio`
`protected SpatialOperationConfig`	`spatialOperationConfig`

Fields inherited from class oracle.spatial.hadoop.vector.mapred.job.MultipleInputsJob
inputDataSets, miConf

Fields inherited from class oracle.spatial.hadoop.vector.mapred.job.BaseJob
argsp, inputDataSet, jarClass, jobRegEntryPath, proxyIDS

Constructor Summary

Constructors
Constructor and Description

SpatialJoin()

Constructors
Constructor and Description
`SpatialJoin()`

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`configure(org.apache.hadoop.mapred.JobConf jobConf)` Validates and adds the current parameters to the job configuration
`protected void`	`defineGlobaBounds()`
`java.lang.String`	`getCmdOptions()` Gets a description of the arguments expected from command line.
`java.util.Map<java.lang.String,java.lang.Object>`	`getCurrentCmdArgs(org.apache.hadoop.conf.Configuration conf)` Returns the current driver properties in a map where each key-value is a name and value of a command line argument.
`protected InputDataSetCmdArgsParserHandler`	`getInputDataSetCmdParserHandler(org.apache.hadoop.conf.Configuration conf)` Gets the current instance of `InputDataSetCmdArgsParserHandler` used to parse command line parameters for the input data set
`protected InputDataSetConfiguratorHandler`	`getInputDataSetConfiguratorHandler(org.apache.hadoop.conf.Configuration conf)` Returns the current instance of `InputDataSetConfiguratorHandler` used to configure the input data set
`java.lang.String`	`getOutput()` Gets the job output path
`org.apache.hadoop.fs.Path`	`getPartitioningResultPath()` Sets the location of a previously generated partitioning result file for the input data sets
`protected java.lang.String`	`getRootOutput()`
`double`	`getSamplingRatio()` Gets the ratio of the sample size to the input data size used to sample when a partitioning result file is not set
`SpatialOperationConfig`	`getSpatialOperationConfig()` Gets the spatial operation configuration used to perform the spatial join
`protected boolean`	`isPartitioningRequired(org.apache.hadoop.conf.Configuration conf)`
`static void`	`main(java.lang.String[] args)`
`boolean`	`preprocess(org.apache.hadoop.mapred.JobConf jobConf)` Checks whether partitioning is required and if so, it runs the partitioning process.
`void`	`processArgs(java.lang.String[] args, org.apache.hadoop.conf.Configuration conf)` Extracts and validates arguments from the command line
`int`	`run(java.lang.String[] args)`
`void`	`setPartitioningResultPath(org.apache.hadoop.fs.Path partitioningResultPath)` Gets the location of a previously generated partitioning result file for the input data sets
`void`	`setSamplingRatio(double samplingRatio)` Sets the ratio of the sample size to the input data size used to sample when a partitioning result file is not set
`void`	`setSpatialOperationConfig(SpatialOperationConfig spatialOperationConfig)` Sets the spatial operation configuration used to perform the spatial join
`protected void`	`setupPartitioningResult(org.apache.hadoop.fs.Path partitioningResultPath, org.apache.hadoop.conf.Configuration conf)`

Methods inherited from class oracle.spatial.hadoop.vector.mapred.job.MultipleInputsJob
addInputDataSet, asMultiInputDataSet, configureInputs, configureInputs, getInputListCmdOptions, getInputs, getMultipleInputDataSetsParams, removeInputDataSet, setInputDataSets, updateInputDataSet

Methods inherited from class oracle.spatial.hadoop.vector.mapred.job.BaseJob
addJobRegistryEntry, addJobRegistryEntry, addJobRegistryEntry, configure, createJob, createJob, createJob, createJob, createJobConf, createJobConf, createJobConf, getCmdOptionsWithInputDataSets, getCmdOptionsWithInputDataSets, getCurrentCmdArgsAsString, getCurrentCmdArgsAsString, getInput, getInputDataSet, getInputFormatClass, getJarClass, getRecordInfoProviderClass, getSpatialConfig, runJob, runJob, setInput, setInputDataSet, setInputFormatClass, setJarClass, setOutput, setRecordInfoProviderClass, setSpatialConfig

Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf

- Field Detail
  - PARTITIONING_FOLDER
```
protected static final java.lang.String PARTITIONING_FOLDER
```
    See Also:
    
    Constant Field Values
  - JOIN_FOLDER
```
protected static final java.lang.String JOIN_FOLDER
```
    See Also:
    
    Constant Field Values
  - partitioningResultPath
```
protected org.apache.hadoop.fs.Path partitioningResultPath
```
  - spatialOperationConfig
```
protected SpatialOperationConfig spatialOperationConfig
```
  - samplingRatio
```
protected double samplingRatio
```
- Constructor Detail
  - SpatialJoin
```
public SpatialJoin()
```
- Method Detail
  - getInputDataSetConfiguratorHandler
```
protected InputDataSetConfiguratorHandler getInputDataSetConfiguratorHandler(org.apache.hadoop.conf.Configuration conf)
```
    Description copied from class: BaseJob
    
    Returns the current instance of InputDataSetConfiguratorHandler used to configure the input data set
    
    Specified by:
    
    getInputDataSetConfiguratorHandler in class BaseJob<java.lang.Object,java.lang.Object>
    
    Parameters:
    
    conf - a job configuration
    
    Returns:
    
    an instance of InputDataSetConfiguratorHandler
  - getInputDataSetCmdParserHandler
```
protected InputDataSetCmdArgsParserHandler getInputDataSetCmdParserHandler(org.apache.hadoop.conf.Configuration conf)
```
    Description copied from class: BaseJob
    
    Gets the current instance of InputDataSetCmdArgsParserHandler used to parse command line parameters for the input data set
    
    Specified by:
    
    getInputDataSetCmdParserHandler in class BaseJob<java.lang.Object,java.lang.Object>
    
    Parameters:
    
    conf - a job configuration
    
    Returns:
    
    an instance of InputDataSetCmdArgsParserHandler
  - getSamplingRatio
```
public double getSamplingRatio()
```
    Gets the ratio of the sample size to the input data size used to sample when a partitioning result file is not set
    
    Returns:
  - setSamplingRatio
```
public void setSamplingRatio(double samplingRatio)
```
    Sets the ratio of the sample size to the input data size used to sample when a partitioning result file is not set
    
    Parameters:
    
    samplingRatio -
  - getPartitioningResultPath
```
public org.apache.hadoop.fs.Path getPartitioningResultPath()
```
    Sets the location of a previously generated partitioning result file for the input data sets
    
    Returns:
  - setPartitioningResultPath
```
public void setPartitioningResultPath(org.apache.hadoop.fs.Path partitioningResultPath)
```
    Gets the location of a previously generated partitioning result file for the input data sets
    
    Parameters:
    
    partitioningResultPath -
  - getSpatialOperationConfig
```
public SpatialOperationConfig getSpatialOperationConfig()
```
    Gets the spatial operation configuration used to perform the spatial join
    
    Returns:
  - setSpatialOperationConfig
```
public void setSpatialOperationConfig(SpatialOperationConfig spatialOperationConfig)
```
    Sets the spatial operation configuration used to perform the spatial join
    
    Parameters:
    
    spatialOperationConfig -
  - getOutput
```
public java.lang.String getOutput()
```
    Description copied from class: BaseJob
    
    Gets the job output path
    
    Overrides:
    
    getOutput in class BaseJob<java.lang.Object,java.lang.Object>
    
    Returns:
    
    a path
  - getRootOutput
```
protected java.lang.String getRootOutput()
```
  - processArgs
```
public void processArgs(java.lang.String[] args,
                        org.apache.hadoop.conf.Configuration conf)
                 throws java.lang.Exception
```
    Description copied from class: BaseJob
    
    Extracts and validates arguments from the command line
    
    Overrides:
    
    processArgs in class MultipleInputsJob
    
    Parameters:
    
    args - arguments from the command line
    
    conf - the job configuration
    
    Throws:
    
    java.lang.Exception
  - getCurrentCmdArgs
```
public java.util.Map<java.lang.String,java.lang.Object> getCurrentCmdArgs(org.apache.hadoop.conf.Configuration conf)
```
    Description copied from class: BaseJob
    
    Returns the current driver properties in a map where each key-value is a name and value of a command line argument. By printing this information it is possible to know how to execute a similar job from command line
    
    Overrides:
    
    getCurrentCmdArgs in class BaseJob<java.lang.Object,java.lang.Object>
    
    Parameters:
    
    conf - a job configuration
    
    Returns:
    
    a map which entries are of type <String,String> or <String, Map<String, Object>> for multiple input data sets. For nested maps, the entry types are the same than the enclosing map
  - getCmdOptions
```
public java.lang.String getCmdOptions()
```
    Description copied from class: BaseJob
    
    Gets a description of the arguments expected from command line.
    
    Overrides:
    
    getCmdOptions in class BaseJob<java.lang.Object,java.lang.Object>
    
    Returns:
    
    a text description
  - configure
```
public void configure(org.apache.hadoop.mapred.JobConf jobConf)
               throws java.lang.Exception
```
    Description copied from class: BaseJob
    
    Validates and adds the current parameters to the job configuration
    
    Overrides:
    
    configure in class BaseJob<java.lang.Object,java.lang.Object>
    
    Parameters:
    
    jobConf - the job configuration
    
    Throws:
    
    java.lang.Exception
  - setupPartitioningResult
```
protected void setupPartitioningResult(org.apache.hadoop.fs.Path partitioningResultPath,
                                       org.apache.hadoop.conf.Configuration conf)
```
  - isPartitioningRequired
```
protected boolean isPartitioningRequired(org.apache.hadoop.conf.Configuration conf)
                                  throws java.io.IOException
```
    Throws:
    
    java.io.IOException
  - preprocess
```
public boolean preprocess(org.apache.hadoop.mapred.JobConf jobConf)
                   throws java.lang.Exception
```
    Checks whether partitioning is required and if so, it runs the partitioning process.
    
    Parameters:
    
    jobConf - the job configuration
    
    Returns:
    
    true if there were no errors
    
    Throws:
    
    java.lang.Exception
  - defineGlobaBounds
```
protected void defineGlobaBounds()
```
  - run
```
public int run(java.lang.String[] args)
        throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception
  - main
```
public static void main(java.lang.String[] args)
                 throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception

Class SpatialJoin

Field Summary

Fields inherited from class oracle.spatial.hadoop.vector.mapred.job.MultipleInputsJob

Fields inherited from class oracle.spatial.hadoop.vector.mapred.job.BaseJob

Constructor Summary

Method Summary

Methods inherited from class oracle.spatial.hadoop.vector.mapred.job.MultipleInputsJob

Methods inherited from class oracle.spatial.hadoop.vector.mapred.job.BaseJob

Methods inherited from class org.apache.hadoop.conf.Configured

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.hadoop.conf.Configurable

Field Detail

PARTITIONING_FOLDER

JOIN_FOLDER

partitioningResultPath

spatialOperationConfig

samplingRatio

Constructor Detail

SpatialJoin

Method Detail

getInputDataSetConfiguratorHandler

getInputDataSetCmdParserHandler

getSamplingRatio

setSamplingRatio

getPartitioningResultPath

setPartitioningResultPath

getSpatialOperationConfig

setSpatialOperationConfig

getOutput

getRootOutput

processArgs

getCurrentCmdArgs

getCmdOptions

configure

setupPartitioningResult

isPartitioningRequired

preprocess

defineGlobaBounds

run

main