K
- The type of the record's key. The same used by internalInputFormatV
- The type of the record's value. The same used by internalInputFormat.public abstract class CompositeInputFormat<K,V> extends org.apache.hadoop.mapred.FileInputFormat<K,V> implements org.apache.hadoop.mapred.JobConfigurable, AbstractCompositeInputFormat<K,V>
Modifier and Type | Field and Description |
---|---|
protected org.apache.hadoop.mapred.InputFormat<K,V> |
iInputFormat |
Constructor and Description |
---|
CompositeInputFormat() |
Modifier and Type | Method and Description |
---|---|
void |
configure(org.apache.hadoop.mapred.JobConf conf) |
abstract org.apache.hadoop.mapred.RecordReader<K,V> |
getDelegateRecordReader(org.apache.hadoop.mapred.FileSplit split, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter)
Creates a record reader instance.
|
static org.apache.hadoop.mapred.InputSplit |
getFittingInputSplit(org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.InputFormat<?,?> inputFormat, org.apache.hadoop.mapred.InputSplit split)
Returns an instance of an InputSplit subclass appropriate for the given InputFormat.
|
org.apache.hadoop.mapred.InputFormat<K,V> |
getInternalInputFormat(org.apache.hadoop.mapred.JobConf jobConf)
Gets an instance of the class set as the internal input format
|
static <K,V> java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat<K,V>> |
getInternalInputFormatClass(org.apache.hadoop.mapred.JobConf jobConf)
Gets the internal input format type which is used to actually read the data.
|
static <K,V> RecordInfoProvider<K,V> |
getRecordInfoProvider(RecordInfoProvider<K,V> provider, org.apache.hadoop.mapred.JobConf conf)
Gets an instance of the specified RecordInfoProvider implementation
|
static <K,V> java.lang.Class<? extends RecordInfoProvider<K,V>> |
getRecordInfoProviderClass(org.apache.hadoop.mapred.JobConf jobConf)
Gets the class of the RecordInfoProvider implementation
|
org.apache.hadoop.mapred.RecordReader<K,V> |
getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter) |
org.apache.hadoop.mapred.InputSplit[] |
getSplits(org.apache.hadoop.mapred.JobConf conf, int splitCount) |
static <K,V> void |
setInternalInputFormatClass(org.apache.hadoop.mapred.JobConf jobConf, java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat<K,V>> iInputFormat)
Sets the internal input format type which is used to actually read the data.
|
static <K,V> void |
setRecordInfoProviderClass(org.apache.hadoop.mapred.JobConf jobConf, java.lang.Class<? extends RecordInfoProvider<K,V>> provider)
Sets the class of the RecordInfoProvider implementation
|
public static <K,V> void setInternalInputFormatClass(org.apache.hadoop.mapred.JobConf jobConf, java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat<K,V>> iInputFormat)
jobConf
- the job configurationiInputFormat
- a class which is a subclass of FileInputFormat or CombineFileInputFormatpublic static <K,V> java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat<K,V>> getInternalInputFormatClass(org.apache.hadoop.mapred.JobConf jobConf)
jobConf
- the job configurationpublic org.apache.hadoop.mapred.InputFormat<K,V> getInternalInputFormat(org.apache.hadoop.mapred.JobConf jobConf)
getInternalInputFormat
in interface AbstractCompositeInputFormat<K,V>
jobConf
- the job configurationpublic static <K,V> void setRecordInfoProviderClass(org.apache.hadoop.mapred.JobConf jobConf, java.lang.Class<? extends RecordInfoProvider<K,V>> provider)
jobConf
- the job configurationprovider
- a class that extends from RecordInfoProviderpublic static <K,V> java.lang.Class<? extends RecordInfoProvider<K,V>> getRecordInfoProviderClass(org.apache.hadoop.mapred.JobConf jobConf)
jobConf
- the job configurationpublic static <K,V> RecordInfoProvider<K,V> getRecordInfoProvider(RecordInfoProvider<K,V> provider, org.apache.hadoop.mapred.JobConf conf)
provider
- if null, a new instance will be returned, otherwise, the same instance will be returned. It is used to release the caller from checking if the instance is null.conf
- the job configurationpublic void configure(org.apache.hadoop.mapred.JobConf conf)
configure
in interface org.apache.hadoop.mapred.JobConfigurable
public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf conf, int splitCount) throws java.io.IOException
public org.apache.hadoop.mapred.RecordReader<K,V> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter) throws java.io.IOException
public abstract org.apache.hadoop.mapred.RecordReader<K,V> getDelegateRecordReader(org.apache.hadoop.mapred.FileSplit split, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter) throws java.io.IOException
split
- a FileSplit instanceconf
- the job configurationreporter
- a Reporter instancejava.io.IOException
public static org.apache.hadoop.mapred.InputSplit getFittingInputSplit(org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.InputFormat<?,?> inputFormat, org.apache.hadoop.mapred.InputSplit split) throws java.io.IOException
conf
- the job configurationinputFormat
- an InputFormat instancesplit
- an InputSplit instancejava.io.IOException
Copyright © 2016 Oracle and/or its affiliates. All Rights Reserved.