public class TableInputFormat extends InputFormat<K,V>
For information on the parameters that may be passed to this class,
refer to the javadoc for the parent class of this class;
TableInputFormatBase
.
A simple example demonstrating the Oracle NoSQL DB Hadoop oracle.kv.hadoop.table.TableInputFormat class can be found in the KVHOME/example/table/hadoop directory. It demonstrates how, in a MapReduce job, to read records from an Oracle NoSQL Database that were written using Table API. The javadoc for that program describes the simple Map/Reduce processing as well as how to invoke the program in Hadoop.
Constructor and Description |
---|
TableInputFormat() |
Modifier and Type | Method and Description |
---|---|
RecordReader<PrimaryKey,Row> |
createRecordReader(InputSplit split,
TaskAttemptContext context)
Returns the RecordReader for the given InputSplit.
|
static void |
setBatchSize(int batchSize)
Specifies the suggested number of keys to fetch during each network
round trip by the InputFormat.
|
static void |
setConsistency(Consistency consistency)
Specifies the read consistency associated with the lookup of the child
KV pairs.
|
static void |
setDirection(Direction newDirection)
Specifies the order in which records are returned by the InputFormat.
|
static void |
setFieldRangeProperty(String newProperty)
Sets the String to use for the property value whose contents are used
to construct the field range to employ when iterating the table.
|
static void |
setKVHadoopHosts(String[] newHadoopHosts)
Set the KV Hadoop data node host name(s) for this InputFormat
to operate on.
|
static void |
setKVHelperHosts(String[] newHelperHosts)
Set the KV Helper host:port pair(s) for this InputFormat to operate on.
|
static void |
setKVSecurity(String loginFile,
PasswordCredentials userPasswordCredentials,
String trustFile)
Sets the login properties file and the public trust file (keys
and/or certificates), as well as the
PasswordCredentials
for authentication. |
static void |
setKVStoreName(String newStoreName)
Set the KV Store name for this InputFormat to operate on.
|
static void |
setMaxBatches(int newMaxBatches)
Specifies the maximum number of result batches that can be held in
memory on the client side before processing on the server side
pauses.
|
static void |
setMaxRequests(int newMaxRequests)
Specifies the maximum number of client side threads to use when running
an iteration; where a value of 1 causes the iteration to be performed
using only the current thread, and a value of 0 causes the client to
base the number of threads to employ on the current store topology.
|
static void |
setPrimaryKeyProperty(String newProperty)
Sets the String to use for the property value whose contents are used
to construct the primary key to employ when iterating the table.
|
void |
setQueryInfo(int newQueryBy,
String newWhereClause,
Integer newPartitionId) |
static void |
setTableName(String newTableName)
Set the name of the table in the KV store that this InputFormat
will operate on.
|
static void |
setTimeout(long timeout)
Specifies an upper bound on the time interval for processing a
particular KV retrieval.
|
static void |
setTimeoutUnit(TimeUnit timeoutUnit)
Specifies the unit of the timeout parameter.
|
getSplits
public RecordReader<PrimaryKey,Row> createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException
createRecordReader
in class InputFormat<PrimaryKey,Row>
IOException
InterruptedException
public static void setKVStoreName(String newStoreName)
oracle.kv.kvstore
Hadoop
property.newStoreName
- the new KV Store name to setpublic static void setKVHelperHosts(String[] newHelperHosts)
oracle.kv.hosts
Hadoop
property.newHelperHosts
- array of hostname:port strings of any hosts
in the KV Store.public static void setKVHadoopHosts(String[] newHadoopHosts)
oracle.kv.hadoop.hosts
property.newHadoopHosts
- array of hostname strings corresponding to the
names of the Hadoop data node hosts in the Hadoop cluster that this
InputFormat will use to support MapReduce jobs and/or service Hive
queries.public static void setTableName(String newTableName)
oracle.kv.tableName
property.newTableName
- the new table name to set.public static void setPrimaryKeyProperty(String newProperty)
fieldName,fieldValue,fieldType,fieldName,fieldValue,fieldType,..
where the number of elements separated by commas must be a multiple
of 3, and each fieldType must be 'STRING', 'INTEGER', 'LONG', 'FLOAT',
'DOUBLE', or 'BOOLEAN'. Additionally, the values referenced by the
various fieldType and fieldValue components of this String must
satisfy the semantics of PrimaryKey for the given table; that is,
they must represent a first-to-last subset of the table's primary
key fields, and they must be specified in the same order as those
primary key fields. If the String referenced by this property
does not satisfy these requirements, a full primary key wildcard
will be used when iterating the table.
This is equivalent to passing the oracle.kv.primaryKey
Hadoop property.
newProperty
- the new shard key property to setpublic static void setFieldRangeProperty(String newProperty)
-Doracle.kv.fieldRange="{\"name\":\"fieldName\",
\"start\":\"startVal\",[\"startInclusive\":true|false],
\"end\"\"endVal\",[\"endInclusive\":true|false]}"
where for the given field over which to range, the 'start', and 'end'
components are required, and the 'startInclusive' and 'endInclusive'
components are optional; defaulting to 'true' if not included. Note
that the list itself is enclosed in un-escaped double quotes and
corresponding curly brace; and each name component and string type
value component is enclosed in ESCAPED double quotes.
In addition to the JSON format requirement above, the values referenced by the components of this Property's value must also satisfy the semantics of FieldRange; that is,
This is equivalent to passing the oracle.kv.fieldRange
Hadoop property.
newProperty
- the new field range property to setpublic static void setDirection(Direction newDirection)
newDirection
- the direction to retrieve datapublic static void setConsistency(Consistency consistency)
oracle.kv.consistency
Hadoop property.consistency
- the consistencypublic static void setTimeout(long timeout)
oracle.kv.timeout
Hadoop
property.timeout
- the timeoutpublic static void setTimeoutUnit(TimeUnit timeoutUnit)
oracle.kv.timeout
Hadoop property.timeoutUnit
- the timeout unitpublic static void setMaxRequests(int newMaxRequests)
This is equivalent to passing the oracle.kv.maxRequests
Hadoop property.
newMaxRequests
- the suggested number of threads to employ when
an iteration.public static void setBatchSize(int batchSize)
oracle.kv.batchSize
Hadoop property.batchSize
- the suggested number of keys to fetch during each
network round trip.public static void setMaxBatches(int newMaxBatches)
This is equivalent to passing the oracle.kv.maxBatches
Hadoop property.
newMaxBatches
- the suggested number of threads to employ when
an iteration.public static void setKVSecurity(String loginFile, PasswordCredentials userPasswordCredentials, String trustFile) throws IOException
PasswordCredentials
for authentication. The value of the loginFile
and
trustFile
parameters must be either a fully qualified
path referencing a file located on the local file system, or the
name of a file (no path) whose contents can be retrieved as a
resource from the current VM's classpath.
Note that this class provides the getSplits
method;
which must be able to contact a secure store, and so will need
access to local copies of the login properties and trust files.
As a result, if the values input for the loginFile
and
trustFile
parameters are simple file names rather
than fully qualified paths, this method will retrieve the contents
of each from the classpath and generate private, local copies of
the associated file for availability to the getSplits
method.
IOException
Copyright (c) 2011, 2017 Oracle and/or its affiliates. All rights reserved.