Class V1V2TableUtil
- java.lang.Object
-
- oracle.kv.hadoop.hive.table.V1V2TableUtil
-
public final class V1V2TableUtil extends Object
Utility class that provides static convenience methods for managing the interactions between version 1 and version 2 (YARN) MapReduce classes. - Note on Logging - Two loggers are currently employed by this class:- One logger based on Log4j version 1, accessed via the org.apache.commons.logging wrapper.
- One logger based on the Log4j2 API.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static InputFormat<PrimaryKey,Row>
getInputFormat(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId)
static InputFormat<PrimaryKey,Row>
getInputFormat(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator)
static InputFormat<PrimaryKey,Row>
getInputFormat(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId)
static InputFormat<PrimaryKey,Row>
getInputFormat(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator)
For the current Hive query, constructs and returns a YARN based InputFormat class that will be used when processing the query.static Map<TableHiveInputSplit,TableInputSplit>
getSplitMap(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId)
static Map<TableHiveInputSplit,TableInputSplit>
getSplitMap(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator)
static Map<TableHiveInputSplit,TableInputSplit>
getSplitMap(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId)
static Map<TableHiveInputSplit,TableInputSplit>
getSplitMap(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator)
For the current Hive query, returns a singleton collection that maps each version 1 InputSplit for the query to its corresponding version 2 InputSplit.static void
resetInputJobInfoForNewQuery()
Convenience method that can be called to reset the state at the completion of the previous query.
-
-
-
Method Detail
-
getSplitMap
public static Map<TableHiveInputSplit,TableInputSplit> getSplitMap(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator) throws IOException
For the current Hive query, returns a singleton collection that maps each version 1 InputSplit for the query to its corresponding version 2 InputSplit. If the call to this method is the first call after the query has been entered on the command line and/or the query state has been reset by a call to resetInputJobInfoForNewQuery, this method will construct and populate the return Map; otherwise, it will return the previously constructed Map.- Throws:
IOException
-
getSplitMap
public static Map<TableHiveInputSplit,TableInputSplit> getSplitMap(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId) throws IOException
- Throws:
IOException
-
getSplitMap
public static Map<TableHiveInputSplit,TableInputSplit> getSplitMap(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator) throws IOException
- Throws:
IOException
-
getSplitMap
public static Map<TableHiveInputSplit,TableInputSplit> getSplitMap(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId) throws IOException
- Throws:
IOException
-
getInputFormat
public static InputFormat<PrimaryKey,Row> getInputFormat(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator) throws IOException
For the current Hive query, constructs and returns a YARN based InputFormat class that will be used when processing the query. This method also constructs and populates a singleton Map whose elements are key/value pairs in which each key is a version 1 split for the returned InputFormat, and each value is the key's corresponding version 2 split.- Throws:
IOException
-
getInputFormat
public static InputFormat<PrimaryKey,Row> getInputFormat(JobConf jobConf, TableHiveInputSplit inputSplit, int queryBy, String whereClause, Integer shardKeyPartitionId) throws IOException
- Throws:
IOException
-
getInputFormat
public static InputFormat<PrimaryKey,Row> getInputFormat(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId) throws IOException
- Throws:
IOException
-
getInputFormat
public static InputFormat<PrimaryKey,Row> getInputFormat(JobConf jobConf, int queryBy, String whereClause, Integer shardKeyPartitionId, TableInputFormatBase.TopologyLocatorWrapper topologyLocator) throws IOException
- Throws:
IOException
-
resetInputJobInfoForNewQuery
public static void resetInputJobInfoForNewQuery()
Convenience method that can be called to reset the state at the completion of the previous query. Currently, this method is called by only the close method of the TableHiveRecordReader class. Unfortunately, because of the way Hive and Big Data SQL query processing is implemented, that method is not always called when certain queries complete.For example, if the query is a Hive query in which MapReduce is used to satisfy the query, the TableHiveRecordReader.close method is never called; whereas it is called at the completion of all other Hive queries. With respect to Big Data SQL queries, although TableHiveRecordReader.close is always called when the query specifies a predicate, it is not called when a Big Data SQL query specifies no predicate.
To adjust for this unfortunate inconsistency between the different code paths employed by Hive and Big Data SQL, in addition to calling this method whenever TableHiveRecordReader.close is called at the completion of a given Hive or Big Data SQL query, the state is also reset by the getInputFormat method at the beginning of query processing.
-
-