Class TableStorageHandler

java.lang.Object
org.apache.hadoop.hive.ql.metadata.DefaultStorageHandler
oracle.kv.hadoop.hive.table.TableStorageHandler
All Implemented Interfaces:
Configurable, HiveStorageHandler, HiveStoragePredicateHandler

public class TableStorageHandler extends DefaultStorageHandler
Concrete implementation of TableStorageHandlerBase; which assumes that the data accessed from the desired Oracle NoSQL Database has been stored, and will be accessed, via the Table API.
Since:
3.1
  • Field Details

    • kvStoreName

      protected String kvStoreName
    • kvHelperHosts

      protected String[] kvHelperHosts
    • kvHadoopHosts

      protected String[] kvHadoopHosts
    • tableName

      protected String tableName
    • primaryKeyProperty

      protected String primaryKeyProperty
    • fieldRangeProperty

      protected String fieldRangeProperty
    • direction

      protected Direction direction
    • consistency

      protected Consistency consistency
    • timeout

      protected long timeout
    • timeoutUnit

      protected TimeUnit timeoutUnit
    • maxRequests

      protected int maxRequests
    • batchSize

      protected int batchSize
    • maxBatches

      protected int maxBatches
    • jobConf

      protected JobConf jobConf
  • Constructor Details

    • TableStorageHandler

      public TableStorageHandler()
  • Method Details

    • getInputFormatClass

      public Class<? extends InputFormat> getInputFormatClass()
      Specified by:
      getInputFormatClass in interface HiveStorageHandler
      Overrides:
      getInputFormatClass in class DefaultStorageHandler
    • getOutputFormatClass

      public Class<? extends OutputFormat> getOutputFormatClass()
      Specified by:
      getOutputFormatClass in interface HiveStorageHandler
      Overrides:
      getOutputFormatClass in class DefaultStorageHandler
    • getSerDeClass

      public Class<? extends AbstractSerDe> getSerDeClass()
      Specified by:
      getSerDeClass in interface HiveStorageHandler
      Overrides:
      getSerDeClass in class DefaultStorageHandler
    • configureInputJobProperties

      public void configureInputJobProperties(TableDesc tableDesc, Map<String,String> jobProperties)
      Creates a configuration for job input. This method provides the mechanism for populating this StorageHandler's configuration (returned by JobContext.getConfiguration) with the properties that may be needed by the handler's bundled artifacts; for example, the InputFormat class, the SerDe class, etc. returned by this handler. Any key value pairs set in the jobProperties argument are guaranteed to be set in the job's configuration object; and any "context" information associated with the job can be retrieved from the given TableDesc parameter.

      Note that implementations of this method must be idempotent. That is, when this method is invoked more than once with the same tableDesc values for a given job, the key value pairs in jobProperties, as well as any external state set by this method, should be the same after each invocation. How this invariant guarantee is achieved is left as an implementation detail; although to support this guarantee, changes should only be made to the contents of jobProperties, but never to tableDesc.

      Specified by:
      configureInputJobProperties in interface HiveStorageHandler
      Overrides:
      configureInputJobProperties in class DefaultStorageHandler
    • configureOutputJobProperties

      public void configureOutputJobProperties(TableDesc tableDesc, Map<String,String> jobProperties)
      Using semantics identical to the semantics of the configureInputJobProperties method, creates a configuration for job output. For more detail, refer to the description of the configureInputJobProperties method.
      Specified by:
      configureOutputJobProperties in interface HiveStorageHandler
      Overrides:
      configureOutputJobProperties in class DefaultStorageHandler
    • configureTableJobProperties

      public void configureTableJobProperties(TableDesc tableDesc, Map<String,String> jobProperties)
      Although this method was originally intended to configure properties for a job based on the definition of the source or target table the job accesses, this method is now deprecated in Hive. The methods configureInputJobProperties and configureOutputJobProperties should be used instead.
      Specified by:
      configureTableJobProperties in interface HiveStorageHandler
      Overrides:
      configureTableJobProperties in class DefaultStorageHandler
    • decomposePredicate

      public HiveStoragePredicateHandler.DecomposedPredicate decomposePredicate(JobConf jobConfig, Deserializer deserializer, ExprNodeDesc predicate)
      Method required by the HiveStoragePredicateHandler interface.

      This method validates the components of the given predicate and ultimately produces the following artifacts:

      • a Hive object representing the predicate that will be pushed to the backend for server side filtering
      • the String form of the computed predicate to push; which can be passed to the server via the ONSQL query mechanism
      • a Hive object consisting of the remaining components of the original predicate input to this method -- referred to as the 'residual' predicate; which represents the criteria the Hive infrastructure will apply (on the client side) to the results returned after server side filtering has been performed
      The predicate analysis model that Hive employs is basically a two step process. First, an instance of the Hive IndexPredicateAnalyzer class is created and its analyzePredicate method is invoked, which returns a Hive class representing the residual predicate, and also populates a Collection whose contents is dependent on the particular implementation of IndexPredicateAnalyzer that is used. After analyzePredicate is invoked, the analyzer's translateSearchConditions method is invoked to convert the contents of the populated Collection to a Hive object representing the predicate that can be pushed to the server side. Finally, the object that is returned is an instance of the Hive DecomposedPredicate class; which contains the computed predicate to push and the residual predicate.

      Note that because the Hive built-in IndexPredicateAnalyzer produces only predicates that consist of 'AND' statements, and which correspond to PrimaryKey based or IndexKey based predicates, if the Hive built-in analyzer does not produce a predicate to push, then a custom analyzer that extends the capabilities of the Hive built-in analyzer is employed. This extended analyzer handles statements that the built-in analyzer does not handle. Additionally, whereas the built-in analyzer populates a List of Hive IndexSearchConditions corresponding to the filtering criteria of the predicate to push, the extended analyzer populates an ArrayDeque in which the top (first element) of the Deque is a Hive object consisting of all the components of the original input predicate, but with 'invalid' operators replaced with 'valid' operators; for example, with 'IN list' replaced with 'OR' statements.

      In each case, translateSearchConditions constructs the appropriate Hive predicate to push from the contents of the given Collection; either List of IndexSearchCondition, or ArrayDeque.

      Specified by:
      decomposePredicate in interface HiveStoragePredicateHandler