Class TableStorageHandler

    • Field Detail

      • kvStoreName

        protected String kvStoreName
      • kvHelperHosts

        protected String[] kvHelperHosts
      • kvHadoopHosts

        protected String[] kvHadoopHosts
      • tableName

        protected String tableName
      • primaryKeyProperty

        protected String primaryKeyProperty
      • fieldRangeProperty

        protected String fieldRangeProperty
      • timeout

        protected long timeout
      • timeoutUnit

        protected TimeUnit timeoutUnit
      • maxRequests

        protected int maxRequests
      • batchSize

        protected int batchSize
      • maxBatches

        protected int maxBatches
      • jobConf

        protected JobConf jobConf
    • Constructor Detail

      • TableStorageHandler

        public TableStorageHandler()
    • Method Detail

      • configureInputJobProperties

        public void configureInputJobProperties​(TableDesc tableDesc,
                                                Map<String,​String> jobProperties)
        Creates a configuration for job input. This method provides the mechanism for populating this StorageHandler's configuration (returned by JobContext.getConfiguration) with the properties that may be needed by the handler's bundled artifacts; for example, the InputFormat class, the SerDe class, etc. returned by this handler. Any key value pairs set in the jobProperties argument are guaranteed to be set in the job's configuration object; and any "context" information associated with the job can be retrieved from the given TableDesc parameter.

        Note that implementations of this method must be idempotent. That is, when this method is invoked more than once with the same tableDesc values for a given job, the key value pairs in jobProperties, as well as any external state set by this method, should be the same after each invocation. How this invariant guarantee is achieved is left as an implementation detail; although to support this guarantee, changes should only be made to the contents of jobProperties, but never to tableDesc.

        Specified by:
        configureInputJobProperties in interface HiveStorageHandler
        Overrides:
        configureInputJobProperties in class DefaultStorageHandler
      • decomposePredicate

        public HiveStoragePredicateHandler.DecomposedPredicate decomposePredicate​(JobConf jobConfig,
                                                                                  Deserializer deserializer,
                                                                                  ExprNodeDesc predicate)
        Method required by the HiveStoragePredicateHandler interface.

        This method validates the components of the given predicate and ultimately produces the following artifacts:

        • a Hive object representing the predicate that will be pushed to the backend for server side filtering
        • the String form of the computed predicate to push; which can be passed to the server via the ONSQL query mechanism
        • a Hive object consisting of the remaining components of the original predicate input to this method -- referred to as the 'residual' predicate; which represents the criteria the Hive infrastructure will apply (on the client side) to the results returned after server side filtering has been performed
        The predicate analysis model that Hive employs is basically a two step process. First, an instance of the Hive IndexPredicateAnalyzer class is created and its analyzePredicate method is invoked, which returns a Hive class representing the residual predicate, and also populates a Collection whose contents is dependent on the particular implementation of IndexPredicateAnalyzer that is used. After analyzePredicate is invoked, the analyzer's translateSearchConditions method is invoked to convert the contents of the populated Collection to a Hive object representing the predicate that can be pushed to the server side. Finally, the object that is returned is an instance of the Hive DecomposedPredicate class; which contains the computed predicate to push and the residual predicate.

        Note that because the Hive built-in IndexPredicateAnalyzer produces only predicates that consist of 'AND' statements, and which correspond to PrimaryKey based or IndexKey based predicates, if the Hive built-in analyzer does not produce a predicate to push, then a custom analyzer that extends the capabilities of the Hive built-in analyzer is employed. This extended analyzer handles statements that the built-in analyzer does not handle. Additionally, whereas the built-in analyzer populates a List of Hive IndexSearchConditions corresponding to the filtering criteria of the predicate to push, the extended analyzer populates an ArrayDeque in which the top (first element) of the Deque is a Hive object consisting of all the components of the original input predicate, but with 'invalid' operators replaced with 'valid' operators; for example, with 'IN list' replaced with 'OR' statements.

        In each case, translateSearchConditions constructs the appropriate Hive predicate to push from the contents of the given Collection; either List of IndexSearchCondition, or ArrayDeque.

        Specified by:
        decomposePredicate in interface HiveStoragePredicateHandler