Oracle XML Functions for Hive Reference

This section describes the Oracle XML Extensions for Hive. It describes the following commands and functions:

Data Type Conversions

Table 7-1 shows the conversions that occur automatically between Hive primitives and XML schema types.

Table 7-1 Data Type Equivalents

Hive XML schema

TINYINT

xs:byte

SMALLINT

xs:short

INT

xs:int

BIGINT

xs:long

BOOLEAN

xs:boolean

FLOAT

xs:float

DOUBLE

xs:double

STRING

xs:string


Hive Access to External Files

The Hive functions have access to the following external file resources:

You can address these files by their URI from either HTTP (by using the http://... syntax) or the local file system (by using the file://... syntax). In this example, relative file locations are resolved against the local working directory of the task, so that URIs such as bar.xsd can be used to access files that were added to the distributed cache:

xml_query("
   import schema namespace tns='http://example.org' at 'bar.xsd';
   validate { ... }
        ",
   .
   .
   .

To access a local file, first add it to the Hadoop distributed cache using the Hive ADD FILE command. For example:

ADD FILE /local/mydir/thisfile.xsd;

Otherwise, you must ensure that the file is available on all nodes of the cluster, such as by mounting the same network drive or simply copying the file to every node. The default base URI is set to the local working directory.

See Also: