Oracle NoSQL Database
version 11gR2.2.0.26

Package oracle.kv.exttab

This package contains the public API and Utilities for accessing Oracle NoSQL Database data through Oracle Database External Tables.

See:
          Description

Interface Summary
Formatter Formatter is an interface to a user-specified class used for formatting NoSQL Database records into Strings which are appropriate for importing into a specific Oracle External Table.
 

Class Summary
Preproc The Preproc utility is invoked by Oracle Database whenever a SELECT is performed on an External Table that has been associated with an Oracle NoSQL Database using the Publish utility.
Publish The Publish utility writes information about how to connect to an Oracle NoSQL Database from an Oracle Database External Table.
 

Exception Summary
ExternalTableException Runtime Exception for anything related to ExternalTables.
 

Package oracle.kv.exttab Description

This package contains the public API and Utilities for accessing Oracle NoSQL Database data through Oracle Database External Tables. This capability allows NoSQL Database data to be read into Oracle Database using Oracle External Tables. NoSQL Database data can not be modified using External Table access.

Generally, the Oracle Database External Table feature allows a table to be backed by one or more flat files (e.g. text or CSV), also known as "location" files. When a SQL SELECT statement is executed against an External Table, the Oracle Database server reads the location files, parses them according to the ACCESS PARAMETERS specified in the External Table CREATE TABLE definition, and materializes them as rows in the External Table.

Any External Table definition can specify a preprocessor program which is invoked by the Oracle Database server during the read process. The preprocessor is passed the name of the location file as a command line argument. Its contract is to read the data relevant to the location file and write the records to stdout where the Oracle Database reads, parses, and materializes the records as rows in the table.

To use the Oracle Database External Table feature to read Oracle NoSQL Database data, you must use the

<KVHOME>/exttab/bin/nosql_stream
preprocessor to populate your Oracle tables with the data. The remainder of this document describes how to configure an Oracle database with External Tables so that you can read data from an Oracle NoSQL Database.

Using External Tables to perform write operations against Oracle NoSQL Database is not supported.

Usage Overview

To read data from Oracle NoSQL Database into an Oracle Database External Table, you define an External Table with one or more location files; the location files will not actually contain any data, but instead contain configuration information related to connecting to the Oracle Database and Oracle NoSQL Database, query restrictions, and formatting. Specifying multiple location files in the External Table definition allows for a degree of parallelism during data reading.

Once the External Table is defined, you run the Publish utility to "publish" information about the External Table and how to access the data in Oracle NoSQL Database. This information is written as an XML document into each of the location file(s).

Note that you only have to run the publish utility once. After that, you only need to run the publish utility again if you want to change the way the NoSQL Database is accessed (for example, using a different key prefix, or if you change the port or the rep nodes where you access the database).

When the publish step is completed, a SQL SELECT may be executed against the External Table. This will cause the <KVHOME>/exttab/bin/nosql_stream script to be invoked, which in turn causes the Preproc class to be invoked. The preprocessor

  1. Reads the configuration and NoSQL Database access information from the location file,
  2. reads the data from the NoSQL Database,
  3. formats it using either a user defined formatter or a default format, and
  4. writes it to stdout.

An External Table "cookbook" can be found in the <KVHOME>/examples/externaltables directory. See the External Tables Cookbook for more information.

External Table Configuration

Before you can run the publish utility, you must configure the Oracle Database External Table. You must also grant various permissions to any Oracle Database user who needs to access it. For more information on External Tables and the parameters, see External Tables Access Parameters.

Publish Utility Configuration

Before you can run the publish utility, you must create a configuration file that contains parameters regarding: The config file is an XML document and consists of two sections: parameters related to the publish step, and parameters related to the preprocessor step. In general, the file looks like this:
<config version="1">
  <component name="publish" type="params">
    <property name="..."
              value="..."
              type="..."/>
    <property name="..."
              value="..."
              type="..."/>
    ...
  </component>
  <component name="nosql_stream" type="params">
    <property name="..."
              value="..."
              type="..."/>
    <property name="..."
              value="..."
              type="..."/>
    ...
  </component>
</config>
Notice that there are two major sections in the file: the publish and the nosql_stream section. In the publish section, the following parameters are required: In the publish section, the following parameters are optional: In the nosql_stream section, the following parameters are required: In the nosql_stream section, the following parameters are optional:

The Publish Utility

The Publish Utility:
  1. Accepts the parameters described above in an XML config file,
  2. opens a connection to the Oracle Database instance,
  3. reads the schema information for the specified External Table and extracts the names of External Table files from the schema using the ALL_EXTERNAL_LOCATIONS view,
  4. opens a connection to the NoSQL Database to verify correctness of the parameters,
  5. writes the relative number of the External Table file into the file,
  6. writes information about how to contact the Oracle NoSQL DB Rep Nodes (the "publish" component parameters) into the External Table files,
  7. writes information about how to restrict the query on each of the Oracle NoSQL DB Rep Nodes (the "nosql_stream" component parameters) into the External Table files.

The Publish Utility accepts the following command line parameters:

Since the Publish Utility connects to the Oracle Database using JDBC, when you invoke it, you must include $ORACLE_HOME/jdbc/lib/ojdbc6.jar in the classpath. If you are using the Oracle wallet as an external password store, then you must also include $ORACLE_HOME/jlib/oraclepki.jar.

The Preprocessor (nosql_stream)

The preprocessor is invoked by the Oracle Database instance when a SQL SELECT statement is executed on the Oracle Database server. The Oracle Database server invokes the preprocessor once for each External Table location file and therefore the number of location files determines the level of parallelism when retrieving records. The preprocessor(s) are invoked on the same machine that the Oracle Database instance is running.

In general, the user does not invoke this utility directly (except perhaps for debugging) as that will be done by Oracle Database when it invokes the nosql_stream script which is specified as the PREPROCESSOR program in the External Table declaration. The nosql_stream Preprocessor performs the following operations:

When configuring the preprocessor, consider the following:

Formatters

For many users, it may not be possible to use the External Table ACCESS PARAMETERS to sufficiently map NoSQL Database records to Oracle Database records. Formatters allow you to provide a class which accepts a record and returns a String which can be mapped to the External Table fields. The Formatter.toOracleLoaderFormat(oracle.kv.KeyValueVersion, oracle.kv.KVStore) class is invoked once per record by the preprocessor. If toOracleLoaderFormat() returns null, then that record is ignored and not imported to the External Table. This provides a user-programmable filtering mechanism.

Exceptions

Every SELECT statement against an External Table writes a log to the file specified by the LOGFILE clause of the External Table definition (the default logfile is <table_name>_<pid>.log in the External Table location file directory). The BADFILE clause names the file to which records are written when they cannot be loaded because of errors. For example, a record is written to the bad file when a field in the datafile could not be converted to the datatype of a column in the External Table.

If the preprocessor writes anything to stderr, the SELECT fails and the output of the exception is written to the .log file. For example, if your formatter throws a RuntimeException("Couldn't parse field"), you would see something like this:

SQL> SELECT * FROM nosql_data;
SELECT * FROM nosql_data
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04095: preprocessor command /export/home/kvhome/exttab/bin/nosql_stream
encountered error "Exception in thread "main" java.lang.RuntimeException: Couldn't parse field
at Preproc.main(Preproc.java:12)
"


SQL> quit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64
bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
bash-3.00$ ls -l ../data
total 4
-rw-r--r--   1 oracle   oinstall     597 May 10 09:57 EMP_LOAD_13150.log
-rw-rw-rw-   1 me       other         27 May  2 14:49 mylocation.dat
bash-3.00$ more ../data/EMP_LOAD_13150.log


 LOG file opened at 05/10/12 09:57:21

Field Definitions for table NOSQL_DATA
  Record format DELIMITED BY NEWLINE
  Data in file has same endianness as the platform
  Rows with all null fields are accepted

  Fields in Data Source:

    F1                              CHAR (2)
      Trim whitespace same as SQL Loader
    F2                              CHAR (10)
      Trim whitespace same as SQL Loader
KUP-04095: preprocessor command /export/home/kvhome/exttab/bin/nosql_stream encountered error
"Exception in thread "main" java.lang.RuntimeException: Couldn't parse field
        at Preproc.main(Preproc.java:12)
"

Parallelism

If you specify multiple Location files, be sure to enable parallel query processing using these three commands:
ALTER SESSION ENABLE PARALLEL DDL;
ALTER SESSION ENABLE PARALLEL DML;
ALTER SESSION ENABLE PARALLEL QUERY;
as well as adding a PARALLEL nn directive in the External Table declaration, where nn is the number of LOCATION files you specified.

Additional External Tables

Additional External Tables can be added with a CREATE TABLE command for each one. You should change the Oracle Database directory objects (ext_tab and nosql_bin_dir as necessary). If you have a different classpath for your Formatter, then be sure to add it to the nosql_stream script.


Oracle NoSQL Database
version 11gR2.2.0.26

Copyright (c) 2011, 2013 Oracle and/or its affiliates. All rights reserved.