Oracle NoSQL Database
|PREV PACKAGE NEXT PACKAGE||FRAMES NO FRAMES|
|Formatter||Formatter is an interface to a user-specified class used for formatting NoSQL Database records into Strings which are appropriate for importing into a specific Oracle External Table.|
|Preproc||The Preproc utility is invoked by Oracle Database whenever a
|Publish||The Publish utility writes information about how to connect to an Oracle NoSQL Database from an Oracle Database External Table.|
|ExternalTableException||Runtime Exception for anything related to ExternalTables.|
This package contains the public API and Utilities for accessing Oracle NoSQL Database data through Oracle Database External Tables. This capability allows NoSQL Database data to be read into Oracle Database using Oracle External Tables. NoSQL Database data can not be modified using External Table access.
Generally, the Oracle Database External Table feature allows a table to be
backed by one or more flat files (e.g. text or CSV), also known as
"location" files. When a SQL
SELECT statement is executed
against an External Table, the Oracle Database server reads the location
files, parses them according to the
specified in the External Table
CREATE TABLE definition, and
materializes them as rows in the External Table.
Any External Table definition can specify a preprocessor program which is invoked by the Oracle Database server during the read process. The preprocessor is passed the name of the location file as a command line argument. Its contract is to read the data relevant to the location file and write the records to stdout where the Oracle Database reads, parses, and materializes the records as rows in the table.
To use the Oracle Database External Table feature to read Oracle NoSQL Database data, you must use the
<KVHOME>/exttab/bin/nosql_streampreprocessor to populate your Oracle tables with the data. The remainder of this document describes how to configure an Oracle database with External Tables so that you can read data from an Oracle NoSQL Database.
Using External Tables to perform write operations against Oracle NoSQL Database is not supported.
Once the External Table is defined, you run the
Publish utility to "publish" information about the
External Table and how to access the data in Oracle NoSQL Database. This
information is written as an XML document into each of the location
Note that you only have to run the publish utility once. After that, you only need to run the publish utility again if you want to change the way the NoSQL Database is accessed (for example, using a different key prefix, or if you change the port or the rep nodes where you access the database).
When the publish step is completed, a SQL
SELECT may be
executed against the External Table. This will cause
<KVHOME>/exttab/bin/nosql_stream script to be invoked,
which in turn causes the
Preproc class to be
invoked. The preprocessor
formatteror a default format, and
An External Table "cookbook" can be found in
See the External Tables Cookbook
for more information.
<KVHOME>/exttab/bin/directory where the
nosql_streamscript resides. For example, the following commands might be used to specify these two directories to Oracle Database:
SQL> CONNECT / AS sysdba; SQL> CREATE DIRECTORY ext_tab AS '/data'; SQL> CREATE DIRECTORY nosql_bin_dir AS '/export/home/kvhome/exttab/bin';
NOTE: Consult with your Oracle DBA regarding the appropriate
locations and permissions for these directories. It may be
desirable from a security point of view to copy the
nosql_stream script from the
<KVHOME>/exttab/bin/ directory to some other
SQL> CONNECT / AS sysdba; SQL> CREATE USER nosqluser IDENTIFIED BY password; SQL> GRANT CREATE SESSION TO nosqluser; SQL> GRANT EXECUTE ON SYS.UTL_FILE TO nosqluser; SQL> GRANT READ, WRITE ON DIRECTORY ext_tab TO nosqluser; SQL> GRANT READ, EXECUTE ON DIRECTORY nosql_bin_dir TO nosqluser; SQL> GRANT CREATE TABLE TO nosqluser;
nosql_streampreprocessor as you do so. For example:
SQL> CONNECT nosqluser/password SQL> CREATE TABLE [schema.]tableName (column datatype, ...)) 2 ORGANIZATION EXTERNAL 3 (TYPE ORACLE_LOADER 4 DEFAULT DIRECTORY ext_tab 5 ACCESS PARAMETERS (RECORDS DELIMITED BY NEWLINE 6 PREPROCESSOR nosql_bin_dir:'nosql_stream' 7 FIELDS TERMINATED BY 'delimiter') 8 LOCATION ('filename','filename', ...)) 9 PARALLEL nn;
where nn is the number of files you specified for LOCATION on line 8.
publishutility, you must create a configuration file that contains parameters regarding:
<config version="1"> <component name="publish" type="params"> <property name="..." value="..." type="..."/> <property name="..." value="..." type="..."/> ... </component> <component name="nosql_stream" type="params"> <property name="..." value="..." type="..."/> <property name="..." value="..." type="..."/> ... </component> </config>Notice that there are two major sections in the file: the
nosql_streamsection. In the
publishsection, the following parameters are required:
oracle.kv.exttab.connection.url: The connection URL used by the Publish utility to connect to the Oracle instance containing the External Table. This property overrides all other connection properties. The connecting database user must have the privileges described above.
If you are using a Wallet as an external password store, then the property value must have this form:
db_connect_stringmust exactly-match the credential in the wallet. This example uses Oracle Net Services syntax:
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=myhost)(PORT=1521))) (CONNECT_DATA=(SERVICE_NAME=my_db_service_name)))This example uses a TNSNAMES entry:
oracle.kv.exttab.connection.wallet_location, which is described later in this page.
If you are not using an Oracle wallet, then one of the following URL connection styles is used:
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=myhost)(PORT=1521))) (CONNECT_DATA=(SERVICE_NAME=my_db_service_name)))
oracle.kv.exttab.connection.user: required when not using a wallet for connections. In this case, the user will be prompted for a password.
oracle.kv.exttab.tableName: The schema-qualified table name of the external table in the format:
publishsection, the following parameters are optional:
oracle.kv.exttab.connection.user: An Oracle user with appropriate privileges. Optional only when you are using a wallet for a connection.
oracle.kv.exttab.connection.tns_admin: File path to a directory containing SQL*Net configuration files, such as sqlnet.ora and tnsnames.ora. The value of the TNS_ADMIN environment variable is used for this property by default. Define this property to use TNS entry names in database connect strings.
oracle.kv.exttab.connection.tnsEntryName: A TNS entry name defined in the tnsnames.ora file. This property is used with
oracle.kv.exttab.connection.wallet_location: Pathname of an Oracle wallet where the connection information is stored. When using Oracle wallet as an external password store, set the following additional properties.
nosql_streamsection, the following parameters are required:
oracle.kv.kvstore: The name of the NoSQL Database.
oracle.kv.hosts: One or more
hostname:portpairs, separated by commas. The utility uses these
hostname:portpairs to contact Rep Nodes and obtain information about the organization of the NoSQL Database. Note that there are no restrictions on which online Rep Nodes in the store can be identified here; all online Rep Nodes are equally valid candidates.
nosql_streamsection, the following parameters are optional:
oracle.kv.direction: The direction to read each of the partitions. The value may be only be "forward". "reverse" and "unordered" may be supported in the future. Presently this parameter is not meaningful, but is included for completeness.
oracle.kv.batchSize: The number of records to fetch during each round trip.
oracle.kv.parentKey: The parent key whose child KV pairs are to be fetched, specified in the canonical key format is described in
oracle.kv.subRange: Used to further restrict the range under the
parentKey. This parameter is specified in the canonical
oracle.kv.depth: One of
oracle.kv.consistency: One of
TIME(permissibleLag, timeout), where
timeoutare a value and a
TimeUnitpair (e.g. "10 secs").
oracle.kv.timeout: The timeout to use for querying the NoSQL Database. This is in timeout format (e.g. "10 seconds").
oracle.kv.exttbl.stream.keyDelimiter: If present, the value is a single character and indicates:
oracle.kv.exttbl.stream.variableSizeBytes: If this parameter is 0 or not specified, or
oracle.kv.formatterClassis not specified, then records are output in Oracle Data Loader "fixed" format. If this parameter is non-0, it specifies the number of prefix bytes per the Oracle Data Loader format described in External Tables Access Parameters. If both this and
oracle.kv.formatterClassare specified, then an exception is thrown when the Publish utility is run.
oracle.kv.formatterClass: the name of a class which can format key/value pairs to an appropriate format for the Oracle Loader. This parameter may not be specified if
oracle.kv.exttbl.stream.variableSizeBytesis also specified.
The Publish Utility accepts the following command line parameters:
-noexecute: Validate the parameters and test the connections to Oracle Database and Oracle NoSQL DB, but do not publish (write) to the External Table files. Either
-publishmust be specified.
-publish: The opposite of
-noexecute. Publish (write) the relevant information to the External Table files. Either
-publishmust be specified.
-verbose: Be verbose during execution. Useful for debugging.
-config <file>: Specifies an XML config file in the format shown above.
Since the Publish Utility connects to the Oracle Database using JDBC,
when you invoke it, you must include
$ORACLE_HOME/jdbc/lib/ojdbc6.jar in the classpath. If you are using
the Oracle wallet as an external password store, then you must also include
SELECTstatement is executed on the Oracle Database server. The Oracle Database server invokes the preprocessor once for each External Table location file and therefore the number of location files determines the level of parallelism when retrieving records. The preprocessor(s) are invoked on the same machine that the Oracle Database instance is running.
In general, the user does not invoke this utility directly (except
perhaps for debugging) as that will be done by Oracle Database when it
nosql_stream script which is specified as the
PREPROCESSOR program in the External Table declaration.
Preprocessor performs the following operations:
oracle.kv.exttbl.stream.keyDelimiteris specified, the key is written to stdout using the canonical Key format, followed by the keyDelimiter, followed by the value, followed by newline.
nosql_streamis a shell script which requires that the
CLASSPATHbe set correctly. You should edit this script to include appropriate settings. Be sure that it has execute permission when you are finished editing.
ACCESS PARAMETERSto sufficiently map NoSQL Database records to Oracle Database records.
Formattersallow you to provide a class which accepts a
recordand returns a String which can be mapped to the External Table fields. The
Formatter.toOracleLoaderFormat(oracle.kv.KeyValueVersion, oracle.kv.KVStore)class is invoked once per record by the preprocessor. If
toOracleLoaderFormat()returns null, then that record is ignored and not imported to the External Table. This provides a user-programmable filtering mechanism.
SELECTstatement against an External Table writes a log to the file specified by the LOGFILE clause of the External Table definition (the default logfile is <table_name>_<pid>.log in the External Table location file directory). The
BADFILEclause names the file to which records are written when they cannot be loaded because of errors. For example, a record is written to the bad file when a field in the datafile could not be converted to the datatype of a column in the External Table.
If the preprocessor writes anything to stderr, the
fails and the output of the exception is written to the .log file.
For example, if your formatter throws a
RuntimeException("Couldn't parse field"), you would see something
SQL> SELECT * FROM nosql_data; SELECT * FROM nosql_data * ERROR at line 1: ORA-29913: error in executing ODCIEXTTABLEFETCH callout ORA-29400: data cartridge error KUP-04095: preprocessor command /export/home/kvhome/exttab/bin/nosql_stream encountered error "Exception in thread "main" java.lang.RuntimeException: Couldn't parse field at Preproc.main(Preproc.java:12) " SQL> quit Disconnected from Oracle Database 11g Enterprise Edition Release 126.96.36.199.0 - 64 bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options bash-3.00$ ls -l ../data total 4 -rw-r--r-- 1 oracle oinstall 597 May 10 09:57 EMP_LOAD_13150.log -rw-rw-rw- 1 me other 27 May 2 14:49 mylocation.dat bash-3.00$ more ../data/EMP_LOAD_13150.log LOG file opened at 05/10/12 09:57:21 Field Definitions for table NOSQL_DATA Record format DELIMITED BY NEWLINE Data in file has same endianness as the platform Rows with all null fields are accepted Fields in Data Source: F1 CHAR (2) Trim whitespace same as SQL Loader F2 CHAR (10) Trim whitespace same as SQL Loader KUP-04095: preprocessor command /export/home/kvhome/exttab/bin/nosql_stream encountered error "Exception in thread "main" java.lang.RuntimeException: Couldn't parse field at Preproc.main(Preproc.java:12) "
ALTER SESSION ENABLE PARALLEL DDL; ALTER SESSION ENABLE PARALLEL DML; ALTER SESSION ENABLE PARALLEL QUERY;as well as adding a
PARALLEL nndirective in the External Table declaration, where
nnis the number of LOCATION files you specified.
CREATE TABLEcommand for each one. You should change the Oracle Database directory objects (ext_tab and nosql_bin_dir as necessary). If you have a different classpath for your Formatter, then be sure to add it to the
Oracle NoSQL Database
|PREV PACKAGE NEXT PACKAGE||FRAMES NO FRAMES|