Known issues with Oracle NoSQL Database Analytics Integrator

Possible Loss of Precision with Some Data Types:

The Oracle NoSQL Database Analytics Integrator retrieves data from tables in the Oracle NoSQL Database Cloud Service, converts that data to Parquet format, stores the Parquet data in Object Storage, and finally transfers that data to a table in an ADW database. To perform the conversion to Parquet format, the NoSQL Analytics Integrator employs facilities provided by the Oracle NoSQL Database Migrator, which maps Oracle NoSQL data types to comparable types defined by the Parquet type system. The mapping between the NoSQL Database type system and the Parquet type system is not a complete one-to-one mapping. See Oracle NoSQL to Parquet Data Type Mapping for more details. In particular, the Parquet type system does not currently define a numeric data type analogous to the Oracle NoSQL NUMBER type; where the largest type defined by Parquet is the Parquet DOUBLE type. Thus, if a NoSQL table to be processed by the Oracle NoSQL Database Analytics Integrator consists of a field of type NUMBER that contains a value so large that it cannot be represented as a Parquet DOUBLE, then a loss of precision is possible when that value is converted to the Parquet DOUBLE type; as that value will be represented in Parquet format as either +Infinity or -Infinity.

ADW Database Does Not Currently Handle JSON Field Types of Length Larger 4000 Bytes:

If the table you create in Oracle NoSQL Database Cloud Service contains a field (column) of type JSON, and if the value written to that field is a JSON document with length greater than 4000 bytes in at least one row of the table, then although the Oracle NoSQL Database Analytics Integrator has no problem writing such values to the Object Storage (in Parquet format), the ADW database does not process the JSON document correctly; displaying null instead of the contents of the document. Although the max_string_size initialization parameter of the ADW database is set to EXTENDED by default, the mechanism used by the ADW database to retrieve and display the corresponding Parquet value currently ignores the EXTENDED settings and attempts to store the value in a VARCHAR2(4000) type instead of VARCHAR2(32767); which causes the value to be truncated and null to be displayed. See Oracle Database Reference - Datatype Limits for more details.

Example: Create a table myJsonTable with two fields, an INTEGER and a JSON. Suppose you populate the row with id=1 with a JSON document consisting of more than 4000 bytes.
CREATE TABLE IF NOT EXISTS myJsonTable (id INTEGER, 
jsonField JSON, PRIMARY KEY (idField)) USING TTL 1 days;
When you fetch the contents of the row with id=1, you should see output such as the following:

SELECT * FROM myJsonTable WHERE id = '1';

id jsonField
1 (null)
Work Around: Until ADW fixes this bug, you can manually work around the issue by doing the following from the Database Actions SQL Interface.
  • Verify that the max_string_size initialization parameter is set to EXTENDED in the database.
    SELECT name,value FROM v$parameter WHERE name = 'max_string_size';
    If the value of the max_string_size is set to STANDARD, then increase the size from STANDARD to EXTENDED.
  • Drop the table
    DROP TABLE myJsonTable;
  • Manually recreate the table and specify enough bytes to hold the JSON document.
    begin
    DBMS_CLOUD.CREATE_EXTERNAL_TABLE (
    table_name =>'myJsonTable',
    credential_name =>'OCI$RESOURCE_PRINCIPAL' or
    'NOSQLADWDB001_OBJ_STORE_CREDENTIAL',
    file_uri_list =>'https://objectstorage.us-ashburn-
    1.oraclecloud.com/n/nosqldev/b/nosql-to-adw/o/myJsonTable*',
    format => '{"type":"parquet", "schema": "first"}',
    column_list =>'ID NUMBER (10), JSONFIELD VARCHAR2(32767)'
    );
    end;
  • You should now be able to see the actual contents of the JSON document in the row with id=1 .
    SELECT * FROM myJsonTable WHERE id = '1';

    Note:

    Rather than declaring the JSONFIELD as VARCHAR2(32767) you can also work around this issue by declaring that column as type CLOB.
    begin
    DBMS_CLOUD.CREATE_EXTERNAL_TABLE (
    table_name =>'myJsonTable',
    credential_name =>'OCI$RESOURCE_PRINCIPAL' or
    'NOSQLADWDB001_OBJ_STORE_CREDENTIAL',
    file_uri_list =>'https://objectstorage.us-ashburn-
    1.oraclecloud.com/n/nosqldev/b/nosql-to-adw/o/myJsonTable*',
    format => '{"type":"parquet", "schema": "first"}',
    column_list =>'ID NUMBER (10), JSONFIELD CLOB'
    );
    end;

Some Clients Do Not Handle Double.POSITIVE_INFINITY, Double.NEGATIVE_INFINITY, and DOUBLE.NaN Correctly:

If the table you create in Oracle NoSQL Database Cloud Service contains a field (column) of type DOUBLE, and if the value written to that field is Double.POSITIVE_INFINITY, Double.NEGATIVE_INFINITY, or Double.NaN (Not-a-Number) in at least one row of the table, then although the Oracle NoSQL Database Analytics Integrator has no problem writing such values to Object Storage (in Parquet format), and although the ADW database has no problem retrieving and storing those values, some of the clients used to analyze those values may have trouble handling and/or displaying such non-numeric values. For example, when you attempt to use either the Oracle Cloud Database Actions SQL Interface or Oracle Analytics (Desktop or Cloud) to query the ADW database table, this issue manifests itself in two ways.

When you use the Run Statement button on the Database Actions SQL Interface (represented by a green circle containing a white arrow) to execute a single SELECT query on the table, although the query actually completes, the results of the query are never displayed and the command appears to hang.

Note:

One can tell that the query completes rather than hangs when using the Run Statement option in the Database Actions SQL Interface when the Query Result window of that interface eventually displays a dropdown menu labeled Download and displays the Execution time (even though the spinning wheel appears to indicate the query is hanging).

There are two ways you can work around this issue. First, you can simply execute the query as a script. To do this, you would select the query in the [Worksheet]* window of the tool and then click on the Run Script button. This will display the results of the query in the Script Output window of the tool; displaying any Double.POSITIVE_INFINITY values as the string 'Infinity', Double.NEGATIVE_INFINITY values as the string '-Infinity', and any Double.NaN values as the string 'NaN'.

Another way to work around the issue in the Database Actions SQL Interface is to use the Run Statement to execute the query, and when the Download dropdown menu appears in the Query Result window (indicating the query has completed), click on the Download dropdown menu and click on the menu item labeled JSON to export the output of the query as a JSON document. Once you have exported the query results, you can use your browser or editor of choice to examine the query results.

On the other hand, if you use Oracle Analytics (desktop tool or cloud service) to query the table, then the following error trace occurs:
Odbc driver returned an error (SQLExecDirectW).
State: HY000. Code.10058. [NQODBC][SQL_STATE:HY000]
[nQSError:10058] A general error has occurred.
State: HY000. Code: 43113. [nQSError: 43113] Message returned from OBIS.
State: HY000. Code: 43119. [nQSError: 43119] Query Failed.
State: HY000. Code: 17001. [nQSError: 17001] Oracle Error code: 1722,
message: ORA-01722: invalid number at OCI call OCIStmtFetch.
State: HY000. Code: 17012. [nQSError: 17012] Bulk fetch failed. (HY000)
SQL Issued:
SET VARIABLE DISABLE_CACHE_SEED=1,
DISABLE_XSA_CACHE_SEED=1,
ENABLE_DIMENSIONALITY=1;
SELECT 0 s_0, XSA('weblogic'.'1cdbf90a-570e-4ebb-946b-5510da1b5f76').
"input"."Data"."XD" s_1,
XSA('weblogic'.'1cdbf90a-570e-4ebb-946b-5510da1b5f76').
"input"."Data"."XTCTYPE" s_2,
XSA('weblogic'.'1cdbf90a-570e-4ebb-946b-5510da1b5f76').
"input"."Data"."XTESTCASE" s_3,
FROM XSA('weblogic'.'1cdbf90a-570e-4ebb-946b-5510da1b5f76').input."Data"
There is no work around for this issue in Oracle Analytics.

Thus, until the Database Actions SQL Interface and Oracle Analytics addresses how they handle Double.POSITIVE_INFINITY, Double.NEGATIVE_INFINITY, and Double.NaN, you should always make note of whether or not the table you wish to analyze consists of any rows with one or more of these values.