Changes in 12cR2.4.4.6

The following changes were made in Oracle NoSQL Database 12cR2.4.4.6.

New Features

  1. Added indexing for JSON data.

    As of the current release, SQL for Oracle NoSQL supports typed JSON indexes. As their name implies, such indexes place some type-related restrictions on the JSON data that is being indexed. Creation of a typed JSON index will fail if the associated table contains any rows with data that violate the restrictions imposed by the index. Similarly, an insert/update operation will be rejected if the new row does not conform to the restrictions imposed by one or more existing JSON indexes. However, as long as the type constraints are satisfied, typed JSON indexes are very similar to indexes on strongly typed data, both in the way their contents are evaluated as well as in the ways they are used by queries.

    Syntactically, the only difference between JSON and non-JSON indexes is that a type must be specified in the CREATE INDEX statement for each index path that indexes a JSON field. The type must be one of the JSON atomic types (a numeric type, or string, or boolean). Such a type declaration implies that the items returned by the index path expression, when evaluated on a table row, must be of the specified type, or SQL NULL, or JSON null. It is also allowed for the indexed path to not exist in the data; in this case the path expression returns an empty result and a special (internal) value EMPTY is placed in the index.

    Like non-JSON indexes, JSON indexes may be "simple" or "multikey". The later are used to index all the elements of an array, or all the entries of a JSON object. An example for each kind of index is shown below. They examples use the following table:

    create table foo (id integer,
                      firstName string,
                      lastName string,
                      address json,
                      primary key(id));

    A sample row of the above table may look like this (shown in JSON format):

    {
      "id":0,
      "firstName" : "John",
      "lastName" : "Doe",
      "address": {
        "street" : "somewhere",
        "city": "San Francisco",
        "state"  : "CA",
        "phones" : [ { "area":408,  "number":5039567, "kind":"work" },
                     { "area":415,  "number":2854026, "kind":"work" },
                     { "area":null, "number":8390129, "kind":"home" },
                   ]
      }
    }
    
    create index idx1 on foo (address.state as string,
                              address.city as string)

    In the above index, both the address.state and address.city paths must return strings, or NULL, or JSON null, or EMPTY (an SQL NULL may be returned only if the top-level address column is NULL). Furthermore, since the paths do not contain any multikey steps ([], or .values(), or .keys()), they must return at most one item, which implies that none of the fields in the paths (address, or address.state, or address.city) may be arrays.

    create index idx2 on foo (address.state as string,
                              address.phones[].area as integer,
                              address.phones[].kind as string)

    In the above index, the paths address.phones[].area and address.phones[].kind are multikey, because they contain the [] step. This implies that address.phones is expected to be an array (and none of the other fields in these 2 paths can be arrays). However, it is also allowed for address.phones to be a single JSON object, or even an atomic value. In general, the index paths are evaluated the same way as path expressions in DML queries. So, if address.phones is a single object having area and kind fields, the values of these fields will be indexed. If address.phones is an atomic value, the result of the path expression is empty, and EMPTY will be placed in the index. [#25509]

  2. An EMPTY value is used internally to represent cases where an expression returns an empty result. Applications do not normally have to deal with EMPTY values. The only exception is in IndexKey instances: when a TableIterator is used to scan an index and return index keys, the EMPTY value may appear in the returned IndexKey instances. The FieldValue.isEMPTY() method can be used to check if the value of an IndexKey field is EMPTY. Applications may also use the IndexKey.putEMPTY() method to search an index for entries containing the EMPTY value in one or more of their fields. [#25509]

  3. Added NUMBER data type. The NUMBER data type represents the java.math.BigDecimal value. It can be used as a universal numeric type that can handle numeric values of any size and precision. A field of type NUMBER can be used as a field of a primary key or an index key. All query expression that work on numbers have been extended to handle NUMBER values as well. [#25447]

  4. Added operators IS NULL and IS NOT NULL in SQL for NoSQL. As in standard SQL, the operators test whether the result of an expression is (or is not) the SQL NULL. The operators are "sargable", that is, they can be evaluated as conditions on index keys, if appropriate indexes exist. [#25809]

  5. Indexing of NULL values was introduced in 4.2. However, in 4.2 and 4.3 NULL was also used to index "missing" values, or more precisely, in cases where the evaluation of an index path on a row would return an empty result (EMPTY). In the current release, NULL and EMPTY are indexed separately. This is required to make the IS (NOT) NULL operators sargable. To see why, consider the following example:

    create table foo (id integer, map MAP(RECORD(f1 integer, f2 integer)))
    
    create index idx on foo (map.somekey.f1)
    
    select * from foo where map.somekey.f1 IS NULL
    
    Row R1: { "id" : 1, "map" : null }
    
    Row R2 : { "id : 2, "map" : { } }
    
    Row R3 : { "id : 3, "map" : { "somekey" : {"f2" : 10} } }

    Rows R1 and R3 are in the result set of the query, but not R2 (because on R2, map.key.f1 returns EMPTY).

    In 4.2, the index will contain NULL for all of the above rows. In the current release, it will contain NULL for R1 and R3, and EMPTY for R2. So, in the current release, the index can be used to optimize the query. [#25509]

  6. Allow quoted strings as steps in index paths. This is needed to be able to index a specific entry of a map, when the key of that entry in not a simple identifier. For example:

    CREATE TABLE Foo(id INTEGER, map MAP(INTEGER)), primary key(id))
    
    create index idx on Foo (map."@abc")

    Prior to this release, creating the above index was not possible, because quoted strings were not allowed as steps. An attempt to use the path map.@abc would fail, because @abc is not a valid identifier (due to the '@' char). [#25854], [#25958], [#25963], [#25974]

  7. Any escape sequences that appear in string literals inside a query are converted (inlined) to their corresponding unicode character when the query is parsed. This is required, because the JSON data model applies the same conversion. So, for example, when JSON text is loaded into Oracle NoSQL, any escape sequences in the JSON text are inlined. As a result, the string literal appearing in queries must have the same "format" internally, as the one used by the stored data. [#25767]

  8. Added a new plan plan network-restore to perform network restore from one RepNode to another within the same shard. This plan can be used to restore data from a secondary node to a primary RepNode; restore a replica from another one if the shard lose quorum and no master available and also could be used to restore a RepNode having corrupted data. [#25834]

  9. Since this release, configuration files of storage node, Admin, RepNode and security are set to owner read only by default. For example, the "config.xml" in root directory and "security.xml" in security configuration directory. [#25835]

  10. Oracle NoSQL Database store passwords of users after hashing in the security metadata database, the hashing algorithm was using PBKDF2WithHmacSHA1. Since this release, replace with stronger algorithm PBKDF2WithHmacSHA256. Note that this change won't update hashes of existing user passwords. It's recommended to update user passwords to make use of the new algorithm after upgrade. [#26014]

  11. Supports the "describe table" and "describe index" output in tabular format. [#25720]

  12. Displays the "describe table" and "describe index" output in tabular format. [#25720]

  13. The progress of a plan will be available via JMX. A JMX notification is generated to report on the status of a plan and its associated tasks for both serial and parallel plans. It reports on both the general tasks as well as tasks associated with elasticity operations such as migrate partition as shown in the sample out below. [#25200]

    {
      "planId": 17,
      "planName": "Deploy Topo (17)",
      "reportTime": 1483445963460,
      "state": "RUNNING",
      "attemptNumber": 1,
      "migratePartition_Total": 150,
      "generalTask_Total": 14,
      "migratePartition_Successful": 15,
      "generalTask_Successful": 13,
      "migratePartition_Running": 135,
      "generalTask_NotStarted": 1
    }

Bug and Performance Fixes

  1. In previous releases, the "covering index" optimization was not being considered for queries that did not have a WHERE clause. This has been fixed in the current release. As a result, a query like select id1 from foo, where "id1" is a primary key column of table "foo", will be evaluated by a keys-only scan over the primary index of the table, instead of retrieving all the table rows. [#25509]

  2. A bug has been fixed for order-by queries where the order-by clause includes primary-key columns. Because sorting is indexed-based, all of the indexed fields must appear in the order-by clause, before the primary-key columns. Otherwise, the query should be rejected. This restriction was not enforced in prior releases, which could lead to wrongly sorted results. [#25509]

  3. Certain combinations of queries and indexes on maps would cause an error during query execution. The most important case is when an index indexes more than 2 specific keys of a map and a query is using the index to evaluate one or more of its expressions during the index scan using only the index fields (without the need to retrieve the full table row). Here is an example that demonstrates such a case:

    CREATE TABLE Foo(
       id INTEGER,
       g  LONG,
       rec RECORD(a INTEGER,
                  b ARRAY(INTEGER),
                  c MAP(RECORD(ca INTEGER, cb INTEGER, cc INTEGER, cd INTEGER))),
    primary key(id))
    
    create index idx on Foo (rec.c.key1.ca, rec.c.key2.ca, rec.c.key3.cb)

    Query:

    select id from Foo f where f.rec.c.key1.ca >= 3 and f.rec.c.key2.ca = 20

    In this query, the 1st predicate in the WHERE clause is used as the start condition for the index scan. This does not cause any problems. The problem is caused by the second predicate, which is used as a "filtering" predicate: it is evaluated as the index is being scanned, using the 2nd field ("rec.c.key2.ca") of the current index entry. Furthermore, the index indexes 3 specific map keys ("key1", "key2", and "key3"). If the path "rec.c.key3.cb" was not part of the index definition, the bug would not show up. [#25822]

  4. The fact that map keys are case-sensitive was not taken into account when the query processor would consider using an index on a specific map key. This could lead to the wrong index being used by a query. For example:

    CREATE TABLE Foo (id INTEGER, map MAP(INTEGER), primary key(id))
    
    CREATE INDEX idx ON Foo (map.SomeKey)
    
    select id
    from Foo f
    where f.map.somekey = 3

    The above query would use the index. But this would be wrong because the map keys "SomeKey" and "somekey" are not the same.

    The case-sensitivity of the map keys was also not taken into account when, during and INDEX CREATE statement, a check is made to see if an index with the same fields exists already. [#25959]

  5. Elimination of duplicate query results was not done in some cases. Specifically, if a query used a multikey index and no predicates were pushed to that index, duplicate elimination was not done. An index will be used by a query, even if no predicates are pushed into ti, if the index is used for sorting (order-by queries) or if its use is forced by an index hint. [#26008]

  6. Fixed several bugs in comparison operators. The bugs can cause a query to return wrong results or raise an NPE. Fortunately the bugs show up in corner cases only. [#25832]

    • A bug can occur when long or double literal is compared with an expression whose type is integer or float, respectively, and the long/double literal is outside the range of integers/floats. Furthermore, the expression must return one or more NULLs, or an empty result, or more than one items and the operator is a value comparison operator.

    • A bug can occur when long or double literal is compared via != or !=any with an expression whose type is integer or float, respectively, and the long/double literal is outside the range of integers/floats.

    • A bug occurs when JSON null is compared with an expression whose type does not contain JSON null, and either (a) the expression returns SQL NULL, or (b) the expression returns EMPTY (the empty sequence) and the comparison operator is !=any.

    • A bug occurs when a multi-valued path expression is compared with a const, the path expression matched a multi-key path of an index, the query will use that index, and the comparison predicate is converted to a start/stop condition for the index scan. For example, t.array[] = 5 where "array" is a column of table t with type array(integer). Normally, this predicate will raise an error if there is any row where the array contains more than 1 elements. However, if there is also an index on "array" and the predicate is pushed to that index, the error could be lost, and the query would return a result.

  7. Fixed a compilation bug in array filter expression, when the input expression has EMPTY type. For example, the following query will raise an EmptyStackException: select f.info.children.keys().age[$element > 4] from foo f In the path expression appearing in the SELECT clause, the keys() step will always return a bunch of strings. As a result, the .age step on those strings will always an empty result. So, the compiler would assign the type EMPTY to the f.info.children.keys().age expression. This was not handled correctly by the following array-filter step ([$element > 4]). [#25877]

  8. The code that parses JSON text and maps it to a strongly-typed table schema was not handling invalid JSON documents correctly. This has been fixed in this release. [#25842]

  9. Clarified in the documentation that the values returned by toByteArray (and related) methods are not guaranteed to work with older releases. Byte array values created with either the current or earlier releases can passed to the associated methods that accept byte array values for the current or later releases, but values created by later releases are not guaranteed to be compatible when passed to methods for earlier releases.

    These are the affected methods:

    • Consistency.toByteArray and fromByteArray

    • Durability.toByteArray and fromByteArray

    • ExecutionFuture.toByteArray and KVStore.getFuture

    • Key.toByteArray and fromByteArray

    • KeyRange.toByteArray and fromByteArray

    • Value.toByteArray and fromByteArray

    • Version.toByteArray and fromByteArray

  10. Fixed the RMI registry filter issue occurs while running a secure store with Java SE Development Kit 8, Update 121 (JDK 8u121), which is caused by the Java new feature called RMI better constraint checking. The fix is to add patterns oracle.kv.**;java.lang.Enum to RMI registry filter sun.rmi.registry.registryFilter automatically if they are not present while starting a Storage Node Agent for both secure and non-secure stores. [#25923]

  11. Changed the SSL cipher suites preference order to favor suites using GCM cipher algorithm mode if no user-specified cipher suite is configured. In our experimental performance test, it is tracked that suites using GCM provide better performance, particularly on hardware with limited sources of secure randomness. [#25949]

  12. Changed the error message when issue a query on a closed store handle to be more understandable. [#25883]

  13. Fixed the issue that failover operation may fail if a secondary zone has more recent data than existing alive primary zones. The fix added potential data loss verification to detect if there is a data loss after failover. To retain data in secondary zones, users need to perform network restore first from leading secondary nodes to primaries. If not, specify force and re-execute the failover plan. [#24772]

  14. Fixed the issue that the '0.0' in JSON string is failed to be parsed as a float value. [#25983]

  15. Fixed the issue that restarting a secured KVLite in EE version may fail with IllegalStateException "Unable to access the configured wallet store". This is due to concurrent attempts to open the wallet file. [#25990]

  16. Upgraded the Elasticsearch library to v2.4.4. [#25943]

  17. Fixed the issue where the Admin failed to restart on a IllegalCommandException. When an Admin node restarts and becomes a master, it restarts all the plans that were running at the time of failover. Prior to this fix If IllegalCommandException occurred during the restarting of a plan, the Admin may fail to restart. [#26022]

  18. Fixed the issue for bulk put API where the entries with duplicate key supplied by a single stream are marked as existed before the key is actually found in store. [#25903]

Utility Changes

  1. Change field name from "datetime" to "reportTime" in following type of SN JMX notification JSON format data. [#25979]

    • oracle.kv.repnode.opmetric

    • oracle.kv.repnode.envmetric

    • oracle.kv.repnode.replicationstate

    • oracle.kv.repnode.status

    • oracle.kv.plan.status

  2. Added "put" command to sql shell to put row(s) into the named table: put -table <name> [-json <string> | -file <file> [JSON | CSV]]