Changes in 12cR1.4.3.10

The following changes were made in Oracle NoSQL Database 12cR1.4.3.10.

New Features in SQL for Oracle NoSQL

  1. JSON data type and querying of JSON data.

    JSON can be used as the data type of a top-level table column, of a nested record field, or as the element type of an array or map. JSON is an abstract type: the actual value stored in a field of type JSON can be a valid json atomic value (integer, long, double, string, boolean, or the json null value), or an array, or a map. If the value is an array/map, the element type of the array/map will also be JSON. This implies that, unlike their fixed-schema counterparts, arrays and maps of JSON can contain heterogeneous elements.

    On input, JSON data is passed to Oracle NoSQL as a string or a text stream. In both cases, the text is parsed internally, and its constituent parts are converted to a tree of maps, array, and atomic value instances. Direct construction of JSON trees is also possible via APIs that construct and connect maps, array, and atomic values. For persistent storage, the tree is serialized into a binary format.

    JSON data is schema-less, in the sense that a field of type JSON can have very different kind of values in different table rows. For example, if "info" is a top-level table column of type JSON, in one row the value of "info" may be an integer, in another row an array containing a mix of doubles and strings, and in a third row a map containing a mix of other maps, arrays, and atomic values. Furthermore, the data stored in a JSON column or field, can be updated in any way that produces a still valid JSON instance. As a result, each JSON tree (either in main memory or as a serialized byte array on disk) is self-describing with regard to its contents.

    SQL for Oracle NoSQL can operate seamlessly on both data models: the strongly typed data supported in prior releases, as well as the schema-less JSON data. Tables can mix data from both models and all of the DML expressions can operate on both kinds of data. [#25455]

    Indexing JSON data is not supported yet. [#25455], [#25436], [#25437], [#25727], [#25197], [#25731], [#25681], [#25683], [#25688], [#25559], [#25740], [#25629], [#25669], [#25671]

  2. New kinds of expressions in SQL for Oracle NoSQL.

    • NOT operator. Negates a boolean value. [#25455]

    • EXISTS operator. Checks whether an expression returns at least one value. [#25455]

    • IS OF TYPE expression. Checks whether a value or a set of values has a specified data type. [#25437]

    • CAST expression. Casts, if possible, a value or set of values to a specified data type. [#25436]

    • Map constructor. Constructs a new map value. Both the field names and the field values may be computed by other expressions. Field values may be heterogeneous. [#25455], [#25629]

    • CASE expression. Implements if-then-else semantics. [#25197]

    • json null value and true/false literals. The (case-insensitive) keywords null, true, and false can be used as literals in a query. These are the only reserved keywords in SQL for NoSQL. DML operators and expressions were extended, if needed, to handle the new json null value. [#25455], [#25731], [#25683]

  3. TIMESTAMP data type and values. A timestamp represents a point in time. It contains a date and a time. The date part has 3 components: year, month, and day of month. The time part has 4 components: hour, minute, second, and a decimal fraction of the second. The number of digits used to represent the fraction of a second is called the precision of the timestamp, and it can be a number between 0 (no fractional second) to 9 (nano second precision). Fields of timestamp types can be indexed and accessed in SQL queries. The SQL comparison operators have been extended to compare timestamps, and the cast operator can cast strings of appropriate formats to timestamp values. [#24775], [#25707]

Other Changes in SQL for Oracle NoSQL

Several of the pre-existing expressions in SQL for Oracle NoSQL have changed in this release, mostly to make such expressions more "JSON friendly". Some of these changes are backwards incompatible and applications may have to change their queries as a result.

  1. Implicit array construction in SELECT clause If any expression in the SELECT list returns more than one values, an array is constructed to contain the values. [#25455], [#25629]

  2. Several changes in path expressions, both syntactic and semantic.

    • Field steps accept atomic items as input. A field step returns an empty result if its input is an atomic item. Previously, an error would be raised in this case. [#25455], [#25688]

    • New syntax for filtering the entries of maps and records. [#25455], [#25575], [#25740], [#25724]

      Previously, square brackets ([<cond>?]) were used to filter the values of both maps and arrays. Furthermore, there was no way to filter map entries and return the qualifying map keys (instead of the values). Finally, there was no way to filter the entries of records. This release introduces the .keys(<cond>?) and .values(<cond>?) path steps. They are both supposed to work primarily with maps and records. .keys(<cond>?) selects the map/record entries that satisfy the condition (if any) and returns the keys (field names) of these entries. .values(<cond>?) selects the map/record entries that satisfy the condition (if any) and returns the field values of these entries. If the input is an array, .keys(<cond>?) and .values(<cond>?) are applied recursively on the array elements. If the input is an atomic item, the result is empty.

      This change also affects DDL statements that use paths, because the syntax for DDL paths is a subset of the DML syntax for paths. Most importantly, CREATE INDEX statements must use the new syntax to index maps. Furthermore, applications that access multi-key map indexes directly, via the TableIterator APIs, must be updated to use the new path syntax to name the fields of the IndexKey instances they build to probe the indexes (alternatively, the IndexKey fields can be accessed by their position as well).

    • Filtering and slicing arrays. The square brackets are still used to filter or slice arrays. They work the same way as before on arrays, except that the variable $elementPos has been renamed to $pos. However, in 4.3, the behavior is different if the input is not an array: in this case, an array is created on the fly, and the input item is inserted into that array; then, the filtering/slicing step operates on that array the same way as with any other array. [#25455], [#25575], [#25740]

  3. Change in how the value comparison operators handle empty operands. Previously, if an operand returned the empty result, the result of the comparison would also be empty. In 4.3, if both operands return the empty sequence, the operands are considered equal (so true will be returned if the operator is =, <=, or >=). If only one of the operands returns empty, the result of the comparison is false unless the operator is !=. [#25763]

  4. Change in how the value comparison operators handle incomparable values. Previously, an error would be returned in this case. In 4.3, false is returned as the result of the comparison. [#25455], [#25681]

  5. Change in how the sequence-comparison operators handle the SQL NULL Previously, if any of the operand sequences contained NULLs, those NULLs would just be ignored. In 4.3, if a matching pair of items is not found, and either of the sequences contains a NULL, the result of the comparison is NULL (instead of false). [#25455]

  6. Changes in array constructors. In 4.3, queries can construct arrays containing heterogeneous items. Furthermore, if an input expression returns an SQL NULL, the NULL is ignored (not inserted in the constructed array). Previously, both heterogeneous items and NULLs would cause an error to be raised. [#25455], [#25730], [#25629]

Other New Features

  1. The information that is published to the <kvroot>/log/<storename>.stats and <kvroot>/log/<storename>.perf files is also available through the NoSQL JMX agent, via the standard mechanism. See the javadoc for oracle.kv.mgmt.jmx.StorageNodeMXBean for more information. [#24979]

  2. The "makebootconfig" command now enables security by default. The -store-security flag is now optional, and has the value enable if not specified. You can still configure a non-secure store by specifying none as the value of this flag. In addition, the "kvlite" command now creates a secure store by default. You can create a non-secure store by specifying -secure-config disable. [#25440]

Bug and Performance Fixes

  1. Strings used as keys in the MapValue type are now case-sensitive. Previously they were case-insensitive, which was a bug. Applications that assume case-insensitivity in MapValue keys may be affected. Field names in Record values remain case-insensitive, as documented. [#25598]

  2. Tables created with map of json and array of json types will now work properly. Previously rows of these types would be created but could not be deserialized and retrieved. [#25559]

  3. Removed the no-longer needed MapValue.putNull method and added methods on MapValue, ArrayValue, RecordValue and FieldDef, to allow construction of JSON null values for insertion into JSON maps and arrays. [#25671]

  4. Fixed a problem where, if a table was created with fields other than primary key fields and it was evolved such that it contained only primary key fields, queries on the table could fail to return data. [#25766]

  5. Fixed a problem that could result in a NullPointerException on a RepNode processing a query if a different RepNode were to become unavailable. [#25792]

  6. Fixed a problem that could result in an ArrayIndexOutOfBoundsException when specifying a ReturnRow on a table where the primary key fields are not declared as the first fields of the table. [#25819]

  7. Implemented compatibility with pre-4.3 DDL syntax for creating map indexes. Syntax allowed now includes:

    • path-to-map.keys() and path-to-map.values(), which are the current/new syntax.

    • KEYOF(path-to-map)

    • KEYS(path-to-map)

    • ELEMENTOF(path-to-map), path-to-map[]

    In addition, code has been added to handle pre-4.2 clients operating against 4.2 and later indexes that support null values. [#25864]

  8. Added kvstore-ee.jar to the kvclient.jar manifest so that the Enterprise Edition works without the need to fix the classpath. This works in the Community Edition as well, where kvstore-ee.jar does not exist at all.

  9. Fixed a bug that "create fulltext index if not exists" throws exception if the index exists, it should complete successfully but not require execution. [#25664]

  10. Fixed a problem that could result in an ArrayIndexOutOfBoundsException when specifying a ReturnRow on a table where the primary key fields are not declared as the first fields of the table. [#25819]

  11. Fixed a query bug where an "always false" filtering step, like t.array[false].field, would actually be evaluated as an "always true" step (t.array[false].field). [#25708]

  12. Fixed a query bug that would remove the entire FROM clause if the table name or table alias was not used anywhere in the rest of the query. This would cause the query compiler to throw an ArrayIndexOutOfBoundsException. [#25768]

  13. Fixed a query bug that would cause a NPE when the table queried is a child table and is used without a table alias. [#25645]

  14. Fixed a query bug that could cause an ArrayIndexOutOfBoundsException if the compiler could deduce that an expression would always return an empty result. For example, an ArrayIndexOutOfBoundsException would be thrown by this query: select id[10:0] from foo f [#25673]

  15. Fixed a query bug: The fact the table names are case insensitive was not always taken into account in resolving column reference in a query. [#25747]

  16. Fixed a bug that would cause an IllegalCommandException to be raise for a valid ALTER TABLE statement, when an index on an array or map existed on the same table. [#25782]

  17. Fixed a query bug where a NULLS LAST directive in the order-by was being ignored, if the sort direction was DESC. Instead, an error should be thrown in this case. [#25785]

  18. Added new methods to the oracle.kv.table.FieldValueFactory class for creating a TimestampValue from a Timestamp, String, or components: [#25654]

    public static TimestampValue createTimestamp(Timestamp v, int precision);
    public static TimestampValue createTimestamp(String s, int precision);
    public static TimestampValue createTimestamp(int year, int month, int day,
                                                 int hour, int minute,
                                                 int second, int nano,
                                                 int precision);

    Added new methods to the oracle.kv.table.TimestampValue interface to return the components of a Timestamp value:

    public int getYear();
    public int getMonth();
    public int getDay();
    public int getHour();
    public int getMinute();
    public int getSecond();
    public int getNano();
  19. Fixed a bug that could result in failure in reading records from a table which contains a map/array field with min/max contraints on its element after upgrade from KV 3.0 to KV 4.3. [#25799]

  20. Fixed a problem that could result in failure in table evolving when adding a new field that contains a nested fixed_binary, enum or record field. [#25793]

  21. Changed query shell output default from COLUMN to JSON. [#25700]

  22. Restricted numeric type mapping in JSON to integer, long, and double. This eliminates float as an option in JSON. [#25699]

  23. Improved the error checking performed when processing DDL commands to provide clearer feedback when the requested command uses a feature that is not supported on all of the nodes in the store because of the nodes still need to be upgraded. [#25712]

  24. Added checks to confirm that all shards have quorum and a majority of nodes in each zone being changed are online before deploying a topology that changes the type of one or more zones. This change makes it more likely that the topology change will detect and report, prior to making any changes, that, given the current lack of availability of certain replication nodes, the command will fail. [#24503]

  25. Fixed upgrade problem where primary key fields marked as not nullable would cause serialization issues in queries causing them to fail. Primary keys can no longer be explicitly marked a not nullable or with a default because they are implicitly not nullable and cannot have default values. [#25533]

  26. Fixed a memory leak that could, in some cases, cause an OutOfMemoryError to be thrown when a replication node needs to do a Network Restore when performing replication. [#25649]

  27. Improved the ability of the Admin CLI to validate new SN parameters when performing 'plan change-parameters' and 'change-policy' commands. With this change, commands that specify invalid SN parameters will fail without modifying the parameters, rather than having the incorrect parameters prevent SNs from being restarted. [#25636]

  28. Fixed a problem in the Load utility that prevented it from restoring security information for a secure store. [#24642]

  29. Fixed a problem that could prevent master rebalancing from working correctly because of excessive logging by reducing redundant debug logging entries. [#25625]

  30. Fixed a problem in Hive and Big Data SQL query processing where a the SELECT clause from one query might affect the behavior of the next query if that query did not specify a SELECT clause. [#25626]

Storage Engine Changes (JE 7.3)

  1. JE's low-level operation throughput statistics have been simplified and improved. Previously, these statistics represented API calls rather than CRUD operations and this caused confusion when a single API call performed multiple CRUD operations. Also, some CRUD operations (key search operations on any RN in a shard, all operations on a replica RN) were missing, and no operation statistics were included in the admin .stat files. The JE throughput stats were previously only visible via the env/je.stat.csv file.

    Now, the following statistics representing CRUD operations are output in the admin .stat files as well as env/je.stats.cvs, and are included for all RNs, including replicas. The new statistics should be considered internal operations or internal units of work. This approach is used to allow correlating internal operations to performance measurements. [#23792]

  2. Data corruption is now detected as soon as possible by a new internal JE background task. This detects data corruption caused by media/disk failure by reading the log sequentially and verifying checksums. This is the equivalent of running the JE DbVerifyLog utility, but it is performed automatically and periodically on every RN.

    By default, verification is on and occurs once a day at midnight, local time. Although this should not normally be necessary, verification can be disabled or the verification schedule can be changed using the JE EnvironmentConfig.ENV_RUN_VERIFIER, VERIFY_SCHEDULE and VERIFY_LOG parameters (je.env.runVerifier, je.env.verifySchedule and je.env.verifyLog).

    When corruption is detected, the RN will be shut down and will not be restarted automatically by the SN. A SEVERE level exception will be logged in the RN's log and the exception message will contain the LOG_CHECKSUM token. Manual intervention by an administrator is necessary in this situation. We do not recommend restarting the RN without first replacing the media and/or forcing a network restore from another node in the shard.

    The advantage of performing verification frequently is that a media/disk problem may be detected sooner than it would be otherwise. This means that the network restore can be done while the other RNs in the shard are up, minimizing exposure to additional failures. [#25221]

  3. Repeat-fault reads have been eliminated for LNs (Btree leaf nodes, which represent record data on disk.) Previously, if an LN's on-disk size was greater than 4kB, two reads would be required to fetch the LN from disk. The first read would always include the log entry header, which contains the exact entry size, and the second read (repeat-read) was needed to read the entire entry. The second read includes the entire entry, although normally it will be cached by the file system.

    Now, only a single read is needed because the last logged size for LNs is now stored in the Btree, and this can be used to determine the exact size needed for the read buffer. The benefits of this change are:

    1. the amount of IO is reduced (although the repeat-read normally reads data that is cached by the file system)

    2. the statistics describing IO activity are simpler to analyze without the repeat-reads in the picture.

    Note that INs (Btree internal nodes) can still cause repeat-reads when they are fetched, because the last logged size for INs is not stored in the Btree. However, in most applications all INs are cached and therefore INs are rarely read from disk (except during a cache warm-up period). The nRepeatFaultReads statistic (EnvironmentStats.getNRepeatFaultReads) indicates the number of repeat-reads. [#25387]

  4. A network restore (data copy from one RN to another RN in a shard) may be instigated for various reasons, for example, when a replica has been down for a long time period and is then brought up again. Previously, if a network restore was interrupted and the RN was then restarted, the RN could have attempted to operate with a partial data set, causing unpredictable results. Now, the use of a partially restored RN is not possible, and the network restore will automatically be restarted. If a network restore is interrupted, the RN's env directory will contain a marker file named 0x7fffffff.jdb, which is recognized and managed by JE. [#25369]

  5. Fixed a bug that prevented the request timeout for a write operation from being honored. Under certain circumstances, the request took longer than the specified timeout. [#25692]

Deprecated Features

  1. The NONE_REQUIRED_NO_MASTER (oracle.kv.Consistency.NONE_REQUIRED_NO_MASTER) consistency policy that requires that a read operation be serviced on a replica, never the Master, is deprecated in this release. Applications should use the NONE_REQUIRED consistency policy or KVStoreConfig.setReadZones instead.