Changes in 12cR2.4.5.12

Topics

New Features

Introduced Streams API that allows users to subscribe to all the changes from write operations to an Oracle NoSQL Database table. The changes are streamed to your application as a series of discrete StreamOperation class objects. These APIs are an implementation of Reactive Streams interface. The Oracle NoSQL version of the streams APIs are prefixed with NoSQL so as to differentiate them from the APIs described by the Reactive Streams standard. For example, Reactive Streams describes a Publisher class. Oracle NoSQL Database's implementation of that class is NoSQLPublisher. For more information refer to the "Getting Started with Oracle NoSQL Database Streams API". [#24871]
Added UPDATE statement to perform single-row, server-side updates with SQL.

The UPDATE statement in Oracle NoSQL Database is an extended version of the one in standard SQL. In addition to the SET clause, which is used to replace the value of existing data fields, the ADD, PUT, and REMOVE clauses have been added to support adding and removing elements and fields to/from arrays and maps. Furthermore, the already powerful path expressions of Oracle NoSQL can be used to identify multiple target fields to update (whether those fields are inside json or strongly typed data) and to compute new values, potentially based on the current values. However, in the current implementation, only a single row may be updated, which implies that the WHERE clause must specify a complete primary key.

As an example, assume a table, called "People", with only two columns: an integer "id" column and an "info" column of type JSON. Furthermore, let's assume the following row to be updated:
```
{ 
  "id" : 0,
  "info" :
  {
    "firstName" : "John",
    "lastName" : "Doe",
    "profession" : "software engineer",
    "income" : 200000,
    "address" :
    {
      "city" : "San Fransisco",
      "state" : "CA",
      "phones" : [ { "areacode":415, "number":2840060, "kind":"office" },
                   { "areacode":650, "number":3789021, "kind":"mobile" },
                   { "areacode":415, "number":6096010, "kind":"home" }
                 ]
    },
    "children":
    {
      "Anna" : { "age" : 10, "school" : "ABC", "friends" : ["John", "Maria"] },
      "Ron"  : { "age" : 2 },
      "Mary" : { "age" : 7, "school" : "XYZ", "friends" : ["Mark"] }
    }
  }
}
```
The following UPDATE statement updates various fields in the above row:
```
update People p
set p.info.profession = "surfing instructor",
set p.info.address.city = "Santa Cruz",
set p.info.income = $ / 10,
set p.info.children.values().age = $ + 1,
add p.info.address.phones 0 { "areacode":831, "number":5294368, "kind":"mobile" }
remove p.info.address.phones[$element.kind = "office"]
put p.info.children.Ron { "friends" : ["Julie"] },
where id = 0
returning *
```
The RETURNING clause at the end of the above statement specifies that the whole row, after the update, must be returned. So, the result of the statement looks like this:
```
{
  "id" : 0,
  "info" :
  {
    "firstName" : "John",
    "lastName" : "Doe",
    "profession" : "surfing instructor",
    "income" : 20000,
    "address" :
    {
      "city" : "Santa Cruz",
      "state" : "CA",
      "phones" : [ { "areacode":831, "number":5294368, "kind":"mobile" },
                   { "areacode":650, "number":3789021, "kind":"mobile" },
                   { "areacode":415, "number":6096010, "kind":"home" }
                 ]
    },
    "children":
    {
      "Anna" : { "age" : 11, "school" : "ABC", "friends" : ["John", "Maria"] },
      "Ron"  : { "age" : 3, "friends" : ["Julie"] },
      "Mary" : { "age" : 8, "school" : "XYZ", "friends" : ["Mark"] }
    }
  }
}
```
The first two SET clauses change the profession and city of John Doe. The third SET reduces his income to one tenth. In this SET, the implicitly declared $ variable is bound to the target item of the SET, i.e., to the result of the p.info.income expression. The fourth SET increases the age of John's children by 1. Notice again the use of the $ variable here: the expression p.info.children.values().age returns 3 ages; the SET iterates over these ages, binds the $ variable to each age in turn, computes the expression $ + 1 for each age, and updates the age with the new value.

The ADD clause adds a new phone at position 0 inside the phones array. The REMOVE removes all the office phones (only one in this example). The PUT clause adds a friend for Ron. In this clause, the expression p.info.children.Ron returns the value associated with the Ron child. This value is a map (the json object { "age" : 3 }) and becomes the target of the update. The 2nd expression in the PUT ({ "friends" : ["Julie"] }) constructs and returns a new map. The fields of this map are added to the target map. [#26161]
Changes in the cast expression.
- A record can now be cast to a map. This allows a user to cast a record to a json object.
- A map can be cast to a record. This is needed for updates, because we don't have a record constructor, so if a user wants to replace a whole record with a new record, the user constructs a map, which is then cast to the record by the update stmt.
- In the previous implementation casting to JSON was a noop if the input value was a subtype of JSON. However, this was actually a bug, because for example, ARRAY(INT) is a subtype of JSON. and if we want to cast ARRAY(INT) to JSON, we should create an array of ARRAY(JSON) type. So, if the target type is JSON, the current implementation makes sure that any arrays/maps contained in the input value are cast to ARRAY(JSON)/MAP(JSON) in the output value. Furthermore, since we now allow casting records to maps, casting a record to JSON is now treated as casting the record to MAP(JSON). Before, this would raise an error. [#26161]
Added SQL function seq_concat. It simply returns the concatenation of the sequences returned by its argument expressions. [#26161]
Added SQL functions to extract temporal fields from a TIMESTAMP field. [#26046]
Storage directory sizes are now used to enforce disk usage, and new statistics are provided for monitoring disk usage. We strongly recommend that all applications:
- Specify storage directories and sizes for all RNs.
- Monitor disk usage using the new availableLogSize statistic and take correction action well before this value reaches zero.
Specifying a storage size is important because the storage engine reserves data files for potential replication to nodes that are out of contact. More reserved files are retained for potential replication in this release. If a storage size is not specified, all the free space on the volume (minus 5GB of free space), will eventually be used.

If monitoring is not performed or corrective actions are not taken, there is a danger that the storage size (or the volume size) will be exceeded as the size of the data set grows. Although less likely, this could also occur due to a configuration error (for example, too small a cache size), by neglecting to delete a snapshot, etc. Disk usage is now monitored internally and write operations will be rejected when the storage size (or volume size) is in danger of being exceeded. In this situation, read operations are still allowed. Previously, the RN would cease to function when the volume was filled, i.e., no operations could be performed. Allowing read operations now provides partial availability in this situation. In addition, it is now guaranteed that at least 5GB of free space on the volume will be maintained, regardless of the specified storage size.

The availableLogSize statistic represents the amount of space that can be used by write operations, taking into account that reserved data files will be deleted automatically when necessary to perform write operations. Note that monitoring disk usage in the file system is not meaningful, because of the presence of these reserved files that will be deleted automatically.

The availableLogSize statistic is one of several new statistics:
- activeLogSize -- Bytes used by all active data files: files required for basic operation.
- reservedLogSize -- Bytes used by all reserved data files: files that have been cleaned and can be deleted if they are not protected.
- protectedLogSize -- Bytes used by all protected data files: the subset of reserved files that are temporarily protected and cannot be deleted.
- protectedLogSizeMap -- A breakdown of protectedLogSize as a map of protecting entity name to protected size in bytes.
- availableLogSize -- Bytes available for write operations when unprotected reserved files are deleted: free space + reservedLogSize - protectedLogSize.
- totalLogSize -- Total bytes used by data files on disk: activeLogSize + reservedLogSize.
These statistics are included in the .stat files and can be monitored using the JMX oracle.kv.repnode.envmetric type. In the JMX output, these statistic names are prefixed by "Cleaning_" because they are in the log cleaning (disk garbage collection) statistics group, for example: "Cleaning_availableLogSize".

For a new store, the storage size is specified using the makebootconfig the -storagedirsize argument. For an existing store, the storage size can be added (or modified) using the change-storagedir plan. Note that in both cases the -storageDir must be specified, even if the capacity is one.

Because specifying a storage directory size is recommended, a warning is now printed by makebootconfig when -storagedir is specified but -storagedirsize is not specified. [#25220]
Verify Configuration now issues a warning if the user does not specify -storagedirsize and -rootdirsize. We recommend that users set these parameters when installing the store to help manage disk space. [#26187]
Increased the table name size from 32 characters to 256 characters. [#26021]

Bug and Performance Fixes

Better index usage by queries.

In prior releases, conditions that appear inside path filtering steps were not considered as sargable, that is, they would never be used in determining the start or stop points of an index scan. In the current release, this restriction is removed. As an example, consider the following query:
```
select id
from foo f
where f.info.address.phones[408 <= $element.areacode and
                            $element.areacode <= 650].kind =any "work"
```
If there is an index on (info.address.phones[].areacode, info.address.phones[].kind), all predicates in the query can now be used for the index scan. So, a scan on this index will start at areacode 408 and stop right after areacode 650. During the scan, only entries whose "kind" field is equal to "work" will be selected. Before this fix, if the index was chosen by the query, the full index would be scanned and only the equality pred on kind would applied during this scan.

In addition to path-filtering predicates, EXISTS predicates are also treated as potentially sargable predicates in this release. For example, consider the following query:
```
select id
from foo f
where exists f.info.address.state and exists f.info.address.phones.areacode
```
In this case, if we have an index on (info.address.state, info.address.phones[].areacode) both predicates in the query are sargable. Two ranges of the index will be scanned. The first range starts at the beginning of the index and finishes at the first entry having EMPTY as the value of the state field. The second range start just after this EMPTY value and finishes at the end of the index. During each scan, entries having the EMPTY value for the areacode field will be eliminated. [#26044]
Fixed a bug in the case where a query uses a secondary index and it also has a predicate on a primary-key column, which is not part of the index definition. Under some conditions, the predicate on the primary-key column was used as a start/stop predicate for the index scan. This is not possible and the query would raise an IndexOutOfBoundsException during compilation. An example is the following:
```
create table t1(id integer, name string, primary key(id));

create index idx1 on t1(name);

select * from t1 where name = "alex" and id > 0
```
With this bug fix, the condition on the "id" column is not used as a start/stop predicate any more, but it is used as an index-filtering predicate instead. [#26358]
Added functionality to plan failover command to add, move or remove Arbiters to the topology. [#25269]
Fixed performance bug in SQL query processing. Removed an unnecessary remote call to fetch table metadata. This call was adding a one-time overhead of close to 2ms to each SQL query. For queries that access few rows, this is a huge overhead, especially when the rows are cached in memory. For example, the latency of a query that accesses and returns a single memory-cached row was 5x slower than an equivalent table iterator API call. [#26232]
Fixed a query bug that occurs when a TableQuery operation is forwarded by a server to another server. For example, this may happen during partition migration. The bug was in the serialization method of TableQuery, because that method assumed that serialization occurs only when the TableQuery is sent from a client to the servers. [#26385]
Users can now set the PASSWORD LIFETIME to zero. The password will never expire if the value is set to zero days. Prior to this fix, when the user tried to issue "ALTER USER admin PASSWORD LIFETIME 0 DAYS", the statement will fail with message "Time value must not be zero or negative". [#26040]
Changed the way the Java driver configures Java loggers to make it easier to configure additional log handlers. Applications can now be configured to send logging output to a file by specifying a file logging handler for the "kv.oracle" logger. [#26134]
Fixed bugs that allow queries to execute against a secure kvstore after a session timeout without returning AuthenticationRequiredException. [#26250], [#26249]
The method oracle.kv.table.Index.createIndexKey(RecordValue value) has been deprecated in this release, this method is not able to construct index keys for multi-key indexes that include elements of a map or array and can result in multiple index entries, or distinct IndexKey values for a single row. [#26211]
Fixed a bug where altering a key-only table by adding a non-primary-key column would cause subsequent queries and index creation to fail. This bug exists in R4.4 as well. [#26354]
Fixed the output for SHOW AS JSON USERS and SHOW AS JSON ROLE role to output results in the correct JSON format. The incorrect output format was causing parsing issues in their applications. [#26355]
When upgrading from a pre-4.2 release to 4.2, 4.3, or 4.4, the presence of multiple outstanding plans in the ERROR state can cause the Admin to fail when starting up. That bug has been fixed in this release. [#26303]

Utility Changes

Introduced two new arguments -registry-open-timeout and -registry-read-timeout for Admin CLI and ping utility, which are used to configure open and read timeout in milliseconds associated with the sockets used to make registry requests. [#24164]

Storage Engine Changes (JE 7.5)

Fixed a compatibility problem with the Azul Zulu JVM. Previously the following exception would occur when using Zulu: [#26163]

The database environment could not be opened: java.lang.IllegalStateException:
Could not access Zing management bean. Make sure -XX:+UseZingMXBeans was
specified.

Fixed a bug that could cause OutOfMemoryError when performing a network restore. This could cause an RN to unnecessarily restart and retry the network restore. (A network restore is used when an RN has been down for some period of time, and is lagging the master node in its shard.) [#26305]

Fixed a bug that could prevent performing a network restore, after a prior network restore was aborted or incomplete for any reason. (A network restore is used when an RN has been down for some period of time, and is lagging the master node in its shard.)

For example, this could occur if the RN process is killed during the first network restore, and then another network restore is attempted. The problem could occur only in an environment with a relatively large data set, specifically where at least one billion write transactions had been performed. An example stack trace is below.

java.lang.NumberFormatException: For input string: "7473413648"
at java.lang.NumberFormatException.forInputString(
        NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:583)
at java.lang.Integer.parseInt(Integer.java:615)
at com.sleepycat.je.rep.InsufficientLogException.init(
        InsufficientLogException.java:218)
at com.sleepycat.je.rep.impl.RepImpl.handleRestoreRequired(
        RepImpl.java:2296)
at com.sleepycat.je.recovery.RecoveryManager.findEndOfLog(
        RecoveryManager.java:543)
at com.sleepycat.je.recovery.RecoveryManager.recover(
        RecoveryManager.java:339)
at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(
        EnvironmentImpl.java:841)
at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:222)
at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:267)
at com.sleepycat.je.Environment.init(Environment.java:252)
at com.sleepycat.je.rep.ReplicatedEnvironment.init(
        ReplicatedEnvironment.java:607)
at com.sleepycat.je.rep.ReplicatedEnvironment.init(
        ReplicatedEnvironment.java:466)
at oracle.kv.impl.rep.RepEnvHandleManager.openEnv(
        RepEnvHandleManager.java:628)
at oracle.kv.impl.rep.RepEnvHandleManager.renewRepEnv(
        RepEnvHandleManager.java:465)
at oracle.kv.impl.rep.RepNode.startup(RepNode.java:913)

This has been fixed. Without the fix, a workaround for the problem is to remove all the .jdb files on the RN node, and restart the RN. [#26311]