Changes in 21.1.12

Topics

New Features

Added support for indexing nested arrays and map, and using such indexes in queries.

Previously, Oracle NoSQL could index only one array or map per table row, meaning that an indexed array or map could not be nested inside another array or map. This release adds support for indexing nested arrays and maps. For example, consider the following table that records information about air travel baggage:

create table baggage (
  ticketNo string,
  passengerName string,
  bagInfo json,
  primary key(ticketNo))

A sample row for this table is shown below. In this row, there is only one piece of baggage for ticketNo "1762352483606", but in general, passengers may have more than one piece of baggage, and as a result, "bagInfo" is an array of documents.

{
  "ticketNo" : "1762352483606",
  "passengerName" : "Willie Hernandez",
  "bagInfo" : [
    {
      "tagNum" : "17657806243915",
      "routing" : "MIA/LAX/MEL",
      "lastActionCode" : "OFFLOAD",
      "lastSeenStation" : "MEL",
      "lastSeenTimeGmt" : "2019.03.13 at 22:10:00 EDT",
      "flightLegs" : [
        {
          "flightNo" : "BM604",
          "flightDate" : "2019.03.12 at 20:00:00 EDT",
          "fltRouteSrc" : "MIA",
          "fltRouteDest" : "LAX",
          "estimatedArrival" : "2019.03.12 at 23:00:00 PDT",
          "actions" : [
            { "actionCode" : "ONLOAD to LAX",
              "actionTime" : "2019.03.12 at 21:14:00 EDT" },
            { "actionCode" : "BagTag Scan at MIA",
              "actionTime" : "2019.03.12 at 20:48:00 EDT"
            },
            { "actionCode" : "Checkin at MIA",
              "actionTime" : "2019.03.12 at 19:50:00 EDT"
            }
          ]
        },
        {
          "flightNo" : "BM667",
          "flightDate" : "2019.03.12 at 23:14:00 PDT",
          "fltRouteSrc" : "LAX",
          "fltRouteDest" : "MEL",
          "estimatedArrival" : "2019.03.13 at 22:16:00 EDT",
          "actions" : [
            { "actionCode" : "Offload to Carousel at MEL",
              "actionTime" : "2019.03.13 at 22:14:00 EDT"
            },
            { "actionCode" : "ONLOAD to MEL",
              "actionTime" : "2019.03.13 at 03:35:00 PDT"
            },
            { "actionCode" : "OFFLOAD from LAX",
              "actionTime" : "2019.03.13 at 03:19:00 PDT"
            }
          ]
        }
      ]
    }
  ]
}

The following index can be created on the baggage table.

create index idx on baggage (
  bagInfo[].lastSeenStation as string,
  bagInfo.flightLegs[].fltRouteSrc as string,
  bagInfo.flightLegs[].flightDate as string)

The index indexes 3 fields of the baggage table: one is the lastSeenStation field of the outer bagInfo array and the other two are the fltRouteSrc and flightDate fields of the nested flightLegs array. The contents of this index for the sample row in baggage are:

  [ "MEL", "MIA", "2019.03.12 at 20:00:00 EDT", "1762352483606" ]
  [ "MEL", "LAX", "2019.03.12 at 23:14:00 PDT", "1762352483606" ]

The following query returns the number of bags that traveled from "MIA" to "MEL" during the whole year 2019. All of the conditions in the WHERE clause are pushed to the above index, as start/stop predicates.

select count(b.info.bagInfo.flightLegs[$element.fltRouteSrc ="MIA"])
    from baggage b
    where exists b.info.bagInfo[$element.lastSeenStation = "MEL"].
        flightLegs[$element.fltRouteSrc = "MIA" and
        $element.flightDate >= "2019.01.01" and
        $element.flightDate <= "2019.12.31"]

[KVSTORE-271]

Added a parameter to control the maximum number of index keys generated per row. The parameter is called rnMaxIndexKeysPerRow and its default value is 10,000. When a row is being indexed (either for the first time or after an update), if the number of index keys extracted from the row exceeds the maximum, an IllegalArgumentException will be thrown.
[KVSTORE-978]
Added the following five new SQL functions to extract row properties that are not stored as table columns. Although the signature of these functions specifies AnyRecord as the type of the input parameter, the functions actually require a row as input. The only expression that returns a row is a row variable, that is, a table alias whose name starts with "$".

integer partition(AnyRecord)

Returns the partition the row belongs to.

integer shard(AnyRecord)

Returns the shard that contains the row currently.

integer row_storage_size(AnyRecord)

Returns the storage size (in bytes) of the most recent version of the row. The returned size includes any overhead (bookkeeping) bytes. It does not include the storage size of any older versions of the row (which are now obsolete and subject to removal by the storage engine cleaner module).

integer index_storage_size(AnyRecord, string)

The second argument to this function is supposed to be the name of an index on the table containing the input row. The functions returns the storage size (in bytes) of the index entry or entries that "point" back to the input row from the given index. The returned size includes any overhead (bookkeeping) bytes associated with these entries.

Notice: for performance reasons it is recommended that queries do not contain multiple calls to the index_storage_size function, for different indexes.

timestamp(3) modification_time(AnyRecord)

Returns the most recent modification time of the row, as a timestamp value of precision 3 (milliseconds). If the row has never been modified since its insertion, it returns the insertion time.

Here is an example query that returns, for each partition, the total number of bytes used to store all the rows of table "foo" contained in that partition:
```
select partition($f), sum(row_storage_size($f))
from foo $f
group by partition($f)
```
[KVSTORE-127]
Added SQL function parse_json(string). It converts a string argument, which is supposed to be JSON text, to an Oracle NoSQL value that represents the given JSON. Here is a usage example:
```
create table foo (id integer, jcol json, primary key(id))

insert into foo values (10, parse_json("{ \"name\" : \"john\", \"age\" : 30 }"))
```
Support LEFT OUTER JOIN SQL syntax.

The NESTED TABLES clause is equivalent to a number of left-outer-join operations "centered" around the target table, this feature supports LEFT OUTER JOIN(LOJ) syntax to be compatible with standard ANSI SQL. The functionality of LOJ is subset of that of NESTED TABLES, it supports to query linear nested tables only. Here is an example:
```
create table A (ida integer, a1 string, primary key(ida));
create table A.B (idb integer, b1 string, primary key(idb));
create table A.B.C (idc integer, c1 integer, primary key(idc));

select * from A a LEFT OUTER JOIN A.B b ON a.ida = b.ida
                  LEFT OUTER JOIN A.B.C c ON b.ida = c.ida and b.idb = c.idb
```
[KVSTORE-265]
Support MRCounter for integer, long and number value types in multi-region tables. It's a data structure that can be replicated across regions, where replicas are updated independently and can always converge to a correct common state.

The syntax to create a MRCounter column is:
```
AS MR_COUNTER
```
Such columns can only be updated using + or - operators in UPDATE DML.
[KVSTORE-803]

Bug and Performance Fixes

Changed the Streams API such that by default it would reconnect infinitely when the connection to any shard in source store is dropped, till the connection is back or the user shuts down the stream. Today the Streams API would reconnect up to a limit before shutting down the stream. The previous behavior favors consistency while the new behavior favors availability over consistency. The Streams API user can override the default behavior by the public API NoSQLSubscriptionConfig.Builder.setMaxReconnect().
[KVSTORE-726]
Changed the Streams API that only durable write operations will be streamed. By durable write it means only the writes satisfying the durability setting. For example, if a write operation comes with durability setting COMMIT_SYNC, it will be stream only after it has been acknowledged by majority of replicas. Previously, such write may be streamed before it is acknowledged by majority of replicas, and when master migrates, the streamed operation may be missing from the new master.
[KVSTORE-686]
Fixed a bug that when the user drops a region with pending DDLs to create, alter or drop a multi-region tables from that remote region, the region agent serving that remote region may not shut down correctly. Consequently if the user creates multi-region tables again immediately after the region is dropped, the XRegion service may not be able to work properly to serve the newly created multi-region tables.
[KVSTORE-901]
Fixed a bug that when all subscribed tables are unsubscribed from a stream to create an empty stream, and later the user subscribes tables to the empty stream, the Stream client may throw NullPointerException.
[KVSTORE-904]
Improved javadoc for asynchronous execution methods. We added the "Thread Model for Asynchronous Execution" section in the KVStore interface to describe our thread model as well as the requirement for user-supplied actions triggered after the asynchronous execution.
[KVSTORE-938]
GC tuning for more predictable latencies. The following changes are made to the RN's GC parameters:
- -XX:G1MixedGCCountTarget to 12 from 32.
- -XX:ConcGCThreads is set to the same values as -XX:ParallelGCThreads.
These changes should make full GC pauses less frequent. There may be a throughput degradation of around 5%, depending on the workload.
[KVSTORE-888]
Fixed a bug where an httpproxy instance running on a 19.3-based client library communicating with a 20.2-based server might fail to execute queries.
[KVSTORE-833]
Query processor will now always raise an error if the search-geometry argument to the geo search functions is not a valid geometry. Until now, an error would be raised only if it could be detected at compile time that the argument was invalid. If not, then during runtime the geo search functions would return false in this case.
[KVSTORE-805]
Fixed a bug that would cause an IllegalArgumentException to be raised due to imprecise computation of the return type of conditional array constructor expression. For example, IAE would be raised for this query: select arr[] from Foo where "arr" is a column of table Foo with type ARRAY(ARRAY(integer)). This is because the arr[] expression in the SELECT is wrapped into a conditional array constructor whose type would be ARRAY(JSON) and it is not allowed to insert a value of type ARRAY(integer) into an array of type ARRAY(JSON). Now, the type of the conditional array constructor is set to ARRAY(ANY).
[KVSTORE-968]
Because support for the oracle.kv.Consistency.NONE_REQUIRED_NO_MASTER consistency policy has been deprecated, the Oracle NoSQL Hadoop/Hive/BigDataSQL integration no longer supports that policy. Applications that employ those integrations should use the oracle.kv.Consistency.NONE_REQUIRED consistency policy instead.
[KVSTORE-837]
The socket read timeout, which can be changed by calling KVStoreConfig.setSocketReadTimeout, no longer needs to be greater than the request timeout when using the async network protocol, which is enabled by default.
[KVSTORE-776]
Fixed a bug in GROUP BY query based on full table scan that returns unexpected results with specified size limit. The bug exists only on cloud environment.
[NOSQL-338]
Fixed a bug that would cause an exception if a bind variable were used in an ADD clause of an UPDATE statement. This is when the bind variable is the new element to add to the target array and there is no position expression. For example:
```
declare $userid integer;
update teams t
add t.info.teams[1].userids $userid
where id = 1
returning *
```
[KVSTORE-963]
Increased the sizes of the default values of several debug and GC log parameters so that more information is retained, to help when debugging problems. Also, to save space, disabled JE debug logging by default, since the entries in those files are present in the associated service debug log files.

Note that default parameter values are only used for new stores and new services. If you are upgrading an existing store, we recommend changing these parameters for existing services.

Parameters that were changed:

rnGCLogFileSize

Changed the default file size limit for GC log files on RNs from 1048576 bytes (1 MB) to 5242880 bytes (5 MB)

adminLogFileLimit

Changed the default file size limit for debug log files on admins from 4000000 bytes to 5242880 bytes (5 MB)

GCLogFileSize

Changed the default file size limit for GC log files on admins from 1048576 bytes (1 MB) to 5242880 bytes (5 MB)

serviceLogFileLimit

Changed the default file size limit for other debug log files, including the store wide log, stat file, perf file, and RN debug log files, from 2000000 bytes to 5242880 bytes (5 MB)

loggingConfigProps

Set the default value of the logging configuration for RNs and admins to com.sleepycat.je.util.FileHandler.level=OFF, which disables logging to JE debug log files

[KVSTORE-878]
Fixed a bug in master rebalancing to avoid futile retries when the capacity of the SN is not a multiple of the RF.
[KVSTORE-869]
Fixed a bug that the final "Z" was missing from the timestamp string in ISO8601 format that indicates UTC.
[KVSTORE-801]
If a prepared proxy-based query was executed after its table was drooped, an NPE would be raised. Now, a QueryException is raised.
[KVSTORE-920]
Fixed a bug that would cause an exception if a query compared a timestamp field with a bind variable and the timestamp field is indexed.

Also allow a timestamp index to be used by a query that compares a timestamp field with a timestamp value when the precisions of the timestamp field and the value are different.
[KVSTORE-950]
Fixed a rare problem where write requests could be sent to a RepNode that was in the UNKNOWN state and was not the master. In this case, the request would fail, but would have succeeded if it had been sent to the actual master. The client now gets the correct node status information and sends the request to the master.
[KVSTORE-681]
Modified the rnRHAsyncMaxConcurrentRequests RepNode parameter to require that RNs be restarted when these parameters are changed. The facilities needed to support changing this value without requiring a restart had a negative impact on performance.
[KVSTORE-941]
Both the JSON and text output of the ping command contain a new boolean attribute isMasterBalanced. If the value is false, that means there is an excess of RepNodes in the Master state and one or more of the nodes in the Master state will transition to a Replica at a suitable time in the future.
[KVSTORE-799]
The ping command now exits with exit code 103 when the store is operational but some RepNodes are in the UNKNOWN state.
[KVSTORE-756]
The output of the ping command now includes information about the storage type for RepNodes and Admins. The possible storage types reported are:

HD

Hard Disk

SSD

Solid-State Drive, flash-based

NVME

NVM Express, non-volatile memory

HD (default for UNKNOWN)

the storage type is assumed to be Hard Disk because it was not possible to determine the actual storage type.

[KVSTORE-802]

Fixed a bug where, in rare cases, operations may return with the following error when a client using 20.3.17 release is connecting to a store with 20.3.17 release or after:

java.lang.IllegalStateException: Expected state BEFORE_READ, was DONE
    at oracle.kv.impl.async.AbstractResultHandler.checkCallOnResult(AbstractResultHandler.java:71)
    at oracle.kv.impl.async.AbstractResultHandler.checkCallOnResult(AbstractResultHandler.java:49)
    at oracle.kv.impl.async.AbstractDelegatingResultHandler.onResult(AbstractDelegatingResultHandler.java:49)
    at oracle.kv.impl.async.AsyncVersionedRemoteDialogInitiator.onCanRead(AsyncVersionedRemoteDialogInitiator.java:157)
    at oracle.kv.impl.async.dialog.DialogContextImpl$OnCanReadTask.doWork(DialogContextImpl.java:1028)
    at oracle.kv.impl.async.dialog.DialogContextImpl$OnCanReadTask.run(DialogContextImpl.java:1017)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at oracle.kv.impl.async.dialog.nio.NioChannelExecutor.runTasks(NioChannelExecutor.java:1493

[KVSTORE-880]

Fixed a bug where operations may return with the following error when there is a network-related problem:

java.lang.IllegalStateException: Unexpected null read
    at oracle.kv.impl.async.AsyncVersionedRemoteDialogInitiator.onCanRead(AsyncVersionedRemoteDialogInitiator.java:131)
    at oracle.kv.impl.async.dialog.DialogContextImpl$OnCanReadTask.doWork(DialogContextImpl.java:1078)
    at oracle.kv.impl.async.dialog.DialogContextImpl$OnCanReadTask.run(DialogContextImpl.java:1049)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at oracle.kv.impl.async.dialog.nio.NioChannelExecutor.runTasks(NioChannelExecutor.java:1655)

[KVSTORE-899]

Fixed a scalability problem for the storage nodes that when the storage nodes have a limited CPU resource and a large number of client connections, OutOfMemoryError may occur.
[KVSTORE-390]
In previous versions, the syntax of the CREATE INDEX and CREATE FULLTEXT INDEX statements allowed square brackets ([<cond>?]) when indexing the values of map fields. With this release, square brackets ([<cond>?]) are no longer allowed, and only .values(<cond>?) can be used. This new requirement exposed a bug in the integration with Elasticsearch fulltext indexing. This bug occurs when .values(<cond>?) rather than brackets is used in the CREATE FULLTEXT INDEX statement to specify that the values of a given map should be indexed. Without the fix delivered in this release, the mechanism that generates the document to index in Elasticsearch will employ the wrong mapping; which causes fulltext search queries on the values of the map to fail. This bug is now fixed; thus, when .values(<cond>?) is used in the CREATE FULLTEXT INDEX statement, such queries will now succeed.
[KVSTORE-959]