Changes in 18.3.9

The following changes were made in Oracle NoSQL Database Release 18.3.9 Enterprise Edition.

New Features

  1. Streams are now supported during elasticity operations such as redistribution, store expansion and contraction, etc. The events applied to a particular key are streamed in the same order that they are applied at the store, regardless of the shard that event was applied to. Prior to this feature an AdminFaultException would occur if an elasticity operation was attempted on a store with active subscribers. [#26662]

  2. Added namespace component to table names.

    Queries use a namespace qualified name of form: ns:TableName. DDL adds new statements: CREATE NAMESPACE and DROP NAMESPACE.

    API changes: TableAPI.listNamespaces() returns all known namespaces and ExecuteOptions.setNamespace() and getNamespace() to specify the namespace of unqualified names used in queries.

    There are 2 new system privileges:

    • CREATE_ANY_NAMESPACE

    • DROP_ANY_NAMESPACE

    And new privileges with a namespace scope:

    • CREATE_TABLE_IN_NAMESPACE

    • DROP_TABLE_IN_NAMESPACE

    • EVOLVE_TABLE_IN_NAMESPACE

    • CREATE_INDEX_IN_NAMESPACE

    • DROP_INDEX_IN_NAMESPACE

    • MODIFY_IN_NAMESPACE

    • READ_IN_NAMESPACE

    • INSERT_IN_NAMESPACE

    • DELETE_IN_NAMESPACE

    These privileges can be granted or revoked using:

    GRANT namespace_privilege ON NAMESPACE ns0 TO role_name
    REVOKE namespace_privilege ON NAMESPACE ns0 FROM role_name
    [#26036]
  3. Added support for querying GeoJSON data.

    Note:

    This feature is available only in the Enterprise Edition of Oracle NoSQL Database.

    The GeoJSON specification (https://tools.ietf.org/html/rfc7946) defines the structure and content of JSON objects that are supposed to represent geographical shapes on earth (called geometries). Oracle NoSQL Database implements a number of functions that do indeed interpret such JSON objects as geometries and allow for the search for rows containing geometries that satisfy certain conditions. Search is made efficient via the use of special indexes on the GeoJSON data.

    As an example consider a table whose rows store points of interest. The table has an id column as its primary key and a poi column of type JSON. Values of the poi column may look like these two JSON documents:

    {
      "kind" : "city hall",
      "address" : { "state" : "CA",
                    "city" : "Campbell",
                    "street" : "70 North 1st street"
                  },
      "location" : { "type" : "point", "coordinates" : [-121.94, 37.29] }
    } 
    
    {
      "kind" : "nature park",
      "name" : "castle rock state park",
      "address" : { "state" : "CA",
                    "city" : "Los Gatos",
                    "street" : "15000 Skyline Blvd"
                  },
      "location" : { "type" : "polygon",
                     "coordinates" : [
                                       [
                                         [-122.1301, 37.2330],
                                         [-122.1136, 37.2256],
                                         [-122.0920, 37.2291],
                                         [-122.1020, 37.2347],
                                         [-122.1217, 37.2380],
                                         [-122.1301, 37.2330]
                                       ]
                                     ]
                   }
    }

    Both of these documents have a "location" field whose value is a GeoJSON object. In the first document, the GeoJSON represents a single point defined by its coordinates: a longitude, latitude pair. In the second document, the GeoJSON represents a polygon defined by the coordinates of its vertices.

    The following query looks for nature parks in northern California (including parks that straddle the border with neighbor states). The query uses the geo_intersect function with a polygon representing northern california as its second argument.

    select t.poi as park
    from PointsOfInterest t
    where t.poi.kind = "nature park" and
          geo_intersect(t.poi.location,
                        { "type" : "polygon",
                          "coordinates" : [
                                            [
                                              [-121.94, 36.28],
                                              [-117.52, 37.38],
                                              [-119.99, 39.00],
                                              [-120.00, 41.97],
                                              [-124.21, 41.97],
                                              [-124.39, 40.42],
                                              [-121.94, 36.28]
                                            ]
                                          ]
                        })

    The following query looks for gas stations within a mile (1609 meters) of a given route including, for each qualifying gas station, its actual distance from the route. The query uses the geo_near() function with a LineString representing the route as its second argument. Furthermore, the geo_near function adds an implicit order-by distance, and as a result, the query orders the returned gas stations by ascending distance from the route.

    select t.poi as gas_station,
           geo_distance(t.poi.location,
                        { "type" : "LineString",
                          "coordinates" : [
                                           [-121.9447, 37.2975],
                                           [-121.9500, 37.3171],
                                           [-121.9892, 37.3182],
                                           [-122.1554, 37.3882],
                                           [-122.2899, 37.4589],
                                           [-122.4273, 37.6032],
                                           [-122.4304, 37.6267],
                                           [-122.3975, 37.6144]
                                         ]
                        }) as distance
    from PointsOfInterest t
    where t.poi.kind = "gas station" and
          geo_near(t.poi.location,
                   { "type" : "LineString",
                     "coordinates" : [
                                       [-121.9447, 37.2975],
                                       [-121.9500, 37.3171],
                                       [-121.9892, 37.3182],
                                       [-122.1554, 37.3882],
                                       [-122.2899, 37.4589],
                                       [-122.4273, 37.6032],
                                       [-122.4304, 37.6267],
                                       [-122.3975, 37.6144]
                                      ]
                    },
                    1609)

    Both of the above queries can be executed efficiently by creating the following index:

    create index idx_kind_loc on PointsOfInterest(poi.kind as string,
                                                  poi.location as geometry)
    [#27078]
  4. Added sequence aggregation functions.

    The following functions were added: seq_count, seq_sum, seq_avg, seq_min, and seq_max. Contrary to the corresponding SQL aggregate functions (count, sum, etc) the sequence aggregate functions do not imply grouping of rows and do not aggregate values from different rows. Instead, they simply aggregate the items in their input sequence. In doing so, they use the same rules as their corresponding SQL aggregate functions. For example, seq_sum will skip any non-numeric items in the input sequence.

    As an example consider a Users table with an expenses column that is a map containing the expenses of each user for various categories. Then the following query selects, for each user, their id, the sum of their expenses in all categories except housing, and the maximum of these expenses.

    select id,
           seq_sum(u.expenses.values($key != "housing")) as sum,
           seq_max(u.expenses.values($key != "housing")) as max
    from Users u
    [#27082]
  5. Implemented SQL INSERT/UPSERT statement.

    This statement is used to construct a new row and insert it in a specified table. If the INSERT keyword is used, the row will be inserted only if it does not exist already. If the UPSERT keyword is used, the row will be inserted if it does not exist already, otherwise the new row will replace the existing one.

    As an example, consider the following CREATE TABLE statement:

    create table Users (
      id INTEGER,
      firstName STRING,
      lastName STRING,
      otherNames ARRAY(RECORD(first STRING, last STRING)),
      age INTEGER,
      income INTEGER,
      address JSON,
      expenses MAP(INTEGER),
      PRIMARY KEY (id),
    )
    

    Then the following INSERT statement constructs and inserts a new row into the Users table, if a row with id 10 does not exist already.

    insert into table users values (
        10,
        "John",
        "Smith",
        [ {"first" : "Johny", "last" : "BeGood"} ],
        22,
        45000,
        { "street" : "Main", "number" : 10, "city" : "Reno", "state" : "NV"},
        { "travel" : 5000, "books" : 2000 }
    ) set ttl 3 days

    If the "upsert" keyword is used in place of "insert" in the above statement, the new row will be inserted if it does not exist already, otherwise it will replace the current version of the row. [#27006]

  6. It is now possible to cast an integer or long to a timestamp in SQL. The integer/long is interpreted as the number of milliseconds since the epoch. [#27006]

  7. Added parse_json SQL function to convert a string to a JSON instance. [#27006]

  8. Added support for IDENTITY column for tables.

    Users can create a table with IDENTITY column that uses a sequence generator. IDENTITY column is of numeric datatypes: INTEGER, LONG, NUMBER for which the system automatically generates a unique number using an internal sequence generator(SG) that is attached the column. Only one IDENTITY column is allowed for a table. Users can specify the following attributes for the sequence generator: START WITH (default =1), INCREMENT BY (default=1), MINVALUE (default=minimum value of the datatype), MAXVALUE(default=maximum value of the datatype), CACHE(default=1000), CYCLE or NO CYCLE(default=NO CYCLE).

    The IDENTITY column can be defined as either GENERATED ALWAYS or GENERATED BY DEFAULT. For the GENERATED ALWAYS option, the system will always generate a value for the IDENTITY column and will return an error if the user specifies a value for it. For the GENERATED BY DEFAULT option, the system will only generate a value for the IDENTITY column if the user did not supply a value.

    Users can also alter the IDENTITY column property and the SG attributes using the ALTER TABLE DDL. New syntax is introduced for defining and modifying the IDENTITY column using the CREATE TABLE and ALTER TABLE syntax respectively. For example:

    CREATE Table t1 (id INTEGER GENERATED ALWAYS AS IDENTITY 
      (START WITH 1 INCREMENT BY 2 MAXVALUE 100),
    name STRING, PRIMARY KEY (id))

    This creates a table t1 for which the system will generate values 1, 3, 5, ..up to 99. If you want to add a CACHE and CYCLE attribute to the SG, you can issue the following ALTER TABLE statement.

    ALTER Table t1 (ADD id INTEGER GENERATED ALWAYS AS IDENTITY
      (CACHE 3, CYCLE))

    If you want to drop the IDENTITY property of the column id, users can do so as shown below.

    ALTER Table t1 (MODIFY id DROP IDENTITY)
    

    The system generates the IDENTITY column values during SQL statements - INSERT, UPSERT and UPDATE or Table.Api - PUT, PUTIFPRESENT and PUTIFABSENT. Refer to Oracle NoSQL Database documentation for the semantics for these when an IDENTITY column is involved.

    [#25154]
  9. Added a new user-specifiable policy parameter for JVM overhead percentage.

    Add a new policy parameter named jvmOverheadPercent for the command change-policy -params [name=value]. If not specified, the default is 25.

    [#27089]
  10. Added a new Migrator utility. Migrator utility allows user to import MongoDB JSON entries in strict mode representation (exported using mongoexport utility) or normal JSON entries into an Oracle NoSQL Database store. This is a preview version. A general availability version that is integrated with IMPORT/EXPORT will be available in a future release.

  11. Added administrative command through REST API. Admin REST API allows user to run admin commands through HTTP or HTTPS requests to Oracle NoSQL Database store. The request and response payload are in JSON format, user can use utility like "curl" to run admin commands against the store.

    [#26596]
  12. There is a new configuration option in KVStoreConfig that permits specification of a local address on a client machine when connecting to KVStore. Such configuration permits an extra level of network traffic control when running on client machines with multiple NICs. Please review the java doc associated with oracle.kv.KVStoreConfig.setLocalAddress(InetSocketAddress) for details.

    [#26879]

Bug and Performance Fixes

  1. Fixed a number of casting bugs:

    1. Casting a string to number was wrong.

    2. Any kind of value should be castable to string, but that was not the case.

    3. When casting a map to a record, casting of JSON null was not supported.

    4. Trying to cast an empty string to a timestamp would raise StringIndexOutOfBoundsException. Now it raises IllegalArgumentException with an appropriate message.

    [#27006]
  2. Fixed bug in UPDATE statement: the new version of the row was not set when the statement has a RETURNING clause.

    [#27006]
  3. Made the server return the security check failure response to Streams API client.

    [#26091]
  4. Fixed a problem where alter table would incorrectly remove a table's Time To Live (TTL) setting.

    [#26983]
  5. The following changes to the key statistics parameters were made:

    • The default value for rnStatisticsSleepWaitDuration has been changed from 1 second to 60 seconds

    • rnStatisticsGatherInterval must have a value greater than or equal to 60 seconds. If rnStatisticsGatherInterval is set to a value less than 60 seconds then 60 seconds is used for its value.

    • rnStatisticsLeaseDuration must have a value less than or equal to one day (24 hours). If rnStatisticsLeaseDuration is set to a value greater than 1 day the value of 1 day is used.

    • rnStatisticsSleepWaitDuration must have a value greater than or equal to 10 seconds and less than or equal to rnStatisticsGatherInterval. If rnStatisticsSleepWaitDuration is set to less than 10 seconds the value of 10 seconds is used. If rnStatisticsSleepWaitDuration is set to greater than rnStatisticsGatherInterval the value of rnStatisticsGatherInterval is used.

    The parameter values are checked on each RN when the RN starts or when the parameter values are changed. If any of the key statistics parameters are overridden one or more warning messages will appear in the RN log.

    [#26939]
  6. Corrected the spelling of the versionCheckInterval admin service parameter introduced in the 18.1 release. Any values specified for the incorrectly spelled verionCheckInterval parameter will be ignored starting with this release and will need to be specified again using the correct spelling.

    [#27000]
  7. When upgrading to this release, some elasticity operations, such as redistribute, rebalance and contraction, will require that all nodes be upgraded before they can start or resume.

    The elasticity operations that are affected are those that require data to be migrated from one shard to another.

    If one of these operations is running when the upgrade starts, or is started during the upgrade, the operation will pause until the upgrade is completed. Once the upgrade is complete, the elasticity operation will resume.

    [#26943]
  8. The DurabilityException class has a new method getNoSideEffects which returns whether it is known that the operation that produced this DurabilityException had no side effects. Applications that receive a DurabilityException when performing a modify operation and find that getNoSideEffects() returns true can safely assume that none of the changes requested by the operation have been performed and then retry the operation. If it returns false, then the operation may or may not have had side effects.

    [#27073]
  9. The memory allocation calculation now takes into account the JVM overhead. Made the rnHeapPercent represent the percentage of SN memory reserved for requested Java heap size, and add a new user-specifiable parameter jvmOverheadPercent, which represents additional memory used by JVM as a percentage of requested Java heap size. The default value for jvmOverheadPercent is 25.

    The default SN memory allocation is:

    • 85% for Java heap and overhead

      • 68% for requested Java heap size (rnHeapPercent)

      • 25% (jvmOverheadPercent) * 68 (rnHeapPercent) = 17% for JVM overhead

    • 10% for the operating system (systemPercent)

    • 5% (the remainder) for the off-heap cache

    [#27089]
  10. Fixed a bug where a query executing using NullValue external variable won't raise UnsupportedOperationException.

    [#27177]
  11. There have been several bug fixes in elasticity operations (topology redistribute, rebalance, and contraction) when indexes are present in the store. Included with these fixes is the new ability to rebuild an index is the index becomes corrupted.

    [#26856] [#27024] [#27189]
  12. Fixed an issue with exception handling during a query operation which incorrectly caused a RN to restart.

    [#27183]
  13. Fixed a bug that could cause an exception such as the following, making the RN unavailable. The bug was present in versions of 18.1 prior to 18.1.20. It can occur under certain conditions when 336 or more tables have been created over the lifetime of the store.

    2018-08-29 14:16:12.965 UTC SEVERE [rg1-rn1] Process exiting
    oracle.kv.impl.rep.EnvironmentFailureRetryException: com.sleepycat.je.EnvironmentFailureException: (JE +18.1.xx)
    Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException:
    Environment invalid because of previous exception: (JE 18.1.xx)
    rg1-rn1(1):/DATA00/rg1-rn1/env fetchLN of 0x6030/0x95727f6 parent IN=130 IN class=com.sleepycat.je.tree.BIN
    lastFullLsn=0x6039/0x1e1138 lastLoggedLsn=0x603a/0x60ac3e parent.getDirty()=false state=0
    expires=2018-09-02.00 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid.
    Environment is invalid and must be closed.
            at oracle.kv.impl.api.RequestHandlerImpl.executeInternal(RequestHandlerImpl.java:934)
            at oracle.kv.impl.api.RequestHandlerImpl.executeRequest(RequestHandlerImpl.java:682)
            at oracle.kv.impl.api.RequestHandlerImpl.trackExecuteRequest(RequestHandlerImpl.java:649)
            at oracle.kv.impl.api.RequestHandlerImpl.access$100(RequestHandlerImpl.java:153)
            at oracle.kv.impl.api.RequestHandlerImpl$2$1.run(RequestHandlerImpl.java:523)
            at oracle.kv.impl.api.RequestHandlerImpl$2$1.run(RequestHandlerImpl.java:520)
            at oracle.kv.impl.security.ExecutionContext.runWithContext(ExecutionContext.java:192)
    ...
    [#27199]

Utility Changes

  1. Added namespace [namespace] command to Admin CLI and SQL shell to set or clear namespaces for queries and table operations.

  2. Added -namespace <namespaces> to export and import utilities to export/import all the tables within the specific namespaces.

  3. Added storagedir and availableLogSize information to verify configuration and ping command output. If RNs hit out-of-disk limit exception then violations are raised in verify configuration output. If availableLogSize drops below 5 GB then warning notes are issued in verify configuration output.

    [#25458]
  4. Ping and verify will now display read only status for replication nodes.

    [#26469][#26711]