Running the Oracle NoSQL Database Analytics Integrator

Steps to run the Oracle NoSQL Database Analytics Integrator.

Create a configuration file for the integrator

Before you can execute the Oracle NoSQL Database Analytics Integrator, you must first create a configuration file. This configuration file will be used when invoking the utility. The configuration file should have the entries in a JSON format as shown in the examples below. The following are just two sample configuration files. Not all of the parameters used below are required. The table below explains every parameter being used in the example and highlights if it is optional or required.

Example 1: You execute the utility from an Oracle Cloud Compute Instance and you wish to authenticate using an Instance Principal.
{
    "nosqlstore": {
        "type" : "nosqldb_cloud",
        "endpoint" : "us-ashburn-1",
        "useInstancePrincipal" : true,
        "compartment" : <ocid.of.compartment.containing.nosql.tables>,
        "table" : <tableName1,tableName2,tableName3>,
        "readUnitsPercent" : "90,90,90",
        "requestTimeoutMs" : "5000"
    },
    "objectstore" : {
        "type" : "object_storage_oci",
        "endpoint" : "us-ashburn-1",
        "useInstancePrincipal" : true,
        "compartment" : <ocid.of.compartment.containing.bucket>,
        "bucket" : <bucket-name-objectstorage>,
        "compression" : "snappy"
    },
    "database": {
        "type" : "database_cloud",
        "endpoint" : "us-ashburn-1",
        "credentials" : "/home/opc/.oci/config",
        "credentialsProfile" : <profile-for-adw-auth>,
        "databaseName" : <database-name>,
        "databaseUser" : "ADMIN",
        "databaseWallet"” : <path-where-wallet-unzipped>

    }
}
Example 2: You prefer to authenticate using your own user credentials, or you are executing from outside of the Oracle Cloud and thus Instance Principal authentication is not available.
{
    "nosqlstore": {
        "type" : "nosqldb_cloud",
        "endpoint" : "us-ashburn-1",
        "credentials" : "/home/opc/.oci/config",
        "credentialsProfile" : <nosqldb-user-credentials>,
        "table" : <tableName1,tableName2,tableName3>,
        "readUnitsPercent" : "90,90,90",
        "requestTimeoutMs" : "5000"
    },
    "objectstore" : {
        "type" : "object_storage_oci",
        "endpoint" : "us-ashburn-1",
        "credentials" : "/home/opc/.oci/config",
        "credentialsProfile" : <objectstorage-user-credentials>,
        "bucket" : <bucket-name-objectstorage>,
        "compression" : "snappy"
    },
    "database": {
        "type" : "database_cloud",
        "endpoint" : "us-ashburn-1",
        "credentials" : "/home/opc/.oci/config",
        "credentialsProfile" : <adw-user-credentials>,
        "databaseName" : <database-name>,
        "databaseUser" : "ADMIN",
        "databaseWallet" : <path-where-wallet-unzipped>
    } 
   "abortOnError" : false
}

The configuration is divided into three sections – nosqlstore, objectstore, and database - whose entries are used to specify how the utility interacts with each respective cloud service: the NoSQL Cloud Service, Oracle ObjectStorage, and Oracle Autonomous Data Warehouse.

There are some parameters that are common in all three sections.

Table - Common Parameters for all sections

Paramter name Details of the parameter
type Currently, this parameter can take one of the three values: nosqldb_cloud (for the nosqlstore section), object_storage_oci (for the objectstore section), and database_cloud (for the database section).
endpoint The value of this entry must be set to the region in which the associated resource is located. The value specified for this entry can be either the region’s API endpoint or the Region identifier for the resource. For example, if each resource is located in the US East (Ashburn) region, then the endpoint entry in each section can be specified using either the region’s identifier (“us-ashburn-1”) or the region’s API endpoint for the desired service.

Table - Parameters in the configuration file

Parameter name Specified Section Details of the section
useInstancePrincipal

nosqlstore(Optional)

objectstore(Optional)

The useInstancePrincipal entry can be specified as the boolean value true if the following conditions are satisfied:
  • The utility will be executed from an Oracle Cloud Compute Instance.
  • The section being configured is not the database section
  • The compute instance is authorized, as an Instance Principal, to perform actions on the resource referenced in the section being configured
  • The credentials entry is not specified
If true is specified for the useInstancePrincipal entry and the credentials entry is also specified, then the credentials entry takes precedence, and the user credentials referenced in that entry’s value will be used to interact with the associated resource.

Note:

User credentials must be specified in the database section because the Autonomous Database hosted in ADW requires it.
compartment

nosqlstore(Optional)

objectstore(Optional)

  • If true is specified for the useInstancePrincipal entry, then the OCID of the compartment containing that resource must also be specified.
  • If either false is specified for the useInstancePrincipal entry or the credentials entry is specified, then the compartment entry is optional; although it must be specified in the file referenced by the credentials entry.
credentials

nosqlstore(Optional)

objectstore(Optional)

database(Required)

The credentials entry is required in the database section under all circumstances. It is required in the nosqlstore and objectstore sections in one or more of the following circumstances:
  • Either the utility will be executed from outside the Oracle Cloud, or it will be executed from an Oracle Cloud Compute Instance that is not an Instance Principal
  • The useInstancePrincipal entry is not specified or is set to false.

The value specified for this entry must reference a file on the local file system that specifies user credentials that can be used to securely interact with the associated resource.

credentialsProfile

nosqlstore(Optional)

objectstore(Optional)

database(Optional)

The credentialsProfile entry is optional in each section, and even if specified, applies only when a corresponding credentials entry is also specified.

table nosqlstore(Required)

The table entry is required and must be specified in the nosqlstore section. The value of this entry is a string consisting of a comma-separated list of names; where each name references the name of a table in the NoSQL Database Cloud Service whose contents should be retrieved and copied to the Autonomous Data Warehouse.

readUnitsPercent nosqlstore(Optional)

The readUnitsPercent entry is optional and is applicable only in the nosqlstore section. The value of this entry is a string consisting of a comma-separated list of integers; between 1 and 100, representing the percentage of read units that can be consumed when retrieving data from the corresponding table.

This entry allows you to specify different read unit percentages for each of the tables referenced in the table entry; where the first percentage in the list corresponds to the first table in the list of tables, the second percentage corresponds to the second table, and so on. It is not required that the number of percentages in this list equal the number of tables in the list of tables. A default value of 90 percent will be assigned to any table in the list of tables that does not have a corresponding percentage in this list.

For example, suppose four table names are specified in the table entry, but the readUnitsPercent entry is set to the value ”50,80”. For this case, data from the first table will be retrieved using 50 percent of the available read units, whereas 80 percent of the read units will be used when retrieving data from the second table. And finally, for the remaining two tables, 90 percent of the read units (the default) will be used when retrieving the data from each of those tables.

requestTimeoutMs nosqlstore(Optional)

The requestTimeoutMs entry is optional and is applicable only in the nosqlstore section. The value of this entry is a string consisting of a comma-separated list of positive integers; where each integer represents the number of milliseconds allowed for each data retrieval request to complete for the corresponding table.

This entry allows you to specify different timeout values for each of the tables referenced in the table entry. If this entry is not specified, or if this entry specifies a timeout for only a subset of the tables, then the default value of 5000 will be assigned to the remaining tables.

bucket objectstore(Required) The bucket entry is required and must be specified in the objectstore section. The value of this entry is a string representing the name of the OCI Object Storage bucket, into which the utility copies the data retrieved from the NoSQL tables.
compression objectstore(Optional)
The compression entry is optional and is applicable in only the objectstore section. The value specified for this entry is a string representing how the data is retrieved from the table(s) specified in the nosqlstore. If this is set, then the table data is compressed when being copied to object storage. The value specified for this entry must be one of the following:
  • snappy – for snappy compression
  • gzip – for gzip compression
  • none – do not compress the table data copied to ObjectStorage

Note:

If the compression entry is not specified, then snappy compression will be performed.
databaseName database(Required) The dabaseName entry is required and must be specified in the database section. This entry is a string whose value is the name of the database created in the Oracle Autonomous Data Warehouse Cloud Service.
databaseUser database(Optional)

The databaseUser entry is optional and should be specified in the database section. This entry is a string whose value is the name of the user account in the Autonomous Database specified in the dabaseName entry. If this entry is not specified, then you will be prompted in the command line to provide the value.

databaseWallet database(Required) The databaseWallet entry is required and must be specified in the database section. This entry is a string whose value is the filesystem path to the directory containing the contents of the Oracle Wallet downloaded from the Autonomous Database user account specified in the databaseUser entry in the configuration file.
abortOnError Optional Specifies the action to be taken on facing an error. The default value is true.

Note:

Each entry in the configuration file can be overridden on the command line by setting a system property with the name of the form, section.entry for example, -Dnosqlstore.table=tableName1,tableName3. If an entry is not located within a section, then the name to use for such a property is simply the name of the entry itself; for example, -DabortOnError=false. This feature may be useful when testing or writing scripts that run the utility at regular intervals.

Specifying config information in the credentials file:

Oracle Cloud Infrastructure requires basic configuration information, like user credentials, tenancy OCID, etc which can be specified in the config file. The default location for this config file is ~/.oci. You can specify multiple sets of user credentials in this config file.

A sample credentials file is shown below.
[DEFAULT]
user=<ocid.of.default.user>
fingerprint=<fingerprint.of.default.user>
key_file=<path.to.default.user.oci.api.private.key.file.pem>
tenancy=<ocid.of.default.user.tenancy>
region=us-ashburn-1
compartment=<ocid.of.default.compartment>

[nosqldb-user-credentials]
user=<ocid.of.nosqldb.user>
fingerprint=<fingerprint.of.nosqldb.user>
key_file=<path.to.nosqldb.user.oci.api.private.key.file.pem>
tenancy=<ocid.of.nosqldb.user.tenancy>
region=us-ashburn-1
compartment=<ocid.of.nosqldb.compartment>

[objectstorage-user-credentials]
user=<ocid.of.objectstorage.user>
fingerprint=<fingerprint.of.objectstorage.user>
key_file=<path.to.objectstorage.user.oci.api.private.key.file.pem>
tenancy=<ocid.of.objectstorage.user.tenancy>
region=us-ashburn-1
compartment=<ocid.of.objectstorage.compartment>

[adw-user-credentials]
user=<ocid.of.adw.user>
fingerprint=<fingerprint.of.adw.user>
key_file=<path.to.adw.user.oci.api.private.key.file.pem>
tenancy=<ocid.of.adw.user.tenancy>
region=us-ashburn-1
compartment=<ocid.of.adw.compartment>
dbmsOcid=<ocid.of.autonomous.database.in.adw>
dbmsCredentialName=<OCI$RESOURCE_PRINCIPAL or NOSQLADWDB_OBJ_STORE_CREDENTIAL>

Note:

In the above configuration file, there are three separate entries for nosql-db-user, objectstorage-user and adw-user. This is not mandatory and a config file can exist with only one DEFAULT profile. However, having separate profiles is a good practice rather than combining all parameters in the DEFAULT profile.

Table - Parameters in credentials file

Parameter Name Details of the parameter
user The OCID of the user
fingerprint A short sequence of bytes used to identify a longer public key for the default user
keyfile The path/filename to the file which contains the private key for the default user
tenancy The OCID of the tenancy
regions The endpoint of the region
compartment compartment name or OCID of the compartment of the default user
dbmsOcid OCID of the Autonomous Database
dbmsCredentialName

This is the name of the credential the ADW database will use to authenticate with Object Storage; which is either the name OCI$RESOURCE_PRINCIPAL (if you choose to employ Resource Principal authentication), or the name of the AUTH_TOKEN credential that is created when the DBMS_CLOUD.CREATE_CREDENTIAL procedure is executed by either the user or the system administrator (for example,NOSQLADWDB_OBJ_STORE_CREDENTIAL ).

Running the tool

After all the requirements for using the necessary Oracle Cloud services (NoSQL Database, Object Storage, and Autonomous Data Warehouse) have been completed and a valid configuration file has been created, the Oracle NoSQL Database Analytics Integrator can be executed by simply typing a command on the command line.
  • Navigate to the directory nosqlanalytics under the installation directory (/home/opc/nosqlanalytics-<version>) .
    cd /home/opc/nosqlanalytics-1.0.1/nosqlanalytics
  • Invoke the utility using the following command. The configuration file oci-nosqlanalytics-config.json is present under the .oci directory inside the home directory.
    java -Djava.util.logging.config.file=./src/main/resources/logging/java-util-logging.properties
    -Dlog4j.configurationFile=file:./src/main/resources/logging/log4j2-analytics.properties
    -jar ./lib/nosqlanalytics-1.0.1.jar
    -config ~/.oci/oci-nosqlanalytics-config.json

Note:

The system properties that configure the loggers used during execution are optional. If those system properties are not specified, then the utility will produce no logging output.

Logging

The Oracle NoSQL Database Analytics Integrator executes software from multiple third-party libraries, where each library defines its own set of loggers with different namespaces. For convenience, the Oracle NoSQL Database Analytics Integrator provides two logging configuration files as part of the release; one to configure logging mechanisms based on java.util.logging, and one for loggers based on Log4j2.

Note:

By default, the logger configuration files provided with the utility are designed to produce minimal output as the utility executes. But if you wish to see verbose output from the various components that are employed by the utility, then you should increase the logging levels of the specific loggers whose behavior you wish to analyze.