CREATE TABLE Statement

The table is the basic structure to hold user data. You use the create table statement to create a new table in the Oracle NoSQL Database.

Syntax

create_table_statement ::= 
   CREATE TABLE [IF NOT EXISTS] table_name [comment] 
   "(" table_definition ")" [ttl_definition] [json_collection_definition]

table_name ::= [namespace_name ":"] name_path
name_path ::= field_name ("." field_name)*
field_name ::= id | DSTRING

table_definition ::= 
   (column_definition | key_definition) 
   ("," (column_definition | key_definition))*
column_definition ::= 
   id type_definition 
   [default_definition | identity_definition | 
uuid_definition | mr_counter_definition] 
   [comment]
key_definition ::= 
   PRIMARY KEY 
   "(" [shard_key_definition [","]] [id_list_with_size] ")" 
   [ttl_definition]
id_list_with_size ::= id_with_size ("," id_with_size)*
id_with_size ::= id [storage_size]
storage_size ::= "(" INT_CONSTANT ")"
shard_key_definition ::= SHARD "(" id_list_with_size ")"
ttl_definition ::= USING TTL INT_CONSTANT (HOURS | DAYS)
region_definition ::= IN REGIONS region-name-1,region-name-2 [,...]
json_collection_definition ::= AS JSON COLLECTION

Semantics

table_name

The table name is specified as an optional namespace_name and a local_name. The local name is a name_path because, in the case of child tables, it will consist of a list of dot-separated ids. Child tables are described in the Table Hierarchies section. A table_name that includes a namespace_name is called a qualified table name. When an SQL statement (DDL or DML) references a table by its local name only, the local name is resolved internally to a qualified name with a specific namespace name. See the Namespace Management chapter.

IF NOT EXISTS

This is an optional clause. If this clause is specified and if a table with the same qualified name exists (or is being created) and if that existing table has the same structure as in the statement, no error is generated. In all other cases and if a table with the same qualified name exists, the create table statement generates an error indicating that the table exists.

ttl_definition

The Time-To-Live (TTL) value is used in computing the expiration time of a row. Expired rows are not included in query results and are eventually removed from the table automatically by Oracle NoSQL Database. If you specify a TTL value while creating the table, it applies as the default TTL for every row inserted into this table. However, you can override the table level TTL by specifying a TTL value via the table insertion API.

The expiration time of a row is computed by adding the TTL value to the current timestamp. To be more specific, for a TTL value of N hours/days, the expiration time is the current time (in UTC) plus N hours/days, rounded up to the next full hour/day. For example, if the current timestamp is 2020-06-23T10:01:36.096 and the TTL is 4 days, the expiration time will be 2020-06-28T00:00:00.000. You can use zero as a special value to indicate that a rows should never expire. If the CREATE TABLE statement has no TTL specification, the default table TTL is zero.

In case of MR Tables with TTL value defined, the rows replicated to other regions carry the expiration time when the row was written. This can be either the default table level TTL value or a row level override that is set by your application. Therefore, this row will expire in all the regions at the same time, irrespective of when they were replicated. However, if a row is updated in one of the regions and it expires in the local region even before it is replicated to one of the remote region(s), then this row will expire as soon as it is replicated and committed in that remote region.

json_collection_definition

The json_collection_definition declares the table as a collection of documents. A JSON collection table is a convenient way to store, update, and query your documents. A JSON collection table must include a primary key while creating the table. You can create a JSON collection table with MR_COUNTERS if the table is intended to be a multi-region table. For more details on JSON collection table, see Using JSON Collection Tables.

region_definition

This is an optional clause. In case, the table being created is an MR Table, this parameter lists all the regions that the table should span. You must mention at least one remote region in this clause to create the table as an MR Table. For information on MR Tables, see Life Cycle of MR Tables.

Note:

Specifying this clause while creating a child table of a MR Table will result in an error.

table_definition

The table_definition part of the statement must include at least one field definition, and exactly one primary key definition (Although the syntax allows for multiple key_definitions, the query processor enforces the one key_definition rule. The syntax is this way to allow for the key definition to appear anywhere among the field definitions).

column_definition

The syntax for a column definition is similar to the field_definition grammar rule that defines the fields of a record type. See Data Type Definitions section. It specifies the name of the column, its data type, whether the column is nullable or not, an optional default value or whether the column is an IDENTITY column or not, and an optional comment. As mentioned in Table Management section, tables are containers of records, and the table_definitions acts as an implicit definition of a record type (the table schema), whose fields are defined by the listed column_definitions. However, when the type_definition grammar rule is used in any DDL statement, the only wildcard type that is allowed is the JSON type. So, for example, it is possible to create a table with a column whose type is JSON, but not a column whose type is ANY.

identity_definition

The identity_definition specifies the name of the identity column. There can only be one identity column per table. See Using the IDENTITY Column section.

uuid_definition

The uuid_definition declares the type of a column to be the UUID type. See Using the UUID data type section.

mr_counter_definition

The mr_counter_definition parameter declares the type of a column to be the MR_COUNTER datatype. This data type can be used only in a multi-region table. See Using CRDT datatype in a multi-region table.

key_definition

The syntax for the primary key specification (key_definition) specifies the primary key columns of the table as an ordered list of field names. The column names must be among the ones appearing in the field_definitions, and their associated type must be one of the following: a numeric type, string, enum, or timestamp. The usual definition of a primary key applies: two rows of the same table cannot have the same values on all of their primary key columns.

shard_key_definition

A key_definition specifies the table’s shard key columns as well, as the first N primary-key columns, where 0 < N <= M and M is the number of primary-key columns. Specification of a shard key is optional. By default, for a root table (a table without a parent) the shard key is the whole primary key. Semantically, the shard key is used to distribute table rows across the multiple servers and processes that comprise an Oracle NoSQL Database store. Briefly, two rows having the same shard key, i.e., the same values on their shardkey columns, will always be located in the same server and managed by the same process. Further details about the distribution of data in Oracle NoSQL Database can be found in the Primary and Shard Key Design section.

storage_size

An additional property of INTEGER-typed primary-key fields is their storage size. This is specified as an integer number between 1 and 5 (the syntax allows any integer, but the query processor enforces the restriction). The storage size specifies the maximum number of bytes that may be used to store in serialized form a value of the associated primary key column. If a value cannot be serialized into the specified number of bytes (or less), an error will be thrown. An internal encoding is used to store INTEGER (and LONG) primary-key values, so that such values are sortable as strings (this is because primary key values are always stored as keys of the "primary" Btree index). The following table shows the range of positive values that can be stored for each byte-size (the ranges are the same for negative values). Users can save storage space by specifying a storage size less than 5, if they know that the key values will be less or equal to the upper bound of the range associated with the chosen storage size.

comment

Comments are included at table-level and they become part of the table's metadata as uninterpreted text. Comments are displayed in the output of the describe statement.

Tables Used in the Examples

Example 5-1 User data application table

The following create table statement defines a users table that holds information about the users.

CREATE TABLE users( id INTEGER,
                     firstName STRING,
                     lastName STRING,
                     otherNames ARRAY(RECORD(first STRING, last STRING)),
                     age INTEGER,
                     income INTEGER,
                     address JSON,
                     connections ARRAY(INTEGER),
                     hobbies ARRAY(STRING),
PRIMARY KEY (id) )

The rows of the users table defined above represent information about users. For each such user, the otherNames field is an array of records, where each record in the array includes the first and last name fields. The connections field is an array of integers containing the ids of other users that this user is connected with. You can assume that the ids in the array are sorted by some measure of the strength of the connection. The hobbies field is an array of string containing the user's interests in various activities. The address column is a schema-less JSON field.

For example, a typical value for the address field can include the following attributes:

{
    "street" : "Pacific Ave",
    "number" : 101,
    "city"   : "Santa Cruz",
    "state"  : "CA",
    "zip"    : 95008,
    "phones" : [
            { "area" : 408, "number" : 4538955, "kind" : "work" },
            { "area" : 831, "number" : 7533341, "kind" : "home" }
    ]
}

You can add any number of attributes to the JSON field. The JSON field does not have a fixed data type.

Some addresses may have additional fields, or missing fields, or fields spelled differently. The phones field may not be an array of JSON objects but a single such object. The whole address may be just one string, number, or JNULL.

For more details on the supported data types, see Oracle NoSQL Database Data Model.

Example 5-2 Airline baggage tracking application

The following CREATE TABLE statement defines a BaggageInfo table that holds the checked baggage information of passengers in an airline system.

CREATE TABLE BaggageInfo (ticketNo LONG,
fullName STRING,
gender STRING,
contactPhone STRING,
confNo STRING,
bagInfo JSON,
PRIMARY KEY (ticketNo)
)USING TTL 5 DAYS

Using this schema you can handle a use case wherein passengers traveling on a flight can track where their checked-in bags or luggage are along the route to the final destination. This functionality can be offered as a part of the airline's mobile application. Once the passenger logs into the mobile application, the ticket number or reservation code of the current flight is displayed on the screen. Passengers can use this information to track their checked baggage. The mobile application uses Oracle NoSQL Database to store all the data related to the baggage. In the backend, the mobile application logic uses SQL queries to retrieve the required data.

This schema creates the BaggageInfo table with columns that include atomic data types like long and string to hold passenger information. The checked baggage information of the passengers is stored as a schema-less JSON. You can add any number of attributes to this schema-less JSON field. In contrast, the passenger's information like ticket number, full name, gender, and contact details are all part of a fixed schema.

You can create the table by specifying a TTL value, after which the rows expire automatically and are not available anymore. The TTL value is followed by either HOURS or DAYS. In this schema, the rows of the table expire after 5 days. You can check the hours remaining until a row expires using the remaining_days Built-in Functions.

Specifying the TTL value while creating a table is optional. If you don’t provide a TTL value, the rows of a table will have an infinite expiration time.

Example 5-3 Streaming media service

The following CREATE TABLE statement defines a stream_acct table that creates a TV streaming application.

CREATE TABLE stream_acct(
acct_id INTEGER,
profile_id INTEGER,
profile_name STRING,
acct_data JSON, 
PRIMARY KEY(acct_id)
)

The Streaming Media Service streams various shows that are watched by customers across the globe. Every show has several seasons and every season has multiple episodes. You need a persistent meta-data store that keeps track of the current activity of the customers using the TV streaming application. Using this schema you can provide useful information to the customer such as episodes they watched, the watch time per episode, the total number of seasons of the show they watched, and so on. The data is stored in the Oracle NoSQL Database and the application performs SQL queries to retrieve the required data and offer it to the user.

This schema has acct_id as the primary key column, which contains the user's account ID. The schema also includes other fields such as profile_id, profile_name, and a schema-less JSON column (acct_data). The schema-less JSON does not have a fixed data type. You can add any number of attributes to this JSON field.

Load Data to the Tables

Use the following statements to insert the data into the User data - users table.

INSERT INTO users VALUES ( 

   10,   
   "John",
   "Smith",
   [ {"first" : "Johny", "last" : "Good"}, {"first" : "Johny2", "last" : "Brave"},{"first" : "Johny3", "last" : "Kind"},{"first" : "Johny4", "last" : "Humble"} ],
   22,
   45000,
   {
      "street" : "Pacific Ave",
      "number" : 101,
      "city" : "Santa Cruz",
      "state" : "CA",
      "zip" : 95008,
      "phones" : [
         { "area" : 408, "number" : 4538955, "kind" : "work" },
         { "area" : 831, "number" : 7533341, "kind" : "home" },
         { "area" : 831, "number" : 7533382, "kind" : "mobile" }
      ]
   },
   [ 30, 55, 43 ],
   [ "Reading", "Soccer", "Hiking", "Gardening"]
)

INSERT INTO users VALUES ( 

   20,   
   "Jane",
   "Smith",
   [ {"first" : "Jane", "last" : "Charming"} ],
   22,
   55000,
   {
      "street" : "Atlantic Ave",
      "number" : 201,
      "city" : "San Jose",
      "state" : "CA",
      "zip" : 95005,
      "phones" : [
         { "area" : 608, "number" : 6538955, "kind" : "work" },
         { "area" : 931, "number" : 9533341, "kind" : "home" },
         { "area" : 931, "number" : 9533382, "kind" : "mobile" }
      ]
   },
   [ 40, 75, 63 ],
   [ "Knitting", "Hiking", "Baking", "BingeWatching"]
)

INSERT INTO users VALUES ( 

   30,   
   "Adam",
   "Smith",
   [ {"first" : "Adam", "last" : "BeGood"} ],
   45,
   75000,
   {
      "street" : "Indian Ave",
      "number" : 301,
      "city" : "Houston",
      "state" : "TX",
      "zip" : 95075,
      "phones" : [
         { "area" : 618, "number" : 6618955, "kind" : "work" },
         { "area" : 951, "number" : 9613341, "kind" : "home" },
         { "area" : 981, "number" : 9613382, "kind" : "mobile" }
      ]
   },
   [ 60, 45, 73 ],
   [ "Soccer", "Riding", "PianoForte", "RockClimbing", "Sketching" ]
)

INSERT INTO users VALUES ( 

   40,   
   "Joanna",
   "Smith",
   [ {"first" : "Joanna", "last" : "Smart"} ],
   NULL,
   75000,
   {
      "street" : "Tex Ave",
      "number" : 401,
      "city" : "Houston",
      "state" : "TX",
      "zip" : 95085,
      "phones" : [
         { "area" : NULL, "number" : 1618955, "kind" : "work" },
         { "area" : 451, "number" : 4613341, "kind" : "home" },
         { "area" : 481, "number" : 4613382, "kind" : "mobile" }
      ]
   },
   [ 70, 30, 40 ],
   ["Soccer", "Riding", "PianoForte", "RockClimbing", "Sketching"]
)

To create sample tables for the airline baggage tracking application and TV streaming application, download the scripts baggageschema_loaddata.sql and acctstream_loaddata.sql.

Start your KVSTORE or KVLite and open the SQL.shell.

java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore

Using the load command, run the required scripts.

load -file baggageschema_loaddata.sql
load -file acctstream_loaddata.sql

This creates the tables used in the examples and loads the data into the tables.

One sample row from the airline baggage tracking application - BaggageInfo table is shown below.

"ticketNo" : 1762344493810,
"fullName" : "Adam Phillips",
"gender" : "M",
"contactPhone" : "893-324-1064",
"confNo" : "LE6J4Z",
 [ {
    "id" : "79039899165297",
    "tagNum" : "17657806255240",
    "routing" : "MIA/LAX/MEL",
    "lastActionCode" : "OFFLOAD",
    "lastActionDesc" : "OFFLOAD",
    "lastSeenStation" : "MEL",
    "flightLegs" : [ {
      "flightNo" : "BM604",
      "flightDate" : "2019-02-01T01:00:00",
      "fltRouteSrc" : "MIA",
      "fltRouteDest" : "LAX",
      "estimatedArrival" : "2019-02-01T03:00:00",
      "actions" : [ {
        "actionAt" : "MIA",
        "actionCode" : "ONLOAD to LAX",
        "actionTime" : "2019-02-01T01:13:00"
      }, {
        "actionAt" : "MIA",
        "actionCode" : "BagTag Scan at MIA",
        "actionTime" : "2019-02-01T00:47:00"
      }, {
        "actionAt" : "MIA",
        "actionCode" : "Checkin at MIA",
        "actionTime" : "2019-02-01T23:38:00"
      } ]
    }, {
      "flightNo" : "BM667",
      "flightDate" : "2019-01-31T22:13:00",
      "fltRouteSrc" : "LAX",
      "fltRouteDest" : "MEL",
      "estimatedArrival" : "2019-02-02T03:15:00",
      "actions" : [ {
        "actionAt" : "MEL",
        "actionCode" : "Offload to Carousel at MEL",
        "actionTime" : "2019-02-02T03:15:00"
      }, {
        "actionAt" : "LAX",
        "actionCode" : "ONLOAD to MEL",
        "actionTime" : "2019-02-01T07:35:00"
      }, {
        "actionAt" : "LAX",
        "actionCode" : "OFFLOAD from LAX",
        "actionTime" : "2019-02-01T07:18:00"
      } ]
    } ],
    "lastSeenTimeGmt" : "2019-02-02T03:13:00",
    "bagArrivalDate" : "2019.02.02T03:13:00"
  } ]

One sample row from the Streaming Media Service - stream_acct table is shown below.

"acct_id" : 1,
"profile_id" : 101,
"profile_name" : "Adams",
[ {
   "firstName" : "Adam",
   "lastName" : "Phillips",
   "country" : "USA",
   "contentStreamed":  [
      {
        "showName" : "At the Ranch",
        "showId" : 26,
        "showtype" : "tvseries",
        "genres" : ["action", "crime", "spanish"], 
        "numSeasons" : 4,
        "seriesInfo": [
          {
            "seasonNum" : 1,
            "numEpisodes" : 2,
            "episodes": [
              {
                "episodeID": 20,
                "lengthMin": 85,
                "minWatched": 85,
                "date" : "2022-04-18"
              },
              {
                "episodeID": 30,
                "lengthMin": 60,
                "minWatched": 60,
                "date" : "2022-04-18"
              }
            ]
          },
          {
            "seasonNum": 2,
            "numEpisodes" : 4,
            "episodes": [
              {
                "episodeID": 40,
                "lengthMin": 50,
                "minWatched": 50,
                "date" : "2022-04-25"
              },
              {
                "episodeID": 50,
                "lengthMin": 45,
                "minWatched": 30,
                "date" : "2022-04-27"
              }
            ]
          }
        ]
      },
      {
        "showName": "Bienvenu",
        "showId": 15,
        "showtype": "tvseries",
        "genres" : ["comedy", "french"], 
        "numSeasons" : 2,
        "seriesInfo": [
          {
            "seasonNum" : 1,
            "numEpisodes" : 2,
            "episodes": [
              {
                "episodeID": 20,
                "lengthMin": 45,
                "minWatched": 45,
                "date" : "2022-03-07"
              },
              {
                "episodeID": 30,
                "lengthMin": 42,
                "minWatched": 42,
                "date" : "2022-03-08"
              }
            ]
          }
        ]
      }
    ]
  } ]