CREATE TABLE Statement

The table is the basic structure to hold user data. You use the create table statement to create a new table in the Oracle NoSQL Database.

Syntax

create_table_statement ::= 
   CREATE TABLE [IF NOT EXISTS] table_name [comment] 
   "(" table_definition ")" [ttl_definition] [json_collection_definition]

table_name ::= [namespace_name ":"] name_path
name_path ::= field_name ("." field_name)*
field_name ::= id | DSTRING

table_definition ::= 
   (column_definition | key_definition) 
   ("," (column_definition | key_definition))*
column_definition ::= 
   id type_definition 
   [default_definition | identity_definition | 
uuid_definition | mr_counter_definition] 
   [comment]
key_definition ::= 
   PRIMARY KEY 
   "(" [shard_key_definition [","]] [id_list_with_size] ")" 
   [ttl_definition]
id_list_with_size ::= id_with_size ("," id_with_size)*
id_with_size ::= id [storage_size]
storage_size ::= "(" INT_CONSTANT ")"
shard_key_definition ::= SHARD "(" id_list_with_size ")"
ttl_definition ::= USING TTL INT_CONSTANT (HOURS | DAYS)
region_definition ::= IN REGIONS region-name-1,region-name-2 [,...]
json_collection_definition ::= AS JSON COLLECTION

Semantics

table_name

The table name is specified as an optional namespace_name and a local_name. The local name is a name_path because, in the case of child tables, it will consist of a list of dot-separated ids. Child tables are described in the Table Hierarchies section. A table_name that includes a namespace_name is called a qualified table name. When an SQL statement (DDL or DML) references a table by its local name only, the local name is resolved internally to a qualified name with a specific namespace name. See the Namespace Management chapter.

IF NOT EXISTS

This is an optional clause. If this clause is specified and if a table with the same qualified name exists (or is being created) and if that existing table has the same structure as in the statement, no error is generated. In all other cases and if a table with the same qualified name exists, the create table statement generates an error indicating that the table exists.

ttl_definition

The Time-To-Live (TTL) value is used in computing the expiration time of a row. Expired rows are not included in query results and are eventually removed from the table automatically by Oracle NoSQL Database. If you specify a TTL value while creating the table, it applies as the default TTL for every row inserted into this table. However, you can override the table level TTL by specifying a TTL value via the table insertion API.

The expiration time of a row is computed by adding the TTL value to the current timestamp. To be more specific, for a TTL value of N hours/days, the expiration time is the current time (in UTC) plus N hours/days, rounded up to the next full hour/day. For example, if the current timestamp is 2020-06-23T10:01:36.096 and the TTL is 4 days, the expiration time will be 2020-06-28T00:00:00.000. You can use zero as a special value to indicate that a rows should never expire. If the CREATE TABLE statement has no TTL specification, the default table TTL is zero.

In case of MR Tables with TTL value defined, the rows replicated to other regions carry the expiration time when the row was written. This can be either the default table level TTL value or a row level override that is set by your application. Therefore, this row will expire in all the regions at the same time, irrespective of when they were replicated. However, if a row is updated in one of the regions and it expires in the local region even before it is replicated to one of the remote region(s), then this row will expire as soon as it is replicated and committed in that remote region.

json_collection_definition

The json_collection_definition declares the table as a collection of documents. A JSON collection table is a convenient way to store, update, and query your documents. A JSON collection table must include a primary key while creating the table. You can create a JSON collection table with MR_COUNTERS if the table is intended to be a multi-region table. For more details on JSON collection table, see Using JSON Collection Tables.

region_definition
This is an optional clause. In case, the table being created is an MR Table, this parameter lists all the regions that the table should span. You must mention at least one remote region in this clause to create the table as an MR Table. For information on MR Tables, see Life Cycle of MR Tables.

Note:

Specifying this clause while creating a child table of a MR Table will result in an error.
table_definition

The table_definition part of the statement must include at least one field definition, and exactly one primary key definition (Although the syntax allows for multiple key_definitions, the query processor enforces the one key_definition rule. The syntax is this way to allow for the key definition to appear anywhere among the field definitions).

column_definition

The syntax for a column definition is similar to the field_definition grammar rule that defines the fields of a record type. See Data Type Definitions section. It specifies the name of the column, its data type, whether the column is nullable or not, an optional default value or whether the column is an IDENTITY column or not, and an optional comment. As mentioned in Table Management section, tables are containers of records, and the table_definitions acts as an implicit definition of a record type (the table schema), whose fields are defined by the listed column_definitions. However, when the type_definition grammar rule is used in any DDL statement, the only wildcard type that is allowed is the JSON type. So, for example, it is possible to create a table with a column whose type is JSON, but not a column whose type is ANY.

identity_definition

The identity_definition specifies the name of the identity column. There can only be one identity column per table. See Using the IDENTITY Column section.

uuid_definition

The uuid_definition declares the type of a column to be the UUID type. See Using the UUID data type section.

mr_counter_definition

The mr_counter_definition parameter declares the type of a column to be the MR_COUNTER datatype. This data type can be used only in a multi-region table. See Using CRDT datatype in a multi-region table.

key_definition

The syntax for the primary key specification (key_definition) specifies the primary key columns of the table as an ordered list of field names. The column names must be among the ones appearing in the field_definitions, and their associated type must be one of the following: a numeric type, string, enum, or timestamp. The usual definition of a primary key applies: two rows of the same table cannot have the same values on all of their primary key columns.

shard_key_definition

A key_definition specifies the table’s shard key columns as well, as the first N primary-key columns, where 0 < N <= M and M is the number of primary-key columns. Specification of a shard key is optional. By default, for a root table (a table without a parent) the shard key is the whole primary key. Semantically, the shard key is used to distribute table rows across the multiple servers and processes that comprise an Oracle NoSQL Database store. Briefly, two rows having the same shard key, i.e., the same values on their shardkey columns, will always be located in the same server and managed by the same process. Further details about the distribution of data in Oracle NoSQL Database can be found in the Primary and Shard Key Design section.

storage_size

An additional property of INTEGER-typed primary-key fields is their storage size. This is specified as an integer number between 1 and 5 (the syntax allows any integer, but the query processor enforces the restriction). The storage size specifies the maximum number of bytes that may be used to store in serialized form a value of the associated primary key column. If a value cannot be serialized into the specified number of bytes (or less), an error will be thrown. An internal encoding is used to store INTEGER (and LONG) primary-key values, so that such values are sortable as strings (this is because primary key values are always stored as keys of the "primary" Btree index). The following table shows the range of positive values that can be stored for each byte-size (the ranges are the same for negative values). Users can save storage space by specifying a storage size less than 5, if they know that the key values will be less or equal to the upper bound of the range associated with the chosen storage size.

comment

Comments are included at table-level and they become part of the table's metadata as uninterpreted text. Comments are displayed in the output of the describe statement.

Tables used in the examples

This section shows how to create table from the SQL shell for a few sample applications.

Start your KVSTORE or KVLite and open the SQL shell.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore

See Running the SQL Shell for the general usage and command options.

Example 5-1 User data application table

The following create table statement defines a users table that holds information about the users:

CREATE TABLE users( id INTEGER,
                     firstName STRING,
                     lastName STRING,
                     otherNames ARRAY(RECORD(first STRING, last STRING)),
                     age INTEGER,
                     income INTEGER,
                     address JSON,
                     connections ARRAY(INTEGER),
                     hobbies ARRAY(STRING),
PRIMARY KEY (id) )

The rows of the users table defined above represent information about users. For each such user, the otherNames field is an array of records, where each record in the array includes the first and last name fields. The connections field is an array of integers containing the ids of other users that this user is connected with. You can assume that the ids in the array are sorted by some measure of the strength of the connection. The hobbies field is an array of string containing the user's interests in various activities. The address column is a schema-less JSON field.

For example, a typical value for the address field can include the following attributes:

{
    "street" : "Pacific Ave",
    "number" : 101,
    "city"   : "Santa Cruz",
    "state"  : "CA",
    "zip"    : 95008,
    "phones" : [
            { "area" : 408, "number" : 4538955, "kind" : "work" },
            { "area" : 831, "number" : 7533341, "kind" : "home" }
    ]
}

You can add any number of attributes to the JSON field. The JSON field does not have a fixed data type.

Some addresses may have additional fields, or missing fields, or fields spelled differently. The phones field may not be an array of JSON objects but a single such object. The whole address may be just one string, number, or JNULL.

For more details on the supported data types, see Oracle NoSQL Database Data Model.

Example 5-2 Airline baggage tracking application

The following CREATE TABLE statement defines a BaggageInfo table that holds the checked baggage information of passengers in an airline system:

CREATE TABLE BaggageInfo (ticketNo LONG,
fullName STRING,
gender STRING,
contactPhone STRING,
confNo STRING,
bagInfo JSON,
PRIMARY KEY (ticketNo)
)USING TTL 5 DAYS

You can use the baggage tracking schema in an airline's mobile application. The passengers traveling on a flight can track where their checked-in bags or luggage are, along the route to the destination. The mobile application uses the Oracle NoSQL Database to store all the data related to the baggage. In the backend, the mobile application logic uses SQL queries to retrieve the required data.

This baggage tracking schema creates a BaggageInfo table with columns that include atomic data types and JSON data type to hold passenger information. The BaggageInfo table has the passenger's ticket number as the primary key column. The fullName, gender, contactPhone, and confNo (reservation number) fields store the passenger's information, which is part of a fixed schema. This table also includes a bagInfo column as a schema-less JSON array, which represents the tracking information of a passenger's checked-in baggage. In contrast to the fixed schema fields, you can add any number of attributes to this schema-less JSON field.

See Example 5-5 for a sample row from the BaggageInfo table.

You can create the table by specifying a TTL value, after which the rows expire automatically and are not available anymore. The TTL value is followed by either HOURS or DAYS. In this schema, the rows of the table expire after 5 days. You can check the hours remaining until a row expires using the remaining_days Built-in Functions.

Specifying the TTL value while creating a table is optional. If you don’t provide a TTL value, the rows of a table will have an infinite expiration time.

Example 5-3 Streaming media service

The following CREATE TABLE statement defines a stream_acct table that creates a TV streaming application:

CREATE TABLE stream_acct(
acct_id INTEGER,
profile_id INTEGER,
profile_name STRING,
acct_data JSON, 
PRIMARY KEY(acct_id)
)

The Streaming Media Service application streams various shows across the globe. Every show has several seasons, and every season has multiple episodes. You need a persistent metadata store that keeps track of the current activity of the subscriber using the TV streaming application.

You can use the TV streaming application to provide useful information to the subscriber such as episodes they watched, watch time per episode, total number of seasons of the show they watched, and so forth. The mobile application uses the Oracle NoSQL Database to store the data and performs SQL queries to retrieve the required data and offer it to the user.

The Streaming Media Service schema creates a stream_acct table with columns that include atomic data types and JSON data type to hold subscriber details.

The stream_acct table has the subscriber's account ID as the primary key column. The other fixed schema fields profile_name and account_expiry contain the subscriber’s details. The stream_acct table includes an acct_data column as a schema-less JSON field to keep track of the subscriber's current activity. In contrast to the fixed schema fields, you can add any number of attributes to this schema-less JSON field.

See Example 5-6 for a sample row from the stream_acct table.

Load data to the tables

You can load data into the tables using INSERT statement from the SQL shell directly by supplying the statement in the SQL prompt or by using scripts.

Example 5-4 Insert the data into the User data application (users table)

The following example shows how to insert the data into the User data - users table using the INSERT statement directly in the SQL prompt:

INSERT INTO users VALUES ( 

   10,   
   "John",
   "Smith",
   [ {"first" : "Johny", "last" : "Good"}, {"first" : "Johny2", "last" : "Brave"},{"first" : "Johny3", "last" : "Kind"},{"first" : "Johny4", "last" : "Humble"} ],
   22,
   45000,
   {
      "street" : "Pacific Ave",
      "number" : 101,
      "city" : "Santa Cruz",
      "state" : "CA",
      "zip" : 95008,
      "phones" : [
         { "area" : 408, "number" : 4538955, "kind" : "work" },
         { "area" : 831, "number" : 7533341, "kind" : "home" },
         { "area" : 831, "number" : 7533382, "kind" : "mobile" }
      ]
   },
   [ 30, 55, 43 ],
   [ "Reading", "Soccer", "Hiking", "Gardening"]
)

INSERT INTO users VALUES ( 

   20,   
   "Jane",
   "Smith",
   [ {"first" : "Jane", "last" : "Charming"} ],
   22,
   55000,
   {
      "street" : "Atlantic Ave",
      "number" : 201,
      "city" : "San Jose",
      "state" : "CA",
      "zip" : 95005,
      "phones" : [
         { "area" : 608, "number" : 6538955, "kind" : "work" },
         { "area" : 931, "number" : 9533341, "kind" : "home" },
         { "area" : 931, "number" : 9533382, "kind" : "mobile" }
      ]
   },
   [ 40, 75, 63 ],
   [ "Knitting", "Hiking", "Baking", "BingeWatching"]
)

INSERT INTO users VALUES ( 

   30,   
   "Adam",
   "Smith",
   [ {"first" : "Adam", "last" : "BeGood"} ],
   45,
   75000,
   {
      "street" : "Indian Ave",
      "number" : 301,
      "city" : "Houston",
      "state" : "TX",
      "zip" : 95075,
      "phones" : [
         { "area" : 618, "number" : 6618955, "kind" : "work" },
         { "area" : 951, "number" : 9613341, "kind" : "home" },
         { "area" : 981, "number" : 9613382, "kind" : "mobile" }
      ]
   },
   [ 60, 45, 73 ],
   [ "Soccer", "Riding", "PianoForte", "RockClimbing", "Sketching" ]
)

INSERT INTO users VALUES ( 

   40,   
   "Joanna",
   "Smith",
   [ {"first" : "Joanna", "last" : "Smart"} ],
   NULL,
   75000,
   {
      "street" : "Tex Ave",
      "number" : 401,
      "city" : "Houston",
      "state" : "TX",
      "zip" : 95085,
      "phones" : [
         { "area" : NULL, "number" : 1618955, "kind" : "work" },
         { "area" : 451, "number" : 4613341, "kind" : "home" },
         { "area" : 481, "number" : 4613382, "kind" : "mobile" }
      ]
   },
   [ 70, 30, 40 ],
   ["Soccer", "Riding", "PianoForte", "RockClimbing", "Sketching"]
)

Example 5-5 Import data into airline baggage tracking application (BaggageInfo) table

The following example shows how to create sample table for the airline baggage tracking applicationand import data using script from the SQL prompt:

Download the script baggageschema_loaddata.sql.

Using the load command, run the required script.
load -file baggageschema_loaddata.sql

This creates the BaggageInfo table used in the examples and loads data into the table.

One sample row from the airline baggage tracking application - BaggageInfo table is shown below.

The passenger's ticket number, ticketNo is the primary key of the table. The fullName, gender, contactPhone, and confNo (reservation number) fields store the passenger's information, which is part of a fixed schema. The bagInfo column is a schema-less JSON array, which represents the tracking information of a passenger's checked-in baggage.

Each element of the bagInfo array corresponds to a single checked-in bag. The size of the bagInfo array gives the total bags checked-in by a passenger. Each bag has an id and a tagnum field. The routing field includes the routing information from the passenger's travel itinerary. The lastActionCode and lastActionDesc fields hold the latest action taken on the bag and its action code at the current destination. The lastSeenStation field includes the airport code of the bag's current destination. The lastSeenTimeGmt field includes the latest action time. The bagArrivalDate field holds the expected arrival date at the destination airport. The bagInfo array further includes a nested flightLegs array with fields to track the source and transit details.

Each element of the flightLegs array corresponds to a travel leg. The fields flightNo holds the flight number, flightDate holds the departure date, fltRouteSrc holds the originating airport code, and fltRouteDest field hold the destination airport code for each travel leg. The flightLegs array further includes a nested actions array with fields to track the activities performed on the checked bag at each travel leg.

Each element of the actions array includes the fields actionAt, actionCode, and actionTime to track the tasks at source and destination airports in each travel leg.

"ticketNo" : 1762344493810,
"fullName" : "Adam Phillips",
"gender" : "M",
"contactPhone" : "893-324-1064",
"confNo" : "LE6J4Z",
 [ {
    "id" : "79039899165297",
    "tagNum" : "17657806255240",
    "routing" : "MIA/LAX/MEL",
    "lastActionCode" : "OFFLOAD",
    "lastActionDesc" : "OFFLOAD",
    "lastSeenStation" : "MEL",
    "flightLegs" : [ {
      "flightNo" : "BM604",
      "flightDate" : "2019-02-01T01:00:00",
      "fltRouteSrc" : "MIA",
      "fltRouteDest" : "LAX",
      "estimatedArrival" : "2019-02-01T03:00:00",
      "actions" : [ {
        "actionAt" : "MIA",
        "actionCode" : "ONLOAD to LAX",
        "actionTime" : "2019-02-01T01:13:00"
      }, {
        "actionAt" : "MIA",
        "actionCode" : "BagTag Scan at MIA",
        "actionTime" : "2019-02-01T00:47:00"
      }, {
        "actionAt" : "MIA",
        "actionCode" : "Checkin at MIA",
        "actionTime" : "2019-02-01T23:38:00"
      } ]
    }, {
      "flightNo" : "BM667",
      "flightDate" : "2019-01-31T22:13:00",
      "fltRouteSrc" : "LAX",
      "fltRouteDest" : "MEL",
      "estimatedArrival" : "2019-02-02T03:15:00",
      "actions" : [ {
        "actionAt" : "MEL",
        "actionCode" : "Offload to Carousel at MEL",
        "actionTime" : "2019-02-02T03:15:00"
      }, {
        "actionAt" : "LAX",
        "actionCode" : "ONLOAD to MEL",
        "actionTime" : "2019-02-01T07:35:00"
      }, {
        "actionAt" : "LAX",
        "actionCode" : "OFFLOAD from LAX",
        "actionTime" : "2019-02-01T07:18:00"
      } ]
    } ],
    "lastSeenTimeGmt" : "2019-02-02T03:13:00",
    "bagArrivalDate" : "2019.02.02T03:13:00"
  } ]

Example 5-6 Import data into TV streaming application (stream_acct) table

The following example shows how to create sample table for TV streaming application, and import data using script from the SQL prompt:

Download the script acctstream_loaddata.sql.

Using the load command, run the required script.
load -file acctstream_loaddata.sql

This creates the stream_acct table used in the examples and loads data into the table.

One sample row from the Streaming Media Service - stream_acct table is shown below.

The subscriber's account ID, acct_id is the primary key of the table. The fields profile_name and account_expiry contain the subscriber’s details. The acct_data column is a schema-less JSON field, which keeps track of the subscriber's current activity.

Each element of the acct_data JSON represents a user with the given subscriber’s profile name. User data contains the fields firstName, lastName, and country to hold user information. The acct_data JSON field further includes a nested contentStreamed JSON array to track the shows watched by the user.

Each element of the contentStreamed array contains the showName field to store the name of the show. The showId field includes the identifier of the show. The showtype field indicates the type such as tvseries, sitcom, and so forth. The genres array lists the show’s categorization. The numSeasons field contains the total number of seasons streamed for the show. The contentStreamed JSON array also includes a nested seriesInfo JSON array to track the watched episodes.

Each element of the seriesInfo array contains a seasonNum field to identify the season. The numEpisodes field indicates the total number of episodes streamed in the given season. The seriesInfo array further includes an episodes array to track the details of each watched episode.

Each element of the episodes array contains the episodeID field to identify the episode. The episodeName field includes the episode's name. The lengthMin field includes the show’s telecast duration in minutes. The minWatched field includes the duration for which a user has watched the episode. The date field includes the date on which the user watched the given episode.

"acct_id" : 1,
"profile_id" : 101,
"profile_name" : "Adams",
[ {
   "firstName" : "Adam",
   "lastName" : "Phillips",
   "country" : "USA",
   "contentStreamed":  [
      {
        "showName" : "At the Ranch",
        "showId" : 26,
        "showtype" : "tvseries",
        "genres" : ["action", "crime", "spanish"], 
        "numSeasons" : 4,
        "seriesInfo": [
          {
            "seasonNum" : 1,
            "numEpisodes" : 2,
            "episodes": [
              {
                "episodeID": 20,
                "lengthMin": 85,
                "minWatched": 85,
                "date" : "2022-04-18"
              },
              {
                "episodeID": 30,
                "lengthMin": 60,
                "minWatched": 60,
                "date" : "2022-04-18"
              }
            ]
          },
          {
            "seasonNum": 2,
            "numEpisodes" : 4,
            "episodes": [
              {
                "episodeID": 40,
                "lengthMin": 50,
                "minWatched": 50,
                "date" : "2022-04-25"
              },
              {
                "episodeID": 50,
                "lengthMin": 45,
                "minWatched": 30,
                "date" : "2022-04-27"
              }
            ]
          }
        ]
      },
      {
        "showName": "Bienvenu",
        "showId": 15,
        "showtype": "tvseries",
        "genres" : ["comedy", "french"], 
        "numSeasons" : 2,
        "seriesInfo": [
          {
            "seasonNum" : 1,
            "numEpisodes" : 2,
            "episodes": [
              {
                "episodeID": 20,
                "lengthMin": 45,
                "minWatched": 45,
                "date" : "2022-03-07"
              },
              {
                "episodeID": 30,
                "lengthMin": 42,
                "minWatched": 42,
                "date" : "2022-03-08"
              }
            ]
          }
        ]
      }
    ]
  } ]