CREATE TABLE Statement

The table is the basic structure to hold user data. You use the create table statement to create a new table in the Oracle NoSQL Database.

Syntax

create_table_statement ::= 
   CREATE TABLE [IF NOT EXISTS] table_name [comment] 
   "(" table_definition ")" [ttl_definition]

table_name ::= [namespace_name ":"] name_path
name_path ::= field_name ("." field_name)*
field_name ::= id | DSTRING

table_definition ::= 
   (column_definition | key_definition) 
   ("," (column_definition | key_definition))*
column_definition ::= 
   id type_definition 
   [default_definition | identity_definition | 
uuid_definition | mr_counter_definition] 
   [comment]
key_definition ::= 
   PRIMARY KEY 
   "(" [shard_key_definition [","]] [id_list_with_size] ")" 
   [ttl_definition]
id_list_with_size ::= id_with_size ("," id_with_size)*
id_with_size ::= id [storage_size]
storage_size ::= "(" INT_CONSTANT ")"
shard_key_definition ::= SHARD "(" id_list_with_size ")"
ttl_definition ::= USING TTL INT_CONSTANT (HOURS | DAYS)
region_definition ::= IN REGIONS region-name-1,region-name-2 [,...]

Semantics

table_name

The table name is specified as an optional namespace_name and a local_name. The local name is a name_path because, in the case of child tables, it will consist of a list of dot-separated ids. Child tables are described in the Table Hierarchies section. A table_name that includes a namespace_name is called a qualified table name. When an SQL statement (DDL or DML) references a table by its local name only, the local name is resolved internally to a qualified name with a specific namespace name. See the Namespace Management chapter.

IF NOT EXISTS

This is an optional clause. If this clause is specified and if a table with the same qualified name exists (or is being created) and if that existing table has the same structure as in the statement, no error is generated. In all other cases and if a table with the same qualified name exists, the create table statement generates an error indicating that the table exists.

ttl_definition

The Time-To-Live (TTL) value is used in computing the expiration time of a row. Expired rows are not included in query results and are eventually removed from the table automatically by Oracle NoSQL Database. If you specify a TTL value while creating the table, it applies as the default TTL for every row inserted into this table. However, you can override the table level TTL by specifying a TTL value via the table insertion API.

The expiration time of a row is computed by adding the TTL value to the current timestamp. To be more specific, for a TTL value of N hours/days, the expiration time is the current time (in UTC) plus N hours/days, rounded up to the next full hour/day. For example, if the current timestamp is 2020-06-23T10:01:36.096 and the TTL is 4 days, the expiration time will be 2020-06-28T00:00:00.000. You can use zero as a special value to indicate that a rows should never expire. If the CREATE TABLE statement has no TTL specification, the default table TTL is zero.

In case of MR Tables with TTL value defined, the rows replicated to other regions carry the expiration time when the row was written. This can be either the default table level TTL value or a row level override that is set by your application. Therefore, this row will expire in all the regions at the same time, irrespective of when they were replicated. However, if a row is updated in one of the regions and it expires in the local region even before it is replicated to one of the remote region(s), then this row will expire as soon as it is replicated and committed in that remote region.

region_definition

This is an optional clause. In case, the table being created is an MR Table, this parameter lists all the regions that the table should span. You must mention at least one remote region in this clause to create the table as an MR Table. For information on MR Tables, see Life Cycle of MR Tables.

Note:

Specifying this clause while creating a child table of a MR Table will result in an error.

table_definition

The table_definition part of the statement must include at least one field definition, and exactly one primary key definition (Although the syntax allows for multiple key_definitions, the query processor enforces the one key_definition rule. The syntax is this way to allow for the key definition to appear anywhere among the field definitions).

column_definition

The syntax for a column definition is similar to the field_definition grammar rule that defines the fields of a record type. See Data Type Definitions section. It specifies the name of the column, its data type, whether the column is nullable or not, an optional default value or whether the column is an IDENTITY column or not, and an optional comment. As mentioned in Table Management section, tables are containers of records, and the table_definitions acts as an implicit definition of a record type (the table schema), whose fields are defined by the listed column_definitions. However, when the type_definition grammar rule is used in any DDL statement, the only wildcard type that is allowed is the JSON type. So, for example, it is possible to create a table with a column whose type is JSON, but not a column whose type is ANY.

identity_definition

The identity_definition specifies the name of the identity column. There can only be one identity column per table. See Using the IDENTITY Column section.

uuid_definition

The uuid_definition declares the type of a column to be the UUID type. See Using the UUID data type section.

mr_counter_definition

The mr_counter_definition parameter declares the type of a column to be the MR_COUNTER datatype. This data type can be used only in a multi-region table. See Using CRDT datatype in a multi-region table.

key_definition

The syntax for the primary key specification (key_definition) specifies the primary key columns of the table as an ordered list of field names. The column names must be among the ones appearing in the field_definitions, and their associated type must be one of the following: a numeric type, string, enum, or timestamp. The usual definition of a primary key applies: two rows of the same table cannot have the same values on all of their primary key columns.

shard_key_definition

A key_definition specifies the table’s shard key columns as well, as the first N primary-key columns, where 0 < N <= M and M is the number of primary-key columns. Specification of a shard key is optional. By default, for a root table (a table without a parent) the shard key is the whole primary key. Semantically, the shard key is used to distribute table rows across the multiple servers and processes that comprise an Oracle NoSQL Database store. Briefly, two rows having the same shard key, i.e., the same values on their shardkey columns, will always be located in the same server and managed by the same process. Further details about the distribution of data in Oracle NoSQL Database can be found in the Primary and Shard Key Design section.

storage_size

An additional property of INTEGER-typed primary-key fields is their storage size. This is specified as an integer number between 1 and 5 (the syntax allows any integer, but the query processor enforces the restriction). The storage size specifies the maximum number of bytes that may be used to store in serialized form a value of the associated primary key column. If a value cannot be serialized into the specified number of bytes (or less), an error will be thrown. An internal encoding is used to store INTEGER (and LONG) primary-key values, so that such values are sortable as strings (this is because primary key values are always stored as keys of the "primary" Btree index). The following table shows the range of positive values that can be stored for each byte-size (the ranges are the same for negative values). Users can save storage space by specifying a storage size less than 5, if they know that the key values will be less or equal to the upper bound of the range associated with the chosen storage size.

comment

Comments are included at table-level and they become part of the table's metadata as uninterpreted text. Comments are displayed in the output of the describe statement.

The rows of the Users table defined above represent information about users. For each such user, the "connections" field is an array containing ids of other users that this user is connected with. We assume that the ids in the array are sorted by some measure of the strength of the connection. The "expenses" column is a map mapping expense categories (like "housing", "clothes", "books", etc) to the amount spent in the associated category. The set of categories may not be known in advance, or it may differ significantly from user to user or may need to be frequently updated by adding or removing categories for each user. As a result, using a map type for "expenses", instead of a record type, is the right choice. Finally, the "address" column has type JSON. A typical value for "address" will be a map representing a JSON document that looks like this:

{
    "street" : "Pacific Ave",
    "number" : 101,
    "city"   : "Santa Cruz",
    "state"  : "CA",
    "zip"    : 95008,
    "phones" : [
            { "area" : 408, "number" : 4538955, "kind" : "work" },
            { "area" : 831, "number" : 7533341, "kind" : "home" }
    ]
}

However, any other valid JSON value may be stored there. For example, some addresses may have additional fields, or missing fields, or fields spelled differently. Or, the "phones" field may not be an array of JSON objects but a single such object. Or the whole address maybe just one string, or number, or JNULL.