INSERT Statement

The INSERT statement is used to construct a new row and add it in a specified table.

Syntax

insert_statement ::= 
   [variable_declaration] 
   INSERT  INTO table_name 
   [[AS] table_alias]
   ["(" id ("," id)* ")"] 
   VALUES "(" insert_clause ("," insert_clause)* ")"
   [SET TTL ttl_clause ] 
   [returning_clause]

insert_clause ::= DEFAULT | expression

returning_clause ::= RETURNING select_list

Semantics

The row will be inserted only if it does not exist already.

Insert statements may start with declarations of external variables that are used in the rest of the statement. See Variable Declaration. However, contrary to queries, external variables can be used in inserts without being declared. This is because inserts do not query any data, and as result, knowing the type of external variables in advance is not important as there isn’t any query optimization to be done.

Optional column(s) may be specified after the table name. This list contains the column names for a subset of the table’s columns. The subset must include all the primary key columns. If no columns list is present, the default columns list is the one containing all the columns of the table, in the order they are specified in the CREATE TABLE Statement section.

The columns in the columns list correspond one-to-one to the expressions (or DEFAULT keywords) listed after the VALUES clause (an error is raised if the number of expressions/DEFAULTs is not the same as the number of columns). These expressions/DEFAULTs compute the value for their associated column in the new row. Specifically, each expression is evaluated and its returned value is cast to the type of its associated column. The cast behaves like the cast expression as described in the Cast Expression section. An error is raised if an expression returns more than one item. If an expression returns no result, NULL is used as the result of that expression. If instead of an expression, the DEFAULT keyword appears in the VALUES list, the default value of the associated column is used as the value of that column in the new row. The default value is also used for any missing columns, when the number of columns in the columns list is less than the total number of columns in the table.

The expressions in the VALUES list may reference external variables, which unlike query statements, do not need to be declared in a declarations section.

Following the VALUES list a SET TTL clause may be used to set the expiration time of the new row. As described in the CREATE TABLE Statement section, every row has an expiration time, which may be infinite, and which is computed in terms of a Time-To-Live (TTL) value that is specified as a number of days or hours. Specifically, for a TTL value of N hours/days, where N is greater than zero, the expiration time is computed as the current time (in UTC) plus N hours/days, rounded up to the next full hour/day. For example, if the current time is 2017-06-01T10:05:30.0 and N is 3 hours, the expiration time will be 2017-06-01T14:00:00.0. If N is 0, the expiration time is infinite.

As shown in the syntax, the SET TTL clause comes in two flavors. When the USING TABLE DEFAULT syntax is used, the TTL value is set to the table default TTL that was specified in the CREATE TABLE statement. Otherwise, the SET TTL contains an expression, which computes a new TTL value. If the result of this expression is empty, the default TTL of the table is used. Otherwise, the expression must return a single numeric item, which is cast to an integer N. If N is negative, it is set to 0. To the right of the TTL expression, the keyword HOURS or DAYS must be used to specify whether N is a number of hours or days, respectively.

If the insert statement contains a SET TTL clause, an expiration time is computed as described above and applied to the row being inserted. If no SET TTL clause is used, the default table TTL is used to compute the expiration time of the inserted row. In case of MR Tables, when this row is replicated to other regions, its expiration time is also replicated as an absolute timestamp value. Therefore, the replicated rows will expire along with the original row, irrespective of when they were replicated. If the same row is inserted with a TTL value in multiple regions, then the TTL value will be set in all regions to the value held in the row with the greatest write timestamp.

The last part of the insert statement is the RETURNING clause. If not present, the result of the update statement is a record with a single field whose name is "NumRowsInserted" and whose value is the number of rows inserted: 0 if the row existed already, or 1 otherwise. Otherwise, if there is a RETURNING clause, it acts the same way as the SELECT clause: it can be a "*", in which case, a full row will be returned, or it can have a list of expressions specifying what needs to be returned. In the case of an INSERT where no insertion is actually done (because the row exists already), the RETURNING clause acts on the existing row. Otherwise, the RETURNING clause acts on the new row.

Example 1: Using DEFAULT values while inserting data

The following statement inserts a row to the users table from CREATE TABLE Statement. Notice that the value for the expenses column will be set to NULL, because the DEFAULT clause is used for that column.

INSERT INTO users VALUES (
    10,
    "John",
    "Smith",
    [ {"first" : "Johny", "last" : "BeGood"} ],
    22,
    45000,
    { "street" : "Main", "number" : 10, "city" : "Reno", "state" : "NV"},
    [ 30, 55, 43 ],
    DEFAULT
);

Example 2: Using a RETURNING clause in an INSERT statement.

A RETURNING clause acts the same way as the SELECT clause. A "*", in the example below returns the full row that got inserted.


INSERT INTO users VALUES (
    20,
    "Mary",
    "Ann",
    null,
    43,
    90000,
    { "street" : "Main", "number" : 89, "city" : "San Jose", "state" : "CA"},
    null,
    DEFAULT
) RETURNING *;

Result:

{"id":20,"firstName":"Mary","lastName":"Ann","otherNames":null,
"age":43,"income":90000,"address":{"city":"San Jose","number":89,
"state":"CA","street":"Main"},"connections":null,"expenses":null}

Example 3: Set a TTL for a row while inserting a row

In the example below, the expiration of the row that is inserted is set to 2 days.


INSERT INTO users VALUES (
    30,
    "Peter",
    "Paul",
    null,
    25,
    53000,
    { "street" : "Main", "number" : 3, "city" : "Fresno", "state" : "CA"},
    null,
    DEFAULT
)SET TTL 2 days;
{"NumRowsInserted":1}

Insert data into a child table:

A child table inherits the primary key columns of its parent table. This is done implicitly, without including the parent columns in the CREATE TABLE statement of the child. So the child table has additional columns pertaining to the primary key of the parent table. While inserting data into a child table, the value for the primary key of the parent table should also be inserted.

Example: The description of the child table userdet is shown below.

sql-> desc as json table users.userdet;
{
  "json_version" : 1,
  "type" : "table",
  "name" : "userdet",
  "parent" : "users",
  "regions" : {
    "2" : "FRA",
    "1" : "LON"
  },
  "fields" : [{
    "name" : "id",
    "type" : "INTEGER",
    "nullable" : false
  }, {
    "name" : "pan",
    "type" : "INTEGER",
    "nullable" : false
  }, {
    "name" : "address",
    "type" : "STRING",
    "nullable" : true
  }, {
    "name" : "email",
    "type" : "STRING",
    "nullable" : true
  }],
  "primaryKey" : ["id", "pan"],
  "shardKey" : ["id"]
}

Here "id" is the primary key of the parent table. While inserting data, provide the value for the "id" column. Else an error is thrown.

insert into users.userdet values(1,100,"bangalore","test@one.com");
{"NumRowsInserted":1}

When you retrieve the data from the child table, the primary key of the parent table is also retrieved.

SELECT * FROM users.userdet;
{"id":1,"pan":100,"address":"bangalore","email":"test@one.com"}