Upsert statement

The word UPSERT combines UPDATE and INSERT, describing the statement's function.

Syntax:

upsert_statement ::= 
   [variable_declaration] 
   UPSERT INTO table_name 
   [[AS] table_alias]
   ["(" id ("," id)* ")"] 
   VALUES "(" insert_clause ("," insert_clause)* ")"
   [SET TTL ttl_clause ] 
   [returning_clause]

insert_clause ::= DEFAULT | expression

returning_clause ::= RETURNING select_list
Use an UPSERT statement to insert a row where it does not exist, or to update the row with new values when it does.
  • Optional column(s) may be specified after the table name. This list contains the column names for a subset of the table’s columns. The subset must include all the primary key columns. If no columns list is present, the default columns list is the one containing all the columns of the table, in the order they are specified in the CREATE TABLE Statement.
  • The columns in the columns list correspond one-to-one to the expressions (or DEFAULT keywords) listed after the VALUES clause (an error is raised if the number of expressions/DEFAULTs is not the same as the number of columns).
  • Following the VALUES list a SET TTL clause may be used to set the expiration time of an upserted(inserted/updated) row.
  • If there is no RETURNING clause, the result of the UPSERT statement is a record with a single field whose name is NumRowsInserted and whose value is the number of rows inserted: 0 if the row existed already and an update happened, or 1 otherwise. If there is a RETURNING clause, it acts the same way as the SELECT clause: it can be a *, in which case, a full row will be returned, or it can have a list of expressions specifying what needs to be returned.
The users table has three rows as shown below:
SELECT count(*) FROM users
{"Column_1":3}

Example 1: Updating data in the users table using UPSERT command.

The existing value for an user with id 10 is shown below.
SELECT * FROM users WHERE id=10;
{
  "id" : 10,
  "firstName" : "John",
  "lastName" : "Smith",
  "otherNames" : [{
    "first" : "Johny",
    "last" : "BeGood"
  }],
  "age" : 22,
  "income" : 45000,
  "address" : {
    "city" : "Reno",
    "number" : 10,
    "state" : "NV",
    "street" : "Main"
  },
  "connections" : [30, 55, 43],
  "expenses" : null
}
You modify the existing row using the UPSERT command. The otherNames array and income field is modified.
 UPSERT INTO users VALUES (
     10,
     "John",
     "Smith",
     [ {"first" : "Johny", "last" : "AlwaysGood"} ],
     22,
     80000,
     { "street" : "Main", "number" : 10, "city" : "Reno", "state" : "NV"},
     [ 30, 55, 43 ],
    DEFAULT
 );
{"NumRowsInserted":0}

1 row returned
You get the result as {"NumRowsInserted":0} which implies that the row has been updated. The updated value for an user with id 10 can be verified with a SELECT statement as shown below.
SELECT * FROM users WHERE id=10;
{
  "id" : 10,
  "firstName" : "John",
  "lastName" : "Smith",
  "otherNames" : [{
    "first" : "Johny",
    "last" : "AlwaysGood"
  }],
  "age" : 22,
  "income" : 80000,
  "address" : {
    "city" : "Reno",
    "number" : 10,
    "state" : "NV",
    "street" : "Main"
  },
  "connections" : [30, 55, 43],
  "expenses" : null
}

Example 2: Updating only some columns using UPSERT statement

If you use the UPSERT statement and specify the values of only few columns in the VALUES clause but do not specify the corresponding column names in the INTO clause, you get an error stating the number of columns in the table do not match with the number of values in the VALUES clause as shown below.
UPSERT INTO users VALUES ( 11,  "John", "Smith" );
Error handling command UPSERT INTO users VALUES ( 11,  "John", "Smith" ):
Error: at (1, 0) The number of VALUES expressions is not equal to the number of table columns
You can avoid this error by specifying the column list after the table name. Here if you do not supply values for all the columns in a UPSERT statement, then those columns get a DEFAULT value if such an option is specified in the corresponding CREATE TABLE statement or those columns are assigned NULL values as shown below.
UPSERT INTO users(id,firstName,lastName) VALUES (11,"John","Smith");

{"NumRowsInserted":1}
1 row returned
You get the result as {"NumRowsInserted":1} which implies that one new row has been inserted. The updated value for an user with id 11 can be verified with a SELECT statement as shown below.
SELECT * FROM users WHERE id=11;
{
  "id" : 11,
  "firstName" : "John",
  "lastName" : "Smith",
  "otherNames" : null,
  "age" : null,
  "income" : null,
  "address" : null,
  "connections" : null,
  "expenses" : null
}
All the fields which were not part of the UPSERT statement has a NULL value.

Note:

Even if a column has a non NULL value ( for example lastNames in the query above), it can becomes NULL if it is omitted in a subsequent UPSERT statement as shown below. Here you are using an optional RETURNING statement to fetch the data after the UPSERT is performed.
UPSERT INTO users(id,firstName) VALUES (11,"Joseph") returning *;
{
  "id" : 11,
  "firstName" : "Joseph",
  "lastName" : null,
  "otherNames" : null,
  "age" : null,
  "income" : null,
  "address" : null,
  "connections" : null,
  "expenses" : null
}
The column lastNames has become NULL because of the UPSERT statement.