Primary Keys

Every table must have one or more fields designated as the primary key. This designation occurs at the time that the table is created, and cannot be changed after the fact. A table's primary key uniquely identifies every row in the table. In the simplest case, it is used to retrieve a specific row so that it can be examined and/or modified.

For example, a table might have five fields: productName, productType, color, size, and inventoryCount. To retrieve individual rows from the table, it might be enough to just know the product's name. In this case, you would set the primary key field as productName and then retrieve rows based on the product name that you want to examine/manipulate.

In this case, the table statement you use to define this table is:

CREATE TABLE myProducts (
    productName STRING,
    productType STRING,
    color ENUM (blue,green,red),
    size ENUM (small,medium,large),
    inventoryCount INTEGER,
    // Define the primary key. Every table must have one.
    PRIMARY KEY (productName)
) 

However, you can use multiple fields for your primary keys. For example:

CREATE TABLE myProducts (
    productName STRING,
    productType STRING,
    color ENUM (blue,green,red),
    size ENUM (small,medium,large),
    inventoryCount INTEGER,
    // Define the primary key. Every table must have one.
    PRIMARY KEY (productName, productType)
) 

On a functional level, doing this allows you to delete multiple rows in your table in a single atomic operation. In addition, multiple primary keys allows you to retrieve a subset of the rows in your table in a single atomic operation.

We describe how to retrieve multiple rows from your table in Reading Table Rows. We show how to delete multiple rows at a time in Using multiDelete().

Note:

If the primary key field is an INTEGER data type, you can apply a serialized size constraint to it. See Integer Serialized Constraints.

Data Type Limitations

Fields can be designated as primary keys only if they are declared to be one of the following types:

  • Integer

  • Long

  • Number

  • Float

  • Double

  • String

  • Timestamp

  • Enum

Partial Primary Keys

Some of the methods you use to perform multi-row operations allow, or even require, a partial primary key. A partial primary key is, simply, a key where only some of the fields comprising the row's primary key are specified.

For example, the following example specifies three fields for the table's primary key:

CREATE TABLE myProducts (
    productName STRING,
    productType STRING,
    productClass STRING,
    color ENUM (blue,green,red),
    size ENUM (small,medium,large),
    inventoryCount INTEGER,
    // Define the primary key. Every table must have one.
    PRIMARY KEY (productName, productType, productClass)
) 

In this case, a full primary key would be one where you provide value for all three primary key fields: productName, productType, and productClass. A partial primary key would be one where you provide values for only one or two of those fields.

Note that order matters when specifying a partial key. The partial key must be a subset of the full key, starting with the first field specified and then adding fields in order. So the following partial keys are valid:

  • productName

  • productName, productType

Shard Keys

Shard keys identify which primary key fields are meaningful in terms of shard storage. That is, rows which contain the same values for all the shard key fields are guaranteed to be stored on the same shard. This matters for some operations that promise atomicity of the results. (See Executing a Sequence of Operations for more information.)

For example, suppose you set the following primary keys:

PRIMARY KEY (productType, productName, productClass)

You can guarantee that rows are placed on the same shard using the values set for the productType and productName fields like this:

PRIMARY KEY (SHARD(productType, productName), productClass)

Note that order matters when it comes to shard keys. The keys must be specified in the order that they are defined as primary keys, with no gaps in the key list. In other words, given the above example, it is impossible to set productType and productClass as shard keys without also specifying productName as a shard key.