Table of Contents
Primary keys and shard keys are important concepts for your table design. What you use for primary and shard keys has implications in terms of your ability to read multiple rows at a time. But beyond that, your key design has important performance implications.
Every table must have one or more fields designated as the primary key. This designation occurs at the time that the table is created, and cannot be changed after the fact. A table's primary key uniquely identifies every row in the table. In the simplest case, it is used to retrieve a specific row so that it can be examined and/or modified.
For example, a table might have five fields:
productName
, productType
,
color
, size
, and
inventoryCount
. To retrieve individual rows
from the table, it might be enough to just know the product's
name. In this case, you would set the primary key field as
productName
and then retrieve rows based on
the product name that you want to examine/manipulate.
In this case, the table statement you use to define this table is:
CREATE TABLE myProducts ( productName STRING, productType STRING, color ENUM (blue,green,red), size ENUM (small,medium,large), inventoryCount INTEGER, // Define the primary key. Every table must have one. PRIMARY KEY (productName) )
However, you can use multiple fields for your primary keys. For example:
CREATE TABLE myProducts ( productName STRING, productType STRING, color ENUM (blue,green,red), size ENUM (small,medium,large), inventoryCount INTEGER, // Define the primary key. Every table must have one. PRIMARY KEY (productName, productType) )
On a functional level, doing this allows you to delete multiple rows in your table in a single atomic operation. In addition, multiple primary keys allows you to retrieve a subset of the rows in your table in a single atomic operation.
We describe how to retrieve multiple rows from your table in Reading Table Rows. We show how to delete multiple rows at a time in Using multiDelete().
Fields can be designated as primary keys only if they are declared to be one of the following types:
Integer
Long
Float
Double
String
Enum
Some of the methods you use to perform multi-row operations allow, or even require, a partial primary key. A partial primary key is, simply, a key where only some of the fields comprising the row's primary key are specified.
For example, the following example specifies three fields for the table's primary key:
CREATE TABLE myProducts ( productName STRING, productType STRING, productClass STRING, color ENUM (blue,green,red), size ENUM (small,medium,large), inventoryCount INTEGER, // Define the primary key. Every table must have one. PRIMARY KEY (productName, productType, productClass) )
In this case, a full primary key would be one where you
provide value for all three primary key fields:
productName
, productType
,
and productClass
. A partial primary key
would be one where you provide values for only one or two
of those fields.
Note that order matters when specifying a partial key. The partial key must be a subset of the full key, starting with the first field specified and then adding fields in order. So the following partial keys are valid:
productName
|
productName , productType |
But a partial key comprised of productType
and productClass
is not.
Shard keys identify which primary key fields are meaningful in terms of shard storage. That is, rows which contain the same values for all the shard key fields are guaranteed to be stored on the same shard. This matters for some operations that promise atomicity of the results. (See Executing a Sequence of Operations for more information.)
For example, suppose you set the following primary keys:
PRIMARY KEY (productType, productName, productClass)
You can guarantee that rows are placed on the same shard
using the values set for the productType
and productName
fields like this:
PRIMARY KEY (SHARD(productType, productName), productClass)
Note that order matters when it comes to shard keys. The keys
must be specified in the order that they are defined as
primary keys, with no gaps in the key list. In other words,
given the above example, it is impossible to set
productType
and productClass
as shard keys without also specifying productName
as a shard key.