Iterating over Table Rows

Store.table_iterator() provides non-atomic table iteration.

Store.table_iterator() does not return the entire set of rows all at once. Instead, it batches the fetching of rows in the iterator, to minimize the number of network round trips, while not monopolizing the available bandwidth. Also, the rows returned by this method are in unsorted order.

Note that this method does not result in a single atomic operation. Because the retrieval is batched, the return set can change over the course of the entire retrieval operation. As a result, you lose the atomicity of the operation when you use this method.

This method provides for an unsorted traversal of rows in your table. If you do not provide a key, then this method will iterate over all of the table's rows.

When using this method, you can optionally specify:

For example, suppose you have a table that stores information about products, which is designed like this:

CREATE TABLE myTable (
    itemType STRING,
    itemCategory STRING,
    itemClass STRING,
    itemColor STRING,
    itemSize STRING,
    price FLOAT,
    inventoryCount INTEGER,
    PRIMARY KEY (SHARD(itemType, itemCategory, itemClass), itemColor,
    itemSize)
) 

With tables containing data like this:

Then in the simplest case, you can retrieve all of the rows related to 'Hats' using Store.table_iterator() as follows. Note that this simple example can also be accomplished using the Store.multi_get() method. If you have a complete shard key, and if the entire results set will fit in memory, then multi_get() will perform much better than table_iterator(). However, if the results set cannot fit entirely in memory, or if you do not have a complete shard key, then table_iterator() is the better choice. Note that reads performed using table_iterator() are non-atomic, which may have ramifications if you are performing a long-running iteration over records that are being updated.

def display_row(row):
    try:
            print "Retrieved row:"
            print "\tType: %s" % row['itemType']
            print "\tCategory: %s" % row['itemCategory']
            print "\tClass: %s" % row['itemClass']
            print "\tColor: %s" % row['itemColor']
            print "\tSize: %s" % row['itemSize']
            print "\tPrice: %s" % row['price']
            print "\tInventory Count: %s" % row['inventoryCount']
            print "\n"
    except KeyError, ke:
        logging.error("Row display failed. Bad key: %s" % ke.message)


def do_store_ops(store):

    key_d = {'itemType' : 'Hats'}

    try:
        row_list = store.table_iterator("myTable", key_d, False)
        if not row_list:
            logging.debug("Table retrieval failed")
        else:
            logging.debug("Table retrieval succeeded.")
            for r in row_list:
                display_row(r)
    except IllegalArgumentException, iae:
        logging.error("Table retrieval failed.")
        logging.error(iae.message)