Bulk Get Operations

Bulk get operations allow you to retrieve and process records from each shard in parallel, like a parallel scan, but using a set of keys instead of a single key as retrieval criteria.

A bulk get operation does not return the entire set of rows all at once. Instead, it batches the fetching of rows in the iterator, to minimize the number of network round trips, while not monopolizing the available bandwidth. Batches are fetched in parallel across multiple Replication Nodes. If more threads are specified on the client side, then the user can expect better retrieval performance until processor or network resources are saturated.

To use bulk get, use one of the TableAPI.tableIterator() or TableAPI.tableKeysIterator() methods that provide bulk retrievals. These accept a set of keys instead of a single key as the retrieval criteria. The set is provided using either an Iterator<Key> or List<Iterator<Key>> value.

The methods retrieve the rows or primary keys matching the keys supplied by the iterator(s).

Note:

If the iterator yields duplicate keys, the row associated with the duplicate keys will be returned at least once and potentially multiple times.

The supplied keys should follow these rules:

  1. All supplied primary keys should belong to the same table.

  2. The input key must be a complete shard key.

  3. If a field range is specified, then the partial primary keys should be uniform. That is, they should have the same number of components. Also, the field range must be the first unspecified field of the supplied key.

When using these methods, you can also optionally specify:

  • A MultiRowOptions class instance which allows you to specify a field range, as well as the ancestor and parent tables you want to include in the iteration.

  • The number of keys to fetch during each network round trip using a TableIteratorOptions class instance. If you provide a value of 0, an internally determined default is used. You can also specify the traversal order (UNORDERED is supported).

    You can control how many threads are used to perform the store read using the MaxConcurrentRequests parameter.

    Finally, you can specify a consistency policy. See Consistency Guarantees for more information.

For example, suppose you have a table that stores information about products, which is designed like this:

CREATE TABLE myTable (
    itemType STRING,
    itemCategory STRING,
    itemClass STRING,
    itemColor STRING,
    itemSize STRING,
    price FLOAT,
    inventoryCount INTEGER,
    PRIMARY KEY (SHARD(itemType, itemCategory), itemClass, itemColor,
    itemSize)) 

With tables containing data like this:

  • Row 1:

    • itemType: Hats

    • itemCategory: baseball

    • itemClass: longbill

    • itemColor: red

    • itemSize: small

    • price: 12.07

    • inventoryCount: 127

  • Row 2:

    • itemType: Hats

    • itemCategory: baseball

    • itemClass: longbill

    • itemColor: red

    • itemSize: medium

    • price: 13.07

    • inventoryCount: 201

  • Row 3:

    • itemType: Pants

    • itemCategory: baseball

    • itemClass: Summer

    • itemColor: red

    • itemSize: large

    • price: 14.07

    • inventoryCount: 39

  • Row 4:

    • itemType: Pants

    • itemCategory: baseball

    • itemClass: Winter

    • itemColor: white

    • itemSize: large

    • price: 16.99

    • inventoryCount: 9

  • Row n:

    • itemType: Coats

    • itemCategory: Casual

    • itemClass: Winter

    • itemColor: red

    • itemSize: large

    • price: 247.99

    • inventoryCount: 13

If you want to locate all the Hats and Pants used for baseball, using nine threads in parallel, you can retrieve all of the records as follows:

package kvstore.basicExample;

...
import java.util.ArrayList;
import java.util.List;
import oracle.kv.Consistency;
import oracle.kv.Direction;
import oracle.kv.table.PrimaryKey;
import oracle.kv.table.Row;
import oracle.kv.table.TableAPI;
import oracle.kv.table.TableIterator;
import oracle.kv.table.TableIteratorOptions;

...

// KVStore handle creation is omitted for brevity

...

// Construct the Table Handle
TableAPI tableH = store.getTableAPI();
Table table = tableH.getTable("myTable");


// Use multi-threading for this store iteration and limit the number
// of threads (degree of parallelism) to 9.
final int maxConcurrentRequests = 9;
final int batchResultsSize = 0;
final TableIteratorOptions tio =
   new TableIteratorOptions(Direction.UNORDERED,
                  Consistency.NONE_REQUIRED,
                  0, null,
                  maxConcurrentRequests,
                  batchResultsSize);

// Create retrieval keys
PrimaryKey myKey = table.createPrimaryKey();
myKey.put("itemType", "Hats");
myKey.put("itemCategory", "baseball");
PrimaryKey otherKey = table.createPrimaryKey();
otherKey.put("itemType", "Pants");
otherKey.put("itemCategory", "baseball");

List<PrimaryKey> searchKeys = new ArrayList<PrimaryKey>();

// Add the retrieval keys to the list.
searchKeys.add(myKey);
searchKeys.add(otherKey);


final TableIterator<Row> iterator = tableH.tableIterator(
                                     searchKeys.iterator(), null, tio);

// Now retrieve the records.
try {
    while (iterator.hasNext()) {
    Row row = (Row) iterator.next();
    // Do some work with the Row here
    }
} finally {
   if (iterator != null) {
   iterator.close();
   }
}