Estimate Capacity

Learn how to estimate throughput and storage capacities for your Oracle NoSQL Database Cloud Service.

Basics Behind the Calculation

Before you learn how to estimate throughput and storage for the service, let us revisit the basics.

  • Write Unit (WU): One Write Unit is defined as the throughput of up to 1 kilobyte (KB) of data per second. A write operation is any Oracle NoSQL Database Cloud Service API call that results in insertion, update, or deletion of a record. Your NoSQL table has a write limit value which specifies the number of write units that may be used each second. Any index that needs to be updated also consumes write units.

    For example, a record size of less than 1 KB requires one WU for a write operation. A record size of 1.5 KB requires two WUs for the write operation.

  • Read Unit (RU): One Read Unit is defined as the throughput of up to 1 KB of data per second for an eventually consistent read operation. Your NoSQL table has a read limit value which specifies the number of read units that may be used each second.

    For example, a record size of less than 1 KB requires one RU for an eventually consistent read operation. A record size of 1.5 KB requires two RUs for an eventually consistent read operation and four RUs for an absolutely consistent read operation.

  • Storage Capacity: It is a gigabyte (GB) of disk space used by your data.

  • Absolute Consistency: The data returned is expected to be the most recently written data to the database.

  • Eventual Consistency: The data returned may not be the most recently written data to the database; if no new updates are made to the data, eventually all accesses to that data return the latest updated value.

Factors that Impact the Capacity Unit

Before you provision the capacity units, it is important to consider the following factors that impact the read, write, and storage capacities.

  • Record size: As the record size increases, the number of capacity units consumed to write or read data also increases.

  • Data consistency: Absolute consistent reads are twice the cost of eventual consistent reads.

  • Secondary Indexes: In a table, when an existing record is modified (added, updated, or deleted), updating secondary indexes consumes Write Units. The total provisioned throughput cost for a write operation is the sum of write units consumed by writing to the table and updating the local secondary indexes.

  • Data modeling choice: With schema-less JSON, each document is self-describing which adds metadata overhead to the overall size of the record. With fixed schema tables, the overhead for each record is exactly 1 byte.

  • Query pattern: The cost of a query operation depends on the number of rows retrieved, number of predicates, the size of the source data, projections, and the presence of indexes. The least expensive queries specify a shard key or index key (with an associated index) to allow the system to take advantage of primary and secondary indexes. An application can try different queries and examine the consumed throughput to help tune the operations.

Real World Example: How to Estimate your Application Workload

Consider a real world example of an E-commerce application to learn how to estimate reads and writes per second. In this example, Oracle NoSQL Database Cloud Service is used to store the product catalog information of the application.

  1. Identify the data model (JSON or fixed-table), record size, and key size for the application.

    Assume that the E-commerce application follows the JSON data model and the developer has created a simple table with two columns. A record identifier (primary key) and a JSON document for the product features and attributes. The JSON document, which is under 1 KB is as follows:

    {
    	"additionalFeatures": "Front Facing 1.3MP Camera",
    	"os": "Macintosh OS X 10.7", 
    	"battery": {      
    		"type": "Lithium Ion (Li-Ion) (7000 mAH)",
         "standbytime" : "24 hours"
    	},
    	"camera": {
    		"features": ["Flash","Video"],
    		"primary": "5.0 megapixels"
    	},
    	"connectivity": {
    		"bluetooth": "Bluetooth 2.1",
    		"cell": "T-mobile HSPA+ @ 2100/1900/AWS/850 MHz",
     		"gps": true,
    		"infrared": false,
    		"wifi": "802.11 b/g"
    	},
    	"description": "Apple iBook is the best in class computer
    		for your professional and personal work.",
    	"display": {
    		"screenResolution": "WVGA (1280 x 968)",
    		"screenSize": "13.0 inches"
    	},
    	"hardware": {
    		"accelerometer": true,
    		"audioJack": "3.5mm",
    		"cpu": "Intel i7 2.5 GHz",
    		"fmRadio": false,
    		"physicalKeyboard": false,
    		"usb": "USB 3.0"
    	},
    	"id": "appleproduct_1",
    	"images": ["img/apple-laptop.jpg"],
    	"name": "Myshop.com : Apple iBook",
    	"sizeAndWeight": {
    		"dimensions": [
    			"300 mm (w)",
    			"300 mm (h)",
    			"12.4 mm (d)"
    		],
    		"weight": "1250.0 grams"
    	},
     	"storage": {
    		"hdd": "750GB",
    		"ram": "8GB"
    	}
    }

    Let us assume that the application has 100,000 such records and the primary key is about 20 bytes in size.

    Also, let us assume that there are queries that would read records via secondary index. For example, to find all the records that have screen size of 13 inches. So, an index is created on the screenSize field.

    The information in summarized as follows:

    Tables Rows per Table Columns per Table Key Size in Bytes Value Size in Bytes (sum of all columns) Indexes Index Key Size in Bytes

    1

    100000

    2

    20

    2 KB

    1

    20

  2. Identify the list of operations (typically CRUD operations and Index reads) on the table and at what rate (per second) they are expected.

    Operation Number of Operations (per second) Example

    Create Records.

    3

    To create a product.

    Read records using the primary key.

    200

    To read product details using the product ID.

    Read records using the secondary index.

    1

    To fetch all products that have a screen size of 13 inches.

    Update or add an attribute to a record.

    5

    To update the product description of a camera

    or

    To add information about the weight of a camera.

    Delete record.

    5

    To delete an existing product.

  3. Identify the read and write consumption in KB.

    Operation Assumptions (If any) Formula Read Consumption in KB Write Consumption in KB

    Create Records.

    Let us assume that the records are created without performing any condition checks (if they exist).

    Record size (rounded to next KB) + 1 KB(index) * (number of indexes)

    0

    1 KB + 1 KB (1 ) = 2 KB

    Read records using the primary key.

     

    Minimum Read (1 KB) + Record size round up to KB + Overhead of 2 KB

    Record size = 1 KB

    0

    Read records using the secondary index.

    Let us assume that 100 records are returned.

    [1 KB(index) + record_size] * number_of_records_matched

    Some variable overhead may occur depending on how many batches are returned and the size limit that is set for the query.

    Let us say this variable consumption is 10 KB. Therefore the total consumption would be 210 KB.

    100 * (1 KB + 1 KB) = 200 KB

    : 200 KB + 10 KB = 210 KB

    0

    Update existing records

    Let us assume that the updated record size is the same as the old record size (1 KB).

    Read consumption = [1 KB(index) * 2] + [record_size * 2]

    Write consumption = original_record_size + new_record_size + 1 KB (index)(number of writes)

    2 KB + 2 KB = 4 KB

    1 KB + 1 KB + 1 KB = 3 KB

    Delete record

     

    Read consumption = 1 KB (index) * 2

    Write consumption = record_size + 1KB (index) * (number_of_indexes)

    1 KB + 2 KB = 2 KB

    1 KB + 1 KB (1 index) = 2 KB

    Using steps 2 and 3, let us determine read and write units for our application workload.

    Operations Rate of Operations Reads per Second Writes per Second

    Create records

    3

    0

    6

    Read records using the primary Key

    300

    300

    0

    Read records using the secondary index

    10

    2100

    0

    Update existing record

    5

    20

    15

    Delete record

    1

    2

    2

    Total Read Units: 2422

    Total Write Units: 23

    Therefore, the E-commerce application is estimated to have a workload of 2422 reads per second and 23 writes per second. Use the Capacity Estimator on Oracle Technology Network to input these values and estimate throughput and storage of your application.

Note:

The preceding calculations assume eventual consistent read requests. For an absolutely consistent read request, the operation consumes double the capacity units. Therefore, the read capacity units would be 4844 Read Units.