Using the MR_COUNTER datatype

Using CRDT datatype in a multi-region table

Overview of the MR_COUNTER data type

MR_Counter data type is a counter CRDT. CRDT stands for Conflict-free Replicated Data Type. In a multi-region setup of an Oracle NoSQL Database, a CRDT is a data type that can be replicated across servers where regions can be updated independently and it converges on a correct common state. Changes in the regions are concurrent and not synchronized with one another. In short, CRDTs provide a way for concurrent modifications to be merged across regions without user intervention. Oracle NoSQL Database currently supports the counter CRDT which is called MR_Counter. The MR_COUNTER datatype is a subtype of the INTEGER or LONG or NUMBER data type. You can also use the MR_COUNTER data type in a schema-less JSON field, which means one or more fields in a JSON document can be of MR_COUNTER data type.

Why do you need MR_Counter in a multi-region table?

In a multi-region database configuration, copies of the same data need to be stored in multiple regions. This configuration needs to deal with the fact that the data may be concurrently modified in different regions.

Take an example of a multi-region table in three different regions (where data is stored in three different Oracle NoSQL Database stores). Concurrent updates of the same data in multiple regions, without coordination between the machines hosting the regions, can result in inconsistencies between the regions, which in the general case may not be resolvable. Restoring consistency and data integrity when there are conflicts between updates may require some or all of the updates to be entirely or partially dropped. For example, in the current configuration of a multi-region table in the Oracle NoSQL Database, if the same column (a counter) of a multi-region table is updated across two regions at the same time with different values, a conflict arises.

Currently, the conflict resolution is that the latest write overwrites the value across regions. For example, Region 1 updates column1 with a value R1, and region2 updates column1 with a value R2, and if the region2 update happens after region1, the value of the column (counter) in both the regions becomes R2. This is not what is actually desired. Rather every region should update the column (a counter) at their end and also the system internally needs to determine the sum of the column across regions.

One way to handle this conflict is making serializable/linearizable transactions (one transaction is completed and changes are synchronized in all regions and only then the next transaction happens). A significant problem of having serializable transactions is performance. This is where MR_COUNTER datatype comes in handy. With MR_COUNTER datatype, we don't need serializable transactions and the conflict resolution is taken care of. That is, MR_COUNTER datatype ensures that though data modifications can happen simultaneously on different regions, the data can always be merged into a consistent state. This merge is performed automatically by MR_COUNTER datatype, without requiring any special conflict resolution code or user intervention.

Use-case for MR_COUNTER datatype

Consider a Telecom provider providing different services and packages to its customers. One such service is a "Family Plan" option where a customer and their family share the Data Usage plan. The customer is allocated a free data usage limit for a month which your the customer's entire family collectively uses. When the total usage of customer's family reaches 90 percent of the data limit, the telecom provider sends the customer an alert. Say there are four members in customer's family plan who are spread across different physical regions. The customer needs to get an alert from the telecom provider once the total consumption of their family reaches 90 percent of the free usage. The data is replicated in different regions to cater to latency, throughput, and better performance. That means there are four regions and each has a kvstore containing the details of the customer's data usage. The usage of their family members needs to be updated in different regions and at any point in time, the total usage should be monitored and an alert should be sent if the data usage reaches the limit.

An MR_COUNTER data type is ideal in such a situation to do conflict-free tracking of the data usage across different regions. In the above example, an increment counter in every data region's data store will track the data usage in that region. The consolidated data usage for all regions can be determined by the system at any point without any user intervention. That is the total data usage at any point in time can be easily determined by the system using an MR_COUNTER datatype.

Types of MR_COUNTER Datatype

Currently, Oracle NoSQL Database supports only one type of MR_COUNTER data type. which is Positive-Negative (PN) counter.

Positive-Negative (PN) Counter

A PN counter can be incremented or decremented. Therefore, these can serve as a general-purpose counter. For example, you can use these counters to count the number of users active on a social media website at any point. When the users go offline you need to decrement the counter.

To create a multi-region table with an MR_COUNTER column, See Create multi-region table with an MR_COUNTER column section in the Administrator's Guide.

Create table using MR_COUNTER datatype

You can declare a table column of MR_Counter data type in a CREATE TABLE statement. You can do this only in a multi-region table.

Declare a column with MR_COUNTER data type. MR_COUNTER is a subtype of the INTEGER or LONG or NUMBER data type. You can also declare a field in a JSON column as MR_COUNTER data type.

Syntax:
mr_counter_defintion := AS MR_COUNTER

Semantics:

Declares the type of a column to be the MR_COUNTER type.

MR_COUNTER column Characteristics:

  • MR_COUNTER data type can be used for a multi-region table only. It cannot be used in regular tables.
  • One table can have multiple columns defined as "MR_COUNTER".
  • You create an MR_COUNTER column as part of a CREATE TABLE DDL statement or add an MR_COUNTER column to an existing table with an ALTER TABLE DDL statement.
  • You can define any field in a JSON column as an MR_COUNTER.
  • The default value of an MR_COUNTER data type is always 0.
  • MR_COUNTER cannot be the element of an ARRAY.

Note:

MR_COUNTER cannot be a primary key or be part of a secondary index.

Example using MR_COUNTER data type - Create a PN counter data type in a multi-region table

CREATE Table myTable( name STRING,
                      count INTEGER AS MR_COUNTER,
                      PRIMARY KEY(name)) IN REGIONS DEN,LON;

In the above example, you create a PN counter data type in two regions DEN and LON.

While inserting data into the table, the system inserts the default value of 0 in the following two cases.
  • If you specify the "DEFAULT" keyword in the INSERT clause:

    INSERT INTO myTable VALUES ("Bob", DEFAULT);

  • If you skip the column in the INSERT clause:

    INSERT INTO myTable(name) VALUES ("Bob");

Note:

For MR_COUNTER, the count contributed by a single region cannot overflow but the MR_COUNTER value, which is the sum of counts for all regions can overflow.

For example, the above table myTable has an integer MR_COUNTER and there are two regions DEN and LON. The region DEN cannot increment the count by a total value greater than INTEGER.MAX. This is also applicable for the region LON.

However, if the region DEN increments the count by INTEGER.MAX and the region LON increments it by 2, the value of count becomes (INTEGER.MAX.+2) which has overflowed, but this is allowed.

Example: Create a multi-region table and declare fields in a JSON column as MR_COUNTER data type
CREATE TABLE demoJSONMR(name STRING,
  jsonWithCounter JSON(counter as INTEGER MR_COUNTER,
                       person.count as LONG MR_COUNTER),
  PRIMARY KEY(name)) IN REGIONS FRA,LON;
In the statement above, you create a multi-region table with a STRING column and a column(JSON documents). You are identifying two of the fields in the JSON document as MR_COUNTER data type.. The first field is counter, which is an INTEGER MR_COUNTER data type. The second field is count within an embedded JSON document (person). The count field is of LONG MR_COUNTER data type.

Note:

There is no limit on the number of MR_COUNTER data types created inside a JSON document.

While inserting data into the table, the system always initially inserts the default value of 0 for all MR_COUNTER data types.

Case 1: Skip the value for MR_COUNTER data type:

When you skip providing values for an MR_COUNTER data type, the value of 0 is automatically assigned to it.
INSERT INTO demoJSONMR VALUES ("Anna",
    {
        "id" : 1,
        "person" : {
            "age" : 10,
            "number" : 100
        }
    }
);

SELECT * FROM demoJSONMR;
{"name":"Anna",
 "jsonWithCounter":{"id" : 1,"counter":0,
      "person":{"age":10,"count":0,"number":100}
 }
}

Case 2: Provide a value for the MR_COUNTER data type:

When inserting a row into the multi-region table with a JSON MR_COUNTER, a value of 0 is initially assigned to all MR_COUNTER data types even if you explicitly assign a non-zero value. This also holds good when you try to provide a value that is not an INTEGER or LONG or NUMBER.
INSERT INTO demoJSONMR VALUES ("Anna",
    {
        "id" : 1,
        "counter" : 5,
        "person" : {
            "age" : 10,
            "count" : NULL,
            "number" : 100
        }
    }
);

SELECT * FROM demoJSONMR;
{"name":"Anna",
 "jsonWithCounter":{"id" : 1,"counter":0,
                    "person":{"age":10,"count":0,"number":100}
  }
}