Chapter 5. Determining Your Store's Configuration

Table of Contents

Steps for Changing the Store's Topology
Make the Topology Candidate
Transform the Topology Candidate
View the Topology Candidate
Validate the Topology Candidate
Preview the Topology Candidate
Deploy the Topology Candidate
Verify the Store's Current Topology

A store consists of a number of Storage Nodes. Each Storage Node can host one or more Replication Nodes, based on its capacity value. The term topology is used to describe the distribution of Replication Nodes. A topology is derived from the number and capacity of available Storage Nodes, the number of partitions in the store, and the replication factor of the store's datacenter. Topology layouts are also governed by a set of rules that maximize the availability of the store.

The initial configuration, or topology of the store is set when the store is created. Over time, it may be necessary to change the topology of the store. There are several reasons for such a change:

  1. You need to replace or upgrade an existing Storage Node.

  2. You need to increase read throughput. This is done by increasing the replication factor and creating more copies of the store's data which can be used to service read only requests.

  3. You need to increase write throughput. Since each shard has a single master node, distributing the data in the store over a larger number of shards provides the store with more nodes that can execute write operations.

You change the store's configuration by changing the the number or capacity of Storage Nodes available, or the replication factor of a datacenter. To change from one configuration to another, you either create a new initial topology, or you clone an existing topology and modify it into your target topology. You then deploy this target topology.

Note

The deployment of the target topology is potentially a long-running operation and the time required scales with the amount of data that must be moved. During the deployment, the system updates the topology at each step. Because of that, the store passes through intermediate topologies which were not explicitly created by the user.

This chapter discusses how configuration, or topological changes are made in a store.

Note

Configuration changes should not be made while a snapshot is being taken and vice versa. When making configuration changes it is safest to first create a snapshot as a backup and then make the changes. For additional information on creating snapshots, see Taking a Snapshot.

Steps for Changing the Store's Topology

When you change your topology, you should go through these steps:

Creating a new topology may be an iterative process. You may want to try different options to see what may be best before the changes are deployed. In the end, examine the topology candidate and decide if it is satisfactory. If not, apply more transformations, or start over with different parameters. You can view and validate topology candidates to decide if they are appropriate.

The possible transformations include redistributing data, increasing replication factor, and rebalancing. These are described in Transform the Topology Candidate.

The following sections walk you through the process of changing the configuration for your store using the Administration Command Line Interface.

Make the Topology Candidate

To create the first topology candidate for an initial deployment, before any Replication Nodes exist, you use the topology create command. The topology create command takes a topology name, a pool name and the number of partitions as arguments.

For example:

kv-> topology create -name NewTopo -pool BostonPool -partitions 300 

This initial topology candidate can be deployed, without any further transformations, using the plan deploy-topology command.

After the store is deployed, topology candidates are created with the topology clone command. A clone's source can be another topology candidate, or the current, deployed topology. The topology clone command takes the following arguments:

  • -from <from topology>

    The name of the source topology candidate.

  • -name <to topology>

    The name of the clone.

For example:

kv-> topology clone -from NewTopo -name CloneTopo 

Also, there is a variant of the topology create command that takes the following arguments:

  • -current

    If specified, use the current, deployed topology as a source.

  • -name <to topology>

    The name of the clone.

For example:

kv-> topology clone -current -name ClonedTopo 

Transform the Topology Candidate

After the initial deployment, the store is changed by deploying a topology candidate that differs from the topology currently in effect. This target topology is generated by transforming a topology candidate using the topology redistribute, rebalance, or change-repfactor command.

All topologies must obey the following rules:

  1. Each Replication Node from the same shard must reside on a different Storage Node. This rule prevents a single Storage Node failure from causing multiple points of failure for a single shard.

  2. The number of Replication Nodes assigned to a Storage Node must be less than or equal to the capacity of Storage Nodes.

  3. A datacenter must have one or more Replication Nodes from each shard.

The topology rebalance, redistribute or change-repfactor commands can only make changes to the topology candidate if there are additional, or changed, Storage Nodes available. It uses the new resources to rearrange Replication Nodes and partitions so the topology complies with the topology rules and the store improves on read or write throughput.

The following are scenarios in how you might expand the store.

Increase Data Distribution

You can increase data distribution in order to enhance write throughput by using the topology redistribute command. The redistribute command only works if new Storage Nodes are added to permit the creation of new shards. Partitions are distributed across the new shards, resulting in more Replication Nodes to service write operations.

The following example demonstrates adding a set of Storage Nodes and redistributing the data to those nodes. In this example four nodes are added because the data center's replication factor is four and the new partition requires four nodes to satisfy the replication requirements:

kv-> plan deploy-sn -dc dc1 -host node05 -port 5008 -wait
Executed plan 7, waiting for completion...
Plan 7 ended successfully
kv-> plan deploy-sn -dc dc1 -host node06 -port 5010 -wait
Executed plan 8, waiting for completion...
Plan 8 ended successfully
kv-> plan deploy-sn -dc dc1 -host node07 -port 5012 -wait
Executed plan 9, waiting for completion...
Plan 9 ended successfully
kv-> plan deploy-sn -dc dc1 -host node08 -port 5014 -wait
Executed plan 10, waiting for completion...
Plan 10 ended successfully
kv-> pool join -name BostonPool -sn sn5
kv-> pool join -name BostonPool -sn sn6
kv-> pool join -name BostonPool -sn sn7
kv-> pool join -name BostonPool -sn sn8
kv-> topology redistribute -name NewTopo -pool BostonPool

The redistribute command uses added capacity to create new shards and to migrate partitions to those shards. The command fails if the number of new shards is not greater than the current number of shards.

Note

Redistribute commands should not issued against a mixed shard store. A mixed shard store has shards whose Replication Nodes are operating with different software versions of Oracle NoSQL Database.

The system goes through these steps when it is redistributing a topology candidate:

  1. New Replication Nodes are created for each shard and are assigned to Storage Nodes following the topology rules described earlier. It may be necessary to move existing Replication Nodes to different Storage Nodes to best use available resources while still complying with the topology rules.

  2. Partitions are distributed evenly among all shards. Partitions that are in shards that are over populated will move to the shards with the least number of partitions.

  3. You do not specify which partitions are moved.

Increase Replication Factor

You can increase the replication factor and create more copies of the data to improve read throughput and availability by using the topology change-repfactor command. More Replication Nodes are added to each shard so that it has the requisite number of nodes. The new Replication Nodes are populated from existing nodes in the shard. Since every shard in a datacenter has the same replication factor, if there are a large number of shards, this command may require a significant number of new Storage Nodes to be successful.

For additional information on how to identify your replication factor and its implications, see Replication Factor.

The following example increases the replication factor of the store to 4. The administrator deploys a new Storage Node and adds it to the Storage Node pool. She then clones the existing topology and transforms it to use a new replication factor of 4.

kv-> plan deploy-sn -dc dc1 -host node09 -port 5016 -wait
Executed plan 11, waiting for completion...
Plan 11 ended successfully
kv-> pool join -name BostonPool -sn sn9
kv-> topology clone -current -name NewTopo
kv-> topology change-repfactor -name NewTopo -pool BostonPool -rf 4 -dc dc1
kv-> plan deploy-topology -name NewTopo -wait
Executed plan 12, waiting for completion...
Plan 12 ended successfully

The change-repfactor command fails if:

  1. The new replication factor is less than or equal to the current replication factor.

  2. The Storage Nodes specified by the storage node pool do not have enough capacity to host the required new Replication Nodes.

Balance a Non-Compliant Topology

Topologies must obey the rules described in Transform the Topology Candidate. Changes to the physical characteristics of the store can make the current topology of the store violate those rules. For example, after performance tuning, you may want to decrease the capacity of a Storage Node. If that node was already hosting the maximum permissible number of Replication Nodes, reducing the capacity will put the store out of compliance with the capacity rules.

You can balance a non-compliant configuration by using the topology rebalance command. This command requires a topology candidate name and a Storage Node pool name.

The following example examines the topology candidate named NewTopo for any violations to the topology rules. If no improvements are needed as a result of this examination, the topology candidate is unchanged. However, if improvements are needed, then the topology rebalance command will move or create Replication Nodes, using the Storage Nodes in the BostonPool pool, in order to correct any violations. The command does not under any circumstances create additional shards.

kv-> topology rebalance -name NewTopo -pool BostonPool 

If there are an insufficient number of Storage Nodes, the topology rebalance command may not be able to correct all violations. In that case, the command makes as much progress as possible, and warns of remaining issues.

View the Topology Candidate

You can view details of the topology candidate or a deployed topology by using the topology view command. The command takes a topology name as an argument. With the topology view command, you can view all at once: the store name, number of partitions, shards, replication factor, host name and capacity in the specified topology.

Validate the Topology Candidate

You can validate the topology candidate or a deployed topology by using the topology validate command. The topology validate command takes a topology name as an argument. If no topology is specified, the current topology is validated. Validation makes sure that the topology candidate obeys the topology rules described in Transform the Topology Candidate. Validation generates "violations" and "notes".

Violations are issues that can cause problems and should be investigated.

Notes are informational and highlight configuration oddities that may be potential issues, but may be expected.

Preview the Topology Candidate

You should preview the changes that would be made for the specified topology candidate relative to a starting topology. You use the topology preview command to do this. This command takes the following arguments:

  • name

    A string to identify the topology.

  • start <from topology>

    If -start topology name is not specified, the current topology is used. This command should be used before deploying a new topology.

Deploy the Topology Candidate

With a satisfactory topology candidate, you can use the admin service to generate and execute a plan which migrates the store to the new topology.

You can deploy the topology candidate by using the plan deploy-topology command. This command takes a topology name as an argument.

While the plan is executing, you can monitor the plan's progress. You have several options:

  • The plan can be interrupted then retried, or canceled.

  • Other, limited plans may be executed while a transformation plan is in progress to deal with ongoing problems or failures.

By default, the plan deploy-topology command refuses to deploy a topology candidate if it introduces new violations of the topology rules. This behavior can be overridden by using the -force optional plan flag on that command.

Verify the Store's Current Topology

You can verify the store's current topology by using the verify command. The verify command checks the current, deployed topology to make sure it obeys the topology rules described in Transform the Topology Candidate.

You should examine the new topology and decide if it is satisfactory, and if not apply more transformations, or start over with different parameters.