Table of Contents
A store consists of a number of Storage Nodes. Each Storage Node
can host one or more Replication Nodes, based on its capacity
value. The term topology
is used to describe the
distribution of Replication Nodes. A topology is derived from the
number and capacity of available Storage Nodes, the number of
partitions in the store, and the replication factor of the store's
datacenter. Topology layouts are also governed by a set of rules
that maximize the availability of the store.
The initial configuration, or topology of the store is set when the store is created. Over time, it may be necessary to change the topology of the store. There are several reasons for such a change:
You need to replace or upgrade an existing Storage Node.
You need to increase read throughput. This is done by increasing the replication factor and creating more copies of the store's data which can be used to service read only requests.
You need to increase write throughput. Since each shard has a single master node, distributing the data in the store over a larger number of shards provides the store with more nodes that can execute write operations.
You change the store's configuration by changing the the number or
capacity of Storage Nodes available, or the replication factor of a
datacenter. To change from one configuration to another, you
either create a new initial topology, or you
clone
an existing topology and modify it into
your target topology. You then deploy this target topology.
The deployment of the target topology is potentially a long-running operation and the time required scales with the amount of data that must be moved. During the deployment, the system updates the topology at each step. Because of that, the store passes through intermediate topologies which were not explicitly created by the user.
This chapter discusses how configuration, or topological changes are made in a store.
Configuration changes should not be made while a snapshot is being taken and vice versa. When making configuration changes it is safest to first create a snapshot as a backup and then make the changes. For additional information on creating snapshots, see Taking a Snapshot.
When you change your topology, you should go through these steps:
Creating a new topology may be an iterative process. You may want to try different options to see what may be best before the changes are deployed. In the end, examine the topology candidate and decide if it is satisfactory. If not, apply more transformations, or start over with different parameters. You can view and validate topology candidates to decide if they are appropriate.
The possible transformations include redistributing data, increasing replication factor, and rebalancing. These are described in Transform the Topology Candidate.
The following sections walk you through the process of changing the configuration for your store using the Administration Command Line Interface.
To create the first topology candidate for an initial
deployment, before any Replication Nodes exist, you use the
topology create
command. The
topology create
command takes a topology
name, a pool name and the number of partitions as
arguments.
For example:
kv-> topology create -name NewTopo -pool BostonPool -partitions 300
This initial topology candidate can be deployed,
without any further transformations, using the
plan deploy-topology
command.
After the store is deployed, topology candidates are
created with the topology clone command. A clone's
source can be another topology candidate, or the
current, deployed topology. The topology
clone
command takes the following arguments:
-from <from topology>
The name of the source topology candidate.
-name <to topology>
The name of the clone.
For example:
kv-> topology clone -from NewTopo -name CloneTopo
Also, there is a variant of the topology create command that takes the following arguments:
-current
If specified, use the current, deployed topology as a source.
-name <to topology>
The name of the clone.
For example:
kv-> topology clone -current -name ClonedTopo
After the initial deployment, the store is changed by
deploying a topology candidate that differs from the
topology currently in effect. This target topology is
generated by transforming a topology candidate using
the topology redistribute, rebalance, or
change-repfactor
command.
All topologies must obey the following rules:
Each Replication Node from the same shard must reside on a different Storage Node. This rule prevents a single Storage Node failure from causing multiple points of failure for a single shard.
The number of Replication Nodes assigned to a Storage Node must be less than or equal to the capacity of Storage Nodes.
A datacenter must have one or more Replication Nodes from each shard.
The topology rebalance, redistribute or change-repfactor commands can only make changes to the topology candidate if there are additional, or changed, Storage Nodes available. It uses the new resources to rearrange Replication Nodes and partitions so the topology complies with the topology rules and the store improves on read or write throughput.
The following are scenarios in how you might expand the store.
You can increase data distribution in order to enhance
write throughput by using the
topology redistribute
command. The
redistribute command only works if new Storage Nodes
are added to permit the creation of new shards.
Partitions are distributed across the new shards,
resulting in more Replication Nodes to service write
operations.
The following example demonstrates adding a set of Storage Nodes and redistributing the data to those nodes. In this example four nodes are added because the data center's replication factor is four and the new partition requires four nodes to satisfy the replication requirements:
kv-> plan deploy-sn -dc dc1 -host node05 -port 5008 -wait Executed plan 7, waiting for completion... Plan 7 ended successfully kv-> plan deploy-sn -dc dc1 -host node06 -port 5010 -wait Executed plan 8, waiting for completion... Plan 8 ended successfully kv-> plan deploy-sn -dc dc1 -host node07 -port 5012 -wait Executed plan 9, waiting for completion... Plan 9 ended successfully kv-> plan deploy-sn -dc dc1 -host node08 -port 5014 -wait Executed plan 10, waiting for completion... Plan 10 ended successfully kv-> pool join -name BostonPool -sn sn5 kv-> pool join -name BostonPool -sn sn6 kv-> pool join -name BostonPool -sn sn7 kv-> pool join -name BostonPool -sn sn8 kv-> topology redistribute -name NewTopo -pool BostonPool
The redistribute command uses added capacity to create new shards and to migrate partitions to those shards. The command fails if the number of new shards is not greater than the current number of shards.
Redistribute commands should not issued against a mixed shard store. A mixed shard store has shards whose Replication Nodes are operating with different software versions of Oracle NoSQL Database.
The system goes through these steps when it is redistributing a topology candidate:
New Replication Nodes are created for each shard and are assigned to Storage Nodes following the topology rules described earlier. It may be necessary to move existing Replication Nodes to different Storage Nodes to best use available resources while still complying with the topology rules.
Partitions are distributed evenly among all shards. Partitions that are in shards that are over populated will move to the shards with the least number of partitions.
You do not specify which partitions are moved.
You can increase the replication factor and create
more copies of the data to improve read throughput
and availability by using the
topology change-repfactor
command.
More Replication Nodes are added to each shard so
that it has the requisite number of nodes. The new
Replication Nodes are populated from existing nodes
in the shard. Since every shard in a datacenter has
the same replication factor, if there are a large
number of shards, this command may require a
significant number of new Storage Nodes to be
successful.
For additional information on how to identify your replication factor and its implications, see Replication Factor.
The following example increases the replication factor of the store to 4. The administrator deploys a new Storage Node and adds it to the Storage Node pool. She then clones the existing topology and transforms it to use a new replication factor of 4.
kv-> plan deploy-sn -dc dc1 -host node09 -port 5016 -wait Executed plan 11, waiting for completion... Plan 11 ended successfully kv-> pool join -name BostonPool -sn sn9 kv-> topology clone -current -name NewTopo kv-> topology change-repfactor -name NewTopo -pool BostonPool -rf 4 -dc dc1 kv-> plan deploy-topology -name NewTopo -wait Executed plan 12, waiting for completion... Plan 12 ended successfully
The change-repfactor command fails if:
The new replication factor is less than or equal to the current replication factor.
The Storage Nodes specified by the storage node pool do not have enough capacity to host the required new Replication Nodes.
Topologies must obey the rules described in Transform the Topology Candidate. Changes to the physical characteristics of the store can make the current topology of the store violate those rules. For example, after performance tuning, you may want to decrease the capacity of a Storage Node. If that node was already hosting the maximum permissible number of Replication Nodes, reducing the capacity will put the store out of compliance with the capacity rules.
You can balance a non-compliant configuration by using
the topology rebalance
command. This
command requires a topology candidate name and a
Storage Node pool name.
The following example examines the topology candidate
named NewTopo
for any violations to
the topology rules. If no improvements are needed as a
result of this examination, the topology candidate is
unchanged. However, if improvements are needed, then
the topology rebalance
command will
move or create Replication Nodes, using the Storage
Nodes in the BostonPool pool, in order to correct
any violations. The command does not under any
circumstances create additional
shards.
kv-> topology rebalance -name NewTopo -pool BostonPool
If there are an insufficient number of Storage Nodes,
the topology rebalance
command may
not be able to correct all violations. In that case,
the command makes as much progress as possible, and
warns of remaining issues.
You can view details of the topology candidate or a
deployed topology by using the
topology view
command. The command takes
a topology name as an argument. With the topology view
command, you can view all at once: the store name, number
of partitions, shards, replication factor, host name and
capacity in the specified topology.
You can validate the topology candidate or a deployed
topology by using the topology validate
command. The topology validate command takes a topology
name as an argument. If no topology is specified, the
current topology is validated. Validation makes sure that
the topology candidate obeys the topology rules described in
Transform the Topology Candidate.
Validation generates "violations" and "notes".
Violations are issues that can cause problems and should be investigated.
Notes are informational and highlight configuration oddities that may be potential issues, but may be expected.
You should preview the changes that would be made for the
specified topology candidate relative to a starting
topology. You use the topology preview
command to do this. This command takes the following
arguments:
name
A string to identify the topology.
start <from topology>
If -start topology name is not specified, the current topology is used. This command should be used before deploying a new topology.
With a satisfactory topology candidate, you can use the admin service to generate and execute a plan which migrates the store to the new topology.
You can deploy the topology candidate by using the
plan deploy-topology
command. This
command takes a topology name as an argument.
While the plan is executing, you can monitor the plan's progress. You have several options:
The plan can be interrupted then retried, or canceled.
Other, limited plans may be executed while a transformation plan is in progress to deal with ongoing problems or failures.
By default, the plan deploy-topology
command refuses to deploy a topology candidate if it
introduces new violations of the topology rules. This
behavior can be overridden by using the
-force
optional plan flag on that
command.
You can verify the store's current topology by using the
verify
command. The verify command
checks the current, deployed topology to make sure it obeys
the topology rules described in
Transform the Topology Candidate.
You should examine the new topology and decide if it is satisfactory, and if not apply more transformations, or start over with different parameters.