Optimal use of available physical facilities is achieved by deploying your store across multiple Zones. This provides fault isolation and availability for your data if a single zone fails. Each Zone has a copy of your complete store, including a copy of all the shards. With this configuration, reads are always possible, so long as your data's consistency guarantees can be met, because at least one replica is located in every Zone. Writes can also occur in the event of a Zone loss so long as quorum can be maintained.
You can specify a different replication factor to each Zone. A replication factor can then be quantified as one of the following:
Zone Replication Factor
The number of copies, or replicas, maintained in a zone.
Primary Replication Factor
The total number of replicas in all Primary zones. This replication factor controls the number of replicas that participate in elections and acknowledgments. For additional information on how to identify the Primary Replication Factor and its implications, see Replication Factor.
Secondary Replication Factor
The total number of replicas in all Secondary zones. Secondary replicas provide additional read-only copies of the data.
Store Replication Factor
Represents for all zones in the store, the total number of replicas across the entire store.
Zones located nearby have the benefit of avoiding bottlenecks due to throughput limitations, as well as reducing latency during elections and commits.
Zones come in two types. Primary zones contain nodes which can serve as masters or replicas. Zones are created as primary zones by default. For good performance, primary zones should be connected by low latency networks so that they can participate efficiently in master elections and commit acknowledgments.
Secondary zones contain nodes which can only serve as replicas. Secondary zones can be used to provide low latency read access to data at a distant location, or to maintain an extra copy of the data to increase redundancy or increase read capacity. Because the nodes in secondary zones do not participate in master elections or commit acknowledgments, secondary zones can be connected to other zones by higher latency networks, because additional latency will not interfere with those time critical operations.
Using high throughput and low latency networks to connect primary zones leads to better results and improved performance. You can use networks with higher latency to connect to secondary zones so long as the connections provide sufficient throughput to support replication and sufficient reliability that temporary interruptions do not interfere with network throughput.
Because any primary zone can host master nodes, write performance may be reduced if primary zones are connected through a limited throughput and/or high latency network link.
The following steps walk you through the process of deploying six Storage Nodes across three primary zones. You can then verify that each shard has a replica in every Zone; service can be continued in the event of a Zone failure.
In following examples six Storage Node Agents are started on the same host, but in a production environment one Storage Node Agent should be hosted per physical machine.
For a new store, create the initial "boot config" configuration files using the makebootconfig utility:
> java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar makebootconfig \ -root KVROOT -host localhost -config config1.xml \ -port 5010 -admin 5011 -harange 5012,5015 \ -memory_mb 0 -store-security none java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar makebootconfig \ -root KVROOT -host localhost -config config2.xml \ -port 5020 -admin 5021 -harange 5022,5025 \ -memory_mb 0 -store-security none java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar makebootconfig \ -root KVROOT -host localhost -config config3.xml \ -port 5030 -admin 5031 -harange 5032,5035 \ -memory_mb 0 -store-security none java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar makebootconfig \ -root KVROOT -host localhost -config config4.xml \ -port 5040 -harange 5042,5045 \ -memory_mb 0 -store-security none java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar makebootconfig \ -root KVROOT -host localhost -config config5.xml \ -port 5050 -harange 5052,5055 \ -memory_mb 0 -store-security none java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar makebootconfig \ -root KVROOT -host localhost -config config6.xml \ -port 5060 -harange 5062,5065 \ -memory_mb 0 -store-security none
Using each of the configuration files, start all of the Storage Node Agents:
> nohup java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar \ start -root KVROOT -config config1.xml > [1] 12019 > nohup java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar \ start -root KVROOT -config config2.xml > [2] 12020 > nohup java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar \ start -root KVROOT -config config3.xml > [3] 12021 > nohup java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar \ start -root KVROOT -config config4.xml > [4] 12022 > nohup java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar \ start -root KVROOT -config config5.xml > [5] 12023 > nohup java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar \ start -root KVROOT -config config6.xml > [6] 12024
Start the CLI:
> java -Xmx256m -Xms256m \ -jar KVHOME/lib/kvstore.jar runadmin -host \ localhost -port 5010 kv->
Name your store:
kv-> configure -name MetroArea Store configured: MetroArea kv->
Deploy the first Storage Node with administration process in the Manhattan Zone:
kv-> plan deploy-zone -name Manhattan -rf 1 -wait Executed plan 1, waiting for completion... Plan 1 ended successfully kv-> plan deploy-sn -zn 1 -host localhost -port 5010 -wait Executed plan 2, waiting for completion... Plan 2 ended successfully kv-> plan deploy-admin -sn sn1 -port 5011 -wait Executed plan 3, waiting for completion... Plan 3 ended successfully kv-> pool create -name SNs kv-> pool join -name SNs -sn sn1 Added Storage Node(s) [sn1] to pool SNs
Deploy a second Storage Node in Manhattan Zone:
kv-> plan deploy-sn -znname Manhattan -host localhost \ -port 5020 -wait kv-> Executed plan 4, waiting for completion... Plan 4 ended successfully kv-> pool join -name SNs -sn sn2 Added Storage Node(s) [sn2] to pool SNs
Deploy the first Storage Node with administration process in the Jersey City Zone:
kv-> plan deploy-zone -name JerseyCity -rf 1 -wait Executed plan 5, waiting for completion... Plan 5 ended successfully kv-> plan deploy-sn -znname JerseyCity -host localhost \ -port 5030 -wait Executed plan 6, waiting for completion... Plan 6 ended successfully kv-> plan deploy-admin -sn sn3 -port 5031 -wait Executed plan 7, waiting for completion... Plan 7 ended successfully kv-> pool join -name SNs -sn sn3 Added Storage Node(s) [sn3] to pool SNs
Deploy a second Storage Node in Jersey City Zone:
kv-> plan deploy-sn -znname JerseyCity -host localhost \ -port 5040 -wait kv-> Executed plan 8, waiting for completion... Plan 8 ended successfully kv-> pool join -name SNs -sn sn4 Added Storage Node(s) [sn4] to pool SNs
Deploy the first Storage Node with administration process in the Queens Zone:
kv-> plan deploy-zone -name Queens -rf 1 -wait Executed plan 9, waiting for completion... Plan 9 ended successfully kv-> plan deploy-sn -znname Queens -host localhost -port 5050 -wait Executed plan 10, waiting for completion... Plan 10 ended successfully kv-> plan deploy-admin -sn sn5 -port 5051 -wait Executed plan 11, waiting for completion... Plan 11 ended successfully kv-> pool join -name SNs -sn sn5 Added Storage Node(s) [sn5] to pool SNs
Deploy a second Storage Node in Queens Zone:
kv-> plan deploy-sn -znname Queens -host localhost \ -port 5060 -wait kv-> Executed plan 12, waiting for completion... Plan 12 ended successfully kv-> pool join -name SNs -sn sn6 Added Storage Node(s) [sn6] to pool SNs
Create and deploy a topology:
kv-> topology create -name Topo1 -pool SNs -partitions 100 Created: Topo1 kv-> plan deploy-topology -name Topo1 -wait kv-> Executed plan 13, waiting for completion... Plan 13 ended successfully
Check service status with the show
topology
command:
kv->kv-> show topology store=MetroArea numPartitions=100 sequence=117 zn: id=zn1 name=Manhattan repFactor=1 type=PRIMARY zn: id=zn2 name=JerseyCity repFactor=1 type=PRIMARY zn: id=zn3 name=Queens repFactor=1 type=PRIMARY sn=[sn1] zn=[id=zn1 name=Manhattan] localhost:5010 capacity=1 RUNNING [rg1-rn2] RUNNING No performance info available sn=[sn2] zn=[id=zn1 name=Manhattan] localhost:5020 capacity=1 RUNNING [rg2-rn2] RUNNING No performance info available sn=[sn3] zn=[id=zn2 name=JerseyCity] localhost:5030 capacity=1 RUNNING [rg1-rn3] RUNNING No performance info available sn=[sn4] zn=[id=zn2 name=JerseyCity] localhost:5040 capacity=1 RUNNING [rg2-rn3] RUNNING No performance info available sn=[sn5] zn=[id=zn3 name=Queens] localhost:5050 capacity=1 RUNNING [rg1-rn1] RUNNING No performance info available sn=[sn6] zn=[id=zn3 name=Queens] localhost:5060 capacity=1 RUNNING [rg2-rn1] RUNNING No performance info available shard=[rg1] num partitions=50 [rg1-rn1] sn=sn5 [rg1-rn2] sn=sn1 [rg1-rn3] sn=sn3 shard=[rg2] num partitions=50 [rg2-rn1] sn=sn6 [rg2-rn2] sn=sn2 [rg2-rn3] sn=sn4
Verify that each shard has a replica in every zone:
kv-> verify configuration Verify: starting verification of store MetroArea based upon topology sequence #117 100 partitions and 6 storage nodes Time: 2015-03-04 20:10:45 UTC Ver: 12.1.3.2.15 See localhost:/tm/kvroot/MetroArea/log/MetroArea_{0..N}.log for progress messages Verify: Shard Status: total:2 healthy:2 degraded:0 noQuorum:0 offline:0 Verify: Zone [name=Manhattan id=zn1 type=PRIMARY] RN Status: total:3 online:3 maxDelayMillis:5 maxCatchupTimeSecs:0 Verify: Zone [name=JerseyCity id=zn1 type=PRIMARY] RN Status: total:3 online:3 maxDelayMillis:5 maxCatchupTimeSecs:0 Verify: Zone [name=Queens id=zn1 type=PRIMARY] RN Status: total:3 online:3 maxDelayMillis:5 maxCatchupTimeSecs:0 Verify: == checking storage node sn1 == Verify: Storage Node [sn1] on localhost:5010 Zone: [name=Manhattan id=zn1 type=PRIMARY] Status: RUNNING Ver: 12cR1.3.2.15 2015-03-04 06:35:02 UTC Build id: 8e70b50c0b0e Verify: Admin [admin] Status: RUNNING Verify: Rep Node [rg1-rn2] Status: RUNNING,REPLICA sequenceNumber:121 haPort:5013 delayMillis:5 catchupTimeSecs:0 Verify: == checking storage node sn2 == Verify: Storage Node [sn2] on localhost:5020 Zone: [name=Manhattan id=zn1 type=PRIMARY] Status: RUNNING Ver: 12cR1.3.2.15 2015-03-04 06:35:02 UTC Build id: 8e70b50c0b0e Verify: Rep Node [rg2-rn2] Status: RUNNING,REPLICA sequenceNumber:121 haPort:5022 delayMillis:5 catchupTimeSecs:0 Verify: == checking storage node sn3 == Verify: Storage Node [sn3] on localhost:5030 Zone: [name=JerseyCity id=zn2 type=PRIMARY] Status: RUNNING Ver: 12cR1.3.2.15 2015-03-04 06:35:02 UTC Build id: 8e70b50c0b0e Verify: Admin [admin2] Status: RUNNING,MASTER Verify: Rep Node [rg1-rn3] Status: RUNNING,REPLICA sequenceNumber: 121 haPort: 5033 delayMillis:5 catchupTimeSecs:0 Verify: == checking storage node sn4 == Verify: Storage Node [sn4] on localhost:5040 Zone: [name=JerseyCity id=zn2 type=PRIMARY] Status: RUNNING Ver: 12cR1.3.2.15 2015-03-04 06:35:02 UTC Build id: 8e70b50c0b0e Verify: Rep Node [rg2-rn3] Status: RUNNING,REPLICA sequenceNumber:121 haPort:5042 delayMillis:5 catchupTimeSecs:0 Verify: == checking storage node sn5 == Verify: Storage Node [sn5] on localhost:5050 Zone: [name=Queens id=zn3 type=PRIMARY] Status: RUNNING Ver: 12cR1.3.2.15 2015-03-04 06:35:02 UTC Build id: 8e70b50c0b0e Verify: Admin [admin3] Status: RUNNING Verify: Rep Node [rg1-rn1] Status: RUNNING,MASTER sequenceNumber:121 haPort:5053 Verify: == checking storage node sn6 == Verify: Storage Node [sn6] on localhost:5060 Zone: [name=Queens id=zn3 type=PRIMARY] Status: RUNNING Ver: 12cR1.3.2.15 2015-03-04 06:35:02 UTC Build id: 8e70b50c0b0e Verify: Rep Node [rg2-rn2] Status: RUNNING,MASTER sequenceNumber:121 haPort:5062 Verification complete, no violations.
In the previous example there are three zones (zn1 = Manhattan, zn2 = JerseyCity, zn3=Queens) with six Replication Nodes (two masters and four replicas) in this cluster. This means that this topology is not only highly available because you have three replicas within each shard, but it is also able to recover from a single zone failure. If any zone fails, the other two zones are enough to elect the new master, so service continues without any interruption.