Why do I need at least three state database replicas?
Three or more replicas are required. You want a majority of replicas to survive a single component failure. If you lose a replica (for example, due to a device failure), it may cause problems running DiskSuite or when rebooting the system.
How does DiskSuite handle failed replicas?
The system will stay running with exactly half or more replicas. The system will panic when fewer than half the replicas are available to prevent data corruption.
The system will not reboot without one more than half the total replicas. In this case, you must reboot single-user and delete the bad replicas (using the metadb command).
As an example, assume you have four replicas. The system will stay running as long as two replicas (half the total number) are available. However, in order for the system to reboot, three replicas (half the total plus 1) must be available.
In a two-disk configuration, you should always create two replicas on each disk. For example, assume you have a configuration with two disks and you only created three replicas (two on the first disk and one on the second disk). If the disk with two replicas fails, DiskSuite will stop functioning because the remaining disk only has one replica and this is less than half the total number of replicas.
If you created two replicas on each disk in a two-disk configuration, DiskSuite will still function if one disk fails. But because you must have one more than half of the total replicas available in order for the system to reboot, you will be unable to reboot in this state.
Where should I place replicas?
If multiple controllers exist, replicas should be distributed as evenly as possible across all controllers. This provides redundancy in case a controller fails and also helps balance the load. If multiple disks exist on a controller, at least two of the disks on each controller should store a replica.
Replicated databases have an inherent problem in determining which database has valid and correct data. To solve this problem, DiskSuite uses a majority consensus algorithm. This algorithm requires that a majority of the database replicas agree with each other before any of them are declared valid. This algorithm requires the presence of at least three initial replicas which you create. A consensus can then be reached as long as at least two of the three replicas are available. If there is only one replica and the system crashes, it is possible that all metadevice configuration data may be lost.
The majority consensus algorithm is conservative in the sense that it will fail if a majority consensus cannot be reached, even if one replica actually does contain the most up-to-date data. This approach guarantees that stale data will not be accidentally used, regardless of the failure scenario. The majority consensus algorithm accounts for the following: the system will stay running with exactly half or more replicas; the system will panic when fewer than half the replicas are available; the system will not reboot without one more than half the total replicas.