Sun Cluster Concepts Guide for Solaris OS

Disk Device Groups

In the SunPlex system, all multihost devices must be under control of the Sun Cluster software. You first create volume manager disk groups—either Solaris Volume Manager disk sets or VERITAS Volume Manager disk groups (available for use in only SPARC based clusters)—on the multihost disks. Then, you register the volume manager disk groups as disk device groups. A disk device group is a type of global device. In addition, the Sun Cluster software automatically creates a raw disk device group for each disk and tape device in the cluster. However, these cluster device groups remain in an offline state until you access them as global devices.

Registration provides the SunPlex system information about which nodes have a path to what volume manager disk groups. At this point, the volume manager disk groups become globally accessible within the cluster. If more than one node can write to (master) a disk device group, the data stored in that disk device group becomes highly available. The highly available disk device group can be used to house cluster file systems.

Note –

Disk device groups are independent of resource groups. One node can master a resource group (representing a group of data service processes) while another can master the disk group(s) being accessed by the data services. However, the best practice is to keep the disk device group that stores a particular application's data and the resource group that contains the application's resources (the application daemon) on the same node. Refer to “Relationship Between Resource Groups and Disk Device Groups” in Sun Cluster Data Services Planning and Administration Guide for Solaris OS for more information about the association between disk device groups and resource groups.

With a disk device group, the volume manager disk group becomes “global” because it provides multipath support to the underlying disks. Each cluster node physically attached to the multihost disks provides a path to the disk device group.

Disk Device Group Failover

Because a disk enclosure is connected to more than one node, all disk device groups in that enclosure are accessible through an alternate path if the node currently mastering the device group fails. The failure of the node mastering the device group does not affect access to the device group except for the time it takes to perform the recovery and consistency checks. During this time, all requests are blocked (transparently to the application) until the system makes the device group available.

Figure 3–1 Disk Device Group Failover

Illustration: The preceding context describes the graphic.

Multiported Disk Device Groups

This section describes disk device group properties that enable you to balance performance and availability in a multiported disk configuration. Sun Cluster software provides two properties used to configure a multiported disk configuration: preferenced and numsecondaries. You can control the order in which nodes attempt to assume control if a failover occurs by using the preferenced property. Use the numsecondaries property to set a desired number of secondary nodes for a device group.

A highly available service is considered down when the primary goes down and when no eligible secondary nodes can be promoted to primary. If service failover occurs and the preferenced property is true, then the nodes follow the order in the nodelist to select a secondary. the nodelist that is set by the defines the order in which nodes will attempt to assume primary control or transition from spare to secondary. You can dynamically change the preference of a device service by using the scsetup(1M) utility. The preference that is associated with dependent service providers, for example a global file system, will be that of the device service.

Secondary nodes are check-pointed by the primary node during normal operation. In a multiported disk configuration, checkpointing each secondary node causes cluster performance degradation and memory overhead. Spare node support was implemented to minimize the performance degradation and memory overhead caused by checkpointing. By default, your disk device group will have one primary and one secondary. The remaining available provider nodes will come online in the spare state. If failover occurs, the secondary will become primary and the node highest in priority on the nodelist will become secondary.

The desired number of secondary nodes can be set to any integer between one and the number of operational non-primary provider nodes in the device group.

Note –

If you are using Solaris Volume Manager, you must create the disk device group before you can set the numsecondaries property to a number other than the default.

The default desired number of secondaries for device services is one. The actual number of secondary providers that is maintained by the replica framework is the desired number, unless the number of operational non-primary providers is less than the desired number. You will want to alter the numsecondaries property and double check the nodelist if you are adding or removing nodes from your configuration. Maintaining the nodelist and desired number of secondaries will prevent conflict between the configured number of secondaries and the actual number allowed by the framework.Use the metaset(1M) command for Solaris Volume Manager device groups or, if you're using Veritas Volume Manager, the scconf(1M) command for VxVM disk device groups in conjunction with the preferenced and numsecondaries property settings to manage addition and removal of nodes from your configuration. Refer to “Administering Cluster File Systems Overview” in Sun Cluster System Administration Guide for Solaris OS for procedural information about changing disk device group properties.