Sun Cluster 3.1 Concepts Guide

Dynamic Reconfiguration Support

Sun Cluster 3.1 support for the dynamic reconfiguration (DR) software feature is being developed in incremental phases. This section describes concepts and considerations for Sun Cluster 3.1 support of the DR feature.

Note that all of the requirements, procedures, and restrictions that are documented for the Solaris  DR feature also apply to Sun Cluster DR support (except for the operating environment quiescence operation). Therefore, review the documentation for the Solaris DR feature before using the DR feature with Sun Cluster software. You should review in particular the issues that affect non-network IO devices during a DR detach operation. The Sun Enterprise 10000 Dynamic Reconfiguration User Guide and the Sun Enterprise 10000 Dynamic Reconfiguration Reference Manual (from the Solaris 8 on Sun Hardware or Solaris 9 on Sun Hardware collections) are both available for download from http://docs.sun.com.

Dynamic Reconfiguration General Description

The DR feature allows operations, such as the removal of system hardware, in running systems. The DR processes are designed to ensure continuous system operation with no need to halt the system or interrupt cluster availability.

DR operates at the board level. Therefore, a DR operation affects all of the components on a board. Each board can contain multiple components, including CPUs, memory, and peripheral interfaces for disk drives, tape drives, and network connections.

Removing a board containing active components would result in system errors. Before removing a board, the DR subsystem queries other subsystems, such as Sun Cluster, to determine whether the components on the board are being used. If the DR subsystem finds that a board is in use, the DR remove-board operation is not done. Therefore, it is always safe to issue a DR remove-board operation since the DR subsystem rejects operations on boards containing active components.

The DR add-board operation is always safe also. CPUs and memory on a newly added board are automatically brought into service by the system. However, the system administrator must manually configure the cluster in order to actively use components that are on the newly added board.


Note –

The DR subsystem has several levels. If a lower level reports an error, the upper level also reports an error. However, when the lower level reports the specific error, the upper level will report “Unknown error.” System administrators should ignore the “Unknown error” reported by the upper level.


The following sections describe DR considerations for the different device types.

DR Clustering Considerations for CPU Devices

Sun Cluster software will not reject a DR remove-board operation due to the presence of CPU devices.

When a DR add-board operation succeeds, CPU devices on the added board are automatically incorporated in system operation.

DR Clustering Considerations for Memory

For the purposes of DR, there are two types of memory to consider. These two types differ only in usage. The actual hardware is the same for both types.

The memory used by the operating system is called the kernel memory cage. Sun Cluster software does not support remove-board operations on a board that contains the kernel memory cage and will reject any such operation. When a DR remove-board operation pertains to memory other than the kernel memory cage, Sun Cluster will not reject the operation.

When a DR add-board operation that pertains to memory succeeds, memory on the added board is automatically incorporated in system operation.

DR Clustering Considerations for Disk and Tape Drives

Sun Cluster rejects DR remove-board operations on active drives in the primary node. DR remove-board operations can be performed on non-active drives in the primary node and on any drives in the secondary node. After the DR operation, cluster data access continues as before.


Note –

Sun Cluster rejects DR operations that impact the availability of quorum devices. For considerations about quorum devices and the procedure for performing DR operations on them, see DR Clustering Considerations for Quorum Devices.


See the Sun Cluster 3.1 System Administration Guide for detailed instructions on how to perform these actions.

DR Clustering Considerations for Quorum Devices

If the DR remove-board operation pertains to a board containing an interface to a device configured for quorum, Sun Cluster rejects the operation and identifies the quorum device that would be affected by the operation. You must disable the device as a quorum device before you can perform a DR remove-board operation.

See the Sun Cluster 3.1 System Administration Guide for detailed instructions on how to perform these actions.

DR Clustering Considerations for Cluster Interconnect Interfaces

If the DR remove-board operation pertains to a board containing an active cluster interconnect interface, Sun Cluster rejects the operation and identifies the interface that would be affected by the operation. You must use a Sun Cluster administrative tool to disable the active interface before the DR operation can succeed (also see the caution below).

See the Sun Cluster 3.1 System Administration Guide for detailed instructions on how to perform these actions.


Caution – Caution –

Sun Cluster requires that each cluster node has at least one functioning path to every other cluster node. Do not disable a private interconnect interface that supports the last path to any cluster node.


DR Clustering Considerations for Public Network Interfaces

If the DR remove-board operation pertains to a board containing an active public network interface, Sun Cluster rejects the operation and identifies the interface that would be affected by the operation. Before removing a board with an active network interface present, all traffic on that interface must first be switched over to another functional interface in the multipathing group by using the if_mpadm(1M) command.


Caution – Caution –

If the remaining network adapter fails while you are performing the DR remove operation on the disabled network adapter, availability is impacted. The remaining adapter has no place to fail over for the duration of the DR operation.


See the Sun Cluster 3.1 System Administration Guide for detailed instructions on how to perform a DR remove operation on a public network interface.