Sun Enterprise 10000 Dynamic Reconfiguration User Guide

Detach-Related Error Messages

The following table contains detach-related error messages that are sent to the system logs and/or to the SSP applications.

Table A-3 Detach-Related Failure Error Messages

Error Message 

Probable Cause 

Suggested Action 

NGDR Error: Cannot detach board board_number. It has interface_name interfaces configured.

The board is not eligible to be detached because it has one or more network interfaces attached to it that are critical to the operation of the domain. The network interfaces can be any mix of primary, SSP, AP, or PBF interfaces. 

Use the ifconfig(1M) command to determine the role of the interface(s). If the configured interface is the primary network or the SSP, manually switch the interface to the alternate interface if one exists. For an interface other than the primary and the SSP, unplumbing it may enable the detach operation to succeed. Otherwise, the domain must be shut down, and the interfaces must be moved to another board.

NGDR Error: cpu0_move_finished: invalid board state

Communication protocol has been breached over the eligibility of a CPU. To the SSP, the CPU has been moved off of the board. To the DR driver, the move operation is an invalid operation for that board. 

None 

ifconfig down failed.

The ifconfig(1M) command failed to bring down the network interfaces. The ifconfig(1M) command unplumbs and brings down the network interfaces before the board is detached. One of the network interfaces on the board could be busy, so manual intervention may be needed.

Log in to the domain, and, if possible, bring down the network interfaces on the board manually by using the ifconfig(1M) command with the down option. The manual execution of the command may yield more detailed information about the failure.

ifconfig unplumb failed.

The ifconfig(1M) command failed to unplumb the network interfaces. The ifconfig(1M) command unplumbs and brings down the network interfaces before the board is detached. One of the network interfaces on the board could be busy, so manual intervention may be needed.

Log in to the domain, and, if possible, unplumb the network interfaces manually by using the ifconfig(1M) command with the unplumb option. The manual execution of the command may yield more detailed information about the failure.

Warning: Error return from /opt/SUNWconn/bin/nf_snmd_kill (return_value)

The command failed. Certain daemons keep network interfaces open continuously. Those daemons must be stopped before the devices they control can be detached. 

Analyze the return_value to determine why the kill(1) command failed, and try to correct the problem. If necessary, use the ps(1) command to obtain the PID number for the daemons, and use the kill(1) command to stop the daemons manually.

Warning: Error return from /opt/SUNWconn/bin/pf_snmd_kill (return_value)

The kill(1) command failed. The daemons that are used to control certain network devices must be stopped before the devices can be detached because the daemons keep the interfaces open continually.

Analyze the return_value to determine why the kill command failed, and try to correct the problem. If necessary, use the ps(1) command to obtain the PID number for the daemons, and use the kill(1) command to stop the daemons manually.

NGDR Error: abort_detach: board already drained

The CANCEL ioctl() failed while the DR daemon was trying to abort the detach operation. The failure caused the board to be reported as being in the UNREFERENCED state, indicating that the memory has already been drained.

The board must be completely detached before you can recover from this error. Retry the DR operation after the board has been successfully detached. 

NGDR Error: abort_detach_board: invalid board state

Communication protocol has been breached over the eligibility of a board. To the SSP, the board is part of the domain and has been, or is being, drained of its resources. The SSP, therefore, issues the abort command to stop the detach operation. However, to the DR driver and daemon, the board is not part of the domain. 

Exit and restart the DR application. 

NGDR Error: board configuration query failed.

The DR daemon failed to ascertain the eligibility of the configuration of the board. 

Stop and start the DR daemon and/or the DR driver. If this error persists, use the modinfo(1M), modload(1M), and modunload(1M) commands to work with the driver after you have stopped the DR daemon. Also, check the size of the DR daemon with the ps(1) command. If it is not between 300- and 400 Kbytes, report this error, providing as much information from the system logs as possible.

NGDR Error: Cannot abort detach. Board detached from OS (detach completed).

This message indicates that the detach operation has completed. It follows the message that is displayed for the NGDR Error: abort_detach: board already drained error message.

See the NGDR Error: abort_detach: board already drained message.

NGDR Error: couldn't query cpu configuration

The complete_detach operation has failed because the DR daemon could not ascertain the CPU configuration just prior to the beginning of the complete_detach operation. After a board is detached, the DR daemon uses the information about the CPU configuration to update the utmp and wtmp entries for each CPU on the board. Although the complete_attach operation does not depend on the updates, if the mechanisms through which the CPU configuration is queried are broken, serious problems exist, so a completion of the detach operation should not proceed.

Stop and start the DR daemon and/or the DR driver. Also, check the size of the DR daemon with the ps(1) command. If it is not between 300- and 400-Kbytes, report this error, providing as much information from the system logs as possible.

NGDR Error: detach_board: invalid board state

Communication protocol has been breached over the eligibility of a board. To the SSP, the board is part of the domain, and its resources have been drained, causing the SSP to attempt to complete the detach operation. However, to the DR driver and daemon, the board is not part of the domain. 

Examine the state of the board by using the showdevices(1m) command, and determine the cause of the problem. Retry the drain and/or complete_detach operations to determine if the error is recoverable. Stop and start the DR daemon and driver.

NGDR Error: detach_board: invalid board state

The proper sequence of board states has not been followed, meaning that the board went into the error state or that an earlier failure in the drain-detach sequence of events was not properly reported. 

Examine the state of the board by using the showdevices(1m) command, and determine the cause of the problem. Retry the drain and/or complete_detach operations to determine if the error is recoverable. Stop and start the DR daemon and driver.

NGDR Error: detach_finished: invalid board state

Communication protocol has been breached over the eligibility of a board. To the SSP, the board has been detached. However, to the DR driver and daemon, the board has not been detached from the domain. 

Examine the state of the board by using the showdevices(1m) command, and determine the cause of the problem. Retry the drain and/or complete_detach operations to determine if the error is recoverable. Stop and start the DR daemon and driver.

NGDR Error: detachable_board: invalid board state

Communication protocol has been breached over the eligibility of a board. To the SSP, the board is part of the domain, so the SSP attempts to drain the resources. However, to the DR driver and daemon, the board is not part of the domain. 

Examine the state of the board by using the showdevices(1m) command, and determine the cause of the problem. Retry the drain and/or complete_detach operations to determine if the error is recoverable. Stop and start the DR daemon and driver.

NGDR Error: detaching board would leave no online CPUs

The detach operation failed because no CPUs would be left online after the board is detached. 

Bring more CPUs online on other boards in the domain, or add more boards with online CPUs to the domain, so that the domain will have enough online CPUs after the board is detached. 

NGDR Error: drain_board_resources: invalid board state

Communication protocol has been breached over the eligibility of a board. To the SSP, the board is part of the domain, so the SSP attempts to drain the resources. However, to the DR driver and daemon, the board is not part of the domain. 

Examine the state of the board by using the showdevices(1m) command, and determine the cause of the problem. Retry the drain and/or complete_detach operations to determine if the error is recoverable. Stop and start the DR daemon and driver.

NGDR Error: Remaining system memory (memory_size mb) below minimum threshold (minimum_memory_size mb) . . . .Not enough space

The domain must have enough memory to accommodate the memory of the board that is being detached. The detach operation failed because the domain does not have enough memory to detach the board. 

Attach as many boards as necessary so that the memory in the domain will hold the memory on the board being detached. 

NGDR Error: Some devices not re-attached. Examine the host syslog for details . . . errno_description

Devices could not be reattached to the operating environment during an abort detach operation. Errors were encountered while the DR daemon tried to communicate with the device drivers for one or more devices on the board. 

Examine the system logs to determine which devices were not reattached. If possible, fix the problem then issue the complete_attach(1M) command again to fully configure the board. If this action fails, the failure may be caused by an unsupported device for which a state cannot be resolved until the domain is rebooted.

NGDR Error: sysconf failed (_SC_NPROCESSORS_ONLN) . . . errno_description

The sysconf(3c) system call failed to return the total number of online CPUs in the domain. Thus, the DR daemon cannot determine if the domain would be left with any online CPUs after the board is detached.

See the sysconf(3c) man page for more details about this error. Use those details and the errno_description to diagnose and solve the error. Retry the DR operation after you have solved the error. If no fix is apparent, stop and restart the DR daemon, then retry the DR operation.