This section describes connecting to the controller Service Processor (SP) and configuration considerations for maximum serviceability. In rare cases, faults associated with uncorrectable CPU errors are not diagnosable or displayed in the controller. These faults will be preserved by and observable on the ILOM. The following sections describe how to connect to and manage faults for these cases.
Connect to the server ILOM (Service Processor) on the server platform to diagnose hardware faults that do not appear in the BUI.
In a cluster environment, an ILOM connection should be made to each controller.
The server ILOM provides options for (i) network and (ii) serial port connectivity. Network connection is the preferred choice, as the ILOM serial port does not always allow adequate means of platform data collection.
WARNING : Failure to configure ILOM connectivity may lead to longer than necessary hardware fault diagnosis and resolution times.
All standalone controllers should have at least one NIC port configured as a management interface. Select the Allow Admin option in the BUI to enable BUI connections on port 215 and CLI connections on ssh port 22.
All cluster installations should have at least one NIC port on each controller configured as a management interface as described above. In addition, the NIC instance number must be unique on each controller. For example, nodeA uses nge0 and nodeB uses nge1, so that neither may be used as a cluster data interface. In addition, these interfaces must be locked to the controller using the Configuration -> Cluster option in the BUI. In some cases, this may require installation of an additional network interface card on each controller in a cluster configuration.
If access to the appliance data interfaces is impossible for any reason, the management network interface will maintain BUI and CLI access. During a cluster takeover, interfaces are taken down on the failed controller. So, locked interface configuration is required to gather diagnostic information from a failed controller.
WARNING : Failure to configure locked management interfaces on a cluster may lead to longer than necessary fault diagnosis and resolution times.
Log in to the server as root using the ILOM CLI. To view server faults, type the following command to list all known faults on the system:
-> show /SP/faultmgmt
The server lists all known faults, for example:
SP/faultmgmt Targets: 0 (/SYS/MB/P0) Properties: Commands: cd show
To clear the CPU fault, type the following command:
-> set /SYS/MB/Pn clear_fault_action=true
For example, to clear a fault on CP0:
-> set /SYS/MB/P0 clear_fault_action=true Are you sure you want to clear /SYS/MB/P0 (y/n)? y