C H A P T E R  2

IPMI Server Management

Server manufacturers today have to re-invent how each new server manages itself. The hardware and software design for one server does not necessarily work with another. Every server supplier provides basic monitoring and data collection functions but no two do it exactly the same. These proprietary implementations for manageability only complicate the problem.


Intelligent Platform Management Interface

The standardization of server-based management, called Intelligent Platform Management Interface (IPMI), provides a solution. IPMI enables you to interconnect the CPU and devices being managed. It allows for:

IPMI is an industry-standard, hardware-manageability interface specification that provides an architecture defining how unique devices can all communicate with the CPU in a standard way. It facilitates platform-side server management and remote server-management frameworks, by providing a standard set of interfaces for monitoring and managing servers.

With IPMI, the software becomes less dependent on hardware because the management intelligence resides in the IPMI firmware layer, thereby creating a more intelligently managed server. The IPMI solution increases server scalability by distributing the management intelligence closer to the devices that are being managed.

Baseboard Management Controller

In order to perform autonomous platform-management functions, the processor runs embedded software or firmware. Together, the processor and its controlling firmware are referred to as the Baseboard Management Controller (BMC), which is the core of the IPMI structure. Tightly integrating an IPMI BMC and management software with platform firmware provides a total management solution.



Note - Another way to perform IPMI queries and actions on the BMC is through the IPMI client utility IPMItool, which is used extensively in the testing process. For more information, see Lights Out Management.



The BMC is a service processor integrated into the motherboard design, providing a management solution independent of the main processor. The monitored server can communicate with the BMC through one of three defined interfaces, which are based on a set of registers shared between the platform and the BMC.



Note - In the Sun Fire V20z and Sun Fire V40z servers, the SP has software that emulates a BMC.



The BMC is responsible for:

The BMC provides the intelligence behind IPMI. In the Sun Fire V20z and Sun Fire V40z servers, the SP serves as the BMC, providing access to sensor data and events through the standard IPMI interfaces.

Manageability

IPMI defines a mechanism for server monitoring and recovery implemented directly in hardware and firmware. IPMI functions are available independent of the main processors, BIOS, and operating system.

IPMI monitoring, logging, and access functions add a built-in level of manageability to the platform hardware. IPMI can be used in conjunction with server-management software running under the OS, which provides an enhanced level of manageability.

IPMI provides the foundation for smarter management of servers by providing a methodology for maintaining and improving the reliability, availability, and serviceability of expensive server hardware.

Functional Overview

The following list details the main features of IPMI in the servers:

The BMC owns all sensors within the repository. The SDRR features include:

IPMI Compliance and LAN Channel Access

The servers support IPMI with both SMS and LAN channels through the SP software version 2.2 and later. These servers meet compliance standards for IPMI version 2.

The SMS is implemented as a Keyboard Controller Style (KCS) interface.

The IPMI implementation on these servers also supports LAN channel access. (Refer to the IPMI specification v2 for details.) By default, the LAN channel access is disabled. To enable it, use the ipmi enable channel command and specify the ID of the channel to enable for the LAN Interface, as follows.



Note - This ID is case-sensitive and must be lowercase.



# ssh spipaddr -l spuser ipmi enable channel {sms | lan}

As part of this command, you also specify the password for the default null user. The null user can then use IPMI over the LAN interface. For more information, see User Names and Passwords.

For more information about enabling or disabling the IPMI channel, refer to Appendix E.

User Names and Passwords

Operator-level and admin-level access over the LAN channel requires a valid user name and password. These servers are not preconfigured with user accounts enabled. When you initially enable the LAN channel through the command ipmi enable channel, you are required to provide the password for the null user. See IPMI Compliance and LAN Channel Access.



Note - For security reasons, the LAN channel access is disabled by default.





Note - IPMI user identities are in no way associated with user accounts defined for server-management capabilities. Refer to Initial Setup of the SP for more information about these server-management user accounts.



Server Boot-Option Support

IPMI enables you to set a number of boot options for interpretation by the BIOS. TABLE 2-1 describes important information about the server boot options and parameters that the BIOS supports.


TABLE 2-1 Server Boot Options Supported by the BIOS

Parameter

Number

Details

Set In Progress

0

This parameter is fully supported except for the rollback functionality.

BMC boot flag valid bit clearing

3

Fully supported.

Boot info ack

4

BIOS supports indicating that it has handled boot information.

Boot Flags

5

  • Data byte 1 is supported for the boot flags valid bit.
  • Data byte 2 (CMOS Clear) is supported; however, when this bit is set, all other bits in this byte are ignored.
  • Lock keyboard is fully supported.
  • Boot device selector is supported except for booting to BIOS Setup.
  • Data byte 3 is supported for user password bypass

System Event Log

The IPMI system event log (SEL) is part of the BMC. Several types of information are logged to the SEL, from administrative messages to indications of important events, such as sensor-threshold crossings.

The size of the log is 16K, which allows for 1024 records.

Sensors

Sensors generate events, obtain readings, and set thresholds. The Sensor Data Record Repository (SDRR) contains several types of sensors.

You access all sensors through the BMC. Many sensors represent physical sensors that are distributed on the motherboard and contained within FRUs. These sensors are polled. When they cross a threshold, an entry is entered in the SEL.

For more information on sensor commands, see Appendix G.

Determine Sensor Presence

To determine the presence of a sensor, run the subcommand sensor get.

A sensor that is offline (not reporting) or physically not present in the system is indicated by state unavailable in the command response data.

Sensor Thresholds

To retrieve sensor thresholds, run the subcommand sensor get.

To set sensor thresholds, run subcommand sensor set.

If you specify no thresholds, the result is no change and the return code is success.

TABLE 2-2 lists the completion codes that are returned by the subcommand set sensor.


TABLE 2-2 Completion Codes for Sensor Thresholds

Code

Cause

0x00 (success)

Sensor thresholds set as requested.

0xCD (illegal command)

Sensor thresholds are unchangeable.

0xCC (invalid request)

Attempting to set an unsettable threshold or attempting to set thresholds in an improper order (for example, the upper critical threshold is set lower than the upper noncritical threshold).

0xC0 (node busy)

Processing resources are temporarily unavailable.


Temperature Sensors

Temperature sensor readings are defined within a range of 0° C to 150° C, a difference of 151° C. The CPU die temperature thermal trip occurs at approximately 140° C.

Temperature sensors can generate the following SEL events:

Memory Sensors for DIMMs

Each DIMM has its own record, which is used only to log IPMI events.

For more information, refer to the section "Analyzing Events" in the Sun Fire V20z and Sun Fire V40z Servers--Troubleshooting Techniques and Diagnostics Guide.

Voltage Sensors

All voltage sensor readings are indicated in volts (V). The largest voltage swing that is measured is 15V (the bulk voltage sensor ranges from 0V to 15V). Many of the voltage sensors have much lower maximums and smaller ranges. Voltage sensors can generate the following SEL events:

Fan Sensors

The values reported for all fan-speed sensor readings are indicated in revolutions per minute (RPMs). The sensors have an upper bound of 15,000 RPM.

Fan sensors can generate the following SEL events:

Power-Supply Sensors

All power sensor readings are indicated in watts (W) and are defined within a range of 0W to 600W.

Management Controllers

One management-controller sensor represents the BMC. The management controller has the following capabilities:

Miscellaneous Sensors

The following additional sensors also are supported:

System Event

The system-event sensor indicates a variety of system events. However, no event conditions are reflected from the subcommand sensor get.

PEF actions-Pending actions matched against a platform event filter (PEF) are logged if the event sensor has been configured to do so. Only assertions of pending PEF action conditions are logged.

Sensor Type Code: 0x12 [System Event] Sensor Specific Offset: 0x04 [PEF Action]

Time sync-Time-sync events occur in pairs: one before and one after a SEL time sync.

Sensor Type Code: 0x12 [System Event] Sensor Specific Offset: 0x05 [Time sync]
Event Logging Disabled

The sensor event logging disabled indicates certain SEL-related events. This sensor is represented as a "type 2" SDR record.

SEL Full-When the SEL reaches the "maximum-1" number of records, a record is logged and any subsequent add SEL commands return a limit-exceeded code. This record becomes the last record in the SEL when the log is filled to capacity.

Sensor Type Code: 0x10 Sensor Specific Offset: 0x04 [Log Full] 

SEL Clear-A record is written to the SEL whenever the command Clear SEL is executed. This occurs only on the command Clear SEL; it does not occur if you delete the last SEL entry with the command Delete SEL Entry.

Sensor Type Code: 0x10 Sensor Specific Offset: 0x02 [Log AreaReset/Cleared] 
System Firmware Progress

The system-firmware progress sensor is an event-only sensor. The BIOS Boot Success SEL entry can be logged against this sensor when the BIOS has successfully booted and has attempted to return control to the OS, or if the BIOS has been booted and you enter a BIOS Setup screen.

Sensor Type Code: 0x0F Sensor Specific Offset: 0x02 [Firmware Progress]Event Data 2: 0x13 [Starting operating system boot process]
Watchdog

The Watchdog 2 sensor is used to log watchdog timer expirations. These events are generated only for timers that do not have the "do not log" bit set. A timer-expiration event is logged when a watchdog timer expires.

Sensor Type Code: 0x23 Sensor Specific Offset: * all supported actions 

Event Filters



Note - To ensure a graceful shutdown, the correct platform drivers must be installed on the server.



Platform Event Filtering (PEF) provides policy management that enables the BMC to act on particular events. The supported actions through PEF include:

TABLE 2-3 lists the event filters that are enabled by default.


TABLE 2-3 Event Filters Enabled by Default

Filter Match

Action

ambienttemp asserts upper critical threshold

Power down

cpu0.dietemp asserts upper critical threshold

Graceful power down

cpu1.dietemp asserts upper critical threshold

Graceful power down

cpu2.dietemp asserts upper critical threshold

Note: This filter is ignored on systems with two CPUs.

Graceful power down

cpu3.dietemp asserts upper critical threshold

Note: This filter is ignored on systems with two CPUs.

Graceful power down


Watchdog Timers

A watchdog timer allows a selected action to occur when the timer expires.

For timer actions, pre-timeout interrupts are currently not supported. The following actions are supported:

Alerting

When you use platform event trap (PET) LAN alerts, the number of alert destinations is limited to 16 (1 nonvolatile, 15 volatile). The number of alert policies is limited to 32.



Note - Acknowledgement of PET LAN alerts and alert strings are unsupported.



Alert Policy Set Determination

When event filters are matched, the following occurs:

You can configure policies so that, if the previous alert was successful, an alert is not sent as a result of the execution of the alert policy.

Lights Out Management

On these servers, Lights Out Management (LOM) is performed through IPMItool, a utility for controlling IPMI-enabled devices.

Description

IPMItool is a simple command-line interface (CLI) to servers that supports the Intelligent Platform Management Interface (IPMI) v1.5 specification. It provides the ability to:

Originally written to take advantage of IPMI-over-LAN interfaces, IPMItool is also capable of using a system interface, as provided by a kernel device driver such as OpenIPMI.

Further Information

http://ipmitool.sourceforge.net/

http://www.intel.com/design/servers/ipmi/spec.htm

http://openipmi.sourceforge.net/

Syntax

The syntax used by IPMItool is as follows:

ipmitool [-ghcvV] -I lan -H address [-P password] expression 
ipmitool [-ghcvV] -I open expression 

IPMItool Options

TABLE 2-4 lists the options available for IPMItool.


TABLE 2-4 Options for IPMItool

Option

Description

-h

Provides help on basic usage from the command line.

-c

Makes the output suitable for parsing, where possible, by separating fields with commas instead of spaces.

-g

Attempts to make IPMI-over-LAN communication more robust.

-V

Displays the version information.

-v

Increases the amount of text output. This option can be specified more than once to increase the level of debug output. If given three times, you receive hexdumps of all incoming and outgoing packets.

-I interface

Selects the IPMI interface to use. The possible interfaces are LAN or open interface.

-H address

Displays the address of the remote server; it can be an IP address or host name. This option is required for the LAN interface connection.

-P password

Displays the password for the remote server. The password is limited to a maximum of 16 characters. The password is optional for the LAN interface; if a password is not provided, the session is not authenticated.


IPMItool Expressions

TABLE 2-5 lists the expressions and parameters available for IPMItool.



Note - For each of these expressions, the beginning command is always ipmitool, followed by the expression and parameter(s).





Note - The sol command is not supported in these servers, but you can enable a serial-over-LAN feature. See Serial-Over-LAN.




TABLE 2-5 Expressions and Parameters for IPMItool

Expression

Parameter

Subparameter

Description and examples

help

 

 

Can be used to get command-line help on IPMItool commands. You can also place this expression at the end of commands to get help on the use of options.

EXAMPLES:
ipmitool -I open help
Commands: chassis, fru, lan, sdr, sel

ipmitool -I open chassis help

Chassis Commands: status, power, identify, policy, restart_cause

ipmitool -I open chassis power help

Chassis Power Commands: status, on, off, cycle, reset, diag, soft

raw

netfn

cmd data

Enables you to execute raw IPMI commands (for example, to query the POH counter with a raw command).

EXAMPLE:
ipmitool -I open raw 0x0 0x1

RAW REQ (netfn=0x0 cmd=0x1 data_len=0)
RAW RSP (3 bytes)
60 00 00

chaninfo

channel

 

Displays information about the selected channel. If no channel is specified, the command displays information about the channel currently being used.

EXAMPLES:

ipmitool -I open chaninfo

Channel 0xf info:

Channel Medium Type: System Interface

Channel Protocol Type: KCS

Session Support: session-less

Active Session Count: 0

Protocol Vendor ID: 7154


ipmitool -I open chaninfo 7

Channel 0x7 info:

Channel Medium Type: 802.3 LAN

Channel Protocol Type: IPMB-1.0

Session Support: multi-session

Active Session Count: 1

Protocol Vendor ID: 7154

Alerting: enabled

Per-message Auth: enabled

User Level Auth: enabled

Access Mode: always available

userinfo

channel

Note: Channels 6 and 7 are not supported on Sun Fire V20z servers.

 

Displays information about configured user information on a specific LAN channel.


EXAMPLE:

ipmitool -I open userinfo 6

Maximum User IDs : 4

Enabled User IDs : 1

Fixed Name User IDs : 1

Access Available : call-in / callback

Link Authentication : disabled

IPMI Messaging : enabled

chassis

status

 

Returns information about the high-level status of the server chassis and main power subsystem.

identify

interval

Controls the front panel identification light. The default value is 15 seconds. Enter "0" to turn the light off.

restart_cause

 

Queries the chassis for the cause of the last server restart.

power

 

 

Performs a chassis control command to view and change the power state.

status

 

Shows the current status of the chassis power.

on

 

Powers on the chassis.

off

 

Powers off chassis into the soft off state (S4/S5 state).

Note: This command does not initiate a clean shutdown of the operating system prior to powering off the server.

cycle

 

Provides a power-off interval of at least 1 second.

No action should occur if chassis power is in S4/S5 state, but it is recommended to check the power state first and then only issue a power-cycle command if the server power is on or in a lower sleep state than S4/S5.

reset

 

Performs a hard reset.

lan

print

channel

Prints the current configuration for the given channel.

set

channel
parameter

Sets the given parameter on the given channel.

ipaddr x.x.x.x

Sets the IP address for this channel.

netmask x.x.x.x

Sets the netmask for this channel.

macaddr xx:xx:xx:xx:xx:xx

Sets the MAC adddress for this channel.

defgw ipaddr x.x.x.x

Sets the default gateway IP address.

defgw macaddr xx:xx:xx:xx:xx:xx

Sets the default gateway MAC address.

bakgw ipaddr x.x.x.x

Sets the backup gateway IP address.

bakgw macaddr xx:xx:xx:xx:xx:xx

Sets the backup gateway MAC address.

password pass

Sets the null user password.

user

Enables the user-access mode.

access [on|off]

Sets the LAN-channel-access mode.

ipsrc source

Sets the IP address source. For source, you can indicate:

none = unspecified

static = manually configured static IP address

dhcp = address obtained by BMC running DHCP

bios = address loaded by BIOS or system software

arp respond [on|off]

Sets the BMC-generated ARP responses.

arp generate [on|off]

Sets the BMC-generated gratuitous ARPs.

arp interval [seconds] s

Sets the interval for the BMC-generated gratuitous ARPs.

auth level,...
type,...

Sets the valid authtypes for a given auth level.

Levels can be: callback, user, operator, admin

Types can be: none, md2, md5

fru

print

 

Reads all inventory data for the customer-replaceable units (CRUs) and extracts such information as serial number, part number, asset tags, and short strings describing the chassis, board, or product.

sdr

list

 

Reads the Sensor Data Record (SDR) and extracts sensor information, then queries each sensor and prints its name, reading and status.

sel

info

 

Queries the BMC for information about the system event log (SEL) and its contents.

clear

 

Clears the contents of the SEL.

The clear command cannot be undone.

list

 

Lists the contents of the SEL.


IPMI Linux Kernel Device Driver

The IPMItool application utilizes a modified MontaVista OpenIPMI kernel device driver that is provided on the Sun Fire V20z and Sun Fire V40z Servers Documentation and Support Files CD. The driver has been modified to use an alternate base hardware address and modified device I/O registration.

This driver must be compiled and installed from the Documentation and Support Files CD.

The following kernel modules must be loaded in order for IPMItool to work:

1. ipmi_msghandler

The message handler for incoming and outgoing messages for the IPMI interfaces.

2. ipmi_kcs_drv

An IPMI Keyboard Controller Style (KCS) interface driver for the message handler.

3. ipmi_devintf

Linux character device interface for the message handler.

To force IPMItool to use the device interface, you can specify it on the command line:

# ipmitool -I open [option...] 

To install and compile this kernel device driver, see Initial Setup of the SP.

LAN Interface for the BMC



Note - In the Sun Fire V20z and Sun Fire V40z servers, the SP has software that emulates a BMC.



The IPMItool LAN interface communicates with the BMC over an Ethernet LAN connection using User Datagram Protocol (UDP) under IPv4. UDP datagrams are formatted to contain IPMI request/response messages with IPMI session headers and Remote Management Control Protocol (RMCP) headers.

Remote Management Control Protocol is a request-response protocol delivered using UDP datagrams to port 623. IPMI-over-LAN uses version 1 of the RMCP to support management both before installing the OS on the server, or if the server will not have an OS installed.

The LAN interface is an authenticated, multisession connection; messages delivered to the BMC can (and should) be authenticated with a challenge/response protocol with either a straight password/key or an MD5 message-digest algorithm. IPMItool attempts to connect with administrator privilege level as this is required to perform chassis power functions.

With the -I option, you can direct IPMItool to use the LAN interface:

# ipmitool -I lan [option...] address password

To use the LAN interface with IPMItool, you must provide a host name on the command line.

The password field is optional. If you do not provide a password on the command line, IPMItool attempts to connect without authentication. If you specify a password, it uses MD5 authentication, if supported by the BMC; otherwise, it will use straight password/key.

Files

The file /dev/ipmi0 is a character-device file used by the OpenIPMI kernel driver.

Examples

If you want to remotely control the power of an IPMI-over-LAN-enabled server, you can use the following commands:

# ipmitool -I lan -H spipaddr -P sppasswd chassis power on 

The result returned is:

Chassis Power Control: Up/On 
 
# ipmitool -I lan -H spipaddr -P sppasswd chassis power status 

The result returned is:

Chassis Power is on 

Viewing the IPMI System Event Log

To view the system event log (SEL), use IPMItool.

The out-of-band command is:

# ipmitool -I lan -H spipaddr -P ipmipasswd sel list 

The in-band command (using OpenIPMI on a Linux software-based server or LIPMI on a Solaris software-based server) is:

# ipmitool -I open sel list 


Note - To receive more verbose logging messages, you can run the following command:
# ssh -l spuser spipaddr sp get events



Clearing the IPMI System Event Log

You can use commands to clear the contents of the IPMI SEL.

Use one of the following commands, depending on your OS:


IPMI Troubleshooting

TABLE 2-6 describes some potential issues with IPMI and provides solutions.


TABLE 2-6 IPMI Troubleshooting

Issue

Solution

You cannot connect to the management controller using IPMItool over LAN.

Verify the network connection to the management controller and its IP address and verify the channel is enabled using the ipmi get channels command.

You cannot authenticate to the management controller using IPMItool over LAN.

Ensure that you are using the password assigned when you enabled IPMI LAN access from the management-controller shell prompt.

You have forgotten the password for IPMI access over LAN.

  1. You can reset the IPMI setting, reset the SDRR and purge the SEL from the management-controller shell by running the command:

# ssh spipaddr -l spuser ipmi reset -a

  1. Now re-enable IPMI on LAN with the following commands:

# ssh spipaddr -l spuser

# ipmi enable channel lan

# exit

IPMItool fails when using the "open" interface.

Ensure that the Linux kernel module ipmi_kcs_drv is loaded by running the lsmod command.