C H A P T E R 3 - Administering Your System

C H A P T E R 3

Administering Your System

You administer your system using the alarm card command-line interface and through the MOH application.

The alarm card CLI works with the MOH and PMS applications, and supports Simple Network Management Protocol (SNMP) and Remote Method Invocation (RMI) interfaces. MOH provides the SNMP and RMI interfaces to manage the system and send out events and alerts. CLI provides an overlapping subset of commands with MOH and also provides commands for the alarm card itself; sending out events and alerts is not a function of the CLI.

This chapter contains the following sections:

Using the Alarm Card Command-Line Interface

Updating the Alarm Card Flash Images

Setting the Date and Time on the Alarm Card

Using Remote Shell With the Alarm Card

Viewing Alarm Card Logs

Booting CPU Boards

Connecting to CPU Board Consoles From the Alarm Card

Using the PMS Application for Recovery and Control of CPU Boards

Using the Netra High Availability Suite With the Netra CT Server Applications

Monitoring Your System

Hot-Swap on the Netra CT Server

Using the Alarm Card Command-Line Interface

The alarm card command-line interface provides commands to control power of the system, control the CPU nodes, administer the system, show status, and set configuration variables. See Accessing the Alarm Card for information on how to access the alarm card.

CLI Commands

TABLE 3-1 lists the alarm card command-line interface commands by type, command name, default permission required to use the command, and command description. A -h option with a command indicates that help is available for that command.

Default permission levels are:

c (console permission; authorized to connect to other server console)

u (user administration permission; authorized to use commands that can add, delete, and change permission of users)

a (administration permission; authorized to change the state of the CLI configuration variables)

r (reset/poweron/poweroff permissions; authorized to reset, poweron, and poweroff any of the CPU boards)

blank (permission not required).

The permission level for a user can be changed with the userperm command.

TABLE 3-1 Alarm Card Command-Line Interface Commands
Command Type	Command	Permis- sion	Description
Status	`showenvironment`		Display a summary of current environmental information, such as fan and power supply status.
	`shownetwork`		Display the current network configuration of the alarm card.
	`showserialmode` `-b` port_num		Display the value of `serial_mode` for the specified port number.
	`showserialbaud` `-b` port_num		Display the value of `serial_baud` for the specified port number.
	`showserialparity` `-b` port_num		Display the value of `serial_parity` for the specified port number.
	`showserialstop` `-b` port_num		Display the value of `serial_stop` for the specified port number.
	`showserialdata` `-b` port_num		Display the value of `serial_data` for the specified port number.
	`showserialhwhandshake` `-b` port_num		Display the value of `serial_hwhandshake` for the specified port number.
	`showipmode` `-b` port_num		Display the value of `ip_mode` for the specified port number.
	`showipaddr` `-b` port_num		Display the value of `ip_addr` for the specified port number.
	`showipnetmask` `-b` port_num		Display the value of `ip_netmask` for the specified port number.
	`showipgateway` `-b` port_num		Display the value of `ip_gateway` for the specified port number.
	`showdate`		Display the system date.
	`showntpserver`		Display the IP address of the NTP server.
	`showfru` target instance field		Display FRU ID information. Refer to Displaying Netra CT Server FRU ID Information for more information.
	`showhostname`		Display the value of the hostname used in the CLI prompt.
	showservicemode		Display the value of the alarm card flash update service mode.
	showcpustate		Display the board type, power state, and boot state for each CPU board in the system.
	`showmohsecurity`		Display the value of the alarm card MOH security mode.
Power control	`poweroff` [cpu_node]	r	Power off the specified CPU node slot, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410; if no node is specified, power off the whole system.
	`poweron` [cpu_node]	r	Power on the specified CPU node slot, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410; if no node is specified, power on the whole system.
	`powersupply` n `on\|off`	r	Switch on or off the specified power supply unit.
CPU control	`console` cpu_node	c	Enter console mode and connect to the specified CPU node, where cpu_node can be 1 to 7 on a Netra CT 810 or 3 to 5 on a Netra CT 410.
	`consolehistory` [`-h`] `[run\|orun\|init][index` `[+\|-]`n`][pause` x`]`	c	Display the contents of the alarm card console `run` log or `orun` log. The `init` option clears both the `run` and `orun` logs. The `consolehistory` command may be abbreviated to `chist`.
	`consolerestart`	a	Copy the alarm card console `run` log (`run` buffer) into the old log (`orun` buffer), overwriting the previous contents; then clear the `run` buffer.
	`break` cpu_node	c	Put the server in debug mode, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410.
	`reset` [`-h`] [cpu_node] `[-x` cpu_node`\|ac\|host]`	r	Reset (reboot) a specified server, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410; `ac` is the alarm card; `host` is the host CPU board. `reset` cpu_node produces a soft reset (reboots the operating system); `reset` `-x` produces a hard reset (reboots the board).
	`setpanicdump` cpu_node `[true\|false]`	a	Set whether a panic dump is generated when a CPU node is reset.
	`showpanicdump` cpu_node		Show whether or not a panic dump has been set for a specific CPU node.
	`setescapechar` value	a	Set the escape character to end a console session. The default is a ~ (tilde).
	`showhealth` [-b cpu_node]		Show the healthy information of a CPU node, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410.
	`pmsd` `help`	u	Display help information on starting, stopping, and controlling the PMS daemon on the alarm card. Refer to Enabling the Processor Management Service Application and to Using the PMS Application for Recovery and Control of CPU Boards for more information.
Administra- tion	`useradd` [`-h`] username	u	Add a user account. The default user account is `netract`. The alarm card supports 16 accounts.
	`userdel` [`-h`] username	u	Delete a user account.
	`usershow` [`-h`] [username]	u	Show user accounts.
	`userpassword` [`-h`] username	u	Set or change the password of a specified user account.
	`userperm` [`-h`] username `[c\|u\|a\|r]`	u	Set or change the permission levels for a specified user account.
	`mohuseradd` [`-h`] username	a	Add an MOH user account. The alarm card supports five MOH user accounts.
	`mohuserdel` [`-h`] username	a	Delete an MOH user account.
	`mohusershow` [`-h`] [username]		Show MOH user accounts.
	`mohuserpassword` [`-h`] username	a	Set or change the password of a specified MOH user account.
	`mohuserperm` [`-h`] username `[r\|rw]`	a	Set or change the permission levels for a specified MOH user account.
	`logout`		Log out of the current session.
	`password` [`-h`]	u	Change the existing password.
	`flashupdate` `-d` `cmsw`\|`bcfw\|bmcfw\|rpdf\|scdf` `-f` path	a	Flash update the alarm card software, where `cmsw` represents the chassis management software;. `bcfw` represents the boot control firmware; `bmcfw` represents the BMC firmware; `rpdf` represents the system configuration repository; and `scdf` initializes the system configuration variables to their defaults. Refer to Updating the Alarm Card Flash Images for more information.
	`help`		Display a list of supported commands.
	`version`	u	Display the versions of various software and firmware.
	`setdate` [`-h`] mmddHHMMccyy	a	Set the current date.
	`setsecondaryboot` [`-h`] `rarp`	a	The primary boot device for the alarm card is always the flash. In case of flash failure, the secondary boot device is used. The default is `rarp`.
	`showsecondaryboot`		Display the secondary boot mode.
	`setntpserver` addr\|`none`	a	Configure the alarm card to be an NTP client. The NTP server IP address must be on the same subnet as the alarm card. The default is none.
	`setfru` [`-h`] target instance field value	a	Set FRU ID information. Refer to Specifying Netra CT Server FRU ID Information for more information.
	`showescapechar`	a	Show the escape character used to end a console session.
	`setrecovery` `pmsd`\|`moh true`\|`false`	a	Set whether the alarm card will reset itself if the PMS daemon and/or the MOH application exit. The default is `false`, that is, the alarm card will not reset itself.
	`showrecovery`	a	Show the value of the `setrecovery` action the alarm card takes if the PMS daemon and/or the MOH application exit.
	`loghistory` `[index` `[+\|-]`n`][pause` n`]`	a	Display the contents of the alarm card event log
	`snmpconfig add\|del\|show access\|trap` community `[readonly\|readwrite] [`ip_addr`]`	a	Configure the alarm card SNMP interface for the MOH application. The default is `readonly`. Refer to MOH Configuration and SNMP for more information.
	`setmohsecurity true\|false`	a	Configure the alarm card RMI interface for the MOH application. The default is `false`. Refer to MOH Configuration and RMI for more information.
	`debuglog` [`-h`] [`-reset`]	a	Display the name of the process that exited and caused an alarm card reset.
Configuration(serial ports)	`setserialmode` `-b` port_num `tty\|none`	a	Set the mode of the specified serial port to `tty` or `none.`The default for COM2 is none, that is, no services are available on this port.
	`setserialbaud` `-b` port_num baudrate	a	Set the baud rate of the specified serial port. The default is 9600. Valid values are: 1200, 4800, 9600, 19200, 38400, 56000.
	`setserialparity` `-b` port_num `none\|odd\|even`	a	Set the parity bit of the specified serial port. Valid values are none, odd, or even. The default is `odd`.
	`setserialstop` `-b` port_num `1\|2`	a	Set the stop bit of the specified serial port. Valid values are 1 or 2. The default is 1.
	`setserialdata` `-b` port_num `7\|8`	a	Set the number of data bits of the specified serial port. Valid values are 7 or 8. The default is 7.
	`setserialhwhandshake` `-b` port_num `true\|false`	a	Set the hardware handshake of the specified serial port. Valid values are `true` or `false`. The default is `false`.
Configuration(Ethernet ports)	`setipmode` `-b` port_num `rarp\|config\| standby\|none`	a	Set the IP mode of the specified Ethernet port. Choose the IP mode according to the services available in the network (`rarp`, `config`) or to configure the port for failover (`standby`). The default for ENET1 is `rarp`, the default for ENET2 is `none`, that is, no services are available on this port. You must reset the server for the changes to take effect.
	`setipaddr` `-b` port_num addr	a	Set the IP address of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the `ipmode` is set to `config`. You must reset the server for the changes to take effect.
	`setipnetmask` `-b` port_num mask	a	Set the IP netmask of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the `ipmode` is set to `config`. You must reset the server for the changes to take effect.
	`setipgateway` `-b` port_num addr	a	Set the IP gateway of Ethernet port 1. The default is 0.0.0.0. You must reset the server for the changes to take effect.
Configuration (Other)	`sethostname` hostname	a	Set the hostname to be used in the CLI prompt. The default is `netract`. The maximum length is 32 characters.
	`setservicemode` `true`\|`false`	a	When the `servicemode` is set to `true`, MOH and PMS services are stopped for the alarm card flash update. Refer to Updating the Alarm Card Flash Images for more information.
PMS daemon control	`pmsd` `start` [`-p` port_num] [`-e` server_admin_state] [`-t` tick_interval][`-d`]	a	Start PMS on the alarm card or a CPU board. The `-t` option can only be used on a CPU board.
	`pmsd` `stop` [`-p` port_num]	a	Stop PMS on the alarm card or a CPU board.
	`pmsd` `slotaddressset` `-s` slot_num `-i` ip_addr	a	Set the IP address for the alarm card to control and monitor a CPU board.
	`pmsd` `slotaddressshow` `-s` slot_num\|`all`	a	Print the IP address set with the `pmsd` `slotaddressset` command.
	`pmsd` `slotrndaddressadd` `-s` slot_num`\|all` `-n` ip_addr `-d` ip_addr `-r` slot_num	a	Add address information for a CPU board to control other CPU boards.
	`pmsd` `slotrndaddressdelete` `-s` slot_num`\|all` `-i` index_num`\|all`	a	Delete address information added with the `pmsd` `slotrndaddressadd` command.
	`pmsd` `slotrndaddressshow` `-s` slot_num`\|all` `-i` index_num`\|all`	a	Print address information added with the `pmsd` `slotrndaddressadd` command.
	`pmsd` `operset` `-s` slot_num\|`all` `-o` `maint_config`\| `oper_config`\| `none_config\| graceful_reboot`	a	Enable automatic recovery of a CPU board.
	`pmsd` `infoshow` `-s` slot_num\|`all`	a	Print PMS system information.
	`pmsd` `historyshow` `-s` slot_num\|`all`	a	Print a log of PMS system events and time stamps.
	`pmsd` `recoveryoperset` `-s` slot_num\|`all` `-o` `pc`\|`rst`\|`rstpc`\|`pd`\|`rb`	a	Manually recover a board in case of fault.
	`pmsd` `recoveryautooperset` `-s` slot_num\|`all` `-o` `pc`\|`rst`\|`rstpc`\|`pd`\|`rb\| rbpc`\|`none`\|`trg` [`-d` startup_delay] [`-f` `failure power` `on`\|`off`] [`-r` retries] [`-n` `inter-operation` delay] [`-p` `reset` `power` `cycle` delay]	a	Automatically recover a board in case of fault.
	`pmsd` `recoveryautoinfoshow` `-s` slot_num\|`all`	a	Print the configuration information affected by the `recoveryautooperset` command.
	`pmsd` `hwoperset` `-s` slot_num\|`all` `-o` `powerdown`\|`powerup`\| `reset`\|`mon_enable\| mon_disable` [`-f`]	a	Perform operations on a CPU board hardware.
	`pmsd` `hwinfoshow` `-s` slot_num\|`all`	a	Print PMS system information on the hardware.
	`pmsd` `hwhistoryshow` `-s` slot_num\|`all`	a	Print a log of PMS hardware events and time stamps.
	`pmsd` `osoperset` `-s` slot_num\|`all` `-o` `reboot`\|`mon_enable`\| `mon_disable` [-f]	a	Perform operations on a CPU board operating system.
	`pmsd` `osinfoshow` `-s` slot_num\|`all`	a	Print PMS system information on the operating system.
	`pmsd` `oshistoryshow` `-s` slot_num\|`all`	a	Print a log of PMS operating system events and time stamps.
	`pmsd` `appoperset` `-s` slot_num\|`all` `-o` `force_offline`\| `vote_active`\| `force_active`	a	Perform operations on a CPU board applications.
	`pmsd` `appinfoshow` `-s` slot_num\|`all`	a	Print PMS system information on the applications.
	`pmsd` `apphistoryshow` `-s` slot_num\|`all`	a	Print a log of PMS application events and time stamps.
	`pmsd` `version`	a	Print the PMS version.
	`pmsd` `usage`	a	Print a synopsis of the `pmsd` commands.

Information on configuring alarm card ports, setting up user accounts, specifying FRU ID information, and starting the PMS daemon using the alarm card CLI is provided in Chapter 2. The PMS daemon commands are described in Using the PMS Application for Recovery and Control of CPU Boards.

Security Provided

A remote command-line session or a console session automatically disconnects after 10 minutes of inactivity.

Security is also provided through the permission levels and passwords set for each account.

Updating the Alarm Card Flash Images

You can update the alarm card flash images over the network. TABLE 3-2 shows the alarm card flash options.

TABLE 3-2 Alarm Card Flash Options
Option	Description
`cmsw`	Updates the chassis management software, which includes the embedded firmware, the MOH application, and the PMS application.
`bcfw`	Updates the boot control firmware.
`bmcfw`	Updates the BMC firmware.
`rpdf`	Updates the system configuration repository, which contains information used internally by the CLI in the flash, reinitializes it to a default minimum, and resets the alarm card.
`scdf`	(Optional) Initializes the system configuration variables, for example, the serial port variables, to the defaults.

There is no required sequence for flashing the alarm card; the following is a typical sequence: cmsw, bcfw, bmcfw, and rpdf.

You can update individual images if you want.

To Update All the Alarm Card Flash Images

1. Log in to the alarm card.

2. Set the servicemode to true by entering the following command:

hostname cli> setservicemode true

Setting the servicemode to true allows the alarm card to be flash updated; it also stops the MOH and PMS services on the alarm card.

Note - In Step 3, the scdf option is not mandatory. Use it only if you want to initialize the system configuration variables to the defaults.

3. Flash update all the alarm card images, and complete the process by entering the following commands:

hostname cli> flashupdate -d cmsw -f path

hostname cli> flashupdate -d bcfw -f path

hostname cli> flashupdate -d bmcfw -f path

hostname cli> flashupdate -d scdf

hostname cli> setservicemode false

hostname cli> flashupdate -d rpdf -f path

where path is nfs://nfs.server.ip.address/directory/filename where the software to use in the flash is installed.

After you update rpdf, the alarm card resets itself. If you do not update rpdf, you must reset the alarm card manually.

To Update an Individual Alarm Card Flash Image

1. Log in to the alarm card.

2. Set the servicemode to true by entering the following command:

hostname cli> setservicemode true

Setting the servicemode to true allows the alarm card to be flash updated; it also stops the MOH and PMS services on the alarm card.

3. Flash update an alarm card image, and complete the process by entering the following commands:

hostname cli> flashupdate -d option

hostname cli> setservicemode false

hostname cli> reset ac

where option can be cmsw -f path, bcfw -f path, bmcfw -f path, or scdf, and path is nfs://nfs.server.ip.address/directory/filename where the software to use in the flash is installed. Note that if you want to update rpdf, you must set the servicemode to false before using the flashupdate command, and the alarm card will reset itself after finishing the rpdf update.

Setting the Date and Time on the Alarm Card

The alarm card does not support battery backup time-of-day because battery life cannot be monitored to predict end of life, and drift in system clocks can be common. To provide a consistent system time, set the date and time on the alarm card using one of these methods:

Manually, using the CLI setdate command. The date and time must be reset after any power cycle.

Configuring the alarm card to be an NTP client, using the CLI setntpserver command. The Network Time Protocol (NTP) provides the correct timestamp for all systems on a network by synchronizing the clocks of all the systems. A Solaris server, called xntp, sets and maintains the timestamp. The NTP server must be on the same subnet as the alarm card. Refer to the online man pages for the xntpd, ntpq, and ntpdate commands for more information about NTP.

To Set the Alarm Card Date and Time Manually

1. Log in to the alarm card.

2. Set the date and time manually:

hostname cli> setdate mmddHHMMccyy

where mm is the current month; dd is the current day of the month; HH is the current hour of the day; MM is the current minutes past the hour; cc is the current century minus one; and yy is the current year.

To Set the Alarm Card Date and Time as an NTP Client

1. Log in to the alarm card.

2. Set the date and time as an NTP client:

hostname cli> setntpserver addr

where addr is the IP address of the NTP server. The NTP server must be on the same subnet as the alarm card.

Using Remote Shell With the Alarm Card

This section describes how to use a remote shell to execute CLI commands on the alarm card in batch mode, and how to use the rsh command interactively.

Running Scripts on the Alarm Card

Normally, the alarm card cannot execute batch commands. The alarm card scripting feature enables you to write scripts to execute alarm card CLI commands in batch mode on the alarm card, similar to using scripting in the Solaris OS. You run the scripts from a host or satellite CPU board in the same system as the alarm card.

As an example, using the scripting feature, you can write a script to configure an Ethernet port on the alarm card, and then check to make sure it is configured the way you want. This sample script runs the version command, and the setipmode, setipaddr, showipmode, and showipaddr commands for Ethernet port 2 on the alarm card:

rsh alarm_card_MCNet_ipaddress version

rsh alarm_card_MCNet_ipaddress setipmode -b 2 config

rsh alarm_card_MCNet_ipaddress setipaddr -b 2 addr

rsh alarm_card_MCNet_ipaddress showipmode -b 2

rsh alarm_card_MCNet_ipaddress showipaddr -b 2

The script includes the rsh command, the alarm card MCNet IP address, and the CLI command(s) to run. For information on the MCNet IP address, refer to Configuring the MCNet Interface; for information on the CLI commands, refer to TABLE 3-1.

Scripting Limitations

All the alarm card CLI commands in TABLE 3-1 are supported in a script except for the following interactive commands: userpassword, mohuserpassword, password, console, and break.

For security reasons, you must be superuser on a host or satellite CPU board in the same system as the alarm card. The commands can be run only over the MCNet interface.

To Run a Script on the Alarm Card

1. Log in to the server.

2. Create a script:

rsh alarm_card_MCNet_ipaddress CLI_command

rsh alarm_card_MCNet_ipaddress CLI_command

rsh alarm_card_MCNet_ipaddress CLI_command

rsh alarm_card_MCNet_ipaddress CLI_command

...

where alarm_card_MCNet_ipaddress is the MCNet IP address of the alarm card, and CLI_command is the CLI command you want to run.

3. Save the script to a file.

4. As superuser, run the script:

# /path/filename

where path is the path to the script and filename is the name of the script.

Before executing the commands in the script, the alarm card verifies that the commands are being run by a root user on a host or satellite CPU board in the same system as the alarm card, and that the commands have been received over the MCNet.

Using the `rsh` Command Interactively

A root user on a host or satellite CPU board in the same system as the alarm card can use the rsh command interactively with the CLI commands userpassword or mohuserpassword.

For example:

# rsh alarm_card_MCNet_ipaddress -l userpassword

where alarm_card_MCNet_ipaddress is the MCNet IP address of the alarm card. After the CLI command is accepted, you are prompted for a username and password.

Viewing Alarm Card Logs

The alarm card keeps console logs, event logs, and debugging logs.

Console Logs

The alarm card console logs contain messages received from the host CPU board. There are two types of console logs:

The run log contains the most recent data received from the host CPU operating system. The alarm card always writes to this log. When the run log is full, the alarm card overwrites old data in the run log.

The orun log contains messages printed to the console (1) prior to a host CPU reboot or (2) when the consolerestart command is issued. When either of these events occur, the alarm card stores the contents of the run log in the orun log, and then clears the run log to store further host CPU operating system messages.

The run and orun logs together can contain up to 16 Kbytes of data.

To View Console Logs

1. Log in to the alarm card.

2. View a console log with the consolehistory command:

hostname cli> consolehistory [run|orun|init] [index [+|-] n] [pause x]

where index n is the number of lines to display from either the oldest log entry forward (positive index) or the most recent log entry back (negative index); and pause x is the number of lines to display before pausing (default pause value is 10 lines). For example, to display the contents of the run log, pausing after 20 lines at a time, enter the following:

hostname cli> consolehistory run pause 20

If no options are specified, the consolehistory command prints out the entire contents of all nonempty console logs.

You can use the consolehistory -init command to clear both the run and orun logs.

Event Logs

The alarm card event log contains event history, that is, all events that change the state of the system. The log entries are stored in the circular buffer of the alarm card RAM. The buffer holds up to 2,048 log entries; it is reset if the alarm card is reset.

A log entry includes the time of the event, a hostname, a unique event ID, and a description of the event. For example:

hostname cli> loghistory

Feb 3 02:38:10 netract: 0009: Alarm Card Booted

Feb 3 02:38:11 netract: 0004: ENET2 now DOWN

Feb 3 02:39:57 netract: 0022: User netract Logged on

...

To View the Event Log

1. Log in to the alarm card.

2. View an event log with the loghistory command:

hostname cli> loghistory [index [+|-] n] [pause n]

where index n is the number of lines to display from either the oldest log entry forward (positive index) or the most recent log entry back (negative index); and pause n is the number of lines to display before pausing (the default is to display the entire log without pausing). For example, to display the last 30 lines of the event log, enter the following:

hostname cli> loghistory index -30

Debugging Log

The alarm card debugging log contains the name of the last key process that exited and caused a reset of the alarm card, or it is empty. For example:

hostname cli> debuglog

===== Alarm Card Debug Log =====

Jan  1 21:18:42 'telnetd' exited

Debugging log data remains in the alarm card flash until:

It is overwritten by another key process exiting, or

You clear the log using the debuglog reset command.

To View the Debugging Log

1. Log in to the alarm card.

2. View the debugging log with the debuglog command:

hostname cli> debuglog

If there is no process information output from this command, the debugging log has been cleared. Otherwise, information on the last process that caused the alarm card reset is displayed.

To Clear the Debugging Log

1. Log in to the alarm card.

2. Clear the debugging log with the debuglog reset command:

hostname cli> debuglog reset

Booting CPU Boards

Host and satellite CPU boards can boot from a local disk or over the network.

Boot Device Variables

By default, the OpenBoot PROM NVRAM boot-device configuration variable is set to disk net, disk being an alias for the path to the local disk, and net being an alias for the path of the primary network. You can set the boot device for CPU boards through the alarm card CLI setfru command. Refer to Configuring a Chassis Slot for a Board for more information on using the setfru command to specify a boot device for a board.

When the alarm card powers on a board in a slot, the OpenBoot PROM firmware checks with the alarm card for a boot device for that slot. The alarm card sends the value from the Boot_Devices field in FRU ID to the OpenBoot PROM firmware; the value is either the boot device list for that slot you set using the setfru command or a null string if you did not set a boot device list for that slot. The value overwrites the NVRAM boot-device value.

In the event of an alarm card fault, a CPU board hot-swap, power cycle, reboot or reset will cause the OpenBoot PROM firmware to default to the value set in the boot-device variable.

Booting With a DHCP Server

You can configure Netra CT CPU boards to boot over DHCP. This process includes setting the CPU board boot device for DHCP, forming the CPU board DHCP client ID, and configuring the DHCP server.

On the Netra CT system, the DHCP client ID is a combination of the system's midplane Sun part number (7 bytes), the system's midplane Sun serial number (6 bytes), and the board's geographical address (slot number) (2 bytes). The parts are separated by a : (colon).

To Configure a CPU Board to Boot Over DHCP

1. Log in to the alarm card.

2. Set the boot device for the board to dhcp with the setfru command:

hostname cli> setfru slot fru_instance Boot_Devices network_devicename:dhcp

where fru_instance is the slot number of the board to be configured for DHCP and network_devicename is a path or alias to a network device. For example, to set the boot device to dhcp for the CPU board in slot 4, enter the following:

hostname cli> setfru slot 4 Boot_Devices net:dhcp

3. Get the Netra CT system part number and the system serial number with the showfru command:

hostname cli> showfru midplane 1 Sun_Part_No

...

hostname cli> showfru midplane 1 Sun_Serial_No

...

4. Form the three-part client ID by using the system part number, the system serial number, and the slot number, separated by colons.

For example, if the output from the showfru commands in Step 3 is 375-4335 (Sun part number) and 000001 (Sun serial number), and you want to form the client ID for the CPU board in slot 4, the client ID is: 3754335:000001:04.

5. Translate the client ID to its ASCII equivalent. For example:

Client ID part	ASCII Representation
3754335	33 37 35 34 33 33 35
:	3A
000001	30 30 30 30 30 31
:	3A
04	30 34

Thus, the example client ID in ASCII is:

33 37 35 34 33 33 35 3A 30 30 30 30 30 31 3A 30 34.

6. Configure the DHCP server.

Refer to the Solaris DHCP Administration Guide for information on how to configure the DHCP server for remote boot and diskless boot clients.

The client ID is retained across a CPU board power cycle, reboot, or reset; the alarm card updates the client ID during a first-time power on or a hot-swap of a CPU board. In the event of an alarm card fault, a CPU board reboot or reset will retrieve the previously written client ID.

Connecting to CPU Board Consoles From the Alarm Card

The Netra CT system provides the capability to connect to CPU boards and open console sessions from the alarm card.

You begin by logging in to the alarm card through either the serial port or the Ethernet port. Once a console session with a CPU board is established, you can run Solaris system administration commands, such as passwd, read status and error messages, or halt the board in that particular slot.

Configuring Your System for Multiple Console Use

To enable your system to use multiple consoles, you set several variables, either at the Solaris level or at the OpenBoot PROM level. Set these variables on each CPU board to enable console use.

To Configure Your System for Multiple Consoles

1. Log in as superuser to the CPU board, using the on-board console port ttya.

2. Enter either set of the following commands to enable multiple consoles:

From the Solaris level:

# eeprom "multiplexer-output-devices=ttya ssp-serial"

# eeprom "multiplexer-input-devices=ttya ssp-serial"

# eeprom input-device=input-mux

# eeprom output-device=output-mux

# reboot

From the OpenBoot PROM level:

ok setenv multiplexer-output-devices ttya ssp-serial

ok setenv multiplexer-input-devices ttya ssp-serial

ok setenv input-device input-mux

ok setenv output-device output-mux

ok reset-all

Establishing Console Sessions Between the Alarm Card and CPU Boards

Once you have configured your system for multiple console use, you can log in to the alarm card and open a console for a slot. The Netra CT system allows four console users per slot.

TABLE 3-3 shows the alarm card CLI console-related commands that can be executed from the current login session on the alarm card.

TABLE 3-3 Alarm Card CLI Console-Related Commands
Command	Description
`console` cpu_node	Enter console mode and connect to a specified CPU board, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410 server. If cpu_node is not specified, connect to the host CPU board.
`break` cpu_node	Put the specified CPU board in debug mode, where cpu_node can be 1 to 8 on a Netra CT 810 or 1 to 5 on a Netra CT 410 server. Debug mode can use OpenBoot PROM or `kadb`, depending on server configuration.
`setescapechar` value	Set the escape character to be used in all future console sessions. The default is ~ (tilde). Refer to TABLE 3-4 for escape character use.
`showescapechar`	Show the current escape character.

Most CPU board consoles use the MCNet bus, but a board at the OpenBoot PROM level connects over the IPMI bus. There can be only one console user on the IPMI bus at any one time.

For example, if the board in slot 4 is at the OpenBoot PROM level, the user opening a console session will connect to it over the IPMI bus. This will cause the IPMI bus to be fully occupied and no other users can connect over that bus. If they try, an error message displays. However, other users can connect to boards in other slots over the MCNet bus. The MCNet bus is faster than the IPMI bus, while the IMPI bus is typically a more stable communication channel than the MCNet bus.

Once you have a console connection with a CPU board, you can issue normal Solaris commands. There are several escape character sequences to control the current session. TABLE 3-4 shows these sequences.

TABLE 3-4 CPU Board Console-Related Escape Character Sequences
Sequence	Description
`~b`	Break from the Solaris level and enter the OpenBoot PROM (debug) level.
`~.`	End the console session.
`~g`	Determine the status (MCNet or IMPI) of the current console.
`~t`	Toggle between MCNet and IPMI.

To Start a Console Session From the Alarm Card

1. Log in to the alarm card.

You can log in to the alarm card through a terminal attached to either the serial port connection or the Ethernet port connection.

2. Open a console session to a board in a slot:

hostname cli> console cpu_node

where cpu_node is 1 to 7 on a Netra CT 810 system or 3 to 5 on a Netra CT 410 system. For example, to open a console to the board in slot 4, enter the following:

hostname cli> console 4

You now have access to the board in slot 4. Depending on the state of the board in that particular slot, and whether the previous user logged out of the shell, you see one of several prompts:

console login% - Solaris level

# - Solaris level, previous user logged in as superuser, and did not log out before disconnecting from the console

ok - OpenBoot PROM level, previous user did not log out before disconnecting from the console

To Determine the Status of the Current Console

Enter the escape sequence ~g at the start of a new line:

~g

A message displays, indicating the current state of the console connection. The message is either:

Console mode is IPMI

This means the console is in Solaris mode or OpenBoot PROM mode.

Or the message might be:

Console mode is MCNET

This means the console is in Solaris mode.

To Toggle Between MCNet and IPMI

Toggling between MCNet and IPMI could be useful for troubleshooting. For example, if the console stops working for some reason, you could try toggling to IPMI (the more reliable communication channel).

1. If the CPU board is in Solaris mode, enter the escape sequence ~t:

# ~t

New console mode is IPMI

The console switches between MCNet and IPMI mode. The console now fully occupies the IPMI bus. No other console may be at the OpenBoot PROM level at the same time. If another user attempts to access a board that is occupying the IPMI bus, the console connection will fail.

2. To return to MCNet mode, enter ~t again and press enter:

# ~t

New console mode is MCNET

To Break into OpenBoot PROM from the Console

At the Solaris prompt, enter the escape sequence ~b:

# ~b

The console mode switches to IPMI:

New console mode is IPMI

Type `go' to resume

ok

You can now debug from the OpenBoot PROM level.

To End the Console Session

1. (Optional) Log out of the Solaris shell.

2. At the prompt, disconnect from the console by entering the escape sequence ~. (tilde period):

prompt ~.

hostname cli>

Disconnecting from the console does not automatically log you out from the remote host. Unless you log out from the remote host, the next console user who connects to that board sees the shell prompt of your previous session.

To Show the Current Escape Character

At the alarm card prompt, enter the following command:

hostname cli> showescapechar

The current escape character is displayed:

hostname cli> escape_char: value

To Change the Default Escape Character

At the alarm card prompt, enter the following command:

hostname cli> setescapechar value

where value is any printable character. For example, to change the default escape character from ~ (tilde) to # (pound sign), enter the following:

hostname cli> setescapechar #

The pound sign is now the escape character for all future console sessions.

Using the PMS Application for Recovery and Control of CPU Boards

This section describes specifying recovery operations and controlling CPU boards through the alarm card PMS CLI commands.

Recovery Configuration of a CPU Board From the Alarm Card

You specify the recovery configuration of a CPU board by using the command pmsd operset -s slot_num|all (a single slot number or all slots in the Netra CT system containing a CPU board) and the recovery mode for the specified slot(s).

The recovery configuration can be maintenance mode, operational mode, or none mode. Maintenance mode means the alarm card's automatic recovery of a CPU board is disabled, and PMS applications are started in an offline state, so that you can use manual maintenance operations. Operational mode means the alarm card's automatic recovery of a CPU board is enabled; the alarm card will recover the CPU board in the event of a monitoring fault, and start PMS applications in an active state. None mode means the alarm card's automatic recovery mode may be manually enabled or disabled; PMS application states are not enforced.

The mode is stored in persistent storage. You specify the operation to be performed on the specified slot by using the option -o with the parameter maint_config (set the hardware, operating system, and applications into maintenance mode), oper_config (set the hardware, operating system, and applications into operational mode), none_config (set the hardware, operating system, and applications into no enforcement mode), or graceful_reboot (bring the applications offline if needed and then reboot the operating system).

To Specify the Recovery Configuration of a CPU Board

1. Log in to the alarm card.

2. Configure the automatic recovery mode with the operset command:

hostname cli> pmsd operset -s slot_num|all -o maint_config|oper_config|none_config|gracefulreboot

where slot_num can be a slot number from 1 to 8, and all specifies all slots containing CPU boards. For example, to make PMS' recovery operational for the entire Netra CT server, enter:

hostname cli> pmsd operset -s all -o oper_config

Printing PMS Recovery Configuration Information

The pmsd infoshow -s slot_num|all command can be used to print the recovery configuration and alarm status for the recovery configuration.

The pmsd historyshow -s slot_num|all command can be used to print a recovery configuration and runtime message log. The log is printed to the terminal performing the operation.

Detailed Recovery of a Board in Case of Fault

You can perform detailed, manual recovery operations on a board or instruct PMS to perform detailed, automatic recovery operations on a board using the CLI. The operations are performed across the hardware, the operating system, and the applications.

For manual recovery, use the pmsd recoveryoperset -s slot_num|all command. This command can only be run when the board is in maintenance mode or none mode (PMS applications are offline). You specify the recovery operation to be performed on the specified slot by using the option -o with the parameters: pc (power cycle), rst (reset), rstpc (reset, then power cycle), pd (power down), or rb (reboot).

For automatic recovery, use the recoveryautooperset -s slot_num|all command. This command instructs PMS what to do in response to a fault when the board is in operational mode (PMS applications are active).

You specify the automatic recovery operation to be performed on the specified slot by using the option -o with the parameters: pc (power cycle), rst (reset), rstpc (reset, then power cycle), pd (power down), rb (reboot), rbpc (reboot, then power cycle), none (no recovery), or trg (manually simulate a fault to trigger a recovery). Optional parameters for automatic recovery include: -d startup delay (the time in deciseconds between a fault occurrence and the start of a recovery operation; default is 0 deciseconds), -f failure power off|on (whether a power down operation will occur if the recovery operation fails; on specifies power down will occur and off specifies that power down will not occur; the default is off), -r retries (the number of times a recovery operation can occur and fail before it is terminated; the default is one try), -n inter-operation delay (the time in deciseconds between one and the next operation for an operation with multiple retries; default is 0 deciseconds), and -p reset power-cycle delay (the time in deciseconds to be waited between the reset and power cycle portions of the recovery operation before a failed reset is declared and the power cycle portion of the operation starts; default is 0 deciseconds).

To Manually Recover a Board

1. Log in to the alarm card.

2. Perform manual recovery operations on a board with the recoveryoperset command:

hostname cli> pmsd recoveryoperset -s slot_num|all -o pc|rst|rstpc|pd|rb

where slot_num can be a slot number from 1 to 8, and all specifies all slots containing CPU boards. For example, to instruct PMS to reboot slot 5 after a fault, enter the following:

hostname cli> pmsd recoveryoperset -s 5 -o rb

To Automatically Recover a Board

1. Log in to the alarm card.

2. Perform automatic recovery operations on a board with the recoveryoperset command:

hostname cli> pmsd recoveryautooperset -s slot_num|all -o pc|rst|rstpc|pd|rb|rbpc|none|trg [-d startup delay][-f failure power on|off][-r retries][-n inter-operation delay][-p reset power cycle  delay]

where slot_num can be a slot number from 1 to 8, and all specifies all slots containing CPU boards. For example, to instruct PMS to automatically reboot slot 5 after a fault, with the default delays, retries, and failure power state, enter the following:

hostname cli> pmsd recoveryautooperset -s 5 -o rb

Printing PMS Automatic Recovery Information

The pmsd recoveryautoinfoshow -s slot_num|all command can be used to print information showing the configuration information affected by the recoveryautooperset command.

Monitoring and Controlling a CPU Board's Resources From the Alarm Card

PMS can perform operations on a board's hardware, the operating system, and applications. You can specify that PMS performs operations on one of these, rather than all.

Hardware Operations

The pmsd hwoperset -s slot_num|all command performs operations on the hardware. The operations can only be performed in maintenance or none mode unless the optional -f parameter is used. You specify the operation to be performed on the specified slot by using the option -o with the parameters: powerdown (set the hardware to the power-off state), powerup (set the hardware to the power-on state), reset (reset the hardware), mon_enable (enable health monitoring of the hardware), or mon_disable (disable health monitoring of the hardware). The optional -f parameter can be used to perform the operation even if applications are in the active state, and the slot is in operational mode.

The pmsd hwinfoshow -s slot_num|all command can be used to print PMS system information on the hardware state, monitoring status, and alarm status (whether an alarm was generated).

The pmsd hwhistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the hardware's operation. The log is printed to the terminal performing the operation.

Operating System Operations

The pmsd osoperset -s slot_num|all command performs operations on the operating system. The operations can only be performed in maintenance or none mode unless the optional -f parameter is used. You specify the operation to be performed on the specified slot by using the option -o with the parameters: reboot (reboot the operating system), mon_enable (enable health monitoring of the operating system), or mon_disable (disable health monitoring of the operating system). The optional -f parameter can be used to perform the operation even if applications are in the active state, and the slot is in operational mode.

The pmsd osinfoshow -s slot_num|all command can be used to print PMS system information on the operating system state, monitoring status, and alarm status (whether an alarm was generated).

The pmsd oshistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the operating system's operation. The log is printed to the terminal performing the operation.

Application Operations

The pmsd appoperset -s slot_num|all command performs operations on the applications. You specify the operation to be performed on the specified slot by using the option -o with the parameters: force_offline (force the applications to an offline state), vote_active (move the group of applications to the active state only if all of the applications agree to be moved), or force_active (force the applications to the active state).

The pmsd appinfoshow -s slot_num|all command can be used to print PMS system information on the applications' state and alarm status (whether an alarm was generated).

The pmsd apphistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the applications' operation. The log is printed to the terminal performing the operation.

Printing Other PMS Information

The pmsd version command prints the current version of pmsd.

The pmsd usage command prints a synopsis of the pmsd commands.

Using the Netra High Availability Suite With the Netra CT Server Applications

The Netra High Availability (HA) Suite software provides enhanced services for customer high-availability applications. When installed, it runs on the host and satellite CPU boards. The Netra HA Suite provides reliable (redundant) services across CPU boards; you can fail over from one CPU board in one Netra CT system to another CPU board in another Netra CT system.

The MOH and PMS applications integrate with these Netra HA Suite foundation services: reliable NFS, reliable DHCP/boot server, and CGTP (Carrier-Grade Transport Protocol, providing IP packet services).

The MOH application has to manage these services, for example, monitoring the nfs and tftp daemons. It does this through the node manager agent (NMA). For example, if there is an NFS failure, the MOH application will detect this failure.

The points of interaction between the Netra CT server software and the Netra HA Suite are:

MOH software modules interact with the Netra HA Suite Process Monitor Daemon (PMD) and NMA

MOH I/O interfaces interact with the Netra HA Suite NMA and CGTP

PMS interacts with the Netra HA Suite probe

The Netra HA Suite starts RNFS, RDHCP, and CGTP by default. If you want to change the Netra HA Suite services that are started by default, configure the Process Monitor Daemon (PMD). Refer to the Netra HA Suite documentation for more information on how to do this.

The Netra CT PMS probe brings together the PMS partner list and the Netra HA Suite master and vice-master cluster. Refer to the pms API man pages for more information on partner lists; the man pages are installed by default in /opt/SUNWnetract/mgmt2.0/man.

Refer to the Netra HA Suite documentation for more information on this application.

Monitoring Your System

This section describes various ways to monitor your system.

Command-Line Interface Information

The alarm card CLI provides many commands to display system status. Refer to the alarm card CLI commands in the section, Using the Alarm Card Command-Line Interface, in particular the show commands, to view system status. The alarm card also keeps several logs; refer to Viewing Alarm Card Logs for more information.

LED Information

The system status panel is a module designed to give feedback on the status of the key components within the Netra CT server. The system status panel has one set of LEDs for each component within that particular server. FIGURE 3-1 shows the LEDs on the system status panel for the Netra CT 810 server, and FIGURE 3-2 shows the LEDs on the system status panel for the Netra CT 410 server.

FIGURE 3-1 System Status Panel - Netra CT 810 Server

Picture showing LEDs on the Netra CT 810 server system status panel.

TABLE 3-5 describes the system status panel LEDs for the Netra CT 810 server.

TABLE 3-5 System Status Panel LEDs for the Netra CT 810 Server
LED	LEDs Available	Component
HDD 0	Power and Okay to Remove	Upper hard drive
HDD 1	Power and Okay to Remove	Lower hard drive
Slot 1	Power and Okay to Remove	Host CPU board installed in slot 1
Slots 2-7	Power and Okay to Remove	I/O boards or satellite CPU boards installed in slots 2-7
Slot 8	Power and Okay to Remove	Alarm card installed in slot 8
SCB	Power and Fault	System controller board (behind the system status panel)
FAN 1	Power and Fault	Upper fan tray (behind the system status panel)
FAN 2	Power and Fault	Lower fan tray (behind the system status panel)
RMM	Power and Okay to Remove	Removable media module
PDU 1	Power and Fault (DC only)	Left power distribution unit (behind the server)
PDU 2	Power and Fault (DC only)	Right power distribution unit (behind the server)
PSU 1	Power and Okay to Remove	Left power supply unit
PSU 2	Power and Okay to Remove	Right power supply unit

FIGURE 3-2 System Status Panel - Netra CT 410 Server

Picture showing LEDs on the Netra CT 410 server system status panel.

TABLE 3-6 describes the system status panel LEDs for the Netra CT 410 server.

TABLE 3-6 System Status Panel LEDs for the Netra CT 410 Server
LED	LEDs Available	Component
Slot 1	Power and Okay to Remove	Alarm card installed in slot 1
Slot 2	Power and Okay to Remove	I/O board or satellite CPU board installed in slot 2
Slot 3	Power and Okay to Remove	Host CPU board installed in slot 3
Slot 4 and 5	Power and Okay to Remove	I/O boards or satellite CPU boards installed in slot 4 and 5
HDD 0	Power and Okay to Remove	Hard drive
SCB	Power and Fault	System controller board (behind the system status panel)
FAN 1	Power and Fault	Upper fan tray (behind the system status panel)
FAN 2	Power and Fault	Lower fan tray (behind the system status panel)
FTC	Power and Fault	Host CPU front transition card or host CPU front termination board
PDU 1	Power and Fault (DC only)	Power distribution unit (behind the server)
PSU 1	Power and Okay to Remove	Power supply

Each major component in the Netra CT 810 server or Netra CT 410 server has a set of LEDs on the system status panel that gives the status on that particular component. Each component has either the green Power and the amber Okay to Remove LEDs (FIGURE 3-3), or the green Power and amber Fault LEDs (FIGURE 3-4). Note that the components in the Netra CT servers all have the green Power LED, and they have either the amber Okay to Remove LED or the amber Fault LED, but not both.

FIGURE 3-3 Power and Okay to Remove LEDs

Picture showing Power and Okay to Remove LEDs

FIGURE 3-4 Power and Fault LEDs

Picture showing Power and Fault LEDs

TABLE 3-7 gives the LED states and meanings for any CompactPCI board installed in a slot in the Netra CT 810 server or Netra CT 410 server.

TABLE 3-8 gives the LED states and meanings for any component other than a CompactPCI board that has the green Power and amber Okay to Remove LEDs.

TABLE 3-9 gives the LED states and meanings for any component other than a CompactPCI board that has the green Power and amber Fault LEDs.

TABLE 3-7 CompactPCI Board LED States and Meanings
Green Power LED state	Amber Okay to Remove LED state	Meaning	Action
Off	Off	The slot is empty or the system thinks that the slot is empty because the system didn't detect the board when it was inserted.	If there is a board installed in this slot, then one of the following components is faulty: the board installed in the slot the alarm card the system controller board Remove and replace the failed component to clear this state.
Blinking	Off	The board is powering on or off.	Do not remove the board in this state.
On	Off	The board is up and running.	Do not remove the board in this state.
Off	On	The board is powered off.	You can remove the board in this state.
Blinking	On	The board is powered on, but it is offline for some reason (for example, a fault was detected on the board).	Wait several seconds to see if the green Power LED stops blinking. If it does not stop blinking after several seconds, enter `cfgadm -al` and verify that the board is in the unconfigured and disconnected state. Power off the slot through the alarm card software, then remove the board.
On	On	The board is powered on and is in use, but a fault has been detected on the board.	Deactivate the board using one of the following methods: Use the `cfgadm` `-f` `-c` `unconfigure` command to deactivate the board. Note that in some cases, this might cause the system to panic, depending on the nature of the board hardware or software. Halt the system and power off the slot through the alarm card software, then remove the board. The green Power LED will then give status information: If the green Power LED goes off, then you can remove the board. If the green Power LED remains on, then you must halt the system and power off the slot through the alarm card software.

TABLE 3-8 Meanings of Power and Okay to Remove LEDs
LED State	Power LED	Okay to Remove LED
On, Solid	Component is installed and configured.	Component is Okay to Remove. You can remove the component from the system, if necessary.
On, Flashing	Component is installed but is unconfigured or is going through the configuration process.	Not applicable.
Off	Component was not recognized by the system or is not installed in the slot.	Component is not Okay to Remove. Do not remove the component while the system is running.

TABLE 3-9 Meanings of Power and Fault LEDs
LED State	Power LED	Fault LED
On, Solid	Component is installed and configured.	Component has failed. Replace the component.
On, Flashing	Component is installed but is unconfigured or is going through the configuration process.	Not applicable.
Off	Component was not recognized by the system or is not installed in the slot.	Component is functioning properly.

A green system power LED and system power button are also located on the system status panel. When the system is off, the system power LED is unlit. Pressing the system power button when the system is off will start the power-on sequence. Once the system is completely powered on, the system power LED remains on.

When the system is powered on, pressing the system power button for less than 4 seconds will start the orderly power-off sequence--in a manner that no persistent operating system data structures are corrupted--indicated by a blinking LED. In the orderly power-off, applications in service may be abnormally terminated and no further services will be invoked by the CPU. Once the CPU has reached a quiescent state (run level-0, as if init 0 had been invoked), then the power supply(s) will turn off, indicated by the LED changing from a blinking state to the off state.

If the button is held down for 4 seconds or longer, the power supply(s) are turned off without any intervention of the CPU; that is, the "emergency" power-off sequence occurs.

The MOH Application

The MOH collects information about individual field replaceable units (FRUs) in your system and monitors their operational status. MOH can also monitor certain daemons; for example, if you installed the Netra High Availability Suite, MOH monitors daemons through that application.

Starting and Stopping MOH

If you installed the Solaris patches for MOH in a directory other than the default directory, specify that path instead. You must start the MOH application as superuser.

# cd /opt/SUNWnetract/mgmt2.0/bin

# ./ctmgx start [option]

Refer to TABLE 2-6 for the options available with ctmgx start.

# cd /opt/SUNWnetract/mgmt2.0/bin

# ./ctmgx stop

Once MOH is running, it interfaces with your SNMP or RMI application to discover network elements, monitor the system, and provide status messages. Refer to the Netra CT Server Software Developer's Guide for information on writing applications to interface with the MOH application.

SNMP Notification of Memory Errors

MOH software generates an SNMP trap if a memory error occurs in a memory module on a CPU board. The trap includes information, such as the time stamp, alarm severity, the specific problem, plus possible response to the particular memory error.

Information for the trap is generated by the cediag tool (/opt/SUNWcest/bin/cediag), which interacts with the dual inline memory modules (DIMMs). The cediag tool provides trap information to the ctmgx agent, which regularly polls the cediag tool for status. The polling period is configurable using the ctmgx.cediag.period parameter in the /opt/SUNWnetract/mgmt2.0/etc/ctmgx.conf file; the default is 1800000 milliseconds. Setting this parameter too low could result in too many processes running.

Additional Troubleshooting Information

For additional troubleshooting information, refer to the Netra CT Server Service Manual.

Hot-Swap on the Netra CT Server

Most FRUs in the Netra CT system are hot-swappable.^[1] Hot-swap, a key feature of the PICMG standard, means that a CompactPCI board that meets the PICMG standard can be reliably inserted into or extracted from a powered and operating CompactPCI platform without affecting the other functions of the platform.

The Netra CT system hot-swap modes are shown in TABLE 3-10.

TABLE 3-10 Netra CT System Hot-Swap Modes
Type of Hot-Swap	Description
Basic	The hardware connection and disconnection process is performed automatically by the hardware, while the software connection process requires user assistance through the `cfgadm` `(1M`) command.
Full	Both the hardware and the software connection process are performed automatically.
High Availability	High availability hot-swap provides the ability to control the hardware connection process. This provides a higher degree of control than just indicating insertion and extraction of a board. The hardware connection process is controlled by software on high availability systems, such as the Netra CT server.

The Netra CT system is configured for full hot-swap by default. You can change the mode of the slot for the CPU boards and I/O boards to basic or full hot-swap using the cfgadm(1M) command. You might want to change the hot-swap state of a slot to basic, for example, if you need to insert or remove a third-party I/O board that does not have full hot-swap support.

Note that whenever you reboot or power your system on and off, the hot-swap states revert back to the default full hot-swap state for all I/O slots. If you configure the alarm card or the CPU boards for basic hot-swap, after a host CPU reboot, alarm card reset, or system power off, the alarm card comes up in a disconnected or unconfigured state; you must reconfigure it with the cfgadm command from the host CPU board.

Complete information on hot-swapping FRUs is contained in the Netra CT Server Service Manual.

How High Availability Hot-Swap Works

By default, the Netra CT server is configured to accept any cPCI FRU unless you specifically set an allowable plug-in for a specific slot. Refer to Configuring a Chassis Slot for a Board for more information.

When a board is inserted into the Netra CT server, the alarm card checks the midplane FRU ID information for allowable FRUs for that slot, then checks the inserted board's FRU ID to make sure the board is allowed in the particular slot. If the board is allowed in the slot, the alarm card powers on the board. If the board is not allowed in the slot, the alarm card does not enable power to the slot.

If a host or satellite CPU board is in use, that is, has applications currently running, the alarm card CLI power commands, such as poweron or poweroff, will not work for that CPU board.

Hot-Swap With Boards That Don't Support Full Hot-Swap

You might want to change the hot-swap state of a slot from full to basic if you need to insert or remove a third-party I/O board that does not have full hot-swap support.

To determine the current hot-swap state of a slot, use the prtconf(1M) command. To enable or disable a type of hot-swap on a slot, use the cfgadm(1M) command. For many cfgadm commands, you must know the attachment point ID for the I/O slot that you will be working on.

To Determine the Current Hot-Swap State of a Slot

As superuser on the server, enter the command:

# prtconf -v -P

For a Netra CT 810 server, the output is similar to the following:

cphsc, instance #0

            System properties:

                name='instance' type=int items=1

                    value=00000000

                name='default-hotswap-mode' type=string items=1

                    value='full'

            Driver properties:

                name='AL-8-autoconfig' type=string items=1 dev=none

                    value='enabled'

                name='IO-7-autoconfig' type=string items=1 dev=none

                    value='enabled'

                name='IO-6-autoconfig' type=string items=1 dev=none

                    value='enabled'

                name='IO-5-autoconfig' type=string items=1 dev=none

                    value='enabled'

                name='IO-4-autoconfig' type=string items=1 dev=none

                    value='enabled'

                name='IO-3-autoconfig' type=string items=1 dev=none

                    value='enabled'

                name='IO-2-autoconfig' type=string items=1 dev=none

                    value='enabled'

               name='CPU-autoconfig' type=string items=1 dev=none

                    value='enabled'

               name='hotswap-mode' type=string items=1 dev=none

                    value='full'

If you see value 'basic' underneath the default-hotswap-mode line, then all of the I/O slots in the Netra CT server have been set to basic hot-swap. You should see value 'disabled' for every I/O slot in the system in this situation.

If you see value 'full' underneath the default-hotswap-mode line, then at least one of the I/O slots in the Netra CT server has been set to full hot-swap. You must look at the entries for individual I/O slots to determine if they have been set to basic or full hot-swap mode in this situation:

If you see value 'enabled' underneath an autoconfig line, then that slot is set to full hot-swap.

If you see value 'disabled' underneath an autoconfig line, then that slot is set to basic hot-swap.

To List Attachment Point IDs for I/O Slots

As superuser on the server, enter the command:

# cfgadm

For a Netra CT 810 server, the output is similar to the following:

Ap_Id     Type            Receptacle          Occupant          Condition

AL-8      mcd/fhs         connected           configured        ok

CPU       bridge/fhs      connected           configured        ok

IO-2      stpcipci/fhs    connected           configured        ok

IO-3      unknown         empty               unconfigured      unknown

IO-4      stpcipci/fhs    connected           configured        ok

IO-5      unknown         empty               unconfigured      unknown

IO-6      unknown         empty               unconfigured      unknown

IO-7      unknown         empty               unconfigured      unknown

where the attachment point ID is shown in the first column of the readout; for example, the attachment point ID for I/O slot 2 in a Netra CT 810 server would be IO-2.

For a Netra CT 410 server, the output is similar to the following:

Ap_Id     Type            Receptacle          Occupant          Condition

AL-1      mcd/fhs         connected           configured        ok

CPU       bridge/fhs      connected           configured        ok

IO-2      unknown         empty               unconfigured      unknown

IO-4      stpcipci/fhs    connected           configured        ok

IO-5      stpcipci/fhs    connected           configured        ok

where the attachment point ID is shown in the first column of the readout; for example, the attachment point ID for I/O slot 4 in a Netra CT 410 server would be IO-4.

To Disable Full Hot-Swap and Enable Basic Hot-Swap

As root on the server, enter the command:

# cfgadm -x disable_autoconfig ap_id

where ap_id is the attachment point ID in the server that you want to have basic hot-swap enabled on.

To Re-Enable Full Hot-Swap

As root on the server, enter the command:

# cfgadm -x enable_autoconfig ap_id

where ap_id is the attachment point ID in the server that you want to have full hot-swap enabled on.

^{1 (Footnote) Exceptions include the single power supply and the single hard drive in the Netra CT 410 server; a single or lone remaining power supply and a single or lone remaining hard drive in the Netra CT 810 server; and the power distribution units.}

Using the Alarm Card Command-Line Interface

CLI Commands

Security Provided

Updating the Alarm Card Flash Images

To Update All the Alarm Card Flash Images

To Update an Individual Alarm Card Flash Image

Setting the Date and Time on the Alarm Card

To Set the Alarm Card Date and Time Manually

To Set the Alarm Card Date and Time as an NTP Client

Using Remote Shell With the Alarm Card

Running Scripts on the Alarm Card

Scripting Limitations

To Run a Script on the Alarm Card

Using the rsh Command Interactively

Viewing Alarm Card Logs

Console Logs

To View Console Logs

Event Logs

To View the Event Log

Debugging Log

To View the Debugging Log

To Clear the Debugging Log

Booting CPU Boards

Boot Device Variables

Booting With a DHCP Server

To Configure a CPU Board to Boot Over DHCP

Connecting to CPU Board Consoles From the Alarm Card

Configuring Your System for Multiple Console Use

To Configure Your System for Multiple Consoles

Establishing Console Sessions Between the Alarm Card and CPU Boards

To Start a Console Session From the Alarm Card

To Determine the Status of the Current Console

To Toggle Between MCNet and IPMI

To Break into OpenBoot PROM from the Console

To End the Console Session

To Show the Current Escape Character

To Change the Default Escape Character

Using the PMS Application for Recovery and Control of CPU Boards

Recovery Configuration of a CPU Board From the Alarm Card

To Specify the Recovery Configuration of a CPU Board

Printing PMS Recovery Configuration Information

Detailed Recovery of a Board in Case of Fault

To Manually Recover a Board

To Automatically Recover a Board

Printing PMS Automatic Recovery Information

Monitoring and Controlling a CPU Board's Resources From the Alarm Card

Hardware Operations

Operating System Operations

Application Operations

Printing Other PMS Information

Using the Netra High Availability Suite With the Netra CT Server Applications

Monitoring Your System

Command-Line Interface Information

LED Information

The MOH Application

Starting and Stopping MOH

SNMP Notification of Memory Errors

Additional Troubleshooting Information

Hot-Swap on the Netra CT Server

How High Availability Hot-Swap Works

Hot-Swap With Boards That Don't Support Full Hot-Swap

To Determine the Current Hot-Swap State of a Slot

To List Attachment Point IDs for I/O Slots

To Disable Full Hot-Swap and Enable Basic Hot-Swap

To Re-Enable Full Hot-Swap

Using the `rsh` Command Interactively