C H A P T E R  1

Introduction to System Management Services

This manual describes the System Management Services (SMS) 1.6 software that is available with the Sun Fire high-end server system.

This chapter includes the following sections:


Sun Fire High-End Systems

The system controller (SC) in Sun Fire high-end systems is a multifunction, CP1500- or CP2140-based printed circuit board (PCB) that provides critical services and resources required for the operation and control of the Sun Fire system.

A Sun Fire high-end system is often referred to as the platform. System boards within the platform can be logically grouped together into separately bootable systems called dynamic system domains, or simply domains.

Up to 18 domains can exist simultaneously on a single Sun Fire E25K/15K, and up to 9 domains on the Sun Fire E20K/12K. (Domains are introduced in this chapter, and are described in more detail in Chapter 5). The SMS software lets you control and monitor domains, as well as the platform itself.

The SC provides the following services for the Sun Fire system:

Redundant SCs

There are two SCs within a Sun Fire platform. The SC that controls the platform is referred to as the main SC, while the other SC acts as a backup and is called the spare SC. The software running on the main SC monitors both SCs to determine when an automatic failover should be performed.

Configure the two SCs with the same configuration. This duplication includes the Solaris Operating System (OS), SMS software, security modifications, patch installations, and all other system configurations.



Note - For failover to be supported, both SCs must be configured with identical versions of the Solaris OS and SMS software.



The failover functionality between the SCs is controlled by daemons running on the main and spare SCs. These daemons communicate across private communication paths built into the Sun Fire platform. Other than the communication between these daemons, there is no special trust relationship between the two SCs.

SMS software packages are installed on the SC. In addition, SMS communicates with the Sun Fire high-end system over an Ethernet connection. See Management Network Services.



Note - SMS 1.6 cannot communicate with SMS 1.4.1 across the I2 network. If one of the SCs is running SMS 1.4.1 and the other is running SMS 1.6, the I2 network tests will fail, and the SCs will communicate instead through high-availability SRAM (HASRAM) For information about the I2 network, see I2 Network.




SMS Features

SMS 1.6 supports Sun Fire high-end domains running the Solaris 8 2/04, Solaris 9 4/04, Solaris 10 3/05, Solaris 10 1/06, and Solaris 10 6/06 OSs. SMS 1.6 supports the Solaris 10 1/06, Solaris 10 6/06, Solaris 9 4/04, Solaris 9 9/04, and Solaris 9 9/05 OSs on the system controllers. The commands provided with the SMS software can be used remotely.



Note - The supported firmware version for SMS 1.6 is 5.2.0.





Note - Graphical user interfaces for many of the commands in SMS are provided by the Suntrademark Management Center. For more information, see Sun Management Center.



SMS enables the platform administrator to perform the following tasks:

In addition, SMS:

SMS enables the domain administrator to perform the following tasks:

Features Provided in Previous Releases of SMS

Previous SMS releases provided the following:

New Features Provided in SMS 1.6 Release

SMS 1.6 provides the following new features:

VCMON

A voltage core monitoring parameter (VCMON) was added to the SMS software. When VCMON is enabled, it monitors any voltage changes or drifts on the processors. If VCMON detects an upward change in voltage (which usually indicates a socket attach issue), it notifies the user with an FMA event and marks the component health status (CHS) of that processor as faulty.


System Architecture

SMS uses a distributed client-server architecture. init(1M) starts, and restarts as necessary, one process: ssd(1M). ssd is responsible for monitoring all other SMS processes and restarting them as necessary. See FIGURE 4-1.

The Sun Fire high-end systems platform, the SC, and other workstations communicate over Ethernet. You perform SMS operations by entering commands on the SC console after remotely logging in to the SC from another workstation on the local area network (LAN). You must log in as a user with the appropriate platform or domain privileges if you want to perform SMS operations, such as monitoring and controlling the platform.



Note - If SMS is stopped on the main SC and the spare SC is powered off, the domains shut down gracefully and the platform is powered down. If the spare SC is simply powered off without a shutdown of SMS, SMS will not have time to power off the platform and the domains will crash.



Dual-system controllers are supported within the Sun Fire high-end systems platform. One SC is designated as the primary or main system controller, and the other is designated as the spare system controller. If the main SC fails, the failover capability automatically switches to the spare SC as described in Chapter 12.

Most domain-configurable units are active components. This means that you must check the system state before powering off any DCU.



caution icon

Caution - Circuit breakers must be on whenever a board is present, including expander boards, whether or not the board is powered on.



For details, see Power Control.


SMS Administration Environment

Administration tasks on the Sun Fire high-end system are secured by group privilege requirements. SMS installs the following 39 UNIX groups to the /etc/group file.

The smsconfig(1M) command enables an administrator to add, remove, and list members of platform and domain groups, as well as set platform and domain directory privileges using the -a, -r, and -l options.

smsconfig also can configure SMS to use alternate group names, including NIS (Network Information Service) managed groups using the -g option. Group information entries can come from any of the sources for groups specified in the/etc/nsswitch.conf file (refer to nsswitch.conf(4)). For instance, if domain A was known by its domain tag as the Production Domain, an administrator could create an NIS group with the same name and configure SMS to use this group as the domain A administrator group instead of using the default, dmnaadmn. For more information, see Chapter 3, and refer to the smsconfig man page.

Network Connections for Administrators

The nature of the Sun Fire high-end systems physical architecture, with an embedded system controller, as well as the supported administrative model (with multiple administrative privileges, and thus multiple administrators) dictates that an administrator use a remote network connection from a workstation to access SMS command interfaces to manage the Sun Fire high-end system.



caution icon

Caution - Shutting down a remote workstation while a tipsession is active into a Sun Fire high-end system SC will bring both SCs down to the OpenBoottrademarkokprompt. This will not affect the domains, and after powering the remote system back on you can restore the SCs by typing goat the okprompt. However, you should end all tipsessions before shutting down a remote workstation.



Since the administrators provide information to verify their identity (passwords) and might need to display sensitive data, it is important that the remote network connection be secure. Physical separation of the administrative networks provides some security on the Sun Fire high-end system. Multiple external physical network connections are available on each SC. SMS software supports up to two external network communities.

For more information on Sun Fire high-end system networks, see Management Network Services. For more information on securing the Sun Fire high-end system, see Chapter 2, Using Solaris Security Toolkit to Secure the System Controller.

SMS Operating System

SMS provides a command-line interface (CLI) to the various functions and features the program contains. You can interact with the SC and the domains on a system by using the CLI commands.

For the examples in this guide, the sc-name is sc0 and sms-user is the user-name of the administrator, operator, configurator, or service personnel logged in to the system.

The privileges allotted to the user are determined by the platform or domain groups to which the user belongs. In these examples, the sms-user is assumed to have both platform and domain administrator privileges, unless otherwise noted.

For more information on the function and creation of SMS user groups, see Chapter 3 and refer to the System Management Services (SMS) 1.6 Installation Guide.


procedure icon  To Begin Using the SC

1. Boot the SC.



Note - This procedure assumes that smsconfig -m has already been run. If smsconfig -m has not been run, you will receive the following error when SMS attempts to start and SMS will exit.




sms: smsconfig(1M) has not been run. Unable to start sms services.

2. Log in to the SC and verify that SMS software startup has completed. Type:


sc0:sms-user:> showplatform

Output similar to the following is displayed if you have platform privileges.


sc0:sms-user:>  showplatform
 
PLATFORM:
========
Platform Type: Sun Fire 15000
 
CSN:
====
Chassis Serial Number: 353A00053
 
COD:
====
Chassis HostID : 5014936C37048
PROC RTUs installed : 8
PROC Headroom Quantity : 0
PROC RTUs reserved for domain A : 4
PROC RTUs reserved for domain B : 0
PROC RTUs reserved for domain C : 0
PROC RTUs reserved for domain D : 0
PROC RTUs reserved for domain E : 0
PROC RTUs reserved for domain F : 0
PROC RTUs reserved for domain G : 0
PROC RTUs reserved for domain H : 0
PROC RTUs reserved for domain I : 0
PROC RTUs reserved for domain J : 0
PROC RTUs reserved for domain K : 0
PROC RTUs reserved for domain L : 0
PROC RTUs reserved for domain M : 0
PROC RTUs reserved for domain N : 0
PROC RTUs reserved for domain O : 0
PROC RTUs reserved for domain P : 0
PROC RTUs reserved for domain Q : 0
PROC RTUs reserved for domain R : 0
 
 
Available Component List for Domains:
=====================================
Available for domain newA:
          SB0 SB1 SB2 SB7
          IO1 IO3 IO6
Available for domain engB:
          No System boards
          No IO boards
Available  for domain domainC:
          No System boards
          IO0 IO1 IO2 IO3 IO4
Available  for domain eng1:
          No System boards
          No IO boards
Available  for domain E:
          No System boards
          No IO boards
Available  for domain domainF:
          No System boards
          No IO boards
Available  for domain dmnG:
          No System boards
          No IO boards
Available  for domain domain H:
          No System boards
          No IO boards
Available  for domain I:
          No System boards
          No IO boards
Available  for domain dmnJ:
          No System boards
          No IO boards
Available  for domain K:
          No System boards
          No IO boards
Available  for domain L:
          No System boards
          No IO boards
Available  for domain M:
          No System boards
          No IO boards
Available  for domain N:
          No System boards
          No IO boards
Available  for domain O:
          No System boards
          No IO boards
Available  for domain P:
          No System boards
          No IO boards
Available  for domain Q:
          No System boards
          No IO boards
Available  for domain dmnR:
          No System boards
          No IO boards
 
 
Domain Ethernet Addresses:
=============================
Domain ID   Domain Tag        Ethernet Address
A           newA              8:0:20:b8:79:e4
B           engB              8:0:20:b4:30:8c
C           domainC           8:0:20:b7:30:b0
D               -             8:0:20:b8:2d:b0
E           eng1              8:0:20:f1:b7:0
F           domainF           8:0:20:be:f8:a4
G           dmnG              8:0:20:b8:29:c8
H               -             8:0:20:f3:5f:14
I               -             8:0:20:be:f5:d0
J           dmnJ              UNKNOWN
K               -             8:0:20:f1:ae:88
L               -             8:0:20:b7:5d:30
M               -             8:0:20:f1:b8:8
N               -             8:0:20:f3:5f:74
O               -             8:0:20:f1:b8:8
P               -             8:0:20:b8:58:64
Q               -             8:0:20:f1:b7:ec
R           dmnR              8:0:20:f1:b7:10
 
 
Domain Configurations:
======================
DomainID    Domain Tag     Solaris Nodename     Domain Status
A           newA           -                    Powered Off
B           engB           sun15-b              Keyswitch Standby
C           domainC        sun15-c              Running OBP
D           -              sun15-d              Running Solaris
E           eng1           sun15-e              Running Solaris
F           domainF        sun15-f              Running Solaris
G           dmnG           sun15-g              Running Solaris
H           -              sun15-g              Solaris Quiesced
I           -              -                    Powered Off
J           dmnJ           -                    Powered Off
K           -              sun15-k              Booting Solaris
L           -              -                    Powered Off
M           -              -                    Powered Off
N           -              sun15-n              Keyswitch Standby
O           -              -                    Powered Off
P           -              sun15-p              Running Solaris
Q           -              sun15-q              Running Solaris
R           dnmR           sun15-r              Running Solaris

At this point, you can begin using SMS programs.

SMS Console Window

An SMS console window provides a command-line interface from the SC to the Solaris OS on the domains.


procedure icon  To Display a Console Window Locally

1. Log in to the SC, if you have not already done so.



Note - You must have domain privileges for the domain on which you want to run console.



2. Type:


sc0:sms-user:> console -d domain-indicator  option

where:


-d

Specifies the domain using a domain-indicator:

 

domain-id - ID for a domain. Valid domain-ids are 'A'...'R' and are case insensitive.

 

domain-tag - Name assigned to a domain using addtag(1M).

-f

Force

Opens a domain console window with locked write permission, terminates all other open sessions, and prevents new ones from being opened. This constitutes an exclusive session. Use it only when you need exclusive use of the console (for example, for private debugging). To restore multiple-session mode, either release the lock (~^) or terminate the console session (~.).

-g

Grab

Opens a console window with unlocked write permission. If another session has unlocked write permission, the new console window takes it away. If another session has locked permission, this request is denied and a read-only session is started.

-l

Lock

Opens a console window with locked write permission. If another session has unlocked write permission, the new console window takes it away. If another session has locked permission, the request is denied and a read-only session is started.

-r

Read Only

Opens a console window in read-only mode.


The console command creates a remote connection to the domain's virtual console driver, making the window in which the command is executed a console window for the specified domain (domain-id or domain-tag).

If console is invoked without any options when no other console windows are running for that domain, it comes up in an exclusive locked write mode session.

If console is invoked without any options when one or more nonexclusive console windows are running for that domain, it will appear in read-only mode.

Locked write permission is more secure. It can only be removed if another console is opened using console -f or if ~* (tilde-asterisk) is entered from another running console window. In both cases, the new console session is an exclusive session, and all other sessions are forcibly detached from the domain virtual console.

The console command can use either Input Output Static Random Access Memory (IOSRAM) or the internal management network for domain console communication. You can manually toggle the communication path by using the ~= (tilde-equal sign) command. Doing so is useful if the network becomes inoperable, in which case the console session appears to be hung.

Many console sessions can be attached simultaneously to a domain, but only one console will have write permissions; all others will have read-only permissions. Write permissions are in either locked or unlocked mode.

Tilde Escape Sequences

In a domain console window, a tilde ( ~ ) that appears as the first character of a line is interpreted as an escape signal that directs console to perform some special action, as shown in the following table:

TABLE 1-1 Tilde Usage Table listing characters used in a domain console window with a tilde (~), with a description of each character.

Character

Description

~?

Status message.

~.

Disconnects console session.

~#

Breaks to OpenBoot PROM or kadb.

~@

Acquires unlocked write permission. See option -g.

~^

Releases write permission.

~=

Toggles the communication path between the network and IOSRAM interfaces. You can use ~= only in private mode (see ~* ).

~&

Acquires locked write permission; see option -l . You can issue this signal during a read-only or unlocked write session.

~*

Acquires locked write permission, terminates all other open sessions, and prevents new sessions from being opened; see option -f . To restore multiple-session mode, either release the lock or terminate this session.


The rlogin command also processes tilde-escape sequences whenever a tilde is seen at the beginning of a new line. If you must send a tilde sequence at the beginning of a line and you are connected using rlogin, use two tildes (the first escapes the second for rlogin). Alternatively, do not enter a tilde at the beginning of a line when running inside of an rlogin window.

If you use a kill -9 command to terminate a console session, the window or terminal in which the console command was executed goes into raw mode, and appears hung. Press CTRL-J, then type stty sane, then press CTRL-J to escape this condition.

In the domain console window, vi(1) runs properly and the escape sequences (tilde commands) work as intended only if the environment variable TERM has the same setting as that of the console window.

For example:


sc0:sms-user:> setenv TERM xterm 

To resize the window, type:


sc0:sms-user:> stty rows 20 cols 80

For more information on the domain console, see Chapter 9 and refer to the console man page.

Remote Console Session

In the event that a system controller hangs and that console cannot be reached directly, SMS provides the smsconnectsc command to remotely connect to the hung SC. This command works from either the main or spare SC. For more information and examples, refer to the smsconnectsc man page.

You may also connect to the hung SC using an external console connection, but you cannot run smsconnectsc and use an external console at the same time.


Sun Management Center

Sun Management Center for Sun Fire high-end systems is an extensible monitoring and management tool that integrates standard Simple Network Management Protocol (SNMP)-based management structures with new intelligent and autonomous agent and management technology based on the client-server paradigm.

Sun Management Center is used as the graphical user interface (GUI) and SNMP manager-agent infrastructure for the Sun Fire system. The features and functions of Sun Management Center are not covered in this manual. For more information, refer to the latest Sun Management Center documentation available at: www.docs.sun.com