Solaris Resource Manager 1.3 System Administration Guide

Chapter 3 Configuration

Solaris Resource Manager provides a great deal of flexibility in its configuration to the central system administrator (for example, the root user). This chapter describes the following configuration areas:

Kernel Boot Parameters

The kernel has certain Solaris Resource Manager parameters that can be set by the central administrator when the kernel is booted. The Solaris system reads the /etc/system file at boot time and uses it to configure kernel modules (see system(4) for details). The parameters that can be set in the SHR module (all are 32-bit integers) to override the Solaris Resource Manager default behavior are:

SRMLnodes

The number of lnodes to cache in the kernel. On Solaris systems, each kernel lnode requires about 3 Kb. A value of zero (the default) means that the kernel will determine the value. The heuristic then used is:

(nproc / SRMProcsPerUid ) + SRMLnodesExtra 

where nproc is the maximum number of simultaneous processes allowed in the system. A minimum value of 6 overrides this calculation. The maximum specified by SRMMemoryMax will also override this calculation.

SRMProcsPerUid

The anticipated average number of processes used by each user. The default is 4.

SRMLnodesExtra

A bias used in the heuristic to determine the size of the in-memory lnode array. The default is 20.

SRMNhash

The number of entries in the hash table that is used to map UID values to lnodes in the kernel. On Solaris systems, each entry is 4 bytes long. The default is zero, which means to use the same value as for the number of lnodes.

SRMMemoryMax

The reciprocal of this value is a fraction that specifies the maximum percentage of real memory to use for the Solaris Resource Manager lnode and hash tables combined. The default is 20, which means that a maximum of 5 percent of real memory will be used for Solaris Resource Manager data structures.

SRMMemWarnFreq

The minimum interval, in seconds, between "memory exceeded" notification warnings for a single lnode. The default value is 4.

For example, in the /etc/system file the line:

set srmlim:SRMMemWarnFreq=10 

will ensure that memory exceeded messages are not sent more frequently than once every 10 seconds for any single user.

There are also some parameters not in Solaris Resource Manager that affect its behavior. These include:

initclass

This is the name of the scheduling class in which the init(1M) process is started. Under Solaris Resource Manager this should be given as the string "SHR" (including the double-quote characters). The default Solaris value is "TS". To use Solaris Resource Manager for CPU resource control, the following line should be included in the /etc/system file:

set initclass="SHR"

to override the default.

extraclass

This is a name of a scheduling class module to load, without necessarily using it as the default scheduling class. To use Solaris Resource Manager with only non-CPU resource control, the following line should be included in the /etc/system file:

set extraclass="SHR"

To boot a system that does not have Solaris Resource Manager loaded, an alternate /etc/system file named /etc/system.noshrload is used. See Booting Without Solaris Resource Manager for instructions on this process.

Multi-User Startup Configuration

During a normal system boot, when the system changes from single-user to multi-user mode, a Solaris Resource Manager initialization script (see Appendix A, Solaris Resource Manager Code Examples) is run to set various Solaris Resource Manager parameters. Details of what this script does are given in Chapter 4, Boot Procedure.

If the initialization script itself (/etc/init.d/init.srm) is modified, copies of both the original and modified versions should be kept separately. Applying Solaris Resource Manager updates will not necessarily preserve existing initialization scripts.

Global Solaris Resource Manager Parameters via srmadm

The srmadm command allows an administrator to set, modify, or display the global Solaris Resource Manager parameters. Refer to the srmadm(1MSRM) man page for details of all parameters.

The srmadm command can be called any number of times to set various parameters. It is not necessary to include all settings on a single invocation. This also means that srmadm can be used to change the operational parameters of a running Solaris Resource Manager system on the fly, although some caution should be taken.

Of particular importance to administrators are the srmadm options that enable or disable the main features of Solaris Resource Manager. These are:

fileopen[={y|n}]

The default database is /var/srm/srmDB; it can be overridden with the -f option. Note that closing the Solaris Resource Manager database file in mid-operation should be regarded as an emergency action. It has several undesirable consequences: all processes will continue running on the surrogate root lnode, which may give them more privilege than normal; the SHR scheduler is disabled; and Solaris Resource Manager limit enforcement ceases. When disabled, Solaris Resource Manager has no limits database open, and its cache contains only the surrogate root lnode to which all processes are attached.

share[={y|n}]

When enabled, the Solaris Resource Manager SHR scheduler is used and CPU scheduling takes place according to the Solaris Resource Manager dynamic usage and decay algorithm. This mode cannot be set unless fileopen mode is enabled. When disabled, the SHR scheduler's usage calculations are frozen, and processes are scheduled "round-robin" with fixed equal priorities.

limits[={y|n}]

When enabled, Solaris Resource Manager enforces the virtual memory and process limits. This mode cannot be set unless fileopen mode is enabled. When disabled, Solaris Resource Manager will keep usage attributes up to date, but will not enforce limits.

adjgroups[={y|n}]

When enabled, the Solaris Resource Manager SHR scheduler's global group effective share adjustment is used. The enabled state is recommended in most circumstances. Every run interval, the normalized usages of all limits entries are recalculated. If the adjgroups scheduling mode is enabled, then extra processing of normalized usages is performed as follows. The scheduler makes a pass over the scheduling tree, comparing each group's recently received effective share with its entitlement. Groups that have received less than their group entitlement are biased to receive a greater effective share in the next run interval. This ensures that groups receive their entitlements of CPU service whenever possible, regardless of the actions of their members.

limshare[={y|n}]

When enabled, the SHR scheduler applies its priority ceiling feature to limit all users' effective shares, which prevents extremely low-usage users from briefly acquiring almost 100 percent of CPU. The enabled state is recommended.

The rate of CPU service for a user is roughly inversely proportional to the user's usage. If not active for a long time, a user's usage decays to near-zero. When such a user logs in (or the lnode becomes active in any way), then for the duration of the next run interval, the user's processes could have such high priority that they monopolize the CPU.

Enabling the limshare scheduling flag causes the scheduler to estimate the effective share that an lnode will receive before the next run interval. If the result exceeds the user's assigned entitlement by a given factor (see maxushare), the user's normalized usage is readjusted to prevent this.

There are two optional parameters to srmadm that are also useful to an administrator:

The following examples illustrate typical srmadm commands.

To turn on Solaris Resource Manager, enabling the SHR scheduler and resource limits:

# srmadm set -f /var/srm/srmDB fileopen=y:share=y:limits=y 

To set the CPU usage decay rate to have a half-life of 5 minutes:

# srmadm set usagedecay=300s 

To display the current flag settings and charges:

% srmadm 

To show all the default settings:

% srmadm show -dv 

Disabling Solaris Resource Manager

The srmadm(1MSRM) command can disable Solaris Resource Manager by clearing the fileopen flag: all processes are moved onto the surrogate root lnode, other changed lnodes in the cache are flushed to disk, and the limits database is closed. This automatically forces the share and limits flags off, disabling the SHR scheduler and limit enforcement, respectively. The share and limits flags may be turned off independently if required while leaving the limits database open. This is better than closing the file, because processes can stay attached to their correct lnodes.

Note that if the Solaris Resource Manager scheduler alone is disabled in mid-operation, all this does is suspend the usage and decay algorithm. The scheduler still continues handling processes in the SHR scheduling class, but as each is assigned an updated priority, the same value is used, resulting in simple "round-robin" scheduling.

Re-enabling Solaris Resource Manager by opening the file and setting the share and/or limits flags after the file has been closed will not cause existing processes to move off the root lnode. It is best not to close the Solaris Resource Manager database during normal operation. If it is closed, the system should be rebooted in order to ensure correct attachment of processes to lnodes.

Using limdaemon

By default, limdaemon logs messages using syslog(3C). A timestamp is included on messages.

limdaemon has several options that can be configured when it is started:

In the following example, the limdaemon command:

% limdaemon -g300 

starts the daemon and sets the grace time to 5 minutes. Note that it is not necessary to follow the command with a shell '&' character. When limdaemon is started, it makes itself into a daemon. That is, a child process is forked that detaches itself from the controlling terminal, placing itself in a process group of its own.

The administrator should determine the balance needed between the additional overhead incurred for rapid updating of connect-time usage attributes, and the greater granularity that will appear with less frequent updating. See the limdaemon(1MSRM) man page for more information on these and other options.

PAM Subsystem

Beginning with the Solaris 2.6 release, Solaris systems support the Pluggable Authentication Module (PAM). Whenever a user requests an operation that involves changing or setting the user's identity (such as logging in to the system, invoking an 'r' command such as rcp or rsh, using ftp, or using su), a set of configurable modules are used to provide authentication, account management, credentials management, and session management. Solaris Resource Manager provides a PAM module for login accounting and modifying the behavior of su.

The program used to request the operation is called a service.

The PAM system as a whole is documented in the man pages pam(3), pam.conf(4), pam_unix(5), and pam_srm(5SRM).

In Solaris Resource Manager, the PAM module provides account management and session management functions. The behavior of PAM can be controlled by editing the file /etc/pam.conf. For normal Solaris Resource Manager behavior, the Solaris Resource Manager PAM module should be configured as requisite for all login-like services for session management, and as requisite for account management for all PAM services. Usually, the Solaris Resource Manager module should be placed after all other required and requisite modules, and before any other sufficient or optional modules.

Upon installation, Solaris Resource Manager edits /etc/pam.conf to provide a suitable behavior. It inserts lines such as these for each service (including other) that already has session or account management configured:

login account requisite pam_srm.so.1 nolnode=/etc/srm/nolnode 
other session requisite pam_srm.so.1
other account requisite pam_srm.so.1 nolnode=/etc/srm/nolnode

The first line specifies that for service login, the module pam_share.so.1 is to be used to provide account management functionality, that it must allow login if login is to succeed, and it is to be given the argument nolnode=/etc/srm/nolnode. See pam.conf(4) for a full explanation of the various control flags (required, requisite, optional, and sufficient).

The second line says that the login service will use the pam_share.so.1 module for session management.

The full list of supported arguments for Solaris Resource Manager account and session management modules is found in pam_srm(5SRM).

Account Management

When the Solaris Resource Manager account management PAM module gets control:

  1. It determines if Solaris Resource Manager is installed and enabled, and tells the PAM system to ignore this module if it is not.

  2. It determines whether the user has an lnode, and calls an administrator-configurable 'no lnode' script if not.

  3. It determines whether the user has permission to use the requested service and device.

  4. It determines whether the user has exceeded the warnings limit, and refuses permission to log in if this is the case.

  5. It calls an administrator-configurable 'every login' script.

If any of these steps fail, the remainder are not performed, and the Solaris Resource Manager account management PAM module denies use of the service. An explanatory message is passed to the user through the service where possible.

Scripts

The default 'missing lnode' script will create an lnode for the user and send mail notifying the system administrator that this has been done. The default script is /etc/srm/nolnode, but this can be changed by editing the file /etc/pam.conf and changing the value of the nolnode option on Solaris Resource Manager account management module lines. The 'every login' script is not usually configured. However, it can be configured by adding an [[everylogin=pathname]] option to any Solaris Resource Manager account management module in /etc/pam.conf. Scripts are invoked as the root user. Standard input, output, and error are closed. If a script exits non-zero, access will be denied. All information is passed as environment variables, which are derived directly from information passed to PAM from the service.

USER

The login name supplied to the program. It has been authenticated by looking it up in the password map; if not present, the account management module will already have returned an error code to PAM.

UID

The UID of the user being authenticated. For services that change UID (such as su), this is the UID of the user invoking the service; for services that set UID (such as login), this is the target UID (that of USER).

RHOST

For access attempts across a network, this variable contains the name of the host where the attempt originated. Its value is otherwise implementation dependent.

SERVICE

The name of the access service, for example, rsh, login, and ftp.

TTY

The name of the TTY on which the service is being invoked. Some services that do not (strictly speaking) have a controlling terminal (such as ftp) will fill this variable with process information (for example, ftp12345, where 12345 is the process identifier (PID) of ftpd; others leave it empty or replace it with the service name.

DEBUG

If debug was specified in the pam.conf file, DEBUG is set to true; otherwise it is set to false. No other environment variables are set, so any script must set its own PATH variable if required.

The default 'no lnode' script creates the lnode in the default scheduling group (other if such a user exists in the password map, otherwise root) and mails the system administrator a reminder to move the new lnode into the appropriate place in the scheduling hierarchy. For a sample script, see Default 'no lnode' Script.

PAM Interaction With Device Groups

The Solaris Resource Manager PAM module looks up the terminal and service names in the device hierarchy, and returns a 'permission denied' message to its invoker if limits are exceeded or if a device flag evaluates to set.

The device categories examined are terminal for the terminal name, and services for the kind of service requested. For example, an rlogin attempt may try to use a file in the network device group, so the flags tested for the user (assuming all flags are set to group) are as shown below. These flags are checked in order:

Access is permitted only if they all evaluate to set. In addition, limits will be checked for the corresponding categories (terminal and services).

Session Management

For login-like services (those that create an entry in the utmp file), the session management facilities of PAM are invoked as well as the account management facilities if both are configured in /etc/pam.conf.

The Solaris Resource Manager product's session management handles charging for devices. It looks to see if the user has exceeded the connect-time limit, or has the onelogin flag evaluate to set and is already logged in, and if so, prevents login.

Otherwise, it generates a message to the limdaemon process to inform it of the login and the configured cost for the terminal being used. It then informs the kernel that the current process is a login header process, and that the limdaemon process must be notified when it expires.

The limdaemon process then tracks connect-time limits and issues warnings if they are about to be exceeded.