Solaris Resource Manager provides a great deal of flexibility in its configuration to the central system administrator (for example, the root user). This chapter describes the following configuration areas:
Kernel boot parameters used when the kernel is first started (see Kernel Boot Parameters).
Global Solaris Resource Manager parameters supplied through the srmadm(1MSRM) command (see Global Solaris Resource Manager Parameters via srmadm).
PAM subsystem, including account and session management (see PAM Subsystem).
The kernel has certain Solaris Resource Manager parameters that can be set by the central administrator when the kernel is booted. The Solaris system reads the /etc/system file at boot time and uses it to configure kernel modules (see system(4) for details). The parameters that can be set in the SHR module (all are 32-bit integers) to override the Solaris Resource Manager default behavior are:
The number of lnodes to cache in the kernel. On Solaris systems, each kernel lnode requires about 3 Kb. A value of zero (the default) means that the kernel will determine the value. The heuristic then used is:
(nproc / SRMProcsPerUid ) + SRMLnodesExtra
where nproc is the maximum number of simultaneous processes allowed in the system. A minimum value of 6 overrides this calculation. The maximum specified by SRMMemoryMax will also override this calculation.
The anticipated average number of processes used by each user. The default is 4.
A bias used in the heuristic to determine the size of the in-memory lnode array. The default is 20.
The number of entries in the hash table that is used to map UID values to lnodes in the kernel. On Solaris systems, each entry is 4 bytes long. The default is zero, which means to use the same value as for the number of lnodes.
The reciprocal of this value is a fraction that specifies the maximum percentage of real memory to use for the Solaris Resource Manager lnode and hash tables combined. The default is 20, which means that a maximum of 5 percent of real memory will be used for Solaris Resource Manager data structures.
The minimum interval, in seconds, between "memory exceeded" notification warnings for a single lnode. The default value is 4.
For example, in the /etc/system file the line:
will ensure that memory exceeded messages are not sent more frequently than once every 10 seconds for any single user.
There are also some parameters not in Solaris Resource Manager that affect its behavior. These include:
This is the name of the scheduling class in which the init(1M) process is started. Under Solaris Resource Manager this should be given as the string "SHR" (including the double-quote characters). The default Solaris value is "TS". To use Solaris Resource Manager for CPU resource control, the following line should be included in the /etc/system file:
to override the default.
This is a name of a scheduling class module to load, without necessarily using it as the default scheduling class. To use Solaris Resource Manager with only non-CPU resource control, the following line should be included in the /etc/system file:
To boot a system that does not have Solaris Resource Manager loaded, an alternate /etc/system file named /etc/system.noshrload is used. See Booting Without Solaris Resource Manager for instructions on this process.
During a normal system boot, when the system changes from single-user to multi-user mode, a Solaris Resource Manager initialization script (see Appendix A, Solaris Resource Manager Code Examples) is run to set various Solaris Resource Manager parameters. Details of what this script does are given in Chapter 4, Boot Procedure.
If the initialization script itself (/etc/init.d/init.srm) is modified, copies of both the original and modified versions should be kept separately. Applying Solaris Resource Manager updates will not necessarily preserve existing initialization scripts.
The srmadm command allows an administrator to set, modify, or display the global Solaris Resource Manager parameters. Refer to the srmadm(1MSRM) man page for details of all parameters.
The srmadm command can be called any number of times to set various parameters. It is not necessary to include all settings on a single invocation. This also means that srmadm can be used to change the operational parameters of a running Solaris Resource Manager system on the fly, although some caution should be taken.
Of particular importance to administrators are the srmadm options that enable or disable the main features of Solaris Resource Manager. These are:
The default database is /var/srm/srmDB; it can be overridden with the -f option. Note that closing the Solaris Resource Manager database file in mid-operation should be regarded as an emergency action. It has several undesirable consequences: all processes will continue running on the surrogate root lnode, which may give them more privilege than normal; the SHR scheduler is disabled; and Solaris Resource Manager limit enforcement ceases. When disabled, Solaris Resource Manager has no limits database open, and its cache contains only the surrogate root lnode to which all processes are attached.
When enabled, the Solaris Resource Manager SHR scheduler is used and CPU scheduling takes place according to the Solaris Resource Manager dynamic usage and decay algorithm. This mode cannot be set unless fileopen mode is enabled. When disabled, the SHR scheduler's usage calculations are frozen, and processes are scheduled "round-robin" with fixed equal priorities.
When enabled, Solaris Resource Manager enforces the virtual memory and process limits. This mode cannot be set unless fileopen mode is enabled. When disabled, Solaris Resource Manager will keep usage attributes up to date, but will not enforce limits.
When enabled, the Solaris Resource Manager SHR scheduler's global group effective share adjustment is used. The enabled state is recommended in most circumstances. Every run interval, the normalized usages of all limits entries are recalculated. If the adjgroups scheduling mode is enabled, then extra processing of normalized usages is performed as follows. The scheduler makes a pass over the scheduling tree, comparing each group's recently received effective share with its entitlement. Groups that have received less than their group entitlement are biased to receive a greater effective share in the next run interval. This ensures that groups receive their entitlements of CPU service whenever possible, regardless of the actions of their members.
When enabled, the SHR scheduler applies its priority ceiling feature to limit all users' effective shares, which prevents extremely low-usage users from briefly acquiring almost 100 percent of CPU. The enabled state is recommended.
The rate of CPU service for a user is roughly inversely proportional to the user's usage. If not active for a long time, a user's usage decays to near-zero. When such a user logs in (or the lnode becomes active in any way), then for the duration of the next run interval, the user's processes could have such high priority that they monopolize the CPU.
Enabling the limshare scheduling flag causes the scheduler to estimate the effective share that an lnode will receive before the next run interval. If the result exceeds the user's assigned entitlement by a given factor (see maxushare), the user's normalized usage is readjusted to prevent this.
The -v parameter prints a formatted report of all current parameter settings on standard output. Supplying more -v parameters (-V 1, -V 2, and -V 3) results in a more verbose report. Invoking srmadm with no arguments is equivalent to supplying a single -v option.
The -d parameter initializes the Solaris Resource Manager system structure with default values instead of reading the current kernel settings. The default values, which mainly give control over scheduling behavior, are built into srmadm and provide a good starting point from which to customize Solaris Resource Manager. The kernel begins with the same values preset.
The following examples illustrate typical srmadm commands.
To turn on Solaris Resource Manager, enabling the SHR scheduler and resource limits:
# srmadm set -f /var/srm/srmDB fileopen=y:share=y:limits=y
To set the CPU usage decay rate to have a half-life of 5 minutes:
# srmadm set usagedecay=300s
To display the current flag settings and charges:
To show all the default settings:
% srmadm show -dv
The srmadm(1MSRM) command can disable Solaris Resource Manager by clearing the fileopen flag: all processes are moved onto the surrogate root lnode, other changed lnodes in the cache are flushed to disk, and the limits database is closed. This automatically forces the share and limits flags off, disabling the SHR scheduler and limit enforcement, respectively. The share and limits flags may be turned off independently if required while leaving the limits database open. This is better than closing the file, because processes can stay attached to their correct lnodes.
Note that if the Solaris Resource Manager scheduler alone is disabled in mid-operation, all this does is suspend the usage and decay algorithm. The scheduler still continues handling processes in the SHR scheduling class, but as each is assigned an updated priority, the same value is used, resulting in simple "round-robin" scheduling.
Re-enabling Solaris Resource Manager by opening the file and setting the share and/or limits flags after the file has been closed will not cause existing processes to move off the root lnode. It is best not to close the Solaris Resource Manager database during normal operation. If it is closed, the system should be rebooted in order to ensure correct attachment of processes to lnodes.
By default, limdaemon logs messages using syslog(3C). A timestamp is included on messages.
limdaemon has several options that can be configured when it is started:
The -m tag and -p priority options are used to tag the messages and control message routing according to the syslogd(1M) configuration.
The -c option causes limdaemon to suppress the updating of terminal connect-time usages.
The -d option causes limdaemon to decay connect-time usages for all terminals of logged in users, with the interval between decays being the argument of the -t option (the default is 1 minute).
The -Dn option causes limdaemon to decay connect-time usages for the terminals of all users once every n minutes.
The -k option terminates the currently running limdaemon command.
The -t option is used to set the time period (in minutes) between updates to the connect-time usage attribute in the terminal device category. The default is 1 minute.
The -e option can be used to suppress the logging off of users who have reached their connect-time limit. This option is implied by the use of the -c option.
The -w option sets the number of minutes before expiration of connect-time that the warning message is given. The default warning interval is 5 minutes.
The -g option is used to set the grace time (in seconds). The default grace time is 30 seconds.
In the following example, the limdaemon command:
% limdaemon -g300
starts the daemon and sets the grace time to 5 minutes. Note that it is not necessary to follow the command with a shell '&' character. When limdaemon is started, it makes itself into a daemon. That is, a child process is forked that detaches itself from the controlling terminal, placing itself in a process group of its own.
The administrator should determine the balance needed between the additional overhead incurred for rapid updating of connect-time usage attributes, and the greater granularity that will appear with less frequent updating. See the limdaemon(1MSRM) man page for more information on these and other options.
Beginning with the Solaris 2.6 release, Solaris systems support the Pluggable Authentication Module (PAM). Whenever a user requests an operation that involves changing or setting the user's identity (such as logging in to the system, invoking an 'r' command such as rcp or rsh, using ftp, or using su), a set of configurable modules are used to provide authentication, account management, credentials management, and session management. Solaris Resource Manager provides a PAM module for login accounting and modifying the behavior of su.
The program used to request the operation is called a service.
The PAM system as a whole is documented in the man pages pam(3), pam.conf(4), pam_unix(5), and pam_srm(5SRM).
In Solaris Resource Manager, the PAM module provides account management and session management functions. The behavior of PAM can be controlled by editing the file /etc/pam.conf. For normal Solaris Resource Manager behavior, the Solaris Resource Manager PAM module should be configured as requisite for all login-like services for session management, and as requisite for account management for all PAM services. Usually, the Solaris Resource Manager module should be placed after all other required and requisite modules, and before any other sufficient or optional modules.
Upon installation, Solaris Resource Manager edits /etc/pam.conf to provide a suitable behavior. It inserts lines such as these for each service (including other) that already has session or account management configured:
login account requisite pam_srm.so.1 nolnode=/etc/srm/nolnode other session requisite pam_srm.so.1 other account requisite pam_srm.so.1 nolnode=/etc/srm/nolnode
The first line specifies that for service login, the module pam_share.so.1 is to be used to provide account management functionality, that it must allow login if login is to succeed, and it is to be given the argument nolnode=/etc/srm/nolnode. See pam.conf(4) for a full explanation of the various control flags (required, requisite, optional, and sufficient).
The second line says that the login service will use the pam_share.so.1 module for session management.
The full list of supported arguments for Solaris Resource Manager account and session management modules is found in pam_srm(5SRM).
When the Solaris Resource Manager account management PAM module gets control:
It determines if Solaris Resource Manager is installed and enabled, and tells the PAM system to ignore this module if it is not.
It determines whether the user has an lnode, and calls an administrator-configurable 'no lnode' script if not.
It determines whether the user has permission to use the requested service and device.
It determines whether the user has exceeded the warnings limit, and refuses permission to log in if this is the case.
It calls an administrator-configurable 'every login' script.
If any of these steps fail, the remainder are not performed, and the Solaris Resource Manager account management PAM module denies use of the service. An explanatory message is passed to the user through the service where possible.
The default 'missing lnode' script will create an lnode for the user and send mail notifying the system administrator that this has been done. The default script is /etc/srm/nolnode, but this can be changed by editing the file /etc/pam.conf and changing the value of the nolnode option on Solaris Resource Manager account management module lines. The 'every login' script is not usually configured. However, it can be configured by adding an [[everylogin=pathname]] option to any Solaris Resource Manager account management module in /etc/pam.conf. Scripts are invoked as the root user. Standard input, output, and error are closed. If a script exits non-zero, access will be denied. All information is passed as environment variables, which are derived directly from information passed to PAM from the service.
The login name supplied to the program. It has been authenticated by looking it up in the password map; if not present, the account management module will already have returned an error code to PAM.
The UID of the user being authenticated. For services that change UID (such as su), this is the UID of the user invoking the service; for services that set UID (such as login), this is the target UID (that of USER).
For access attempts across a network, this variable contains the name of the host where the attempt originated. Its value is otherwise implementation dependent.
The name of the access service, for example, rsh, login, and ftp.
The name of the TTY on which the service is being invoked. Some services that do not (strictly speaking) have a controlling terminal (such as ftp) will fill this variable with process information (for example, ftp12345, where 12345 is the process identifier (PID) of ftpd; others leave it empty or replace it with the service name.
If debug was specified in the pam.conf file, DEBUG is set to true; otherwise it is set to false. No other environment variables are set, so any script must set its own PATH variable if required.
The default 'no lnode' script creates the lnode in the default scheduling group (other if such a user exists in the password map, otherwise root) and mails the system administrator a reminder to move the new lnode into the appropriate place in the scheduling hierarchy. For a sample script, see Default 'no lnode' Script.
The Solaris Resource Manager PAM module looks up the terminal and service names in the device hierarchy, and returns a 'permission denied' message to its invoker if limits are exceeded or if a device flag evaluates to set.
The device categories examined are terminal for the terminal name, and services for the kind of service requested. For example, an rlogin attempt may try to use a file in the network device group, so the flags tested for the user (assuming all flags are set to group) are as shown below. These flags are checked in order:
Access is permitted only if they all evaluate to set. In addition, limits will be checked for the corresponding categories (terminal and services).
For login-like services (those that create an entry in the utmp file), the session management facilities of PAM are invoked as well as the account management facilities if both are configured in /etc/pam.conf.
The Solaris Resource Manager product's session management handles charging for devices. It looks to see if the user has exceeded the connect-time limit, or has the onelogin flag evaluate to set and is already logged in, and if so, prevents login.
Otherwise, it generates a message to the limdaemon process to inform it of the login and the configured cost for the terminal being used. It then informs the kernel that the current process is a login header process, and that the limdaemon process must be notified when it expires.
The limdaemon process then tracks connect-time limits and issues warnings if they are about to be exceeded.