Solaris Resource Manager 1.3 System Administration Guide

Chapter 11 Troubleshooting

This chapter provides pointers for diagnosing problems in the operation of Solaris Resource Manager.

If you require additional assistance, contact your Sun Software Support Provider.

User-Related Problems

User Cannot Log In

None of the Solaris Resource Manager limitations listed above apply to the superuser.


Note -

Although the user is able to log in to the system, if there is no lnode corresponding to the UID of the user (an lnode has not been set up for that user's account), the problem is identified by a message indicating that: No limits information is available. Refer to Orphaned Lnodes.


User Not Informed of Reaching Limits

During normal operation of Solaris Resource Manager, a logged-in user receives notification messages whenever a limit is reached. Sometimes users miss seeing these communications and are unaware of the cause of any problems they are having, and the system will appear to behave mysteriously. However, the system administrator will have been notified.

The delivery of notification messages is carried out by the Solaris Resource Manager daemon program, limdaemon. There are a number of possibilities that the administrator can investigate if notification messages are not being delivered to users:

Unable to Change User's Group

The sgroup attribute determines the lnode's parent in the scheduling tree. This hierarchy is used to regulate resource usage and to schedule the CPU. For this reason several security precautions are placed on modifying the sgroup attribute, both to avoid inadvertent errors when changing it and to prevent circumvention of Solaris Resource Manager.

To modify the sgroup attribute, a user needs one of the following privileges:

Orphaned lnodes cannot be made parents of other lnodes. See Orphaned Lnodes.

Users Frequently Exceeding Limits

Check whether any of these conditions are causing the problem:

Unexpected Notification Messages

Notification messages will be received by any user affected by a limit being reached. Therefore, if a group limit is reached, the group header, and all users below in the scheduling hierarchy, will receive a notification message.

If a user is attached to another lnode and a limit is reached, the user will not receive a notification message, but all the other affected users will. The cause of the problem may not be apparent to the group that is affected.

Terminal Connect-Time Not Updated

The most likely cause of this problem is that the limdaemon program is not running. limdaemon periodically updates the usage and accrue attributes in the terminal device category for all currently logged in users. Typically, it would be started from the Solaris Resource Manager init.d script.

Performance Issues

Processes Attached to the root Lnode

For reasons of system management, processes attached to the root lnode are given almost all the CPU resources they demand. Therefore, if a CPU-bound process is attached to the root lnode, it will tie up a CPU, causing processes on other lnodes to slow or stop.

The following precautions can be taken to prevent this from occurring:

A program that runs as setuid-root does not automatically attach to the root lnode. Normally, the process remains attached to the lnode of the parent that created it, and only the effective UID is changed.

CPU Resources Not Controlled by Solaris Resource Manager

Solaris Resource Manager only controls CPU use by processes in the SHR scheduling class. If excessive demands are made at higher priority by other scheduling classes, especially real-time (RT) and system (SYS), then SHR can only schedule with the residual CPU resource.

The use of the RT class conflicts with the Solaris Resource Manager software's ability to control the system. Real-time processes get complete access to the system, specifically so they can deliver real-time response (generally on the order of a few hundred microseconds). Processes running in the SHR class by definition have lower priority than anything running in real time, and Solaris Resource Manager has no control over RT processes. Real-time processes can easily consume all available resources, leaving Solaris Resource Manager nothing left to allocate to remaining processes.

One notable system service that runs entirely in the SYS class is the NFS server. Solaris Resource Manager cannot control the NFS daemons, because they run in SYS. The Solaris Resource Manager product's ability to allocate processor resources may be reduced on systems offering extensive NFS service.

While processes are executing kernel code (inside a system call), the usual time-slice preemption rules do not apply. Most system calls will only do a reasonable amount of work before they reach a preemption point. However, if the system is under high load from more intensive system calls, this can result in reduced overall responsiveness and is outside the control of a scheduling class.

If the system is short of available real memory, then the resulting I/O bottleneck as the page fault rate increases and process swapping increases leads to increased kernel consumption of CPU. Large amounts of time spent waiting on I/O may indicate lost CPU capacity. Again, this is outside the scope of a scheduling class to control.

The SHR scheduling class is a time-sharing (TS) scheduler. It uses the same global priority range as the TS and the interactive (IA) schedulers. It is not appropriate to mix the use of SHR with TS and IA except during the transition of moving all processes into or out of the SHR class. System operation with a mix of processes in SHR and TS classes will result in reduced quality of scheduling behavior in both classes. For this reason, Solaris Resource Manager prevents non-root processes from moving themselves or others to the TS or IA classes. The RT class uses an alternate priority range and may be used with the SHR class in the same way as the TS and IA classes.

If processes run by superuser contain code that uses the priocntl(2) system call directly instead of using the setpriority(3C) library routine to adjust process priorities, then the target processes may be moved into another scheduling class (typically TS). The setpriority library routine code accounts for the fact that the priocntl interface to SHR is binary compatible with that of TS and thus avoids the problem. The -c option of ps(1) or the -d option of priocntl(1) can be used to display the scheduling class of processes.

The same difficulty arises with superuser privilege processes that explicitly use priocntl(2) to manage the scheduling class membership of processes.

Orphaned Lnodes

An orphaned lnode is one that has a nonexistent parent lnode. This is of concern to an administrator because Solaris Resource Manager prevents processes from attaching to any lnode that is orphaned or has an orphaned ancestor in the scheduling tree.

The kernel checks changes to the sgroup attribute in order to prevent the creation of orphans by invalid alterations to the scheduling group parent.

The major effect of an lnode being orphaned is that it can no longer have processes attached to it. Since no process can connect to it, the lnode cannot be used for logging in. Any attempts to log in using the corresponding account will fail.

The easiest way for an administrator to detect orphaned lnodes is to use the limreport command with the built-in orphan identifier. The command:

% limreport orphan - uid sgroup lname

will list the UID, scheduling group parent, and login name of users who have orphaned lnodes. The sgroup attribute can be used to determine which of the lnodes is at the top of an orphaned section of the tree.

The first step an administrator should take when an orphaned lnode is discovered is to find the top of the orphaned section of the scheduling tree, since this is the lnode that needs to be reattached. If the top of the orphaned section is not correctly identified, only part of the orphaned section will be reattached to the tree.

When the top of the orphaned section has been determined, an administrator with sufficient privilege can use limadm to set the sgroup attribute of the topmost orphaned lnode to a valid lnode within the scheduling tree. This will cause the orphaned lnode to be reattached to the tree as a member of the group that the valid lnode heads. limadm verifies that the new scheduling group parent to be applied is able to be activated, thus ensuring that the lnode being changed will no longer be orphaned.

Alternatively, the administrator can create a new user whose UID is equal to the UID in the sgroup attribute of the orphaned lnode. This will cause the automatic reattachment of the orphaned section of the tree.

Group Loops

When an lnode is made active, all of its parents up to the root lnode are also activated. As a result of this process, if one of the lnodes is seen to have a parent that has already been encountered, the kernel has discovered a group loop.

If the limits database is corrupted, it is possible for a group loop to occur, in which one of the ancestors of an lnode is also one of its children. When the kernel discovers a group loop, it silently and automatically connects the loop into the scheduling tree by breaking it arbitrarily and connecting it as a group beneath the root lnode. The kernel cannot determine which is the uppermost lnode since the loop has no beginning or end. This means that the lnode at the point where the loop is connected to the scheduling tree becomes a group header of a topmost group. It is possible that members of this group might inherit privileges or higher limits than they would otherwise have.

Cause

Group loops are prevented by limadm when setting scheduling group parents. A group loop can only occur through corruption to the limits database. This is a serious problem, and may cause all sorts of other difficulties in Solaris Resource Manager since the limits database is so basic to its operation.

Correction

The problem is self-correcting with respect to the structure of the scheduling tree since the kernel attaches the lnode to the root lnode. Because the attachment is from an arbitrary point in the loop, the administrator has to determine where the lnode should be attached and also check the point of attachment for every other member in the loop.

The result of automatic group loop repair can be seen by listing the lnodes that are children of the root lnode. The command:

% limreport 'sgroup==0' - uid lname

will list all lnodes that have the root lnode as their parent. If any lnodes are listed that should not be children of the root lnode, they are possibly the top of a group loop that has been attached beneath the root lnode.

The major concern for the administrator when a group loop is detected is that, since the cause of the group loop was corruption to the limits database, many more serious problems could arise. If the administrator suspects corruption in the limits database, it is best to carry out some validation checks against the file to determine if it has been corrupted and then take remedial action. Refer to Crash Recovery for details on detecting and correcting a corrupt limits database.

Resolving UID Conflicts

To verify that the UIDs assigned by the system for srmidle, srmlost, and srmother are not in conflict with any existing UIDs, type:

# /usr/bin/egrep 41\|42\|43 /etc/passwd

If a conflict exists, you can change the UIDs by editing the password and shadow files, /etc/passwd and /etc/shadow.

Crash Recovery

There are many concerns for an administrator when a Solaris system has a failure, but there are some additional considerations when a Solaris Resource Manager system is being used. They are:

The following sections discuss these in detail and offer suggestions for handling the situation, where appropriate.

Corruption of the Limits Database

Solaris Resource Manager maintenance of the limits database is robust and corruption is unlikely. However, if corruption does occur, it is of major concern since this database is basic to the operation of Solaris Resource Manager. Any potential corruption should be investigated and, if detected, corrected.

Symptoms

No single symptom can reliably be used to determine whether the limits database has been corrupted, but there are a number of indicators that potentially reflect a corrupted limits database:

If an administrator suspects that there is corruption in the limits database, the best way to detect it is to use limreport to request a list of lnodes with attributes that should have values within a known range. If values outside that range are reported, corruption has taken place. limreport could also be used to list lnodes which have a clear flag.real. This will indicate accounts in the password map for which no lnode exists.

Correction

When corruption is detected, the administrator should revert to an uncorrupted version of the limits database. If the corruption is limited to a small section of the limits database, the administrator may be able to save the contents of all other lnodes and reload them into a fresh limits database using the limreport and limadm commands. This would be preferable if no recent copy of the limits database is available since the new limits database would now contain the most recent usage and accrue attributes. The procedure for saving and restoring the limits database is documented in Chapter 5, Managing Lnodes. For simple cases of missing lnodes it could be sufficient to just recreate them by using the limadm command.

Connect-Time Loss by limdaemon

If limdaemon terminates for any reason, all users currently logged in cease to be charged for any connect-time usage. Furthermore, when limdaemon is restarted, any users logged in will continue to use those terminals free of charge. This is because the daemon relies on login notifications from login to establish a Solaris Resource Manager login session record within the internal structures it uses to calculate connect-time usages. Therefore, whenever it starts, there are no Solaris Resource Manager login sessions established until the first notification is received.

Typically this will not be a problem if limdaemon terminated due to a system crash, since the crash will also cause other processes to terminate. Login sessions would then not be able to recommence until the system is restarted.

If limdaemon terminates for some other reason, the administrator has two choices:

  1. Restart the daemon immediately, and ignore the lost charging of terminal connect-time for users who are already logged in. This could mean that a user has free use of a terminal indefinitely unless identified and logged out.

  2. Bring the system back to single-user mode then return to multi-user mode, thus ensuring that all current login sessions are terminated and users can only log in again after the daemon has been restarted.