Solaris Resource Manager 1.3 System Administration Guide

Examples

The examples in this section demonstrate Solaris Resource Manager functions used to control system resources and allocation, and to display information.

Server Consolidation

The first example illustrates these commands:

liminfo

Prints user attributes and limits information for one or more users to a terminal window

limadm

Changes limit attributes or deletes limits database entries for a list of users

srmadm

Displays or sets operation modes and system-wide Solaris Resource Manager tunable parameters

srmstat

Displays lnode activity information

Consider the case of consolidating two servers, each running a database application, onto a single machine. Simply running both applications on the single machine results in a working system. Without Solaris Resource Manager, the Solaris system allocates resources to the applications on an equal-use basis, and does not protect one application from competing demands by the other application. However, Solaris Resource Manager provides mechanisms that keep the applications from suffering resource starvation. With Solaris Resource Manager, this is accomplished by starting each database attached to lnodes referring to the databases, db1 and db2. To do this, three new administrative placeholder users must be created, for example, databases, db1, and db2. These are added to the limits database; since lnodes correspond to UNIX UIDs, these must also be added to the passwd file (or password map, if the system is using a name service such as NIS or NIS+). Assuming that the UIDs are added to the passwd file or password map, the placeholder users db1 and db2 are assigned to the databases lnode group with the commands:

# limadm set sgroup=0 databases
# limadm set sgroup=databases db1 db2

which assumes that /usr/srm/bin is in the user's path.

Figure 10-1 Server Consolidation

Diagram illustrates the consolidation of two servers, each running a database application, onto a single machine.

Because there are no other defined groups, the databases group currently has full use of the machine. Two lnodes associated with the databases are running, and the processes that run the database applications are attached to the appropriate lnodes with the srmuser command in the startup script for the database instances.

# srmuser db1 /usr/bin/database1/init.db1
# srmuser db2 /usr/bin/database2/init.db2

When either database, db1 or db2, is started up, use the srmuser command to ensure that the database is attached to the correct lnode and charged correctly (srmuser does not affect the ownership of the process to do this). To run the above command, a user must have the UNIX permissions required to run init.db1 and the administrative permission to attach processes to the lnode db1. As users log in and use the databases, activities performed by the databases are accrued to the lnodes db1 and db2.

By using the default allocation of one share to each lnode, the usage in the databases group will average out over time to ensure that the databases, db1 and db2, receive equal allocation of the machine. Specifically, there is one share outstanding-to the databases group-and databases owns it. Each of the lnodes db1 and db2 are also granted the default allocation of one share. Within the databases group, there are two shares outstanding, so db1 and db2 get equal allocation out of databases' resources (in this simple example, there are no competing allocations, so databases has access to the entire system).

If it turns out that activity on Database1 requires 60 percent of the machine's CPU capacity and Database2 requires 20 percent of the capacity, the administrator can specify that the system provide at least this much (assuming that the application demands it) by increasing the number of cpu.shares allocated to db1:

# limadm set cpu.shares=3 db1

There are now four shares outstanding in the databases group; db1 has three, and db2 has one. This change is effected immediately upon execution of the above command. There will be a period of settling when the lnode db1 (Database1) will actually receive more than its entitled 60 percent of the machine resource, as Solaris Resource Manager works to average the usage over the course of time. However, depending on the decay global parameter, this period will not last long.

To monitor this activity at any point, use the commands liminfo (see A Typical Application Server) and srmstat, in separate windows. Note that srmstat provides a regularly updating display. For additional information on srmstat, see srmstat(1SRM).

You now have a machine running with two database applications, one receiving 75 percent of the resource and the other receiving 25 percent. Remember that root is the top-level group header user. Processes running as root thus have access to the entire system, if they so request. Accordingly, additional lnodes should be created for running backups, daemons, and other scripts so that the root processes cannot possibly take over the whole machine, as they might if run in the traditional manner.

Adding a Computational Batch Application User

This example introduces the following command:

srmkill

Kills all the active processes attached to an lnode

The Finance department owns the database system, but Joe, a user from Engineering, has to run a computational batch job and would like to use Finance's machine during off hours when the system is generally idle. The Finance department dictates that Joe's job is less important than the databases, and agrees to run his work only if it will not interfere with the system's primary job. To enforce this policy, add a new group (batch) to the lnode database, and add Joe to the new batch group of the server's lnode hierarchy:

# limadm set cpu.shares=20 databases
# limadm set cpu.shares=1 batch
# limadm set cpu.shares=1 joe
# limadm set sgroup=batch joe

Figure 10-2 Adding a Computation Batch Application

Diagram shows addition of a new group called batch to the lnode database and server hierarchy, and addition of user Joe to the new batch group.

This command sequence changes the allocation of shares so that the databases group has 20 shares, while the batch group has just one. This specifies that members of the batch group (only Joe) will use at most 1/21 of the machine if the databases group is active. The databases group receives 20/21, or 95.2 percent, more than the 60% + 20% = 80% previously determined to be sufficient to handle the database work. If the databases are not requesting their full allocation, Joe will receive more than his 4.8 percent allocation. If the databases are completely inactive, Joe's allocation might reach 100 percent. When the number of outstanding shares allocated to databases is increased from 1 to 20, there is no need to make any changes to the allocation of shares for db1 and db2. Within the databases group, there are still four shares outstanding, allocated in the 3:1 ratio. Different levels of the scheduling tree are totally independent; what matters is the ratio of shares between peer groups.

Despite these assurances, the Finance department further wants to ensure that Joe is not even able to log in during prime daytime hours. This can be accomplished by putting some login controls on the batch group. Since the controls are sensitive to time of day, run a script that only permits the batch group to log in at specific times. For example, this could be implemented with crontab entries, such as:

0 6 * * * /usr/srm/bin/limadm set flag.nologin=set batch 
0 18 * * * /usr/srm/bin/limadm set flag.nologin=clear batch

At 6:00 a.m., batch does not have permission to log in, but at 18:00 (6 p.m.), the limitation is removed.

An even stricter policy can be implemented by adding another line to the crontab entry:

01 6 * * * /usr/srm/bin/srmkill joe

This uses the srmkill(1MSRM) command to kill any processes attached to the lnode Joe at 6:01 a.m. This will not be necessary if the only resources that the job requires are those controlled by Solaris Resource Manager. This action could be useful if Joe's job could reasonably tie up other resources that would interfere with normal work. An example would be a job that holds a key database lock or dominates an I/O channel.

Joe can now log in and run his job only at night. Because Joe (and the entire batch group) has significantly fewer shares than the other applications, his application will run with less than 5 percent of the machine. Similarly, nice(1) can be used to reduce the priority of processes attached to this job, so it runs at lower priority than other jobs running with equal Solaris Resource Manager shares.

At this point, the Finance department has ensured that its database applications have sufficient access to this system and will not interfere with each other's work. The department has also accommodated Joe's overnight batch processing loads, while ensuring that his work also will not interfere with the department's mission-critical processing.

Putting on a Web Front-end Process

Assume a decision has been made to put a web front-end on Database1, but limit this application to no more than 10 users at a time. Use the process limits function to do this.

First, create a new lnode called ws1. By starting the Webserver application under the ws1 lnode, you can control the number of processes that are available to it, and hence the number of active http sessions.

Figure 10-3 Adding a Web Front-end Process

Diagram shows adding a web front-end process under the db1 lnode.

Since Webserver is part of the Database1 application, you might want to give it a share of the db1 lnode and allow it to compete with Database1 for resources. Allocate 60 percent of compute resources to the Webserver and 40 percent to the Database1 application itself:

# limadm set cpu.shares=6 ws1
# limadm set sgroup=db1 ws1
# limadm set cpu.myshares=4 db1
# srmuser ws1 /etc/bin/Webserver1/init.webserver 

The last line starts up the Webserver and charges the application to the ws1 lnode. Note that for Database1, the cpu.myshares have been allocated at 4. This sets the ratio of shares for which db1 will compete with its child process, Webserver, at a ratio of 4:6.


Note -

cpu.shares shows the ratio for resource allocation at the peer level in a hierarchy, while cpu.myshares shows the ratio for resource allocation at the parent:children level when the parent is actively running applications. Solaris Resource Manager allocates resources based on the ratio of outstanding shares of all active lnodes at their respective levels, where "respective level" includes the my.shares of the group parent and all children.


To control the number of processes that Webserver can run, put a process limit on the ws1 lnode. The example uses 20 since a Webserver query will typically spawn 2 processes, so this in fact limits the number of active Webserver queries to 10:

# limadm set process.limit=20 ws1

Another application has now been added to the scheduling tree, as a leaf node under an active lnode. To distribute the CPU resource between the active parent and child, use cpu.myshares to allocate some portion of the available resource to the parent and some to the child. Process limits are used to limit the number of active sessions on an lnode.

Adding More Users Who Have Special Memory Requirements

This example implements the resource control mechanisms CPU sharing, process limits, and login controls, and it addresses display tools for printing lnodes and showing active lnodes.

srmadm

Administers Solaris Resource Manager

limreport

Outputs information on selected users

limdaemon

Directs daemon to send messages when any limits are reached

Another user, Sally, has also asked to use the machine at night, for her application. Since her application is CPU-intensive, to ensure that Joe's application does not suffer, put a limit on Sally's usage of virtual memory, in terms of both her total usage and her "per-process" usage:

# limadm set memory.limit=50M sally
# limadm set memory.plimit=25M sally

Figure 10-4 Adding More Users

Diagram shows adding more users with specific memory limits.

If and when Sally's application tries to exceed either her total virtual memory limit or process memory limit, the limdaemon command will notify Sally and the system administrator, through the console, that the limit has been exceeded.

Use the limreport command to generate a report of who is on the system and their usages to date. A typical use of limreport is to see who is using the machine at any time and how they fit within the hierarchy of users:

% limreport 'flag.real' - uid sgroup lname cpu.shares cpu.usage |sort +1n +0n


Note -

limreport has several parameters. In this example, a check is made on "flag.real" (only looking for "real" lnodes/UIDs); the dash (-) is used to indicate that the default best guess for the output format should be used, and the list "uid sgroup lname cpu.shares cpu.usage" indicates limreport should output these five parameters for each lnode with flag.real set to TRUE. Output is piped to a UNIX primary sort on the second column and secondary sort on the first column to provide a simple report of who is using the server.


Anyone with the correct path and permissions can check on the status of Solaris Resource Manager at any time using the command srmadm show. This will output a formatted report of the current operation state of Solaris Resource Manager and its main configuration parameters. This is useful to verify that Solaris Resource Manager is active and all the controlling parameters are active. It also shows the values of global parameters such as the decay rate and location of the Solaris Resource Manager data store.

It is possible to run Solaris Resource Manager without limits active and without CPU scheduling active, which can be useful at startup for debugging and for initially configuring the Solaris Resource Manager product:

# srmadm set share=n:limits=n

Sharing a Machine Across Departments

A different development group would like to purchase an upgrade for this machine (more processors and memory) in exchange for gaining access to the system when it is idle. Both groups should benefit. To set this up, establish a new group called development at the same level as databases and batch. Allocate development 33 percent of the machine since they have added 50 percent more CPU power and memory to the original system.

Figure 10-5 Sharing a Machine, Step 1

Diagram illustrates sharing a machine. Context provided in surrounding text.

The Development group has hundreds of users. To avoid being involved in the distribution of that group's resources, use the administration flag capability of Solaris Resource Manager to enable the Development system administrator to allocate their resources. You set up limits at the operations and development level as agreed jointly and then you each do the work required to control your own portions of the machine.

To add the new level into the hierarchy, add the group operations as a new lnode, and change the parent group of batch and databases to operations:

# limadm set sgroup=operations batch databases

To set the administration flag:

# limadm set flag.admin=set operations development

Since under normal circumstances all servers have daemons and backup processes to be run, these should be added on a separate high-level lnode.


Note -

Do not use the root lnode, since it has no limits.


Figure 10-6 Sharing a Machine, Step 2

Diagram continues the example of sharing a machine. Context provided in surrounding paragraphs.

As seen in the examples, you can use Solaris Resource Manager to consolidate several different types of users and applications on the same machine. By the judicious use of CPU share controls, virtual memory limits, process limits, and login controls, you can ensure that these diverse applications receive only the resources that they need. The limits ensure that no application or user is going to adversely impact any other user's or group of users' application. The Solaris Resource Manager product supports simple reporting tools that show users and system administrators exactly what is happening at any given moment, and over the course of time. The report generation capability can be used to show the breakdown of resource usage across applications and groups for capacity planning and billing purposes.