System Administration Guide: Resource Management and Network Services

Chapter 9 Physical Memory Control Using the Resource Capping Daemon

The resource capping daemon rcapd regulates physical memory consumption by processes running in projects that have resource caps defined.

Resource Capping Daemon Overview

A resource cap is an upper bound placed on the consumption of a resource, such as physical memory. Per-project physical memory caps are supported.

The resource capping daemon and its associated utilities provide mechanisms for physical memory resource cap enforcement and administration.

Like the resource control, the resource cap can be defined by using attributes of project entries in the project database. However, while resource controls are synchronously enforced by the kernel, resource caps are asynchronously enforced at the user level by the resource capping daemon. With asynchronous enforcement, a small delay occurs as a result of the sampling interval used by the daemon.

For information about rcapd, see the rcapd(1M) man page. For information about projects and the project database, see Chapter 5, Projects and Tasks and the project(4) man page. For information about resource controls, see Chapter 7, Resource Controls.

How Resource Capping Works

The daemon repeatedly samples the resource utilization of projects that have physical memory caps. The sampling interval used by the daemon is specified by the administrator. See Determining Sample Intervals for additional information. When the system's physical memory utilization exceeds the threshold for cap enforcement, and other conditions are met, the daemon takes action to reduce the resource consumption of projects with memory caps to levels at or below the caps.

The virtual memory system divides physical memory into segments known as pages. Pages are the fundamental unit of physical memory in the Solaris memory management subsystem. To read data from a file into memory, the virtual memory system reads in one page at a time, or pages in a file. To reduce resource consumption, the daemon can page out, or relocate, infrequently used pages to a swap device, which is an area outside of physical memory.

The daemon manages physical memory by regulating the size of a project workload's resident set relative to the size of its working set. The resident set is the set of pages that are resident in physical memory. The working set is the set of pages that the workload actively uses during its processing cycle. The working set changes over time, depending on the process's mode of operation and the type of data being processed. Ideally, every workload has access to enough physical memory to enable its working set to remain resident. However, the working set can also include the use of secondary disk storage to hold the memory that does not fit in physical memory.

Only one instance of rcapd can run at any given time.

Attribute to Limit Physical Memory Usage

To define a physical memory resource cap for a project, establish a resident set size (RSS) cap by adding this attribute to the project database entry:

rcap.max-rss

The total amount of physical memory, in bytes, that is available to processes in the project.

For example, the following line in the /etc/project database sets an RSS cap of 10 gigabytes for a project named db.


db:100::db,root::rcap.max-rss=10737418240

Note –

The system might round the specified cap value to a page size.


rcapd Configuration

You use the rcapadm command to configure the resource capping daemon. You can perform the following actions:

To configure the daemon, you must have superuser privileges or have the Process Management profile in your list of profiles. The Process Management role and the System Administrator role both include the Process Management profile. See “RBAC Elements: Reference Information” in System Administration Guide: Security Services.

Configuration changes can be incorporated into rcapd according to the configuration interval (see rcapd Operation Intervals) or on demand by sending a SIGHUP (see the kill(1) man page).

If used without arguments, rcapadm displays the current status of the resource capping daemon if it has been configured.

The following subsections discuss cap enforcement, cap values, and rcapd operation intervals.

Memory Cap Enforcement Threshold

The memory cap enforcement threshold is the percentage of physical memory utilization on the system that triggers cap enforcement. When the system exceeds this utilization, caps are enforced. The physical memory used by applications and the kernel is included in this percentage. The percentage of utilization determines the way in which memory caps are enforced.

To enforce caps, memory can be paged out from project workloads.

A workload is permitted to use physical memory up to its cap. A workload can use additional memory as long as the system's memory utilization stays below the memory cap enforcement threshold.

To set the value for cap enforcement, see How to Set the Memory Cap Enforcement Threshold.

Determining Cap Values

If a project cap is set too low, there might not be enough memory for the workload to proceed effectively under normal conditions. The paging that occurs because the workload requires more memory has a negative effect on system performance.

Projects that have caps set too high can consume available physical memory before their caps are exceeded. In this case, physical memory is effectively managed by the kernel and not by rcapd.

In determining caps on projects, consider these factors.

Impact on I/O system

The daemon can attempt to reduce a project workload's physical memory usage whenever the sampled usage exceeds the project's cap. During cap enforcement, the swap devices and other devices that contain files that the workload has mapped are used. The performance of the swap devices is a critical factor in determining the performance of a workload that routinely exceeds its cap. The execution of the workload is similar to running it on a machine with the same amount of physical memory as the workload's cap.

Impact on CPU usage

The daemon's CPU usage varies with the number of processes in the project workloads it is capping and the sizes of the workloads' address spaces.

A small portion of the daemon's CPU time is spent sampling the usage of each workload. Adding processes to workloads increases the time spent sampling usage.

Another portion of the daemon's CPU time is spent enforcing caps when they are exceeded. The time spent is proportional to the amount of virtual memory involved. CPU time spent increases or decreases in response to corresponding changes in the total size of a workload's address space. This information is reported in the vm column of rcapstat output. See Monitoring Resource Utilization With rcapstat and the rcapstat(1) man page for more information.

Reporting on shared memory

The daemon cannot determine which pages of memory are shared with other processes or which are mapped multiple times within the same process. Since rcapd assumes that each page is unique, this results in a discrepancy between the reported (estimated) RSS and the actual RSS.

Certain workloads, such as databases, use shared memory extensively. For these workloads, you can sample a project's regular usage to determine a suitable initial cap value. Use output from the prstat command with the -J option. See the prstat(1M) man page.

rcapd Operation Intervals

You can tune the intervals for the periodic operations performed by rcapd.

All intervals are specified in seconds. The rcapd operations and their default interval values are described in the following table.

Operation 

Default Interval Value in Seconds 

Description 

scan

15 

Number of seconds between scans for processes that have joined or left a project workload. Minimum value is 1 second. 

sample

Number of seconds between samplings of resident set size and subsequent cap enforcements. Minimum value is 1 second. 

report

5  

Number of seconds between updates to paging statistics. If set to 0, statistics are not updated, and output from rcapstat is not current.

config

60 

Number of seconds between reconfigurations. In a reconfiguration event, rcapadm reads the configuration file for updates, and scans the project database for new or revised project caps. Sending a SIGHUP to rcapd causes an immediate reconfiguration.

To tune intervals, see How to Set Operation Intervals.

Determining rcapd Scan Intervals

The scan interval controls how often rcapd looks for new processes. On systems with many processes running, the scan through the list takes more time, so it might be preferable to lengthen the interval in order to reduce the overall CPU time spent. However, the scan interval also represents the minimum amount of time that a process must exist to be attributed to a capped workload. If there are workloads that run many short-lived processes, rcapd might not attribute the processes to a workload if the scan interval is lengthened.

Determining Sample Intervals

The sample interval configured with rcapadm is the shortest amount of time rcapd waits between sampling a workload's usage and enforcing the cap if it is exceeded. If you reduce this interval, rcapd will, under most conditions, enforce caps more frequently, possibly resulting in increased I/O due to paging. However, a shorter sample interval can also lessen the impact that a sudden increase in a particular workload's physical memory usage might have on other workloads. The window between samplings, in which the workload can consume memory unhindered and possibly take memory from other capped workloads, is narrowed.

If the sample interval specified to rcapstat is shorter than the interval specified to rcapd with rcapadm, the output for some intervals can be zero. This situation occurs because rcapd does not update statistics more frequently than the interval specified with rcapadm. The interval specified with rcapadm is independent of the sampling interval used by rcapstat.

Monitoring Resource Utilization With rcapstat

Use rcapstat to monitor the resource utilization of capped projects. To view an example rcapstat report, see Producing Reports With rcapstat.

You can set the sampling interval for the report and specify the number of times that statistics are repeated.

interval

Specifies the sampling interval in seconds. The default interval is 5 seconds.

count

Specifies the number of times that the statistics are repeated. By default, rcapstat reports statistics until a termination signal is received or until the rcapd process exits.

The paging statistics in the first report issued by rcapstat show the activity since the daemon was started. Subsequent reports reflect the activity since the last report was issued.

The following table defines the column headings in an rcapstat report.

rcapstat Column Headings

Description 

id

The project ID of the capped project. 

project

The project name. 

nproc

The number of processes in the project. 

vm

The total amount of virtual memory size used by processes in the project, in kilobytes (K), megabytes (M), or gigabytes (G). 

rss

The estimated amount of the total resident set size (RSS) of the processes in the project, in kilobytes (K), megabytes (M), or gigabytes (G), not accounting for pages that are shared. 

cap

The RSS cap defined for the project. See Attribute to Limit Physical Memory Usage or the rcapd(1M) man page for information about how to specify memory caps.

at

The total amount of memory that rcapd attempted to page out since the last rcapstat sample.

avgat

The average amount of memory that rcapd attempted to page out during each sample cycle that occurred since the last rcapstat sample. The rate at which rcapd samples collection RSS can be set with rcapadm. See rcapd Operation Intervals.

pg

The total amount of memory that rcapd successfully paged out since the last rcapstat sample.

avgpg

An estimate of the average amount of memory that rcapd successfully paged out during each sample cycle that occurred since the last rcapstat sample. The rate at which rcapd samples process RSS sizes can be set with rcapadm. See rcapd Operation Intervals.

Administering the Resource Capping Daemon With rcapadm

This section contains procedures for configuring the resource capping daemon with rcapadm. See rcapd Configuration and the rcapadm(1M) man page for more information.

If used without arguments, rcapadm displays the current status of the resource capping daemon if it has been configured.

How to Set the Memory Cap Enforcement Threshold

Caps can be configured so that they will not be enforced until the physical memory available to processes is low. See Memory Cap Enforcement Threshold for more information.

The minimum (and default) value is 0, which means that memory caps are always enforced. To set a different minimum, follow this procedure.

  1. Become superuser.

  2. Use the -c option of rcapadm to set a different physical memory utilization value for memory cap enforcement.


    # rcapadm -c percent
    

    percent is in the range 0 to 100. Higher values are less restrictive. A higher value means capped project workloads can execute without having caps enforced until the system's memory utilization exceeds this threshold.

To display the current physical memory utilization and the cap enforcement threshold, see Reporting Memory Utilization and the Memory Cap Enforcement Threshold.

How to Set Operation Intervals

rcapd Operation Intervals contains information about the intervals for the periodic operations performed by rcapd. To set operation intervals using rcapadm, follow this procedure.

  1. Become superuser.

  2. Use the -i option to set interval values.


    # rcapadm -i interval=value,...,interval=value 
    

All interval values are specified in seconds.

How to Enable Resource Capping

There are two ways to enable resource capping on your system.

  1. Become superuser.

  2. Enable the resource capping daemon in one of the following ways:

    • To enable the resource capping daemon so that it will be started now and also be started each time the system is booted, type:


      # rcapadm -E
      
    • To enable the resource capping daemon at boot without starting it now, also specify the -n option:


      # rcapadm -n -E
      

How to Disable Resource Capping

There are two ways to disable resource capping on your system.

  1. Become superuser.

  2. Disable the resource capping daemon in one of the following ways:

    • To disable the resource capping daemon so that it will be stopped now and not be started when the system is booted, type:


      # rcapadm -D
      
    • To disable the resource capping daemon without stopping it, also specify the -n option:


      # rcapadm -n -D
      

Note –

Use rcapadm -D to safely disable rcapd. If the daemon is killed (see the kill(1) man page), processes might be left in a stopped state and need to be manually restarted. To resume a process running, use the prun command. See the prun(1) man page for more information.


Producing Reports With rcapstat

Use rcapstat to report resource capping statistics. Monitoring Resource Utilization With rcapstat explains how to use the rcapstat command to generate reports. That section also describes the column headings in the report. The rcapstat(1) man page also contains this information.

The following subsections use examples to illustrate how to produce reports for specific purposes.

Reporting Cap and Project Information

In this example, caps are defined for two projects associated with two users. user1 has a cap of 50 megabytes, and user2 has a cap of 10 megabytes.

The following command produces five reports at 5-second sampling intervals.


user1machine% rcapstat 5 5
    id project  nproc     vm    rss   cap    at avgat    pg avgpg
112270   user1     24   123M    35M   50M   50M    0K 3312K    0K
 78194   user2      1  2368K  1856K   10M    0K    0K    0K    0K
    id project  nproc     vm    rss   cap    at avgat    pg avgpg
112270   user1     24   123M    35M   50M    0K    0K    0K    0K
 78194   user2      1  2368K  1856K   10M    0K    0K    0K    0K
    id project  nproc     vm    rss   cap    at avgat    pg avgpg
112270   user1     24   123M    35M   50M    0K    0K    0K    0K
 78194   user2      1  2368K  1928K   10M    0K    0K    0K    0K
    id project  nproc     vm    rss   cap    at avgat    pg avgpg
112270   user1     24   123M    35M   50M    0K    0K    0K    0K
 78194   user2      1  2368K  1928K   10M    0K    0K    0K    0K
    id project  nproc     vm    rss   cap    at avgat    pg avgpg
112270   user1     24   123M    35M   50M    0K    0K    0K    0K
 78194   user2      1  2368K  1928K   10M    0K    0K    0K    0K 

The first three lines of output constitute the first report, which contains the cap and project information for the two projects and paging statistics since rcapd was started. The at and pg columns are a number greater than zero for user1 and zero for user2, which indicates that at some time in the daemon's history, user1 exceeded its cap but user2 did not.

The subsequent reports show no significant activity.

Monitoring the RSS of a Project

The following example shows project user1, which has an RSS in excess of its RSS cap.

The following command produces five reports at 5-second sampling intervals.


user1machine% rcapstat 5 5

    id project  nproc    vm   rss   cap    at avgat     pg  avgpg
376565   user1      3 6249M 6144M 6144M  690M  220M  5528K  2764K
376565   user1      3 6249M 6144M 6144M    0M  131M  4912K  1637K
376565   user1      3 6249M 6171M 6144M   27M  147M  6048K  2016K
376565   user1      3 6249M 6146M 6144M 4872M  174M  4368K  1456K
376565   user1      3 6249M 6156M 6144M   12M  161M  3376K  1125K

The user1 project has three processes that are actively using physical memory. The positive values in the pg column indicate that rcapd is consistently paging out memory as it attempts to meet the cap by lowering the physical memory utilization of the project's processes. However, rcapd does not succeed in keeping the RSS below the cap value. This is indicated by the varying rss values that do not show a corresponding decrease. As soon as memory is paged out, the workload uses it again and the RSS count goes back up. This means that all of the project's resident memory is being actively used and the working set size (WSS) is greater than the cap. Thus, rcapd is forced to page out some of the working set to meet the cap. Under this condition, the system will continue to experience high page fault rates, and associated I/O, until one of the following occurs:

In this situation, shortening the sample interval might reduce the discrepancy between the RSS value and the cap value by causing rcapd to sample the workload and enforce caps more frequently.


Note –

A page fault occurs when either a new page must be created or the system must copy in a page from a swap device.


Determining the Working Set Size of a Project

The following example is a continuation of the previous example, and it uses the same project.

The previous example shows that the user1 project is using more physical memory than its cap allows. This example shows how much memory the project workload requires.


user1machine% rcapstat 5 5
    id project  nproc    vm   rss   cap    at avgat     pg  avgpg
376565   user1      3 6249M 6144M 6144M  690M    0K   689M     0K
376565   user1      3 6249M 6144M 6144M    0K    0K     0K     0K
376565   user1      3 6249M 6171M 6144M   27M    0K    27M     0K
376565   user1      3 6249M 6146M 6144M 4872K    0K  4816K     0K
376565   user1      3 6249M 6156M 6144M   12M    0K    12M     0K
376565   user1      3 6249M 6150M 6144M 5848K    0K  5816K     0K
376565   user1      3 6249M 6155M 6144M   11M    0K    11M     0K
376565   user1      3 6249M 6150M   10G   32K    0K    32K     0K
376565   user1      3 6249M 6214M   10G    0K    0K     0K     0K
376565   user1      3 6249M 6247M   10G    0K    0K     0K     0K
376565   user1      3 6249M 6247M   10G    0K    0K     0K     0K
376565   user1      3 6249M 6247M   10G    0K    0K     0K     0K
376565   user1      3 6249M 6247M   10G    0K    0K     0K     0K
376565   user1      3 6249M 6247M   10G    0K    0K     0K     0K
376565   user1      3 6249M 6247M   10G    0K    0K     0K     0K

Halfway through the cycle, the cap on the user1 project was increased from 6 gigabytes to 10 gigabytes. This increase stops cap enforcement and allows the resident set size to grow, limited only by other processes and the amount of memory in the machine. The rss column might stabilize to reflect the project working set size (WSS), 6247M in this example. This is the minimum cap value that allows the project's processes to operate without continually incurring page faults.

The following two figures graphically show the effect rcapd has on user1 while the cap is 6 gigabytes and 10 gigabytes. Every 5 seconds, corresponding to the sample interval, the RSS decreases and I/O increases as rcapd pages out some of the workload's memory. Shortly after the page out completes, the workload, needing those pages, pages them back in as it continues running. This cycle repeats until the cap is raised to 10 gigabytes approximately halfway through the example, and the RSS stabilizes at 6.1 gigabytes. Since the workload's RSS is now below the cap, no more paging occurs. The I/O associated with paging stops as well, as the vmstat (see vmstat(1M)) or iostat (see iostat(1M)) commands would show. Thus, you can infer that the project required 6.1 gigabytes to perform the work it was doing at the time it was being observed.

Figure 9–1 Stabilizing RSS Values After Raising the Cap of user1 Higher Than user1's WSS

Graph shows stabilizing values after cap of user1 is raised higher than user1's WSS.

Figure 9–2 Relationship Between Page Ins and Page Outs, and the Stabilization of I/O After user1's Cap Is Raised

Graph shows relationship between page ins and page outs, and I/O stabilization after user1's cap is raised.

Reporting Memory Utilization and the Memory Cap Enforcement Threshold

You can use the -g option of rcapstat to report the following:

The -g option causes a memory utilization and cap enforcement line to be printed at the end of the report for each interval.


# rcapstat -g
    id project   nproc    vm   rss   cap    at avgat   pg  avgpg
376565    rcap       0    0K    0K   10G    0K    0K   0K     0K
physical memory utilization: 55%   cap enforcement threshold: 0%
    id project   nproc    vm   rss   cap    at avgat   pg  avgpg
376565    rcap       0    0K    0K   10G    0K    0K   0K     0K
physical memory utilization: 55%   cap enforcement threshold: 0%