System Administration Guide: Virtualization Using the Solaris Operating System

Chapter 6 Resource Controls (Overview)

After you determine the resource consumption of workloads on your system as described in Chapter 4, Extended Accounting (Overview), you can place boundaries on resource usage. Boundaries prevent workloads from over-consuming resources. The resource controls facility is the constraint mechanism that is used for this purpose.

This chapter covers the following topics.

For information about how to administer resource controls, see Chapter 7, Administering Resource Controls (Tasks).

Resource Controls Concepts

In the Solaris Operating System, the concept of a per-process resource limit has been extended to the task and project entities described in Chapter 2, Projects and Tasks (Overview). These enhancements are provided by the resource controls (rctls) facility. In addition, allocations that were set through the /etc/system tunables are now automatic or configured through the resource controls mechanism as well.

A resource control is identified by the prefix zone, project, task, or process. Resource controls can be observed on a system-wide basis. It is possible to update resource control values on a running system.

For a list of the standard resource controls that are available in this release, see Available Resource Controls See Resource Type Properties for information on available zone-wide resource controls.

Resource Limits and Resource Controls

UNIX systems have traditionally provided a resource limit facility (rlimit). The rlimit facility allows administrators to set one or more numerical limits on the amount of resources a process can consume. These limits include per-process CPU time used, per-process core file size, and per-process maximum heap size. Heap size is the amount of scratch memory that is allocated for the process data segment.

The resource controls facility provides compatibility interfaces for the resource limits facility. Existing applications that use resource limits continue to run unchanged. These applications can be observed in the same way as applications that are modified to take advantage of the resource controls facility.

Interprocess Communication and Resource Controls

Processes can communicate with each other by using one of several types of interprocess communication (IPC). IPC allows information transfer or synchronization to occur between processes. Prior to the Solaris 10 release, IPC tunable parameters were set by adding an entry to the /etc/system file. The resource controls facility now provides resource controls that define the behavior of the kernel's IPC facilities. These resource controls replace the /etc/system tunables.

Obsolete parameters might be included in the /etc/system file on this Solaris system. If so, the parameters are used to initialize the default resource control values as in previous Solaris releases. However, using the obsolete parameters is not recommended.

To observe which IPC objects are contributing to a project's usage, use the ipcs command with the -J option. See How to Use ipcs to view an example display. For more information about the ipcs command, see ipcs(1).

For information about Solaris system tuning, see the Solaris Tunable Parameters Reference Manual.

Resource Control Constraint Mechanisms

Resource controls provide a mechanism for the constraint of system resources. Processes, tasks, projects, and zones can be prevented from consuming amounts of specified system resources. This mechanism leads to a more manageable system by preventing over-consumption of resources.

Constraint mechanisms can be used to support capacity-planning processes. An encountered constraint can provide information about application resource needs without necessarily denying the resource to the application.

Project Attribute Mechanisms

Resource controls can also serve as a simple attribute mechanism for resource management facilities. For example, the number of CPU shares made available to a project in the fair share scheduler (FSS) scheduling class is defined by the project.cpu-shares resource control. Because the project is assigned a fixed number of shares by the control, the various actions associated with exceeding a control are not relevant. In this context, the current value for the project.cpu-shares control is considered an attribute on the specified project.

Another type of project attribute is used to regulate the resource consumption of physical memory by collections of processes attached to a project. These attributes have the prefix rcap, for example, rcap.max-rss. Like a resource control, this type of attribute is configured in the project database. However, while resource controls are synchronously enforced by the kernel, resource caps are asynchronously enforced at the user level by the resource cap enforcement daemon, rcapd. For information on rcapd, see Chapter 10, Physical Memory Control Using the Resource Capping Daemon (Overview) and rcapd(1M).

The project.pool attribute is used to specify a pool binding for a project. For more information on resource pools, see Chapter 12, Resource Pools (Overview).

Configuring Resource Controls and Attributes

The resource controls facility is configured through the project database. See Chapter 2, Projects and Tasks (Overview). Resource controls and other attributes are set in the final field of the project database entry. The values associated with each resource control are enclosed in parentheses, and appear as plain text separated by commas. The values in parentheses constitute an “action clause.” Each action clause is composed of a privilege level, a threshold value, and an action that is associated with the particular threshold. Each resource control can have multiple action clauses, which are also separated by commas. The following entry defines a per-task lightweight process limit and a per-process maximum CPU time limit on a project entity. The process.max-cpu-time would send a process a SIGTERM after the process ran for 1 hour, and a SIGKILL if the process continued to run for a total of 1 hour and 1 minute. See Table 6–3.


development:101:Developers:::task.max-lwps=(privileged,10,deny);
  process.max-cpu-time=(basic,3600,signal=TERM),(priv,3660,signal=KILL)
typed as one line

Note –

On systems that have zones enabled, zone-wide resource controls are specified in the zone configuration using a slightly different format. See Zone Configuration Data for more information.


The rctladm command allows you to make runtime interrogations of and modifications to the resource controls facility, with global scope. The prctl command allows you to make runtime interrogations of and modifications to the resource controls facility, with local scope.

For more information, see Global and Local Actions on Resource Control Values, rctladm(1M) and prctl(1).


Note –

On a system with zones installed, you cannot use rctladm in a non-global zone to modify settings. You can use rctladm in a non-global zone to view the global logging state of each resource control.


Available Resource Controls

A list of the standard resource controls that are available in this release is shown in the following table.

The table describes the resource that is constrained by each control. The table also identifies the default units that are used by the project database for that resource. The default units are of two types:

Thus, project.cpu-shares specifies the number of shares to which the project is entitled. process.max-file-descriptor specifies the highest file number that can be assigned to a process by the open(2) system call.

Table 6–1 Standard Project, Task, and Process Resource Controls

Control Name 

Description 

Default Unit 

project.cpu-cap

Absolute limit on the amount of CPU resources that can be consumed by a project. A value of 100 means 100% of one CPU as the project.cpu-cap setting. A value of 125 is 125%, because 100% corresponds to one full CPU on the system when using CPU caps.

Quantity (number of CPUs) 

project.cpu-shares

Number of CPU shares granted to this project for use with the fair share scheduler (see FSS(7)).

Quantity (shares) 

project.max-crypto-memory

Total amount of kernel memory that can be used by libpkcs11 for hardware crypto acceleration. Allocations for kernel buffers and session-related structures are charged against this resource control.

Size (bytes) 

project.max-locked-memory

Total amount of physical locked memory allowed. 

If priv_proc_lock_memory is assigned to a user, consider setting this resource control as well to prevent that user from locking all memory.

Note that this resource control replaced project.max-device-locked-memory, which has been removed. This release control will be removed in a future release.

Size (bytes) 

project.max-msg-ids

Maximum number of message queue IDs allowed for this project. 

Quantity (message queue IDs) 

project.max-port-ids

Maximum allowable number of event ports. 

Quantity (number of event ports)  

project.max-sem-ids

Maximum number of semaphore IDs allowed for this project. 

Quantity (semaphore IDs) 

project.max-shm-ids

Maximum number of shared memory IDs allowed for this project. 

Quantity (shared memory IDs) 

project.max-shm-memory

Total amount of System V shared memory allowed for this project. 

Size (bytes) 

project.max-lwps

Maximum number of LWPs simultaneously available to this project. 

Quantity (LWPs) 

project.max-tasks

Maximum number of tasks allowable in this project. 

Quantity (number of tasks) 

project.max-contracts

Maximum number of contracts allowed in this project. 

Quantity (contracts) 

task.max-cpu-time

Maximum CPU time that is available to this task's processes. 

Time (seconds) 

task.max-lwps

Maximum number of LWPs simultaneously available to this task's processes. 

Quantity (LWPs) 

process.max-cpu-time

Maximum CPU time that is available to this process. 

Time (seconds) 

process.max-file-descriptor

Maximum file descriptor index available to this process. 

Index (maximum file descriptor) 

process.max-file-size

Maximum file offset available for writing by this process. 

Size (bytes) 

process.max-core-size

Maximum size of a core file created by this process. 

Size (bytes) 

process.max-data-size

Maximum heap memory available to this process. 

Size (bytes) 

process.max-stack-size

Maximum stack memory segment available to this process. 

Size (bytes) 

process.max-address-space

Maximum amount of address space, as summed over segment sizes, that is available to this process. 

Size (bytes) 

process.max-port-events

Maximum allowable number of events per event port. 

Quantity (number of events)  

process.max-sem-nsems

Maximum number of semaphores allowed per semaphore set. 

Quantity (semaphores per set) 

process.max-sem-ops

Maximum number of semaphore operations allowed per semop call (value copied from the resource control at semget() time).

Quantity (number of operations) 

process.max-msg-qbytes

Maximum number of bytes of messages on a message queue (value copied from the resource control at msgget() time).

Size (bytes) 

process.max-msg-messages

Maximum number of messages on a message queue (value copied from the resource control at msgget() time).

Quantity (number of messages) 

You can display the default values for resource controls on a system that does not have any resource controls set or changed. Such a system contains no non-default entries in /etc/system or the project database. To display values, use the prctl command.

Zone-Wide Resource Controls

Zone-wide resource controls limit the total resource usage of all process entities within a zone. Zone-wide resource controls can also be set using global property names as described in Setting Zone-Wide Resource Controls and How to Configure the Zone.

Table 6–2 Zones Resource Controls

Control Name 

Description 

Default Unit 

zone.cpu-cap

Absolute limit on the amount of CPU resources that can be consumed by a non-global zone. A value of 100 means 100% of one CPU as the project.cpu-cap setting. A value of 125 is 125%, because 100% corresponds to one full CPU on the system when using CPU caps.

Quantity (number of CPUs) 

zone.cpu-shares

Number of fair share scheduler (FSS) CPU shares for this zone 

Quantity (shares) 

zone.max-locked-memory

Total amount of physical locked memory available to a zone. 

When priv_proc_lock_memory is assigned to a zone, consider setting this resource control as well to prevent that zone from locking all memory.

Size (bytes) 

zone.max-lwps

Maximum number of LWPs simultaneously available to this zone 

Quantity (LWPs) 

zone.max-msg-ids

Maximum number of message queue IDs allowed for this zone 

Quantity (message queue IDs) 

zone.max-sem-ids

Maximum number of semaphore IDs allowed for this zone 

Quantity (semaphore IDs) 

zone.max-shm-ids

Maximum number of shared memory IDs allowed for this zone 

Quantity (shared memory IDs) 

zone.max-shm-memory

Total amount of System V shared memory allowed for this zone 

Size (bytes) 

zone.max-swap

Total amount of swap that can be consumed by user process address space mappings and tmpfs mounts for this zone.

Size (bytes) 

For information on configuring zone-wide resource controls, see Resource Type Properties and How to Configure the Zone. To use zone-wide resource controls in lx branded zones, see How to Configure, Verify, and Commit the lx Branded Zone.

Note that it is possible to apply a zone-wide resource control to the global zone. See Using the Fair Share Scheduler on a Solaris System With Zones Installed for additional information.

Units Support

Global flags that identify resource control types are defined for all resource controls. The flags are used by the system to communicate basic type information to applications such as the prctl command. Applications use the information to determine the following:

The following global flags are available:

Global Flag 

Resource Control Type String 

Modifier 

Scale 

RCTL_GLOBAL_BYTES 

bytes 

 

KB 

210

 

MB 

220

 

GB 

230

 

TB 

240

 

PB 

250

 

EB 

260

RCTL_GLOBAL_SECONDS 

seconds 

 

Ks 

103

 

Ms 

106

 

Gs 

109

 

Ts 

1012

 

Ps 

1015

 

Es 

1018

RCTL_GLOBAL_COUNT 

count 

none 

 

103

 

106

 

109

 

1012

 

1015

 

1018

Scaled values can be used with resource controls. The following example shows a scaled threshold value:

task.max-lwps=(priv,1K,deny)

Note –

Unit modifiers are accepted by the prctl, projadd, and projmod commands. You cannot use unit modifiers in the project database itself.


Resource Control Values and Privilege Levels

A threshold value on a resource control constitutes an enforcement point where local actions can be triggered or global actions, such as logging, can occur.

Each threshold value on a resource control must be associated with a privilege level. The privilege level must be one of the following three types.

A resource control is guaranteed to have one system value, which is defined by the system, or resource provider. The system value represents how much of the resource the current implementation of the operating system is capable of providing.

Any number of privileged values can be defined, and only one basic value is allowed. Operations that are performed without specifying a privilege value are assigned a basic privilege by default.

The privilege level for a resource control value is defined in the privilege field of the resource control block as RCTL_BASIC, RCTL_PRIVILEGED, or RCTL_SYSTEM. See setrctl(2) for more information. You can use the prctl command to modify values that are associated with basic and privileged levels.

Global and Local Actions on Resource Control Values

There are two categories of actions on resource control values: global and local.

Global Actions on Resource Control Values

Global actions apply to resource control values for every resource control on the system. You can use the rctladm command described in the rctladm(1M) man page to perform the following actions:

You can disable or enable the global logging action on resource controls. You can set the syslog action to a specific degree by assigning a severity level, syslog=level. The possible settings for level are as follows:

By default, there is no global logging of resource control violations. The level n/a indicates resource controls on which no global action can be configured.

Local Actions on Resource Control Values

Local actions are taken on a process that attempts to exceed the control value. For each threshold value that is placed on a resource control, you can associate one or more actions. There are three types of local actions: none, deny, and signal=. These three actions are used as follows:

none

No action is taken on resource requests for an amount that is greater than the threshold. This action is useful for monitoring resource usage without affecting the progress of applications. You can also enable a global message that displays when the resource control is exceeded, although the process exceeding the threshhold is not affected.

deny

You can deny resource requests for an amount that is greater than the threshold. For example, a task.max-lwps resource control with action deny causes a fork system call to fail if the new process would exceed the control value. See the fork(2) man page.

signal=

You can enable a global signal message action when the resource control is exceeded. A signal is sent to the process when the threshold value is exceeded. Additional signals are not sent if the process consumes additional resources. Available signals are listed in Table 6–3.

Not all of the actions can be applied to every resource control. For example, a process cannot exceed the number of CPU shares assigned to the project of which it is a member. Therefore, a deny action is not allowed on the project.cpu-shares resource control.

Due to implementation restrictions, the global properties of each control can restrict the range of available actions that can be set on the threshold value. (See the rctladm(1M) man page.) A list of available signal actions is presented in the following table. For additional information about signals, see the signal(3HEAD) man page.

Table 6–3 Signals Available to Resource Control Values

Signal 

Description 

Notes 

SIGABRT 

Terminate the process. 

 

SIGHUP 

Send a hangup signal. Occurs when carrier drops on an open line. Signal sent to the process group that controls the terminal. 

 

SIGTERM 

Terminate the process. Termination signal sent by software. 

 

SIGKILL 

Terminate the process and kill the program. 

 

SIGSTOP 

Stop the process. Job control signal. 

 

SIGXRES 

Resource control limit exceeded. Generated by resource control facility. 

 

SIGXFSZ 

Terminate the process. File size limit exceeded. 

Available only to resource controls with the RCTL_GLOBAL_FILE_SIZE property (process.max-file-size). See rctlblk_set_value(3C) for more information.

SIGXCPU 

Terminate the process. CPU time limit exceeded. 

Available only to resource controls with the RCTL_GLOBAL_CPUTIME property (process.max-cpu-time). See rctlblk_set_value(3C) for more information.

Resource Control Flags and Properties

Each resource control on the system has a certain set of associated properties. This set of properties is defined as a set of flags, which are associated with all controlled instances of that resource. Global flags cannot be modified, but the flags can be retrieved by using either rctladm or the getrctl system call.

Local flags define the default behavior and configuration for a specific threshold value of that resource control on a specific process or process collective. The local flags for one threshold value do not affect the behavior of other defined threshold values for the same resource control. However, the global flags affect the behavior for every value associated with a particular control. Local flags can be modified, within the constraints supplied by their corresponding global flags, by the prctl command or the setrctl system call. See setrctl(2).

For the complete list of local flags, global flags, and their definitions, see rctlblk_set_value(3C).

To determine system behavior when a threshold value for a particular resource control is reached, use rctladm to display the global flags for the resource control . For example, to display the values for process.max-cpu-time, type the following:


$ rctladm process.max-cpu-time
	process.max-cpu-time  syslog=off  [ lowerable no-deny cpu-time inf seconds ]

The global flags indicate the following.

lowerable

Superuser privileges are not required to lower the privileged values for this control.

no-deny

Even when threshold values are exceeded, access to the resource is never denied.

cpu-time

SIGXCPU is available to be sent when threshold values of this resource are reached.

seconds

The time value for the resource control.

no-basic

Resource control values with the privilege type basic cannot be set. Only privileged resource control values are allowed.

no-signal

A local signal action cannot be set on resource control values.

no-syslog

The global syslog message action may not be set for this resource control.

deny

Always deny request for resource when threshold values are exceeded.

count

A count (integer) value for the resource control.

bytes

Unit of size for the resource control.

Use the prctl command to display local values and actions for the resource control.


$ prctl -n process.max-cpu-time $$
	process 353939: -ksh
	NAME    PRIVILEGE    VALUE    FLAG   ACTION              RECIPIENT
 process.max-cpu-time
         privileged   18.4Es    inf   signal=XCPU                 -
         system       18.4Es    inf   none 

The max (RCTL_LOCAL_MAXIMAL) flag is set for both threshold values, and the inf (RCTL_GLOBAL_INFINITE) flag is defined for this resource control. An inf value has an infinite quantity. The value is never enforced. Hence, as configured, both threshold quantities represent infinite values that are never exceeded.

Resource Control Enforcement

More than one resource control can exist on a resource. A resource control can exist at each containment level in the process model. If resource controls are active on the same resource at different container levels, the smallest container's control is enforced first. Thus, action is taken on process.max-cpu-time before task.max-cpu-time if both controls are encountered simultaneously.

Figure 6–1 Process Collectives, Container Relationships, and Their Resource Control Sets

Diagram shows enforcement of each resource control at
its containment level.

Global Monitoring of Resource Control Events

Often, the resource consumption of processes is unknown. To get more information, try using the global resource control actions that are available with the rctladm command. Use rctladm to establish a syslog action on a resource control. Then, if any entity managed by that resource control encounters a threshold value, a system message is logged at the configured logging level. See Chapter 7, Administering Resource Controls (Tasks) and the rctladm(1M) man page for more information.

Applying Resource Controls

Each resource control listed in Table 6–1 can be assigned to a project at login or when newtask, su, or the other project-aware launchers at, batch, or cron are invoked. Each command that is initiated is launched in a separate task with the invoking user's default project. See the man pages login(1), newtask(1), at(1), cron(1M), and su(1M) for more information.

Updates to entries in the project database, whether to the /etc/project file or to a representation of the database in a network name service, are not applied to currently active projects. The updates are applied when a new task joins the project through login or newtask.

Temporarily Updating Resource Control Values on a Running System

Values changed in the project database only become effective for new tasks that are started in a project. However, you can use the rctladm and prctl commands to update resource controls on a running system.

Updating Logging Status

The rctladm command affects the global logging state of each resource control on a system-wide basis. This command can be used to view the global state and to set up the level of syslog logging when controls are exceeded.

Updating Resource Controls

You can view and temporarily alter resource control values and actions on a per-process, per-task, or per-project basis by using the prctl command. A project, task, or process ID is given as input, and the command operates on the resource control at the level where the control is defined.

Any modifications to values and actions take effect immediately. However, these modifications apply to the current process, task, or project only. The changes are not recorded in the project database. If the system is restarted, the modifications are lost. Permanent changes to resource controls must be made in the project database.

All resource control settings that can be modified in the project database can also be modified with the prctl command. Both basic and privileged values can be added or be deleted. Their actions can also be modified. By default, the basic type is assumed for all set operations, but processes and users with superuser privileges can also modify privileged resource controls. System resource controls cannot be altered.

Commands Used With Resource Controls

The commands that are used with resource controls are shown in the following table.

Command Reference 

Description 

ipcs(1)

Allows you to observe which IPC objects are contributing to a project's usage 

prctl(1)

Allows you to make runtime interrogations of and modifications to the resource controls facility, with local scope 

rctladm(1M)

Allows you to make runtime interrogations of and modifications to the resource controls facility, with global scope 

The resource_controls(5) man page describes resource controls available through the project database, including units and scaling factors.