The section contains the following chapters on resource management in the Solaris operating environment.
Provides an overview of resource management and discusses why you would want to use the functionality on your system
Covers the project and task facilities and describes how they are used to label and separate workloads
Describes the extended accounting functionality that is used to capture detailed resource consumption statistics for capacity planning or billing purposes
Discusses resource controls, which are used to place bounds on resource usage by applications that run on your system
Describes the fair share scheduler, which uses shares to specify the amounts of CPU time that is allocated to processes that run on your system
Describes the resource capping daemon rcapd(1M), which regulates the consumption of physical memory by processes running in projects that have resource caps
Describes resource pools, which are used to partition system resources and guarantee that a known amount of resources is always available to a specified workload that runs on your system
Describes a hypothetical server consolidation project
Describes the resource management functionality available in the Solaris Management Console tool
Resource management functionality enables you to control how applications use available system resources. You can do the following:
Allocate computing resources, such as processor time
Monitor how the allocations are being used, then adjust the allocations as necessary
Generate extended accounting information for analysis, billing, and capacity planning
Modern computing environments have to provide a flexible response to the varying workloads that are generated by different applications on a system. If resource management features are not used, the Solaris operating environment responds to workload demands by adapting to new application requests dynamically. This default response generally means that all activity on the system is given equal access to resources. Solaris resource management features enable you to treat workloads individually. You can do the following:
Restrict access to a specific resource
Offer resources to workloads on a preferential basis
Isolate workloads from each another
The ability to minimize cross-workload performance compromises, along with the facilities that monitor resource usage and utilization, is referred to as resource management. Resource management is implemented through a collection of algorithms. The algorithms handle the series of capability requests that an application presents in the course of its execution.
Resource management facilities permit you to modify the default behavior of the operating system with respect to different workloads. Behavior primarily refers to the set of decisions that are made by operating system algorithms when an application presents one or more resource requests to the system. You can use resource management facilities to do the following:
Deny resources or prefer one application over another for a larger set of allocations than otherwise permitted
Treat certain allocations collectively instead of through isolated mechanisms
The implementation of a system configuration that uses the resource management facilities can serve several purposes. You can do the following:
Prevent an application from consuming resources indiscriminately
Change an application's priority based on external events
Balance resource guarantees to a set of applications against the goal of maximizing system utilization
When planning a resource-managed configuration, key requirements include the following:
Identifying the competing workloads on the system
Distinguishing those workloads that are not in conflict from those workloads with performance requirements that compromise the primary workloads
After you identify cooperating and conflicting workloads, you can create a resource configuration that presents the least compromise to the service goals of the business, within the limitations of the system's capabilities.
Effective resource management is enabled in the Solaris environment by offering control mechanisms, notification mechanisms, and monitoring mechanisms. Many of these capabilities are provided through enhancements to existing mechanisms such as the proc(4) file system, processor sets, and scheduling classes. Other capabilities are specific to resource management. These capabilities are described in subsequent chapters.
A resource is any aspect of the computing system that can be manipulated with the intent to change application behavior. Thus, a resource is a capability that an application implicitly or explicitly requests. If the capability is denied or constrained, the execution of a robustly written application proceeds more slowly.
Classification of resources, as opposed to identification of resources, can be made along a number of axes. The axes could be implicitly requested as opposed to explicitly requested, time-based, such as CPU time, compared to time-independent, such as assigned CPU shares, and so forth.
Generally, scheduler-based resource management is applied to resources that the application can implicitly request. For example, to continue execution, an application implicitly requests additional CPU time. To write data to a network socket, an application implicitly requests bandwidth. Constraints can be placed on the aggregate total use of an implicitly requested resource.
Additional interfaces can be presented so that bandwidth or CPU service levels can be explicitly negotiated. Resources that are explicitly requested, such as a request for an additional thread, can be managed by constraint.
The three types of control mechanisms that are available in the Solaris operating environment are constraints, scheduling, and partitioning.
Constraints allow the administrator or application developer to set bounds on the consumption of specific resources for a workload. With known bounds, modeling resource consumption scenarios becomes a simpler process. Bounds can also be used to control ill-behaved applications that would otherwise compromise system performance or availability through unregulated resource requests.
Constraints do present complications for the application. The relationship between the application and the system can be modified to the point that the application is no longer able to function. One approach that can mitigate this risk is to gradually narrow the constraints on applications with unknown resource behavior. The resource controls feature discussed in Chapter 7, Resource Controls provides a constraint mechanism. Newer applications can be written to be aware of their resource constraints, but not all application writers will choose to do this.
Scheduling refers to making a sequence of allocation decisions at specific intervals. The decision that is made is based on a predictable algorithm. An application that does not need its current allocation leaves the resource available for another application's use. Scheduling-based resource management enables full utilization of an undercommitted configuration, while providing controlled allocations in a critically committed or overcommitted scenario. The underlying algorithm defines how the term “controlled” is interpreted. In some instances, the scheduling algorithm might guarantee that all applications have some access to the resource. The fair share scheduler (FSS) described in Chapter 8, Fair Share Scheduler manages application access to CPU resources in a controlled way.
Partitioning is used to bind a workload to a subset of the system's available resources. This binding guarantees that a known amount of resources is always available to the workload. The resource pools functionality that is described in Chapter 10, Resource Pools enables you to limit workloads to specific subsets of the machine. Configurations that use partitioning can avoid system-wide overcommitment. However, in avoiding this overcommitment, the possibility of achieving high utilizations is less likely. A reserved group of resources, such as processors, is not available for use by another workload when the workload bound to them is idle.
Portions of the resource management configuration can be placed in a network name service. This feature allows the administrator to apply resource management constraints across a collection of machines, rather than on an exclusively per-machine basis. Related work can share a common identifier, and the aggregate usage of that work can be tabulated from accounting data.
Resource management configuration and workload-oriented identifiers are described more fully in Chapter 5, Projects and Tasks. The extended accounting facility that links these identifiers with application resource usage is described in Chapter 6, Extended Accounting.
Use resource management to ensure that your applications have the required response times.
Resource management can also increase resource utilization. By categorizing and prioritizing usage, you can effectively use reserve capacity during off-peak periods, often eliminating the need for additional processing power. You can also ensure that resources are not wasted because of load variability.
Resource management is ideal for environments that consolidate a number of applications on a single server.
The cost and complexity of managing numerous machines encourages the consolidation of several applications on larger, more scalable servers. Instead of running each workload on a separate system, with full access to that system's resources, you can use resource management software to segregate workloads within the system. Resource management enables you to lower overall total cost of ownership by running and controlling several dissimilar applications on a single Solaris system.
If you are providing Internet and application services, you can use resource management to do the following:
Host multiple web servers on a single machine. You can control the resource consumption for each web site and you can protect each site from the potential excesses of other sites.
Prevent a faulty common gateway interface (CGI) script from exhausting CPU resources.
Stop an incorrectly behaving application from leaking all available virtual memory.
Ensure that one customer's applications are not affected by another customer's applications that run at the same site.
Provide differentiated levels or classes of service on the same machine.
Obtain accounting information for billing purposes.
Use resource management features in any system that has a large, diverse user base, such as an educational institution. If you have a mix of workloads, the software can be configured to give priority to specific projects.
For example, in large brokerage firms, traders intermittently require fast access to execute a query or to perform a calculation. Other system users, however, have more consistent workloads. If you allocate a proportionately larger amount of processing power to the traders' projects, the traders have the responsiveness that they need.
Resource management is also ideal for supporting thin-client systems. These platforms provide stateless consoles with frame buffers and input devices, such as smart cards. The actual computation is done on a shared server, resulting in a timesharing type of environment. Use resource management features to isolate the users on the server. Then, a user who generates excess load does not monopolize hardware resources and significantly impact others who use the system.
The following task map gives a basic overview of the steps that are involved in setting up resource management on your system.
Task |
Description |
For Instructions |
---|---|---|
Identify the workloads on your system. |
Review project entries in either the /etc/project database file or in the NIS map or LDAP directory service. | |
Prioritize the workloads on your system. |
Determine which applications are critical. These workloads might require preferential access to resources. |
Refer to your business service goals. |
Monitor real-time activity on your system. |
Use performance tools to view the current resource consumption of workloads that are running on your system. You can then evaluate whether you must restrict access to a given resource or isolate particular workloads from other workloads. |
Monitoring by System, cpustat(1M), iostat(1M), mpstat(1M), prstat(1M), sar(1), and vmstat(1M) |
Make temporary modifications to the workloads that are running on your system. |
To determine which values can be altered, refer to the resource controls that are available in the Solaris environment. You can update the values from the command line while the task or process is running. |
Available Resource Controls, Actions on Resource Control Values, Temporarily Updating Resource Control Values on a Running System, rctladm(1M), and prctl(1) |
Set resource control attributes for every project entry in the project database or name service project table. |
Each project entry in the /etc/project database or the name service project table can contain one or more resource controls. The resource controls constrain tasks and processes attached to that project. For each threshold value that is placed on a resource control, you can associate one or more actions to be taken when that value is reached. You can set resource controls by using the command-line interface or the Solaris Management Console. If you are setting configuration parameters across a large number of systems, use the console for this task. |
project Database, Local project File Format, Available Resource Controls, Actions on Resource Control Values, and Chapter 8, Fair Share Scheduler |
Place an upper bound on the resource consumption of physical memory by processes in projects. |
The resource cap enforcement daemon rcapd will enforce the physical memory resource cap defined in the /etc/project database with the rcap.max-rss attribute. |
project Database, Chapter 9, Physical Memory Control Using the Resource Capping Daemon, and rcapd(1M) |
Create resource pools configurations. |
Resource pools provide a way to partition system resources, such as processors, and maintain those partitions across reboots. You can add a project.pool attribute to each entry in the /etc/project database. | |
Make the fair share scheduler (FSS) your default system scheduler. |
Ensure that all user processes in either a single CPU system or a processor set belong to the same scheduling class. | |
Activate the extended accounting facility to monitor and record resource consumption on a task or process basis. |
Use extended accounting data to assess current resource controls and plan capacity requirements for future workloads. Aggregate usage on a system-wide basis can be tracked. To obtain complete usage statistics for related workloads that span more than one system, the project name can be shared across several machines. |
How to Activate Extended Accounting for Processes, Tasks, and Flows and acctadm(1M) |
(Optional) If you determine that additional adjustments to your configuration are required, you can continue to alter the values from the command line. You can alter the values while the task or process is running. |
Modifications to existing tasks can be applied on a temporary basis without restarting the project. Tune the values until you are satisfied with the performance. Then update the current values in the /etc/project database or the name service project table. |
Temporarily Updating Resource Control Values on a Running System, rctladm(1M), and prctl(1) |
(Optional) Capture extended accounting data. |
Write extended accounting records for active processes and active tasks. The files that are produced can be used for planning, chargeback, and billing purposes. |
This chapter discusses the project and task facilities of Solaris resource management. Projects and tasks are used to label workloads and separate them from one another. The project provides a network-wide administrative identifier for related work. The task collects a group of processes into a manageable entity that represents a workload component.
To optimize workload response, you must first be able to identify the workloads that are running on the system you are analyzing. This information can be difficult to obtain by using either a purely process-oriented or a user-oriented method alone. In the Solaris environment, you have two additional facilities that can be used to separate and identify workloads: the project and the task.
Based on their project or task membership, running processes can be manipulated with standard Solaris commands. The extended accounting facility can report on both process usage and task usage, and tag each record with the governing project identifier. This process enables offline workload analysis to be correlated with online monitoring. The project identifier can be shared across multiple machines through the project name service database. Thus, the resource consumption of related workloads that run on (or span) multiple machines can ultimately be analyzed across all of the machines.
The project identifier is an administrative identifier that is used to identify related work. The project identifier can be thought of as a workload tag equivalent to the user and group identifiers. A user or group can belong to one or more projects. These projects can be used to represent the workloads in which the user or group of users is allowed to participate. This membership can then be the basis of chargeback that is based on, for example, usage or initial resource allocations. Although a user must have a default project assigned, the processes that the user launches can be associated with any of the projects of which that user is a member.
To log in to the system, a user must be assigned a default project.
Because each process on the system possesses project membership, an algorithm to assign a default project to the login or other initial process is necessary. The algorithm to determine a default project consists of four steps. If no default project is found, the user's login, or request to start a process, is denied.
The system sequentially follows these steps to determine a user's default project:
If the user has an entry with a project attribute defined in the /etc/user_attr extended user attributes database, then the value of the project attribute is the default project (see user_attr(4)).
If a project with the name user.user-id is present in the project(4) database, then that project is the default project.
If a project with the name group.group-name is present in the project database, where group-name is the name of the default group for the user (as specified in passwd(4)), then that project is the default project.
If the special project default is present in the project database, then that project is the default project.
This logic is provided by the getdefaultproj() library function (see getprojent(3PROJECT)).
You can store project data in a local file, in a Network Information Service (NIS) project map, or in a Lightweight Directory Access Protocol (LDAP) directory service. The /etc/project database or name service is used at login and by all requests for account management by the pluggable authentication module (PAM) to bind a user to a default project.
Updates to entries in the project database, whether to the /etc/project file or to a representation of the database in a network name service, are not applied to currently active projects. The updates are applied to new tasks that join the project when login(1) or newtask(1) is used.
Operations that change or set identify include logging in to the system, invoking an rcp or rsh command, using ftp, or using su. When an operation involves changing or setting identity, a set of configurable modules is used to provide authentication, account management, credentials management, and session management.
The account management PAM module for projects is documented in the pam_projects(5) man page. The PAM system is documented in the man pages pam(3PAM), pam.conf(4), and pam_unix(5).
Resource management supports the name service project database. The location where the project database is stored is defined in /etc/nsswitch.conf. By default, files is listed first, but the sources can be listed in any order.
project: files [nis] [ldap] |
If more than one source for project information is listed, the nsswitch.conf file directs the routine to start searching for the information in the first source listed. The routine then searches subsequent databases.
For more information on /etc/nsswitch.conf, see “The Name Service Switch (Overview)” in System Administration Guide: Naming and Directory Services (DNS, NIS, and LDAP) and nsswitch.conf(4).
If you select files as your project database in nsswitch.conf, the login process searches the /etc/project file for project information (see projects(1) and project(4)). The project file contains a one-line entry for each project recognized by the system, of the following form:
projname:projid:comment:user-list:group-list:attributes |
The fields are defined as follows.
The name of the project. The name must be a string that consists of alphanumeric characters, the underline (_) character, and the hyphen (-). The name must begin with an alphabetic character. projname cannot contain periods (.), colons (:), or newline characters.
The project's unique numerical ID (PROJID) within the system. The maximum value of the projid field is UID_MAX (2147483647).
The project's description.
A comma-separated list of users who are allowed in the project.
Wildcards can be used in this field. The asterisk (*) allows all users to join the project. The exclamation point followed by the asterisk (!*) excludes all users from the project. The exclamation mark (!) followed by a user name excludes the specified user from the project.
A comma-separated list of groups of users who are allowed in the project.
Wildcards can be used in this field. The asterisk (*) allows all groups to join the project. The exclamation point followed by the asterisk (!*) excludes all groups from the project. The exclamation mark (!) followed by a group name excludes the specified group from the project.
A semicolon-separated list of name-value pairs (see Chapter 7, Resource Controls). name is an arbitrary string that specifies the object-related attribute, and value is the optional value for that attribute.
name[=value] |
In the name-value pair, names are restricted to letters, digits, underscores, and the period. The period is conventionally used as a separator between the categories and subcategories of the rctl. The first character of an attribute name must be a letter. The name is case sensitive.
Values can be structured by using commas and parentheses to establish precedence. The semicolon is used to separate name-value pairs. The semicolon cannot be used in a value definition. The colon is used to separate project fields. The colon cannot be used in a value definition.
Routines that read this file halt when they encounter a malformed entry. Any project assignments that are specified after the incorrect entry are not made.
This example shows the default /etc/project file:
system:0:System::: user.root:1:Super-User::: noproject:2:No Project::: default:3:::: group.staff:10:::: |
This example shows the default /etc/project file with project entries added at the end:
system:0:System::: user.root:1:Super-User::: noproject:2:No Project::: default:3:::: group.staff:10:::: user.ml:2424:Lyle Personal::: booksite:4113:Book Auction Project:ml,mp,jtd,kjh:: |
To add resource controls to the /etc/project file, see Using Resource Controls.
If you are using NIS, you can specify in the /etc/nsswitch.conf file to search the NIS maps for projects:
project: nis files |
The NIS map, either project.byname or project.bynumber, has the same form as the /etc/project file:
projname:projid:comment:user-list:group-list:attributes |
For more information, see System Administration Guide: Naming and Directory Services (DNS, NIS, and LDAP).
If you are using LDAP, you can specify in the /etc/nsswitch.conf file to search the LDAP entries for projects.
project: ldap files |
For more information on the schema for project entries in an LDAP database, see “Solaris Schemas” in System Administration Guide: Naming and Directory Services (DNS, NIS, and LDAP).
With each successful login into a project, a new task that contains the login process is created. The task is a process collective that represents a set of work over time. A task can also be viewed as a workload component.
Each process is a member of one task, and each task is associated with one project.
All operations on sessions, such as signal delivery, are also supported on tasks. You can also bind tasks to processor sets and set their scheduling priorities and classes, which modifies all current and subsequent processes in the task.
Tasks are created at login (see login(1)), by cron(1M), by newtask(1), and by setproject(3PROJECT).
The extended accounting facility can provide accounting data for processes that is aggregated at the task level.
Command |
Description |
---|---|
Prints the project membership of a user. |
|
Executes the user's default shell or specified command, placing the execution command in a new task that is owned by the specified project. newtask can also be used to modify the task and the project binding for a running process. |
|
Adds a new project entry to the /etc/project file. projadd creates a project entry only on the local system. projadd cannot change information that is supplied by the network name service. |
|
Modifies a project's information on the local system. projmod cannot change information that is supplied by the network name service. However, the command does verify the uniqueness of the project name and project ID against the external name service. |
|
Deletes a project from the local system. projdel cannot change information that is supplied by the network name service. |
Use ps -o to display task and project IDs. For example, to view the project ID, type the following:
# ps -o user,pid,uid,projid USER PID UID PROJID jtd 89430 124 4113 |
Use id -p to print the current project ID in addition to the user and group IDs. If the user operand is provided, the project associated with that user's normal login is printed:
# id -p uid=124(jtd) gid=10(staff) projid=4113(booksite) |
To match only processes with a project ID in a specific list, type the following:
# pgrep -J projidlist # pkill -J projidlist |
To match only processes with a task ID in a specific list, type the following:
# pgrep -T taskidlist # pkill -T taskidlist |
To display various statistics for processes and projects that are currently running on your system, type the following:
% prstat -J PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 21634 jtd 5512K 4848K cpu0 44 0 0:00.00 0.3% prstat/1 324 root 29M 75M sleep 59 0 0:08.27 0.2% Xsun/1 15497 jtd 48M 41M sleep 49 0 0:08.26 0.1% adeptedit/1 328 root 2856K 2600K sleep 58 0 0:00.00 0.0% mibiisa/11 1979 jtd 1568K 1352K sleep 49 0 0:00.00 0.0% csh/1 1977 jtd 7256K 5512K sleep 49 0 0:00.00 0.0% dtterm/1 192 root 3680K 2856K sleep 58 0 0:00.36 0.0% automountd/5 1845 jtd 24M 22M sleep 49 0 0:00.29 0.0% dtmail/11 1009 jtd 9864K 8384K sleep 49 0 0:00.59 0.0% dtwm/8 114 root 1640K 704K sleep 58 0 0:01.16 0.0% in.routed/1 180 daemon 2704K 1944K sleep 58 0 0:00.00 0.0% statd/4 145 root 2120K 1520K sleep 58 0 0:00.00 0.0% ypbind/1 181 root 1864K 1336K sleep 51 0 0:00.00 0.0% lockd/1 173 root 2584K 2136K sleep 58 0 0:00.00 0.0% inetd/1 135 root 2960K 1424K sleep 0 0 0:00.00 0.0% keyserv/4 PROJID NPROC SIZE RSS MEMORY TIME CPU PROJECT 10 52 400M 271M 68% 0:11.45 0.4% booksite 0 35 113M 129M 32% 0:10.46 0.2% system Total: 87 processes, 205 lwps, load averages: 0.05, 0.02, 0.02 |
To display various statistics for processes and tasks that are currently running on your system, type the following:
% prstat -T PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 23023 root 26M 20M sleep 59 0 0:03:18 0.6% Xsun/1 23476 jtd 51M 45M sleep 49 0 0:04:31 0.5% adeptedit/1 23432 jtd 6928K 5064K sleep 59 0 0:00:00 0.1% dtterm/1 28959 jtd 26M 18M sleep 49 0 0:00:18 0.0% .netscape.bin/1 23116 jtd 9232K 8104K sleep 59 0 0:00:27 0.0% dtwm/5 29010 jtd 5144K 4664K cpu0 59 0 0:00:00 0.0% prstat/1 200 root 3096K 1024K sleep 59 0 0:00:00 0.0% lpsched/1 161 root 2120K 1600K sleep 59 0 0:00:00 0.0% lockd/2 170 root 5888K 4248K sleep 59 0 0:03:10 0.0% automountd/3 132 root 2120K 1408K sleep 59 0 0:00:00 0.0% ypbind/1 162 daemon 2504K 1936K sleep 59 0 0:00:00 0.0% statd/2 146 root 2560K 2008K sleep 59 0 0:00:00 0.0% inetd/1 122 root 2336K 1264K sleep 59 0 0:00:00 0.0% keyserv/2 119 root 2336K 1496K sleep 59 0 0:00:02 0.0% rpcbind/1 104 root 1664K 672K sleep 59 0 0:00:03 0.0% in.rdisc/1 TASKID NPROC SIZE RSS MEMORY TIME CPU PROJECT 222 30 229M 161M 44% 0:05:54 0.6% group.staff 223 1 26M 20M 5.3% 0:03:18 0.6% group.staff 12 1 61M 33M 8.9% 0:00:31 0.0% group.staff 1 33 85M 53M 14% 0:03:33 0.0% system Total: 65 processes, 154 lwps, load averages: 0.04, 0.05, 0.06 |
The -J and -T options cannot be used together.
The cron command issues a settaskid to ensure that each cron, at, and batch job executes in a separate task, with the appropriate default project for the submitting user. Also, the at and batch commands capture the current project ID and ensure that the project ID is restored when running an at job.
To switch the user's default project, and thus create a new task (as part of simulating a login) type the following:
# su - user |
To retain the project ID of the invoker, issue su without the - flag.
# su user |
This example shows how to use the projadd and projmod commands.
Become superuser.
View the default /etc/project file on your system.
# cat /etc/project system:0:::: user.root:1:::: noproject:2:::: default:3:::: group.staff:10:::: |
Add a project called booksite and assign it to a user named mark with project ID number 4113.
# projadd -U mark -p 4113 booksite |
View the /etc/project file again to see the project addition.
# cat /etc/project system:0:::: user.root:1:::: noproject:2:::: default:3:::: group.staff:10:::: booksite:4113::mark:: |
Add a comment that describes the project in the comment field.
# projmod -c `Book Auction Project' booksite |
View the changes in the /etc/project file.
# cat /etc/project system:0:::: user.root:1:::: noproject:2:::: default:3:::: group.staff:10:::: booksite:4113:Book Auction Project:mark:: |
This example shows how to use the projdel command to delete a project.
Become superuser.
Remove the project booksite by using the projdel command.
# projdel booksite |
Display the /etc/project file.
# cat /etc/project system:0:::: user.root:1:::: noproject:2:::: default:3:::: group.staff:10:::: |
Log in as user mark and type projects to view the projects assigned.
# su - mark # projects default |
Use the id command with the -p flag to view the current project membership of the invoking process.
$ id -p uid=100(mark) gid=1(other) projid=3(default) |
Become superuser.
Create a new task in the booksite project by using the newtask command with the -v (verbose) option to obtain the system task ID.
# newtask -v -p booksite 16 |
The execution of newtask creates a new task in the specified project, and places the user's default shell in this task.
View the current project membership of the invoking process.
# id -p uid=100(mark) gid=1(other) projid=4113(booksite) |
The process is now a member of the new project.
This example shows how to associate a running process with a different task and project. To perform this task, you must either be superuser, or be the owner of the process and be a member of the new project.
Become superuser.
Obtain the process ID of the book_catalog process.
# pgrep book_catalog 8100 |
Associate process 8100 with a new task ID in the booksite project.
# newtask -v -p booksite -c 8100 17 |
The -c option specifies that newtask operate on the existing named process.
Confirm the task to process ID mapping.
# pgrep -T 17 8100 |
By using the project and task facilities that are described in Chapter 5, Projects and Tasks to label and separate workloads, you can monitor resource consumption by each workload. You can use the extended accounting subsystem to capture a detailed set of resource consumption statistics on both running processes and tasks. The extended accounting subsystem labels the usage records with the project for which the work was done. You can also use extended accounting, in conjunction with the Internet Protocol Quality of Service (IPQoS) flow accounting module described in “Using Flow Accounting and Statistics Gathering (Tasks)” in IPQoS Administration Guide, to capture network flow information on a system.
To begin using extended accounting, see How to Activate Extended Accounting for Processes, Tasks, and Flows.
Before you can apply resource management mechanisms, you must first be able to characterize the resource consumption demands that various workloads place on a system. The extended accounting facility in the Solaris operating environment provides a flexible way to record system and network resource consumption on a task or process basis, or on the basis of selectors provided by IPQoS (see ipqos(7IPP)). Unlike online monitoring tools, which measure system usage in real time, extended accounting enables you to examine historical usage. You can then make assessments of capacity requirements for future workloads.
With extended accounting data available, you can develop or purchase software for resource chargeback, workload monitoring, or capacity planning.
The extended accounting facility in the Solaris environment uses a versioned, extensible file format to contain accounting data. Files that use this data format can be accessed or be created by using the API provided in the included library, libexacct(3LIB). These files can then be analyzed on any platform with extended accounting enabled, and their data can be used for capacity planning and chargeback.
If extended accounting is active, statistics are gathered that can be examined by the libexacct API. libexacct allows examination of the exacct files either forward or backward. The API supports third-party files that are generated by libexacct as well as those files that are created by the kernel.There is a Practical Extraction and Report Language (Perl) interface to libexacct that enables you to develop customized reporting and extraction scripts. See Perl Interface to libexacct.
With extended accounting enabled, the task tracks the aggregate resource usage of its member processes. A task accounting record is written at task completion. Interim records can also be written. For more information on tasks, see Chapter 5, Projects and Tasks.
The extended accounting format is substantially more extensible than the SunOSTM legacy system accounting software format (see “What is System Accounting?” in System Administration Guide: Advanced Administration). Extended accounting permits accounting metrics to be added and removed from the system between releases, and even during system operation.
Both extended accounting and legacy system accounting software can be active on your system at the same time.
Routines that allow exacct records to be created serve two purposes.
To enable third-party exacct files to be created
To enable the creation of tagging records to be embedded in the kernel accounting file by using the putacct system call (see getacct(2))
The putacct system call is also available from the Perl interface.
The format permits different forms of accounting records to be captured without requiring that every change be an explicit version change. Well-written applications that consume accounting data must ignore records they do not understand.
The libexacct library converts and produces files in the exacct format. This library is the only supported interface to exacct format files.
The getacct, putacct, and wracct system calls do not apply to flows. The kernel creates flow records and writes them to the file when IPQoS flow accounting is configured.
The /etc/acctadm.conf file contains the current extended accounting configuration. The file is edited through the acctadm interface, not by the user.
The directory /var/adm/exacct is the standard location for placing extended accounting data. You can use the acctadm(1M) command to specify a different location for the process and task accounting-data files.
Command |
Description |
---|---|
Modifies various attributes of the extended accounting facility, stops and starts extended accounting, and is used to select accounting attributes to track for processes, tasks, and flows. |
|
Writes extended accounting records for active processes and active tasks. |
|
Displays previously invoked commands. lastcomm can consume either standard accounting-process data or extended-accounting process data. |
For information on commands that are associated with tasks and projects, see Commands Used to Administer Projects and Tasks. For information on IPQoS flow accounting, see ipqosconf(1M).
The Perl interface allows you to create Perl scripts that can read the accounting files produced by the exacct framework. You can also create Perl scripts that write exacct files.
The interface is functionally equivalent to the underlying C API. When possible, the data obtained from the underlying C API is presented as Perl data types. This feature makes accessing the data easier and it removes the need for buffer pack and unpack operations. Moreover, all memory management is performed by the Perl library.
The various project, task, and exacct-related functions are separated into groups. Each group of functions is located in a separate Perl module. Each module begins with the Sun standard Sun::Solaris:: Perl package prefix. All of the classes provided by the Perl exacct library are found under the Sun::Solaris::Exacct module.
The underlying libexacct(3LIB) library provides operations on exacct format files, catalog tags, and exacct objects. exacct objects are subdivided into two types:
Items, which are single-data values (scalars)
Groups, which are lists of Items
The following table summarizes each of the modules.
Module |
Description |
For More Information |
---|---|---|
Sun::Solaris::Project |
This module provides functions to access the project manipulation functions getprojid(2), endprojent(3PROJECT) , fgetprojent(3PROJECT), getdefaultproj(3PROJECT), getprojbyid(3PROJECT), getprojbyname(3PROJECT), getprojent(3PROJECT), getprojidbyname(3PROJECT), inproj(3PROJECT), project_walk(3PROJECT), setproject(3PROJECT), and setprojent(3PROJECT). |
Project(3PERL) |
Sun::Solaris::Task |
This module provides functions to access the task manipulation functions gettaskid(2) and settaskid(2). |
Task(3PERL) |
Sun::Solaris::Exacct |
This module is the top-level exacct module. This module provides functions to access the exacct-related system calls getacct(2), putacct(2), and wracct(2). This module also provides functions to access the libexacct(3LIB) library function ea_error(3EXACCT). Constants for all of the exacct EO_*, EW_*, EXR_*, P_*, and TASK_* macros are also provided in this module. |
Exacct(3PERL) |
Sun::Solaris::Exacct:: Catalog |
This module provides object-oriented methods to access the bitfields in an exacct catalog tag. This module also provides access to the constants for the EXC_*, EXD_*, and EXD_* macros. |
Exacct::Catalog(3PERL) |
Sun::Solaris::Exacct:: File |
This module provides object-oriented methods to access the libexacct accounting file functions ea_open(3EXACCT), ea_close(3EXACCT), ea_get_creator(3EXACCT), ea_get_hostname(3EXACCT), ea_next_object(3EXACCT), ea_previous_object(3EXACCT), and ea_write_object(3EXACCT). |
Exacct::File(3PERL) |
Sun::Solaris::Exacct:: Object |
This module provides object-oriented methods to access an individual exacct accounting file object. An exacct object is represented as an opaque reference blessed into the appropriate Sun::Solaris::Exacct::Object subclass. This module is further subdivided into the object types Item and Group. At this level, there are methods to access the ea_match_object_catalog(3EXACCT) and ea_attach_to_object(3EXACCT) functions. |
Exacct::Object(3PERL) |
Sun::Solaris::Exacct:: Object::Item |
This module provides object-oriented methods to access an individual exacct accounting file Item. Objects of this type inherit from Sun::Solaris::Exacct::Object. |
Exacct::Object::Item(3PERL) |
Sun::Solaris::Exacct:: Object::Group |
This module provides object-oriented methods to access an individual exacct accounting file Group. Objects of this type inherit from Sun::Solaris::Exacct::Object. These objects provide access to the ea_attach_to_group(3EXACCT) function. The Items contained within the Group are presented as a Perl array. |
Exacct::Object::Group(3PERL) |
For examples that show how to use the modules described in the previous table, see Using the Perl Interface to libexacct.
To activate the extended accounting facility for tasks, processes, and flows, use the acctadm(1M) command. The optional final parameter to acctadm indicates whether the command should act on the process, system task, or flow accounting components of the extended accounting facility.
Become superuser.
Activate extended accounting for processes.
# acctadm -e extended -f /var/adm/exacct/proc process |
Activate extended accounting for tasks.
# acctadm -e extended,mstate -f /var/adm/exacct/task task |
Activate extended accounting for flows.
# acctadm -e extended -f /var/adm/exacct/flow flow |
Activate extended accounting on an ongoing basis by linking the /etc/init.d/acctadm script into /etc/rc2.d.
# ln -s /etc/init.d/acctadm /etc/rc2.d/Snacctadm # ln -s /etc/init.d/acctadm /etc/rc2.d/Knacctadm |
The n variable is replaced by a number.
See Extended Accounting Configuration for information on accounting configuration.
Type acctadm without arguments to display the current status of the extended accounting facility.
# acctadm Task accounting: active Task accounting file: /var/adm/exacct/task Tracked task resources: extended Untracked task resources: none Process accounting: active Process accounting file: /var/adm/exacct/proc Tracked process resources: extended Untracked process resources: host,mstate Flow accounting: active Flow accounting file: /var/adm/exacct/flow Tracked flow resources: extended Untracked flow resources: none |
In the previous example, system task accounting is active in extended mode and mstate mode. Process and flow accounting are active in extended mode.
In the context of extended accounting, microstate (mstate) refers to the extended data, associated with microstate process transitions, that is available in the process usage file (see proc(4)). This data provides much more detail about the activities of the process than basic or extended records.
Available resources can vary from system to system, and from platform to platform. Use the -r option to view the available accounting resources on the system.
# acctadm -r process: extended pid,uid,gid,cpu,time,command,tty,projid,taskid,ancpid,wait-status,flag basic pid,uid,gid,cpu,time,command,tty,flag task: extended taskid,projid,cpu,time,host,mstate,anctaskid basic taskid,projid,cpu,timeprocess: extended pid,uid,gid,cpu,time,command,tty,projid,taskid,ancpid,wait-status,flag basic pid,uid,gid,cpu,time,command,tty,flag task: extended taskid,projid,cpu,time,host,mstate,anctaskid basic taskid,projid,cpu,time flow: extended saddr,daddr,sport,dport,proto,dsfield,nbytes,npkts,action,ctime,lseen,projid,uid basic saddr,daddr,sport,dport,proto,nbytes,npkts,action |
To deactivate process, task, and flow accounting, turn off each of them individually.
Become superuser.
Turn off process accounting.
# acctadm -x process |
Turn off task accounting.
# acctadm -x task |
Turn off flow accounting.
# acctadm -x flow |
Verify that task accounting, process accounting, and flow accounting have been turned off.
# acctadm Task accounting: inactive Task accounting file: none Tracked task resources: extended Untracked task resources: none Process accounting: inactive Process accounting file: none Tracked process resources: extended Untracked process resources: host,mstate Flow accounting: inactive Flow accounting file: none Tracked flow resources: extended Untracked flow resources: none |
Use the following code to recursively print the contents of an exacct object. Note that this capability is provided by the library as the Sun::Solaris::Exacct::Object::dump() function. This capability is also available through the ea_dump_object() convenience function.
sub dump_object { my ($obj, $indent) = @_; my $istr = ' ' x $indent; # # Retrieve the catalog tag. Because we are # doing this in an array context, the # catalog tag will be returned as a (type, catalog, id) # triplet, where each member of the triplet will behave as # an integer or a string, depending on context. # If instead this next line provided a scalar context, e.g. # my $cat = $obj->catalog()->value(); # then $cat would be set to the integer value of the # catalog tag. # my @cat = $obj->catalog()->value(); # # If the object is a plain item # if ($obj->type() == &EO_ITEM) { # # Note: The '%s' formats provide s string context, so # the components of the catalog tag will be displayed # as the symbolic values. If we changed the '%s' # formats to '%d', the numeric value of the components # would be displayed. # printf("%sITEM\n%s Catalog = %s|%s|%s\n", $istr, $istr, @cat); $indent++; # # Retrieve the value of the item. If the item contains # in turn a nested exacct object (i.e., an item or # group),then the value method will return a reference # to the appropriate sort of perl object # (Exacct::Object::Item or Exacct::Object::Group). # We could of course figure out that the item contained # a nested item orgroup by examining the catalog tag in # @cat and looking for a type of EXT_EXACCT_OBJECT or # EXT_GROUP. # my $val = $obj->value(); if (ref($val)) { # If it is a nested object, recurse to dump it. dump_object($val, $indent); } else { # Otherwise it is just a 'plain' value, so # display it. printf("%s Value = %s\n", $istr, $val); } # # Otherwise we know we are dealing with a group. Groups # represent contents as a perl list or array (depending on # context), so we can process the contents of the group # with a 'foreach' loop, which provides a list context. # In a list context the value method returns the content # of the group as a perl list, which is the quickest # mechanism, but doesn't allow the group to be modified. # If we wanted to modify the contents of the group we could # do so like this: # my $grp = $obj->value(); # Returns an array reference # $grp->[0] = $newitem; # but accessing the group elements this way is much slower. # } else { printf("%sGROUP\n%s Catalog = %s|%s|%s\n", $istr, $istr, @cat); $indent++; # 'foreach' provides a list context. foreach my $val ($obj->value()) { dump_object($val, $indent); } printf("%sENDGROUP\n", $istr); } } |
Use this script to create a new group record and write it to a file named /tmp/exacct.
#!/usr/perl5/5.6.1/bin/perl use strict; use warnings; use Sun::Solaris::Exacct qw(:EXACCT_ALL); # Prototype list of catalog tags and values. my @items = ( [ &EXT_STRING | &EXC_DEFAULT | &EXD_CREATOR => "me" ], [ &EXT_UINT32 | &EXC_DEFAULT | &EXD_PROC_PID => $$ ], [ &EXT_UINT32 | &EXC_DEFAULT | &EXD_PROC_UID => $< ], [ &EXT_UINT32 | &EXC_DEFAULT | &EXD_PROC_GID => $( ], [ &EXT_STRING | &EXC_DEFAULT | &EXD_PROC_COMMAND => "/bin/rec" ], ); # Create a new group catalog object. my $cat = ea_new_catalog(&EXT_GROUP | &EXC_DEFAULT | &EXD_NONE) # Create a new Group object and retrieve its data array. my $group = ea_new_group($cat); my $ary = $group->value(); # Push the new Items onto the Group array. foreach my $v (@items) { push(@$ary, ea_new_item(ea_new_catalog($v->[0]), $v->[1])); } # Open the exacct file, write the record & close. my $f = ea_new_file('/tmp/exacct', &O_RDWR | &O_CREAT | &O_TRUNC) || die("create /tmp/exacct failed: ", ea_error_str(), "\n"); $f->write($group); $f = undef; |
Use the following Perl script to print the contents of an exacct file.
#!/usr/perl5/5.6.1/bin/perl use strict; use warnings; use Sun::Solaris::Exacct qw(:EXACCT_ALL); die("Usage is dumpexacct <exacct file>\n") unless (@ARGV == 1); # Open the exact file and display the header information. my $ef = ea_new_file($ARGV[0], &O_RDONLY) || die(error_str()); printf("Creator: %s\n", $ef->creator()); printf("Hostname: %s\n\n", $ef->hostname()); # Dump the file contents while (my $obj = $ef->get()) { ea_dump_object($obj); } # Report any errors if (ea_error() != EXR_OK && ea_error() != EXR_EOF) { printf("\nERROR: %s\n", ea_error_str()); exit(1); } exit(0); |
Here is example output produced by running Sun::Solaris::Exacct::Object->dump() on the file created in How to Create a New Group Record and Write It to a File.
Creator: root Hostname: localhost GROUP Catalog = EXT_GROUP|EXC_DEFAULT|EXD_NONE ITEM Catalog = EXT_STRING|EXC_DEFAULT|EXD_CREATOR Value = me ITEM Catalog = EXT_UINT32|EXC_DEFAULT|EXD_PROC_PID Value = 845523 ITEM Catalog = EXT_UINT32|EXC_DEFAULT|EXD_PROC_UID Value = 37845 ITEM Catalog = EXT_UINT32|EXC_DEFAULT|EXD_PROC_GID Value = 10 ITEM Catalog = EXT_STRING|EXC_DEFAULT|EXD_PROC_COMMAND Value = /bin/rec ENDGROUP |
After you determine the resource consumption of workloads on your system as described in Chapter 6, Extended Accounting, you can place bounds on resource usage. Bounds prevent workloads from over-consuming resources. The resource controls facility, which extends the UNIX resource limit concept, is one constraint mechanism that is used for this purpose.
UNIX systems have traditionally provided a resource limits facility (rlimits). The rlimits facility allows administrators to set one or more numerical limits on the amount of resources a process can consume. These limits include per-process CPU time used, per-process core file size, and per-process maximum heap size. Heap size is the amount of memory that is allocated for the process data segment.
In the Solaris operating environment, the concept of a per-process resource limit has been extended to the task and project entities described in Chapter 5, Projects and Tasks. These enhancements are provided by the resource controls (rctls) facility. A resource control is identified by the prefix project, task, or process. Resource controls can be observed on a system-wide basis.
The resource controls facility provides compatibility interfaces for the resource limits facility. Existing applications that use resource limits continue to run unchanged. These applications can be observed in the same way as applications that are modified to take advantage of the resource controls facility.
Resource controls provide a mechanism for constraint on system resources. Processes, tasks, and projects can be prevented from consuming amounts of specified system resources. This mechanism leads to a more manageable system by preventing over-consumption of resources.
Constraint mechanisms can be used to support capacity-planning processes. An encountered constraint can provide information about application resource needs without necessarily denying the resource to the application.
Resource controls can also serve as a simple attribute mechanism for resource management facilities. For example, the number of CPU shares made available to a project in the fair share scheduler (FSS) scheduling class is defined by the project.cpu-shares resource control. Because a project is assigned a fixed number of shares by the control, the various actions associated with exceeding a control are not relevant. In this context, the current value for the project.cpu-shares control is considered an attribute on the specified project.
Another type of project attribute is used to regulate the resource consumption of physical memory by collections of processes attached to a project. These attributes have the prefix rcap, for example, rcap.max-rss. Like a resource control, this type of attribute is configured in the project database. However, while resource controls enforce limits from the kernel, rcap project attributes are enforced at the user level by the rcapd(1M) resource cap enforcement daemon. For information on rcapd, see Chapter 9, Physical Memory Control Using the Resource Capping Daemon.
The resource controls facility is configured through the project database (see Chapter 5, Projects and Tasks). Resource control attributes are set in the final field of the project database entry. The values associated with each resource control are enclosed in parentheses, and appear as plain text separated by commas. The values in parentheses constitute an “action clause.” Each action clause is composed of a privilege level, a threshold value, and an action that is associated with the particular threshold. Each resource control can have multiple action clauses, which are also separated by commas. The following entry defines a per-process address-space limit and a per-task lightweight process limit on a project entity.
development:101:Developers:::task.max-lwps=(privileged,10,deny); process.max-address-space=(privileged,209715200,deny) |
The rctladm(1M) command allows you to make runtime interrogations of and modifications to the resource controls facility, with global scope. The prctl(1) command allows you to make runtime interrogations of and modifications to the resource controls facility, with local scope.
A list of the standard resource controls that are available in this release is shown in the following table.
The table describes the resource that is constrained by each control. The table also identifies the default units that are used by the project database for that resource. The default units are of two types:
Quantities represent a limited amount.
Indexes represent a maximum valid identifier.
Thus, project.cpu-shares specifies the number of shares to which the project is entitled. process.max-file-descriptor specifies the highest file number that can be assigned to a process by the open(2) system call.
Table 7–1 Standard Resource Controls
Control Name |
Description |
Default Unit |
---|---|---|
project.cpu-shares |
The number of CPU shares that are granted to this project for use with FSS(7) |
Quantity (shares) |
task.max-cpu-time |
Maximum CPU time that is available to this task's processes |
Time (seconds) |
task.max-lwps |
Maximum number of LWPs simultaneously available to this task's processes |
Quantity (LWPs) |
process.max-cpu-time |
Maximum CPU time that is available to this process |
Time (seconds) |
process.max-file-descriptor |
Maximum file descriptor index that is available to this process |
Index (maximum file descriptor) |
process.max-file-size |
Maximum file offset that is available for writing by this process |
Size (bytes) |
process.max-core-size |
Maximum size of a core file that is created by this process |
Size (bytes) |
process.max-data-size |
Maximum heap memory that is available to this process |
Size (bytes) |
process.max-stack-size |
Maximum stack memory segment that is available to this process |
Size (bytes) |
process.max-address-space |
Maximum amount of address space, as summed over segment sizes, available to this process |
Size (bytes) |
A threshold value on a resource control constitutes an enforcement point where local actions can be triggered or global actions, such as logging, can occur.
Each threshold value must be associated with a privilege level of one of the following three types.
Basic, which can be modified by the owner of the calling process
Privileged, which can be modified only by privileged (superuser) callers
System, which is fixed for the duration of the operating system instance
A resource control is guaranteed to have one system value, which is defined by the system, or resource provider. The system value represents how much of the resource the current implementation of the operating system is capable of providing.
Any number of privileged values can be defined, and only one basic value is allowed. Operations that are performed without specifying a privilege value are assigned a basic privilege by default.
The privilege level for a resource control value is defined in the privilege field of the resource control block as RCTL_BASIC, RCTL_PRIVILEGED, or RCTL_SYSTEM. See getrctl(2) for more information. You can use the prctl command to modify values that are associated with basic and privileged levels.
For each threshold value that is placed on a resource control, you can associate one or more actions.
You can choose to deny the resource requests for an amount that is greater than the threshold.
You can choose to send a signal to the violating or observing process if the threshold value is reached.
Due to implementation restrictions, the global properties of each control can restrict the set of available actions that can be set on the threshold value. A list of available signal actions is presented in the following table. For additional information on signals, see signal(3HEAD).
Table 7–2 Signals Available to Resource Control Values
Signal |
Notes |
---|---|
SIGABRT |
|
SIGHUP |
|
SIGTERM |
|
SIGKILL |
|
SIGSTOP |
|
SIGXRES |
|
SIGXFSZ |
Available only to resource controls with the RCTL_GLOBAL_FILE_SIZE property (process.max-file-size). See rctlblk_set_value(3C) for more information. |
SIGXCPU |
Available only to resource controls with the RCTL_GLOBAL_CPUTIME property (process.max-cpu-time). See rctlblk_set_value(3C) for more information. |
Each resource control on the system has a certain set of associated properties. This set of properties is defined as a set of global flags, which are associated with all controlled instances of that resource. Global flags cannot be modified, but the flags can be retrieved by using either rctladm or the getrctl system call.
Local flags define the default behavior and configuration for a specific threshold value of that resource control on a specific process or process collective. The local flags for one threshold value do not affect the behavior of other defined threshold values for the same resource control. However, the global flags affect the behavior for every value associated with a particular control. Local flags can be modified, within the constraints supplied by their corresponding global flags, by the prctl command or the setrctl system call (see setrctl(2)).
For the complete list of local flags, global flags, and their definitions, see rctlblk_set_value(3C).
To determine system behavior when a threshold value for a particular resource control is reached, use rctladm to display the global flags for the resource control. For example, to display the values for process.max-cpu-time, type the following:
$ rctladm process.max-cpu-time process.max-cpu-time syslog=off [ lowerable no-deny cpu-time inf ] |
The global flags indicate the following.
Superuser privileges are not required to lower the privileged values for this control.
Even when threshold values are exceeded, access to the resource is never denied.
SIGXCPU is available to be sent when threshold values of this resource are reached.
Any value with RCTL_LOCAL_MAXIMAL defined actually represents an infinite quantity, and the value is never enforced.
Use prctl to display local values and actions for the resource control.
$ prctl -n process.max-cpu-time $$ 353939: -ksh process.max-cpu-time [ lowerable no-deny cpu-time inf ] 18446744073709551615 privileged signal=XCPU [ max ] 18446744073709551615 system deny [ max ] |
The max (RCTL_LOCAL_MAXIMAL) flag is set for both threshold values, and the inf (RCTL_GLOBAL_INFINITE) flag is defined for this resource control. Hence, as configured, both threshold quantities represent infinite values and they will never be exceeded.
More than one resource control can exist on a resource. A resource control can exist at each containment level in the process model. If resource controls are active on the same resource at different container levels, the smallest container's control is enforced first. Thus, action is taken on process.max-cpu-time before task.max-cpu-time if both controls are encountered simultaneously.
Often, the resource consumption of processes is unknown. To get more information, try using the global resource control actions that are available with rctladm(1M). Use rctladm to establish a syslog action on a resource control. Then, if any entity managed by that resource control encounters a threshold value, a system message is logged at the configured logging level.
Each resource control listed in Table 7–1 can be assigned to a project at login or when newtask(1) or the other project-aware launchers at(1), batch (see at(1)), or cron(1M) are invoked. Each command that is initiated is launched in a separate task with the invoking user's default project.
Updates to entries in the project database, whether to the /etc/project file or to a representation of the database in a network name service, are not applied to currently active projects. The updates are applied when a new task joins the project through login(1) or newtask.
Values changed in the project database only become effective for new tasks that are started in a project. However, you can use the rctladm and prctl commands to update resource controls on a running system.
The rctladm command affects the global logging state of each resource control on a system-wide basis. This command can be used to view the global state and to set up the level of syslog logging when controls are exceeded.
You can view and temporarily alter resource control values and actions on a per-process, per-task, or per-project basis by using prctl. A project, task, or process ID is given as input, and the command operates on the resource control at the level where it is defined.
Any modifications to values and actions take effect immediately. However, these modifications apply to the current session only. The changes are not recorded in the project database. If the system is restarted, the modifications are lost. Permanent changes to resource controls must be made in the project database.
All resource control settings that can be modified in the project database can also be modified with the prctl command. Both basic and privileged values can be added or be deleted and their actions can be modified. By default, the basic type is assumed for all set operations, but processes and users with superuser privileges can also modify privileged resource controls. System resource controls cannot be altered.
Type this entry in the /etc/project database to set the maximum number of LWPs in each task in project x-files to 3.
x-files:100::root::task.max-lwps=(privileged,3,deny) |
When superuser creates a new task in project x-files by joining it with newtask, superuser will not be able to create more than three LWPs while running in this task. This is shown in the following annotated sample session.
# newtask -p x-files csh # prctl -n task.max-lwps $$ 688: csh task.max-lwps 3 privileged deny 2147483647 system deny # id -p uid=0(root) gid=1(other) projid=100(x-files) # ps -o project,taskid -p $$ PROJECT TASKID x-files 236 # csh /* creates second LWP */ # csh /* creates third LWP */ # csh /* cannot create more LWPs */ Vfork failed # |
The /etc/project file can contain settings for multiple resource controls for each project as well as multiple threshold values for each control. Threshold values are defined in action clauses, which are comma-separated for multiple values.
The following line in the file sets a basic control with no action on the maximum LWPs per task for project x-files. The line also sets a privileged deny control on the maximum LWPs per task. This control causes any LWP creation that exceeds the maximum to fail, as shown in the previous example. Finally, the maximum file descriptors per process are limited at the basic level, which forces failure of any open call that exceeds the maximum.
x-files:101::root::task.max-lwps=(basic,10,none),(privileged,500,deny); process.max-file-descriptor=(basic,128,deny) |
As superuser, type prctl to display the maximum file descriptor for the current shell that is running:
# prctl -n process.max-file-descriptor $$ 8437: sh process.max-file-descriptor [ lowerable deny ] 256 basic deny 65536 privileged deny 2147483647 system deny |
Use the prctl command to temporarily add a new privileged value to deny the use of more than three LWPs per task for the x-files project. The result is identical to the result in How to Set the Maximum Number of LWPs for Each Task in a Project, as shown in the following annotated sample session:
# newtask -p x-files # id -p uid=0(root) gid=1(other) projid=101(x-files) # prctl -n task.max-lwps -t privileged -v 3 -e deny -i project x-files # prctl -n task.max-lwps -i project x-files 670: sh task.max-lwps 3 privileged deny 2147483647 system deny |
You can also use prctl -r to change the lowest value of a resource control.
# prctl -n process.max-file-descriptor -r -v 128 $$ |
You can use rctladm to enable the global syslog attribute of a resource control. When the control is exceeded, notification is logged at the specified syslog level. Type the following:
# rctladm -e syslog process.max-file-descriptor |
A global action on a resource control enables you to receive notice of any entity that is tripping over a resource control value.
For example, assume you want to determine whether a web server possesses sufficient CPUs for its typical workload. You could analyze sar(1) data for idle CPU time and load average. You could also examine extended accounting data to determine the number of simultaneous processes that are running for the web server process.
However, an easier approach is to place the web server in a task. You can then set a global action, using syslog, to notify you whenever a task exceeds a scheduled number of LWPs appropriate for the machine's capabilities.
Use the prctl command to place a privileged (superuser-owned) resource control on the tasks that contain an httpd process. Limit each task's total number of LWPs to 40, and disable all local actions.
# prctl -n task.max-lwps -v 40 -t privileged -d all `pgrep httpd` |
Enable a system log global action on the task.max-lwps resource control.
# rctladm -e syslog task.max-lwps |
Observe whether the workload trips the resource control.
If it does, you will see /var/adm/messages such as:
Jan 8 10:15:15 testmachine unix: [ID 859581 kern.notice] NOTICE: privileged rctl task.max-lwps exceeded by task 19 |
An analysis of workload data can indicate that a particular workload or group of workloads is monopolizing CPU resources. If these workloads are not violating resource constraints on CPU usage, you can modify the allocation policy for CPU time on the system. The fair share scheduling class described in this chapter enables you to allocate CPU time based on shares instead of the priority scheme of the timesharing (TS) scheduling class.
A fundamental job of the operating system is to arbitrate which processes get access to the system's resources. The process scheduler, which is also called the dispatcher, is the portion of the kernel that controls allocation of the CPU to processes. The scheduler supports the concept of scheduling classes. Each class defines a scheduling policy that is used to schedule processes within the class. The default scheduler in the Solaris operating environment, the TS scheduler, tries to give every process relatively equal access to the available CPUs. However, you might want to specify that certain processes be given more resources than others.
You can use the fair share scheduler (FSS) to control the allocation of available CPU resources among workloads, based on their importance. This importance is expressed by the number of shares of CPU resources that you assign to each workload.
You give each project CPU shares to control the project's entitlement to CPU resources. The FSS guarantees a fair dispersion of CPU resources among projects that is based on allocated shares, independent of the number of processes that are attached to a project. The FSS achieves fairness by reducing a project's entitlement for heavy CPU usage and increasing its entitlement for light usage, in accordance with other projects.
The FSS consists of a kernel scheduling class module and class-specific versions of the dispadmin(1M) and priocntl(1) commands. Project shares used by the FSS are specified through the project.cpu-shares property in the project(4) database.
The term “share” is used to define a portion of the system's CPU resources that is allocated to a project. If you assign a greater number of CPU shares to a project, relative to other projects, the project receives more CPU resources from the fair share scheduler.
CPU shares are not equivalent to percentages of CPU resources. Shares are used to define the relative importance of workloads in relation to other workloads. When you assign CPU shares to a project, your primary concern is not the number of shares the project has. Knowing how many shares the project has in comparison with other projects is more important. You must also take into account how many of those other projects will be competing with it for CPU resources.
Processes in projects with zero shares always run at the lowest system priority (0). These processes only run when projects with nonzero shares are not using CPU resources.
In the Solaris operating environment, a project workload usually consists of more than one process. From the fair share scheduler perspective, each project workload can be in either an idle state or an active state. A project is considered idle if none of its processes are using any CPU resources. This usually means that such processes are either sleeping (waiting for I/O completion) or stopped. A project is considered active if at least one of its processes is using CPU resources. The sum of shares of all active projects is used in calculating the portion of CPU resources to be assigned to projects.
The following formula shows how the FSS scheduler calculates per-project allocation of CPU resources.
When more projects become active, each project's CPU allocation is reduced, but the proportion between the allocations of different projects does not change.
Share allocation is not the same as utilization. A project that is allocated 50 percent of the CPU resources might average only a 20 percent CPU use. Moreover, shares serve to limit CPU usage only when there is competition from other projects. Regardless of how low a project's allocation is, it always receives 100 percent of the processing power if it is running alone on the system. Available CPU cycles are never wasted. They are distributed between projects.
The allocation of a small share to a busy workload might slow its performance. However, the workload is not prevented from completing its work if the system is not overloaded.
Assume you have a system with two CPUs running two parallel CPU-bound workloads called A and B, respectively. Each workload is running as a separate project. The projects have been configured so that project A is assigned SA shares, and project B is assigned SB shares.
On average, under the traditional TS scheduler, each of the workloads that is running on the system would be given the same amount of CPU resources. Each workload would get 50 percent of the system's capacity.
When run under the control of the FSS scheduler with SA=SB, these projects are also given approximately the same amounts of CPU resources. However, if the projects are given different numbers of shares, their CPU resource allocations are different.
The next three examples illustrate how shares work in different configurations. These examples show that shares are only mathematically accurate for representing the usage if demand meets or exceeds available resources.
If A and B each have two CPU-bound processes, and SA = 1 and SB = 3, then the total number of shares is 1 + 3 = 4. In this configuration, given sufficient CPU demand, projects A and B are allocated 25 percent and 75 percent of CPU resources, respectively.
If A and B have only one CPU-bound process each, and SA = 1 and SB = 100, then the total number of shares is 101. Each project cannot use more than one CPU because each project has only one running process. Because no competition exists between projects for CPU resources in this configuration, projects A and B are each allocated 50 percent of all CPU resources. In this configuration, CPU share values are irrelevant. The projects' allocations would be the same (50/50), even if both projects were assigned zero shares.
If A and B have two CPU-bound processes each, and project A is given 1 share and project B is given 0 shares, then project B is not allocated any CPU resources and project A is allocated all CPU resources. Processes in B always run at system priority 0, so they will never be able to run because processes in project A always have higher priorities.
Projects are the workload containers in the FSS scheduler. Groups of users who are assigned to a project are treated as single controllable blocks. Note that you can create a project with its own number of shares for an individual user.
Users can be members of multiple projects that have different numbers of shares assigned. By moving processes from one project to another project, processes can be assigned CPU resources in varying amounts.
For more information on the project(4) database and name services, see project Database.
The configuration of CPU shares is managed by the name service as a property of the project database.
When the first task (or process) associated with a project is created through the setproject(3PROJECT) library function, the number of CPU shares defined as resource control project.cpu-shares in the project database is passed to the kernel. A project that does not have the project.cpu-shares resource control defined is assigned one share.
In the following example, this entry in the /etc/project file sets the number of shares for project x-files to 5:
x-files:100::::project.cpu-shares=(privileged,5,none) |
If you alter the number of CPU shares allocated to a project in the database when processes are already running, the number of shares for that project will not be modified at that point. The project must be restarted for the change to become effective.
If you want to temporarily change the number of shares assigned to a project without altering the project's attributes in the project database, use prctl(1). For example, to change the value of project x-files's project.cpu-shares resource control to 3 while processes associated with that project are running, type the following:
# prctl -r -n project.cpu-shares -v 3 -i project x-files |
Replaces the current value for the named resource control.
Specifies the name of the resource control.
Specifies the value for the resource control.
Specifies the ID type of the next argument.
Specifies the object of the change. In this instance, project x-files is the object.
Project system with project ID 0 includes all system daemons that are started by the boot-time initialization scripts. system can be viewed as a project with an unlimited number of shares. This means that system is always scheduled first, regardless of how many shares have been given to other projects. If you do not want the system project to have unlimited shares, you can specify a number of shares for this project in the project database.
As stated previously, processes that belong to projects with zero shares are always given zero system priority. Projects with one or more shares are running with priorities one and higher. Thus, projects with zero shares are only scheduled when CPU resources are available that are not requested by a nonzero share project.
The maximum number of shares that can be assigned to one project is 65535.
The FSS can be used in conjunction with processor sets to provide more fine-grained controls over allocations of CPU resources among projects that run on each processor set than would be available with processor sets alone. The FSS scheduler treats processor sets as entirely independent partitions, with each processor set controlled independently with respect to CPU allocations.
The CPU allocations of projects running in one processor set are not affected by the CPU shares or activity of projects running in another processor set because the projects are not competing for the same resources. Projects only compete with each other if they are running within the same processor set.
The number of shares that is allocated to a project is system wide. Regardless of which processor set it is running on, each portion of a project is given the same amount of shares.
When processor sets are used, project CPU allocations are calculated for active projects that run within each processor set, as shown in the following figure.
Project partitions that run on different processor sets might have different CPU allocations. The CPU allocation for each project partition in a processor set depends only on the allocations of other projects that run on the same processor set.
The performance and availability of applications that run within the boundaries of their processor sets are not affected by the introduction of new processor sets. The applications are also not affected by changes that are made to the share allocations of projects that run on other processor sets.
Empty processor sets (sets without processors in them) or processor sets without processes bound to them do not have any impact on the FSS scheduler behavior.
Assume that a server with eight CPUs is running several CPU-bound applications in projects A, B, and C. Project A is allocated one share, project B is allocated two shares, and project C is allocated three shares.
Project A is running only on processor set 1. Project B is running on processor sets 1 and 2. Project C is running on processor sets 1, 2, and 3. Assume that each project has enough processes to utilize all available CPU power. Thus, there is always competition for CPU resources on each processor set.
The total system-wide project CPU allocations on such a system are shown in the following table.
Project |
Allocation |
---|---|
Project A |
4% = (1/6 X 2/8)pset1 |
Project B |
28% = (2/6 X 2/8)pset1+ (2/5 * 4/8)pset2 |
Project C |
67% = (3/6 X 2/8)pset1+ (3/5 X 4/8)pset2+ (3/3 X 2/8)pset3 |
These percentages do not match the corresponding amounts of CPU shares that are given to projects. However, within each processor set, the per-project CPU allocation ratios are proportional to their respective shares.
On the same system without processor sets, the distribution of CPU resources would be different, as shown in the following table.
Project |
Allocation |
---|---|
Project A |
16.66% = (1/6) |
Project B |
33.33% = (2/6) |
Project C |
50% = (3/6) |
By default, the FSS scheduling class uses the same range of priorities (0 to 59) as the timesharing (TS), interactive (IA), and fixed priority (FX) scheduling classes. Therefore, you should avoid having processes from these scheduling classes share the same processor set. A mix of processes in the FSS, TS, IA, and FX classes could result in unexpected scheduling behavior.
With the use of processor sets, you can mix TS, IA, and FX with FSS in one system. However, all the processes that run on each processor set must be in one scheduling class, so they do not compete for the same CPUs. The FX scheduler in particular should not be used in conjunction with the FSS scheduling class unless processor sets are used. This action prevents applications in the FX class from using priorities high enough to starve applications in the FSS class.
You can mix processes in the TS and IA classes in the same processor set, or on the same system without processor sets.
The Solaris operating environment also offers a real-time (RT) scheduler to users with superuser privileges. By default, the RT scheduling class uses system priorities in a different range (usually from 100 to 159) than FSS. Because RT and FSS are using disjoint ranges of priorities, FSS can coexist with the RT scheduling class within the same processor set. However, the FSS scheduling class does not have any control over processes that run in the RT class.
For example, on a four-processor system, a single-threaded RT process can consume one entire processor if the process is CPU bound. If the system also runs FSS, regular user processes compete for the three remaining CPUs that are not being used by the RT process. Note that the RT process might not use the CPU continuously. When the RT process is idle, FSS utilizes all four processors.
You can type the following command to find out which scheduling classes the processor sets are running in and ensure that each processor set is configured to run either TS, IA, FX, or FSS processes.
$ ps -ef -o pset,class | grep -v CLS | sort | uniq 1 FSS 1 SYS 2 TS 2 RT 3 FX |
To set the default scheduler for the system, see FSS Configuration Examples and dispadmin(1M). To move running processes into a different scheduling class, see FSS Configuration Examples and priocntl(1).
You can use prstat(1M) to monitor CPU usage by active projects.
You can use the extended accounting data for tasks to obtain per-project statistics on the amount of CPU resources that are consumed over longer periods. See Chapter 6, Extended Accounting for more information.
To monitor the CPU usage of projects that run on the system, type the following:
% prstat -J |
To monitor the CPU usage of projects on a list of processor sets, type the following:
% prstat -J -C pset-list |
List of processor set IDs that are separated by commas
As with other scheduling classes in the Solaris environment, commands to set the scheduler class, configure the scheduler's tunable parameters, and configure the properties of individual processes can be used with FSS.
Use the dispadmin command to set FSS as the default scheduler for the system.
# dispadmin -d FSS |
This change takes effect on the next reboot. After reboot, every process on the system runs in the FSS scheduling class.
You can manually move processes from the TS scheduling class into the FSS scheduling class without changing the default scheduling class and rebooting.
Become superuser.
Move the init process (pid 1) into the FSS scheduling class.
# priocntl -s -c FSS -i pid 1 |
Move all processes from the TS scheduling class into the FSS scheduling class.
# priocntl -s -c FSS -i class TS |
All processes again run in the TS scheduling class after reboot.
You might be using a default class other than TS. For example, your system might be running a window environment that uses the IA class by default. You can manually move all processes into the FSS scheduling class without changing the default scheduling class and rebooting.
Become superuser.
Move the init process (pid 1) into the FSS scheduling class.
# priocntl -s -c FSS -i pid 1 |
Move all processes from their current scheduling classes into the FSS scheduling class.
# priocntl -s -c FSS -i all |
All processes again run in the default scheduling class after reboot.
You can manually move processes in a particular project from their current scheduling class to the FSS scheduling class.
# priocntl -s -c FSS -i projid 10 |
The project's processes again run in the default scheduling class after reboot.
You can use the dispadmin command to examine and tune the FSS scheduler's time quantum value. Time quantum is the amount of time that a thread is allowed to run before it must relinquish the processor. To display the current time quantum for the FSS scheduler, type the following:
$ dispadmin -c FSS -g # # Fair Share Scheduler Configuration # RES=1000 # # Time Quantum # QUANTUM=110 |
When you use the -g option, you can also use the -r option to specify the resolution that is used for printing time quantum values. If no resolution is specified, time quantum values are displayed in milliseconds by default. Type the following:
$ dispadmin -c FSS -g -r 100 # # Fair Share Scheduler Configuration # RES=100 # # Time Quantum # QUANTUM=11 |
To set scheduling parameters for the FSS scheduling class, use dispadmin -s. The values in file must be in the format output by the -g option. These values overwrite the current values in the kernel. Type the following:
$ dispadmin -c FSS -s file |
For more information on how to use the FSS scheduler, see priocntl(1), ps(1), dispadmin(1M), and FSS(7).
The resource capping daemon rcapd regulates physical memory consumption by processes running in projects that have resource caps defined.
A resource cap is an upper bound placed on the consumption of a resource, such as physical memory. Per-project physical memory caps are supported.
The resource capping daemon and its associated utilities provide mechanisms for physical memory resource cap enforcement and administration.
Like the resource control, the resource cap can be defined by using attributes of project entries in the project database. However, while resource controls are synchronously enforced by the kernel, resource caps are asynchronously enforced at the user level by the resource capping daemon. With asynchronous enforcement, a small delay occurs as a result of the sampling interval used by the daemon.
For information about rcapd, see the rcapd(1M) man page. For information about projects and the project database, see Chapter 5, Projects and Tasks and the project(4) man page. For information about resource controls, see Chapter 7, Resource Controls.
The daemon repeatedly samples the resource utilization of projects that have physical memory caps. The sampling interval used by the daemon is specified by the administrator. See Determining Sample Intervals for additional information. When the system's physical memory utilization exceeds the threshold for cap enforcement, and other conditions are met, the daemon takes action to reduce the resource consumption of projects with memory caps to levels at or below the caps.
The virtual memory system divides physical memory into segments known as pages. Pages are the fundamental unit of physical memory in the Solaris memory management subsystem. To read data from a file into memory, the virtual memory system reads in one page at a time, or pages in a file. To reduce resource consumption, the daemon can page out, or relocate, infrequently used pages to a swap device, which is an area outside of physical memory.
The daemon manages physical memory by regulating the size of a project workload's resident set relative to the size of its working set. The resident set is the set of pages that are resident in physical memory. The working set is the set of pages that the workload actively uses during its processing cycle. The working set changes over time, depending on the process's mode of operation and the type of data being processed. Ideally, every workload has access to enough physical memory to enable its working set to remain resident. However, the working set can also include the use of secondary disk storage to hold the memory that does not fit in physical memory.
Only one instance of rcapd can run at any given time.
To define a physical memory resource cap for a project, establish a resident set size (RSS) cap by adding this attribute to the project database entry:
The total amount of physical memory, in bytes, that is available to processes in the project.
For example, the following line in the /etc/project database sets an RSS cap of 10 gigabytes for a project named db.
db:100::db,root::rcap.max-rss=10737418240 |
The system might round the specified cap value to a page size.
You use the rcapadm command to configure the resource capping daemon. You can perform the following actions:
Set the threshold value for cap enforcement
Set intervals for the operations performed by rcapd
Enable or disable resource capping
Display the current status of the configured resource capping daemon
To configure the daemon, you must have superuser privileges or have the Process Management profile in your list of profiles. The Process Management role and the System Administrator role both include the Process Management profile. See “RBAC Elements: Reference Information” in System Administration Guide: Security Services.
Configuration changes can be incorporated into rcapd according to the configuration interval (see rcapd Operation Intervals) or on demand by sending a SIGHUP (see the kill(1) man page).
If used without arguments, rcapadm displays the current status of the resource capping daemon if it has been configured.
The following subsections discuss cap enforcement, cap values, and rcapd operation intervals.
The memory cap enforcement threshold is the percentage of physical memory utilization on the system that triggers cap enforcement. When the system exceeds this utilization, caps are enforced. The physical memory used by applications and the kernel is included in this percentage. The percentage of utilization determines the way in which memory caps are enforced.
To enforce caps, memory can be paged out from project workloads.
Memory can be paged out to reduce the size of the portion of memory that is over its cap for a given workload.
Memory can be paged out to reduce the proportion of physical memory used that is over the memory cap enforcement threshold on the system.
A workload is permitted to use physical memory up to its cap. A workload can use additional memory as long as the system's memory utilization stays below the memory cap enforcement threshold.
To set the value for cap enforcement, see How to Set the Memory Cap Enforcement Threshold.
If a project cap is set too low, there might not be enough memory for the workload to proceed effectively under normal conditions. The paging that occurs because the workload requires more memory has a negative effect on system performance.
Projects that have caps set too high can consume available physical memory before their caps are exceeded. In this case, physical memory is effectively managed by the kernel and not by rcapd.
In determining caps on projects, consider these factors.
The daemon can attempt to reduce a project workload's physical memory usage whenever the sampled usage exceeds the project's cap. During cap enforcement, the swap devices and other devices that contain files that the workload has mapped are used. The performance of the swap devices is a critical factor in determining the performance of a workload that routinely exceeds its cap. The execution of the workload is similar to running it on a machine with the same amount of physical memory as the workload's cap.
The daemon's CPU usage varies with the number of processes in the project workloads it is capping and the sizes of the workloads' address spaces.
A small portion of the daemon's CPU time is spent sampling the usage of each workload. Adding processes to workloads increases the time spent sampling usage.
Another portion of the daemon's CPU time is spent enforcing caps when they are exceeded. The time spent is proportional to the amount of virtual memory involved. CPU time spent increases or decreases in response to corresponding changes in the total size of a workload's address space. This information is reported in the vm column of rcapstat output. See Monitoring Resource Utilization With rcapstat and the rcapstat(1) man page for more information.
The daemon cannot determine which pages of memory are shared with other processes or which are mapped multiple times within the same process. Since rcapd assumes that each page is unique, this results in a discrepancy between the reported (estimated) RSS and the actual RSS.
Certain workloads, such as databases, use shared memory extensively. For these workloads, you can sample a project's regular usage to determine a suitable initial cap value. Use output from the prstat command with the -J option. See the prstat(1M) man page.
You can tune the intervals for the periodic operations performed by rcapd.
All intervals are specified in seconds. The rcapd operations and their default interval values are described in the following table.
Operation |
Default Interval Value in Seconds |
Description |
---|---|---|
scan |
15 |
Number of seconds between scans for processes that have joined or left a project workload. Minimum value is 1 second. |
sample |
5 |
Number of seconds between samplings of resident set size and subsequent cap enforcements. Minimum value is 1 second. |
report |
5 |
Number of seconds between updates to paging statistics. If set to 0, statistics are not updated, and output from rcapstat is not current. |
config |
60 |
Number of seconds between reconfigurations. In a reconfiguration event, rcapadm reads the configuration file for updates, and scans the project database for new or revised project caps. Sending a SIGHUP to rcapd causes an immediate reconfiguration. |
To tune intervals, see How to Set Operation Intervals.
The scan interval controls how often rcapd looks for new processes. On systems with many processes running, the scan through the list takes more time, so it might be preferable to lengthen the interval in order to reduce the overall CPU time spent. However, the scan interval also represents the minimum amount of time that a process must exist to be attributed to a capped workload. If there are workloads that run many short-lived processes, rcapd might not attribute the processes to a workload if the scan interval is lengthened.
The sample interval configured with rcapadm is the shortest amount of time rcapd waits between sampling a workload's usage and enforcing the cap if it is exceeded. If you reduce this interval, rcapd will, under most conditions, enforce caps more frequently, possibly resulting in increased I/O due to paging. However, a shorter sample interval can also lessen the impact that a sudden increase in a particular workload's physical memory usage might have on other workloads. The window between samplings, in which the workload can consume memory unhindered and possibly take memory from other capped workloads, is narrowed.
If the sample interval specified to rcapstat is shorter than the interval specified to rcapd with rcapadm, the output for some intervals can be zero. This situation occurs because rcapd does not update statistics more frequently than the interval specified with rcapadm. The interval specified with rcapadm is independent of the sampling interval used by rcapstat.
Use rcapstat to monitor the resource utilization of capped projects. To view an example rcapstat report, see Producing Reports With rcapstat.
You can set the sampling interval for the report and specify the number of times that statistics are repeated.
Specifies the sampling interval in seconds. The default interval is 5 seconds.
Specifies the number of times that the statistics are repeated. By default, rcapstat reports statistics until a termination signal is received or until the rcapd process exits.
The paging statistics in the first report issued by rcapstat show the activity since the daemon was started. Subsequent reports reflect the activity since the last report was issued.
The following table defines the column headings in an rcapstat report.
rcapstat Column Headings |
Description |
---|---|
id |
The project ID of the capped project. |
project |
The project name. |
nproc |
The number of processes in the project. |
vm |
The total amount of virtual memory size used by processes in the project, in kilobytes (K), megabytes (M), or gigabytes (G). |
rss |
The estimated amount of the total resident set size (RSS) of the processes in the project, in kilobytes (K), megabytes (M), or gigabytes (G), not accounting for pages that are shared. |
cap |
The RSS cap defined for the project. See Attribute to Limit Physical Memory Usage or the rcapd(1M) man page for information about how to specify memory caps. |
at |
The total amount of memory that rcapd attempted to page out since the last rcapstat sample. |
avgat |
The average amount of memory that rcapd attempted to page out during each sample cycle that occurred since the last rcapstat sample. The rate at which rcapd samples collection RSS can be set with rcapadm. See rcapd Operation Intervals. |
pg |
The total amount of memory that rcapd successfully paged out since the last rcapstat sample. |
avgpg |
An estimate of the average amount of memory that rcapd successfully paged out during each sample cycle that occurred since the last rcapstat sample. The rate at which rcapd samples process RSS sizes can be set with rcapadm. See rcapd Operation Intervals. |
This section contains procedures for configuring the resource capping daemon with rcapadm. See rcapd Configuration and the rcapadm(1M) man page for more information.
If used without arguments, rcapadm displays the current status of the resource capping daemon if it has been configured.
Caps can be configured so that they will not be enforced until the physical memory available to processes is low. See Memory Cap Enforcement Threshold for more information.
The minimum (and default) value is 0, which means that memory caps are always enforced. To set a different minimum, follow this procedure.
Become superuser.
Use the -c option of rcapadm to set a different physical memory utilization value for memory cap enforcement.
# rcapadm -c percent |
percent is in the range 0 to 100. Higher values are less restrictive. A higher value means capped project workloads can execute without having caps enforced until the system's memory utilization exceeds this threshold.
To display the current physical memory utilization and the cap enforcement threshold, see Reporting Memory Utilization and the Memory Cap Enforcement Threshold.
rcapd Operation Intervals contains information about the intervals for the periodic operations performed by rcapd. To set operation intervals using rcapadm, follow this procedure.
Become superuser.
Use the -i option to set interval values.
# rcapadm -i interval=value,...,interval=value |
All interval values are specified in seconds.
There are two ways to enable resource capping on your system.
Become superuser.
Enable the resource capping daemon in one of the following ways:
To enable the resource capping daemon so that it will be started now and also be started each time the system is booted, type:
# rcapadm -E |
To enable the resource capping daemon at boot without starting it now, also specify the -n option:
# rcapadm -n -E |
There are two ways to disable resource capping on your system.
Become superuser.
Disable the resource capping daemon in one of the following ways:
To disable the resource capping daemon so that it will be stopped now and not be started when the system is booted, type:
# rcapadm -D |
To disable the resource capping daemon without stopping it, also specify the -n option:
# rcapadm -n -D |
Use rcapadm -D to safely disable rcapd. If the daemon is killed (see the kill(1) man page), processes might be left in a stopped state and need to be manually restarted. To resume a process running, use the prun command. See the prun(1) man page for more information.
Use rcapstat to report resource capping statistics. Monitoring Resource Utilization With rcapstat explains how to use the rcapstat command to generate reports. That section also describes the column headings in the report. The rcapstat(1) man page also contains this information.
The following subsections use examples to illustrate how to produce reports for specific purposes.
In this example, caps are defined for two projects associated with two users. user1 has a cap of 50 megabytes, and user2 has a cap of 10 megabytes.
The following command produces five reports at 5-second sampling intervals.
user1machine% rcapstat 5 5 id project nproc vm rss cap at avgat pg avgpg 112270 user1 24 123M 35M 50M 50M 0K 3312K 0K 78194 user2 1 2368K 1856K 10M 0K 0K 0K 0K id project nproc vm rss cap at avgat pg avgpg 112270 user1 24 123M 35M 50M 0K 0K 0K 0K 78194 user2 1 2368K 1856K 10M 0K 0K 0K 0K id project nproc vm rss cap at avgat pg avgpg 112270 user1 24 123M 35M 50M 0K 0K 0K 0K 78194 user2 1 2368K 1928K 10M 0K 0K 0K 0K id project nproc vm rss cap at avgat pg avgpg 112270 user1 24 123M 35M 50M 0K 0K 0K 0K 78194 user2 1 2368K 1928K 10M 0K 0K 0K 0K id project nproc vm rss cap at avgat pg avgpg 112270 user1 24 123M 35M 50M 0K 0K 0K 0K 78194 user2 1 2368K 1928K 10M 0K 0K 0K 0K |
The first three lines of output constitute the first report, which contains the cap and project information for the two projects and paging statistics since rcapd was started. The at and pg columns are a number greater than zero for user1 and zero for user2, which indicates that at some time in the daemon's history, user1 exceeded its cap but user2 did not.
The subsequent reports show no significant activity.
The following example shows project user1, which has an RSS in excess of its RSS cap.
The following command produces five reports at 5-second sampling intervals.
user1machine% rcapstat 5 5 |
id project nproc vm rss cap at avgat pg avgpg 376565 user1 3 6249M 6144M 6144M 690M 220M 5528K 2764K 376565 user1 3 6249M 6144M 6144M 0M 131M 4912K 1637K 376565 user1 3 6249M 6171M 6144M 27M 147M 6048K 2016K 376565 user1 3 6249M 6146M 6144M 4872M 174M 4368K 1456K 376565 user1 3 6249M 6156M 6144M 12M 161M 3376K 1125K |
The user1 project has three processes that are actively using physical memory. The positive values in the pg column indicate that rcapd is consistently paging out memory as it attempts to meet the cap by lowering the physical memory utilization of the project's processes. However, rcapd does not succeed in keeping the RSS below the cap value. This is indicated by the varying rss values that do not show a corresponding decrease. As soon as memory is paged out, the workload uses it again and the RSS count goes back up. This means that all of the project's resident memory is being actively used and the working set size (WSS) is greater than the cap. Thus, rcapd is forced to page out some of the working set to meet the cap. Under this condition, the system will continue to experience high page fault rates, and associated I/O, until one of the following occurs:
The WSS becomes smaller.
The cap is raised.
The application changes its memory access pattern.
In this situation, shortening the sample interval might reduce the discrepancy between the RSS value and the cap value by causing rcapd to sample the workload and enforce caps more frequently.
A page fault occurs when either a new page must be created or the system must copy in a page from a swap device.
The following example is a continuation of the previous example, and it uses the same project.
The previous example shows that the user1 project is using more physical memory than its cap allows. This example shows how much memory the project workload requires.
user1machine% rcapstat 5 5 id project nproc vm rss cap at avgat pg avgpg 376565 user1 3 6249M 6144M 6144M 690M 0K 689M 0K 376565 user1 3 6249M 6144M 6144M 0K 0K 0K 0K 376565 user1 3 6249M 6171M 6144M 27M 0K 27M 0K 376565 user1 3 6249M 6146M 6144M 4872K 0K 4816K 0K 376565 user1 3 6249M 6156M 6144M 12M 0K 12M 0K 376565 user1 3 6249M 6150M 6144M 5848K 0K 5816K 0K 376565 user1 3 6249M 6155M 6144M 11M 0K 11M 0K 376565 user1 3 6249M 6150M 10G 32K 0K 32K 0K 376565 user1 3 6249M 6214M 10G 0K 0K 0K 0K 376565 user1 3 6249M 6247M 10G 0K 0K 0K 0K 376565 user1 3 6249M 6247M 10G 0K 0K 0K 0K 376565 user1 3 6249M 6247M 10G 0K 0K 0K 0K 376565 user1 3 6249M 6247M 10G 0K 0K 0K 0K 376565 user1 3 6249M 6247M 10G 0K 0K 0K 0K 376565 user1 3 6249M 6247M 10G 0K 0K 0K 0K |
Halfway through the cycle, the cap on the user1 project was increased from 6 gigabytes to 10 gigabytes. This increase stops cap enforcement and allows the resident set size to grow, limited only by other processes and the amount of memory in the machine. The rss column might stabilize to reflect the project working set size (WSS), 6247M in this example. This is the minimum cap value that allows the project's processes to operate without continually incurring page faults.
The following two figures graphically show the effect rcapd has on user1 while the cap is 6 gigabytes and 10 gigabytes. Every 5 seconds, corresponding to the sample interval, the RSS decreases and I/O increases as rcapd pages out some of the workload's memory. Shortly after the page out completes, the workload, needing those pages, pages them back in as it continues running. This cycle repeats until the cap is raised to 10 gigabytes approximately halfway through the example, and the RSS stabilizes at 6.1 gigabytes. Since the workload's RSS is now below the cap, no more paging occurs. The I/O associated with paging stops as well, as the vmstat (see vmstat(1M)) or iostat (see iostat(1M)) commands would show. Thus, you can infer that the project required 6.1 gigabytes to perform the work it was doing at the time it was being observed.
You can use the -g option of rcapstat to report the following:
Current physical memory utilization as a percentage of physical memory installed on the system
System memory cap enforcement threshold set by rcapadm
The -g option causes a memory utilization and cap enforcement line to be printed at the end of the report for each interval.
# rcapstat -g id project nproc vm rss cap at avgat pg avgpg 376565 rcap 0 0K 0K 10G 0K 0K 0K 0K physical memory utilization: 55% cap enforcement threshold: 0% id project nproc vm rss cap at avgat pg avgpg 376565 rcap 0 0K 0K 10G 0K 0K 0K 0K physical memory utilization: 55% cap enforcement threshold: 0% |
This chapter discusses resource pools, which are used for partitioning machine resources. Resource pools enable you to separate workloads so that workload consumption of certain resources does not overlap. This resource reservation helps to achieve predictable performance on systems with mixed workloads.
Resource pools provide a persistent configuration mechanism for processor set configuration and, optionally, scheduling class assignment.
By grouping multiple partitions, pools provide a handle to associate with labeled workloads. Each project entry in the /etc/project database can have a pool associated with it. New work that is started on a project is bound to the appropriate pool.
The pools mechanism is primarily for use on large machines of more than four CPUs. However, small machines can still benefit from this functionality. On small machines, you can create pools that share noncritical resource partitions. The pools are separated only on the basis of critical resources.
Resource pools offer a versatile mechanism that can be applied to many administrative scenarios, as described in the following sections.
Use pools functionality to split a server into two pools.
One pool is used for login sessions and interactive work by timesharing users. The other pool is used for jobs that are submitted through the batch system.
Partition the resources for interactive applications in accordance with the applications' requirements.
Set user expectations.
You might initially deploy a machine that is running only a fraction of the services that the machine is ultimately expected to deliver. User difficulties can occur if reservation-based resource management mechanisms are not established when the machine comes online.
For example, the fair share scheduler optimizes CPU utilization. The response times for a machine that is running only one application can be misleadingly fast when compared to the response times users see with multiple applications loaded. By using separate pools for each application, you can ensure that a ceiling on the number of CPUs available to each application is in place before all applications are deployed.
Partition a server that supports large user populations.
Server partitioning provides an isolation mechanism that leads to a more predictable per-user response.
By dividing users into groups that bind to separate pools, and using the fair share scheduling (FSS) facility, you can tune CPU allocations to favor sets of users that have priority. This assignment can be based on user role, accounting chargeback, and so forth.
Use resource pools to adjust to changing demand.
Your site might experience predictable shifts in workload demand over long periods of time, such as monthly, quarterly, or annual cycles. If your site experiences these shifts, you can alternate between multiple pools configurations by invoking pooladm from a cron(1M) job.
Create a real-time pool by using the RT scheduler and designated processor resources.
The commands that are shown in the following table provide the primary administrative interface to the pools facility.
Command |
Description |
---|---|
Activates a particular configuration or deactivates the current configuration. If run without options, pooladm prints out the current running pools configuration. |
|
Enables the manual binding of projects, tasks, and processes to a pool. |
|
Creates and modifies pools configuration files. If run with the info subcommand argument to the -c option, poolcfg displays the current configuration. |
A library API is provided by libpool(3LIB). The library can be used by programs to manipulate pool configurations.
The resource pools framework stores its view of the machine in a private configuration file. (The location of the file is private to the implementation of the pools framework.) This configuration file represents the pools framework's view of the machine. The file also contains information about configured pools and the organization of partitionable resources. Each pool can contain the following:
A reference to a processor set or a CPU resource partition
A property that specifies the pool's default scheduling class
Pools can be implemented on a system by using one of these methods.
When the Solaris software boots, an init script checks to see if the /etc/pooladm.conf file exists. If this file is found, then pooladm is invoked to make this configuration the active pools configuration. The system creates a private configuration file to reflect the organization that is requested in /etc/pooladm.conf, and the machine's resources are partitioned accordingly.
When the Solaris environment is running, a pools configuration can either be activated if it is not already present, or modified by using the pooladm command. By default, pooladm operates on /etc/pooladm.conf. However, you can optionally specify an alternate location and file name, and use this file to update the pools configuration.
Dynamic reconfiguration (DR) enables you to reconfigure hardware while the system is running. Because DR affects available resource amounts, the pools facility must be included in these operations. When a DR operation is initiated, the pools framework acts to validate the configuration.
If the DR operation can proceed without causing the current pools configuration to become invalid, then the private configuration file is updated. An invalid configuration is one that cannot be supported by the available resources.
If the DR operation would cause the pools configuration to be invalid, then the operation fails and you are notified by a message to the message log. If you want to force the configuration to completion, you must use the DR force option. The pools configuration is then modified to comply with the new resource configuration.
The configuration file contains a description of the pools to be created on the system. The file describes the entities and resource types that can be manipulated.
Type |
Description |
---|---|
pset |
A processor set resource |
pool |
Named collection of resource associations |
system |
The machine-level entity |
See poolcfg(1M) for more information on elements that be manipulated.
You can create a structured /etc/pooladm.conf file in two ways.
You can use poolcfg to discover the resources on the current system and place the results in a configuration file.
This method simplifies file construction. All active resources and components on the system that are capable of being manipulated by the pools facility are recorded. The resources include existing processor set configurations. You can then modify the configuration to rename the processor sets or to create additional pools if necessary.
You can use poolcfg to create a new pools configuration.
Use this method when you develop configurations for other machines or when you create configurations that you want to apply to the current machine at a later time.
Use poolcfg or libpool to modify the /etc/pooladm.conf file. Do not directly edit this file.
Use the discover subcommand argument to the -c option of /usr/sbin/poolcfg to create the pools configuration file. The resulting file, /etc/pooladm.conf, contains any existing processor sets.
You can also supply a file name to use instead of the default /etc/pooladm.conf. If the file name is supplied, then the poolcfg commands are applied to the contents of the named file.
For example, to place a discovered configuration in /tmp/foo, do the following:
Use the create subcommand argument to the -c option of /usr/sbin/poolcfg to create a simple configuration file for a system named tester. Note that you must quote subcommand arguments that contain white space.
Become superuser.
Type the following:
# poolcfg -c 'create system tester' |
View the contents of the configuration file in readable form.
# poolcfg -c info system tester int system.version 1 boolean system.bind-default true string system.comment |
To enhance your simple configuration, create a processor set named batch and a pool named batch. Then join them with an association. Note that you must quote subcommand arguments that contain white space.
Become superuser.
Create processor set batch.
# poolcfg -c 'create pset batch (uint pset.min = 2; uint pset.max = 10)' |
Create pool batch.
# poolcfg -c 'create pool batch' |
Join with an association.
# poolcfg -c 'associate pool batch (pset batch)' |
Display the edited configuration.
# poolcfg -c info system tester int system.version 1 boolean system.bind-default true string system.comment pool batch boolean pool.default false boolean pool.active true int pool.importance 1 string pool.comment pset batch pset batch int pset.sys_id -2 string pset.units population boolean pset.default true uint pset.max 10 uint pset.min 2 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 |
You can associate a pool with a scheduling class so that all processes bound to the pool use this scheduler. To do this, set the pool.scheduler property to the name of the scheduler class. This example shows how to associate the pool batch with the FSS.
Become superuser.
Modify pool batch to be associated with the FSS.
# poolcfg -c 'modify pool batch (string pool.scheduler="FSS")' |
Display the edited configuration.
# poolcfg -c info system tester int system.version 1 boolean system.bind-default true string system.comment pool batch boolean pool.default false boolean pool.active true int pool.importance 1 string pool.comment string pool.scheduler FSS pset batch pset batch int pset.sys_id -2 string pset.units population boolean pset.default true uint pset.max 10 uint pset.min 2 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 |
poolcfg -f can take input from a text file that contains poolcfg subcommand arguments to the -c option. This technique is appropriate when you want a set of operations to be performed atomically. When processing multiple commands, the configuration is only updated if all of the commands succeed. For large or complex configurations, this technique can be more useful than per-subcommand invocations.
Create the input file.
$ cat > poolcmds.txt create system tester create pset batch (uint pset.min = 2; uint pset.max = 10) create pool batch associate pool batch (pset batch) |
Become superuser.
Type the following:
# /usr/sbin/poolcfg -f poolcmds.txt |
Use pooladm(1M) to make a particular pool configuration active or to remove an active pools configuration.
To activate the configuration in the default static configuration file, /etc/pooladm.conf, invoke pooladm with the -c option, “commit configuration.”
To remove the running configuration and all associated resources, such as processor sets, use the -x option for “remove configuration.”
The -x option to pooladm removes the dynamic private configuration file as well as all resource configurations that are associated with the dynamic configuration. Thus, the -x option provides a mechanism for recovering from a poorly designed pools configuration. All processes share all of the resources on the machine.
Mixing scheduling classes within one processor set can lead to unpredictable results. If you use pooladm -x to recover from a bad configuration, you should then use priocntl(1) to move running processes into a different scheduling class.
You can bind a running process to a pool in two ways.
You can use the poolbind(1M) command to bind a specific process to a named resource pool.
You can use the project.pool attribute in the project(4) database to identify the pool binding for a new login session or a task that is launched through newtask(1).
The following procedure manually binds a process (for example, the current shell) to a pool named ohare.
To bind tasks or projects to a pool, use poolbind with the -i option. The following example binds all processes in the airmiles project to the laguardia pool.
To automatically bind new processes in a project to a pool, add the project.pool attribute to each entry in the project database.
For example, assume you have a configuration with two pools that are named studio and backstage. The /etc/project file has the following contents.
user.paul:1024::::project.pool=studio user.george:1024::::project.pool=studio user.ringo:1024::::project.pool=backstage passes:1027::paul::project.pool=backstage |
With this configuration, processes that are started by user paul are bound by default to the studio pool.
Using the previous configuration, user paul can modify the pool binding for processes he starts. He can use newtask to bind work to the backstage pool as well, by launching in the passes project.
Launch a process in the passes project.
$ newtask -l -p passes |
Verify the pool binding for the process.
$ poolbind -q $$ process id 6384 : pool 'backstage' |
This chapter reviews the resource management framework and describes a hypothetical server consolidation project. In this example, five applications are being consolidated onto a single system. The target applications have resource requirements that vary, different user populations, and different architectures.
Currently, each application exists on a dedicated server that is designed to meet the requirements of the application. The applications and their characteristics are identified in the following table.
Application Description |
Characteristics |
---|---|
Application server |
Exhibits negative scalability beyond 2 CPUs |
Database instance for application server |
Heavy transaction processing |
Application server in test and development environment |
GUI-based, with untested code execution |
Transaction processing server |
Primary concern is response time |
Standalone database instance |
Processes a large number of transactions and serves multiple time zones |
The following configuration is used to consolidate the applications onto a single system.
The application server has a two–CPU processor set.
The database instance for the application server and the standalone database instance are consolidated onto a single processor set of at least four CPUs. The standalone database instance is guaranteed 75 percent of that resource.
The test and development application server requires the IA scheduling class to ensure UI responsiveness. Memory limitations are imposed to lessen the effects of bad code builds.
The transaction processing server is assigned a dedicated processor set of at least two CPUs, to minimize response latency.
Edit the project database file. Add entries to implement the required resource controls and to map users to resource pools, and then view the file.
# cat /etc/project . . . user.app_server:2001:Production Application Server:::project.pool=appserver_pool user.app_db:2002:App Server DB:::project.pool=db_pool,project.cpu-shares(privileged,1,deny) development:2003:Test and development::staff:project.pool=dev_pool, process.max-address-space=(privileged,536870912,deny) user.tp_engine:2004:Transaction Engine:::project.pool=tp_pool user.geo_db:2005:EDI DB:::project.pool=db_pool,project.cpu-shares=(privileged,3,deny) . . . |
The development team has to execute tasks in the development project because access for this project is based on a user's group ID (GID).
Create an input file named pool.host, which will be used to configure the required resource pools. View the file.
# cat pool.host create system host create pset default_pset (uint pset.min = 1) create pset dev_pset (uint pset.max = 2) create pset tp_pset (uint pset.min = 2) create pset db_pset (uint pset.min = 4; uint pset.max = 6) create pset app_pset (uint pset.min = 1; uint pset.max = 2) create pool default_pool (string pool.scheduler="TS"; boolean pool.default = true) create pool dev_pool (string pool.scheduler="IA") create pool appserver_pool (string pool.scheduler="TS") create pool db_pool (string pool.scheduler="FSS") create pool tp_pool (string pool.scheduler="TS") associate pool default_pool (pset default_pset) associate pool dev_pool (pset dev_pset) associate pool appserver_pool (pset app_pset) associate pool db_pool (pset db_pset) associate pool tp_pool (pset tp_pset) |
Type the following:
# poolcfg -f pool.host |
Make the configuration active.
# pooladm -c |
The framework is now functional on the system.
To view the framework configuration, type:
# pooladm system host int system.version 1 boolean system.bind-default true string system.comment pool default_pool boolean pool.default true boolean pool.active true int pool.importance 1 string pool.comment string pool.scheduler TS pset default_pset pool dev_pool boolean pool.default false boolean pool.active true int pool.importance 1 string pool.comment string pool.scheduler IA pset dev_pset pool appserver_pool boolean pool.default false boolean pool.active true int pool.importance 1 string pool.comment string pool.scheduler TS pset app_pset pool db_pool boolean pool.default false boolean pool.active true int pool.importance 1 string pool.comment string pool.scheduler FSS pset db_pset pool tp_pool boolean pool.default false boolean pool.active true int pool.importance 1 string pool.comment string pool.scheduler TS pset tp_pset pset default_pset int pset.sys_id -1 string pset.units population boolean pset.default true uint pset.max 4294967295 uint pset.min 1 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 pset dev_pset int pset.sys_id 1 string pset.units population boolean pset.default false uint pset.max 2 uint pset.min 0 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 pset tp_pset int pset.sys_id 2 string pset.units population boolean pset.default false uint pset.max 4294967295 uint pset.min 2 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 pset db_pset int pset.sys_id 3 string pset.units population boolean pset.default false uint pset.max 6 uint pset.min 4 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 pset app_pset int pset.sys_id 4 string pset.units population boolean pset.default false uint pset.max 2 uint pset.min 1 string pset.comment boolean pset.escapable false uint pset.load 0 uint pset.size 0 |
A graphic representation of the framework follows.
In the db_pool, the standalone database instance is guaranteed 75 percent of the CPU resource.
This chapter describes the resource control and performance monitoring features in the Solaris Management Console.
You can use the console to monitor system performance and to enter resource control values for projects, tasks, and processes. The console provides a convenient, secure alternative to the command-line interface (CLI) for managing hundreds of configuration parameters that are spread across many systems. Each system is managed individually. The console's graphical interface supports all experience levels.
Task |
Description |
For Instructions |
---|---|---|
Use the console |
Start the Solaris Management Console in a local environment or in a name service or directory service environment. Note that the performance tool is not available in a name service environment. |
“Starting the Solaris Management Console” in System Administration Guide: Basic Administration and “Using the Solaris Management Tools in a Name Service Environment (Task Map)” in System Administration Guide: Basic Administration |
Monitor system performance |
Access the Performance tool under System Status. | |
Add resource controls to projects |
Access the Resource Controls tab under System Configuration. |
Resource management functionality is a component of the Solaris Management Console. The console is a container for GUI-based administrative tools that are stored in collections called toolboxes. For information on the console and how to use it, see “Working With the Management Console (Tasks)” in System Administration Guide: Basic Administration.
When you use the console and its tools, the main source of documentation is the online help system in the console itself. For a description of the documentation available in the online help, see “Solaris Management Console (Overview)” in System Administration Guide: Basic Administration.
The term management scope refers to the name service environment that you choose to use with the selected management tool. The management scope choices for the resource control and performance tools are the/etc/project local file, or NIS.
The management scope that you select during a console session should correspond to the primary name service that is identified in the /etc/nsswitch.conf file.
The Performance tool is used to monitor resource utilization. Resource utilization can be summarized for the system, viewed by project, or viewed for an individual user.
The Performance tool is located under System Status in the Navigation pane. To access the Performance tool, do the following:
Click the System Status control entity in the Navigation pane.
The control entity is used to expand menu items in the Navigation pane.
Click the Performance control entity.
Click the System control entity.
Double-click Summary, Projects, or Users.
Your choice depends on the usage you want to monitor.
Values are shown for the following attributes.
Attribute |
Description |
---|---|
Active Processes |
Number of processes that are active on the system |
Physical Memory Used |
Amount of system memory that is in use |
Physical Memory Free |
Amount of system memory that is available |
Swap Used |
Amount of system swap space that is in use |
Swap Free |
Amount of free system swap space |
Page Rate |
Rate of system paging activity |
System Calls |
Number of system calls per second |
Network Packets |
Number of network packets that are transmitted per second |
CPU Usage |
Percentage of CPU that is currently in use |
Load Average |
Number of processes in the system run queue which are averaged over the last 1, 5, and 15 minutes |
Values are shown for the following attributes.
Attribute |
Short Name |
Description |
---|---|---|
Input Blocks |
inblk |
Number of blocks read |
Blocks Written |
oublk |
Number of blocks written |
Chars Read/Written |
ioch |
Number of characters read and written |
Data Page Fault Sleep Time |
dftime |
Amount of time spent processing data page faults |
Involuntary Context Switches |
ictx |
Number of involuntary context switches |
System Mode Time |
stime |
Amount of time spent in the kernel mode |
Major Page Faults |
majfl |
Number of major page faults |
Messages Received |
mrcv |
Number of messages received |
Messages Sent |
msend |
Number of messages sent |
Minor Page Faults |
minf |
Number of minor page faults |
Num Processes |
nprocs |
Number of processes owned by the user or the project |
Num LWPs |
count |
Number of lightweight processes |
Other Sleep Time |
slptime |
Sleep time other than tftime, dftime, kftime, and ltime |
CPU Time |
pctcpu |
Percentage of recent CPU time used by the process, the user, or the project |
Memory Used |
pctmem |
Percentage of system memory used by the process, the user, or the project |
Heap Size |
brksize |
Amount of memory allocated for the process data segment |
Resident Set Size |
rsssize |
Current amount of memory claimed by the process |
Process Image Size |
size |
Size of the process image in Kbytes |
Signals Received |
sigs |
Number of signals received |
Stopped Time |
stoptime |
Amount of time spent in the stopped state |
Swap Operations |
swaps |
Number of swap operations in progress |
System Calls Made |
sysc |
Number of system calls made over the last time interval |
System Page Fault Sleep Time |
kftime |
Amount of time spent processing page faults |
System Trap Time |
ttime |
Amount of time spent processing system traps |
Text Page Fault Sleep Time |
tftime |
Amount of time spent processing text page faults |
User Lock Wait Sleep Time |
ltime |
Amount of time spent waiting for user locks |
User Mode Time |
utime |
Amount of time spent in the user mode |
User and System Mode Time |
time |
The cumulative CPU execution time |
Voluntary Context Switches |
vctx |
Number of voluntary context switches |
Wait CPU Time |
wtime |
Amount of time spent waiting for CPU (latency) |
Resource controls allow you to associate a project with a set of resource constraints. These constraints determine the allowable resource usage of tasks and processes that run in the context of the project.
The Resource Controls tab is located under System Configuration in the Navigation pane. To access Resource Controls, do the following:
Click the System Configuration control entity in the Navigation pane.
Double-click Projects.
Click on a project in the console main window to select it.
Select Properties from the Action menu.
Click the Resource Controls tab.
View, add, edit, or delete resource control values for processes, projects, and tasks.
To view the list of available resource controls, see About Resource Controls in the console or Available Resource Controls.
You can view, add, edit, or delete resource control values for processes, projects, and tasks. These operations are performed through dialog boxes in the console.
Resource controls and values are viewed in tables in the console. The Resource Control column lists the resource controls that can be set. The Value column displays the properties that are associated with each resource control. In the table, these values are enclosed in parentheses, and they appear as plain text separated by commas. The values in parentheses comprise an “action clause.” Each action clause is composed of a threshold, a privilege level, one signal, and one local action that is associated with the particular threshold. Each resource control can have multiple action clauses, which are also separated by commas.
On a running system, values that are altered in the project database through the console only take effect for new tasks that are started in a project.
For information on projects and tasks, see Chapter 5, Projects and Tasks. For information on resource controls, see Chapter 7, Resource Controls. For information on the fair share scheduler (FSS), see Chapter 8, Fair Share Scheduler.