Sun N1 Grid Engine 6.1 Administration Guide

Resource Quota Overview

To prevent users from consuming all available resources, the N1 Grid Engine 6.1 software supports complex attributes that you can configure on a global, queue or host layer. While this layered resource management approach is powerful, the approach leaves gaps that become particularly important in large installations that consist of many different custom resources, user groups, and projects. The resource quota feature closes this gap by enabling you to manage these enterprise environments to the extent that you can control which project or department has to abdicate when single bottleneck resources run out.

The resource quota feature enables you to apply limits to several kinds of resources, several kinds of resource consumers, to all jobs in the cluster, and to combinations of consumers. In this context, resources are any defined complex attribute known by the N1 Grid Engine configuration. For more information about complex attributes, see the complex(5) man page. Resources can be slots, arch, mem_total, num_proc, swap_total, built-in resources, or any custom-defined resource like compiler_license. Resource consumers are (per) users, (per) queues, (per) hosts, (per) projects, and (per) parallel environments.

The resource quota feature provides a way for you to limit the resources that a consumer can use at any time. This limitation provides an indirect method to prioritize users, departments, and projects. To define directly the priorities by which a user should obtain a resource, use the resource urgency and share-based policies described in Configuring the Urgency Policy and Configuring the Share-Based Policy.

To limit resources through the N1 Grid Engine 6.1 software, use the qquota and qconf commands, or the QMON graphical interface. For more information, see the qquota(1) and qconf(1) man pages.

About Resource Quota Sets

Resource quota sets enable you to specify the maximum resource consumption for any job requests. Once you define the resource quota sets, the scheduler uses them to select the next possible jobs to be run by watching that the quotas will not be exceeded. The ultimate result of setting resource quotas is that only those jobs that do not exceed their resource quotas will be scheduled and run.

A resource quota set defines a maximum resource quota for a particular job request. All of the configured rule sets apply all of the time. If multiple resource quota sets are defined, the most restrictive set applies. Every resource quota set consists of one or more resource quota rules. These rules are evaluated in order, and the first rule that matches a specific request is used. A resource quota set always results in at most one effective resource quota rule for a specific request.

A resource quota set consists of the following information:


Example 6–1 Sample Resource Quota Set

The following example resource quota set restricts user1 and user2 to two gigabytes of free virtual space on each host in the host group lx_hosts.

     {
        name         max_virtual_free_on_lx_hosts
        description  "resource quota for virtual_free restriction"
        enabled      true
        limit        users {user1,user2} hosts {@lx_host} to virtual_free=2g
     }

Static and Dynamic Resource Quotas

Resource quota rules always define a maximum value of a resource that can be used. In most cases, these values are static and equal for all matching filter scopes. Although you could define several different rules to apply to different scopes, you would then have several rules that are nearly identical. Instead of duplicating rules, you can instead define a dynamic limit.

A dynamic limit uses an algebraic expression to derive the rule limit value. The algebraic formula can reference a complex attribute whose value is used to calculate the resulting limit.


Example 6–2 Dynamic Limit Example

The following example illustrates the use of dynamic limits. Users are allowed to use 5 slots per CPU on all Linux hosts.


limit hosts {@linux_hosts} to slots=$num_proc*5

The value of num_proc is the number of processors on the host. The limit is calculated by the formula $num_proc*5, and can be different on each host. Expanding the example above, you could have the following resulting limits:

Instead of num_proc, you could use any other complex attribute known for a host as either a load value or a consumable resource.