This section describes how to configure policies to manage cluster resources.
The grid engine software orchestrates the delivery of computational power, based on enterprise resource policies that the administrator manages. The system uses these policies to examine available computer resources in the grid. The system gathers these resources, and then it allocates and delivers them automatically, in a way that optimizes usage across the grid.
To enable cooperation in the grid, project owners must do the following:
Negotiate policies
Ensure that policies for manual overrides for unique project requirements are flexible
Automatically monitor and enforce policies
As administrator, you can define high-level usage policies that are customized for your site. Four such policies are available:
Urgency policy – See Configuring the Urgency Policy
Share-based policy – See Configuring the Share-Based Policy
Functional policy – See Configuring the Functional Policy
Override policy – See Configuring the Override Policy
Policy management automatically controls the use of shared resources in the cluster to achieve your goals. High-priority jobs are dispatched preferentially. These jobs receive greater CPU entitlements when they are competing with other, lower-priority jobs. The grid engine software monitors the progress of all jobs. It adjusts their relative priorities correspondingly, and with respect to the goals that you define in the policies.
This policy-based resource allocation grants each user, team, department, and all projects an allocated share of system resources. This allocation of resources extends over a specified period of time, such as a week, a month, or a quarter.
On the QMON Main Control window, click the Policy Configuration button. The Policy Configuration dialog box appears.
The Policy Configuration dialog box shows the following information:
Policy Importance Factor
Urgency Policy
Ticket Policy. You can readjust the policy-related tickets.
From this dialog box you can access specific configuration dialog boxes for the three ticket-based policies.
Before the grid engine system dispatches jobs, the jobs are brought into priority order, highest priority first. Without any administrator influence, the order is first-in-first-out (FIFO).
On the Policy Configuration dialog box, under Policy Importance Factor, you can specify the relative importance of the three priority types that control the sorting order of jobs:
Priority. Also called POSIX priority. The –p option of the qsub command specifies site-specific priority policies.
Urgency Policy. Jobs can have an urgency value that determines their relative importance. Pending jobs are sorted according to their urgency value.
Ticket Policy. Jobs are always treated according to their relative importance as defined by the number of tickets that the jobs have. Pending jobs are sorted in ticket order.
For more information about job priorities, see Job Sorting.
You can specify a weighting factor for each priority type. This weighting factor determines the degree to which each type of priority affects overall job priority. To make it easier to control the range of values for each priority type, normalized values are used instead of the raw ticket values, urgency values, and POSIX priority values.
The following formula expresses how a job's priority values are determined:
Job priority = Urgency * normalized urgency value + Ticket * normalized ticket value + Priority * normalized priority value |
Urgency, Ticket, and Priority are the three weighting factors you specify under Policy Importance Factor. For example, if you specify Priority as 1, Urgency as 0.1, and Ticket as 0.01, job priority that is specified by the qsub –p command is given the most weight, job priority that is specified by the Urgency Policy is considered next, and job priority that is specified by the Ticket Policy is given the least weight.
The Urgency Policy defines an urgency value for each job. This urgency value is determined by the sum of the following three contributing elements:
Resource requirement. Each resource attribute defined in the complex can have an urgency value. For information about the setting urgency values for resource attributes, see Configuring Complex Resource Attributes With QMON. Each job request for a resource attribute adds the attribute's urgency value to the total.
Deadline. The urgency value for deadline jobs is determined by dividing the Weight Deadline specified in the Policy Configuration dialog box by the free time, in seconds, until the job's deadline initiation time specified by the qsub –dl command.
Waiting time. The urgency value for a job's waiting time is determined by multiplying the job's waiting time by the Weight Waiting Time specified in the Policy Configuration dialog box. The job's waiting time is measured in seconds.
For details about how the grid engine system arrives at the urgency value total, see About the Urgency Policy.
The tickets that are currently assigned to individual policies are listed under Current Active Tickets. The numbers reflect the relative importance of the policies. The numbers indicate whether a certain policy currently dominates the cluster or whether policies are in balance.
Tickets provide a quantitative measure. For example, you might assign twice as many tickets to the share-based policy as you assign to the functional policy. This means that twice the resource entitlement is allocated to the share-based policy than is allocated to the functional policy. In this sense, tickets behave very much like stock shares.
The total number of all tickets has no particular meaning. Only the relations between policies counts. Hence, total ticket numbers are usually quite high to allow for fine adjustment of the relative importance of the policies.
Under Edit Tickets, you can modify the number of tickets that are allocated to the share tree policy and the functional policy. For details, see Editing Tickets.
Select the Share Override Tickets check box to control the total ticket amount distributed by the override policy. Clear the check box to control the importance of individual jobs relative to the ticket pools that are available for the other policies and override categories. For detailed information, see Sharing Override Tickets.
Select the Share Functional Tickets check box to give a category member a constant entitlement level for the sum of all its jobs. Clear the check box to give each job the same entitlement level, based on its category member's entitlement. For detailed information, see Sharing Functional Ticket Shares.
You can set the maximum number of jobs that can be scheduled in the functional policy. The default value is 200.
You can set the maximum number of pending subtasks that are allowed for each array job. The default value is 50. Use this setting to reduce scheduling overhead.
You can specify the Ticket Policy Hierarchy to resolve certain cases of conflicting policies. The resolving of policy conflicts applies particularly to pending jobs. For detailed information, see Setting the Ticket Policy Hierarchy.
To refresh the information displayed, click Refresh.
To save any changes that you make to the Policy Configuration, click Apply. To close the dialog box without saving changes, click Done.
You can edit the total number of share-tree tickets and functional tickets. Override tickets are assigned directly through the override policy configuration. The other ticket pools are distributed automatically among jobs that are associated with the policies and with respect to the actual policy configuration.
All share-based tickets and functional tickets are always distributed among the jobs associated with these policies. Override tickets might not be applicable to the currently active jobs. Consequently, the active override tickets might be zero, even though the override policy has tickets defined.
The administrator assigns tickets to the different members of the override categories, that is, to individual users, projects, departments, or jobs. Consequently, the number of tickets that are assigned to a category member determines how many tickets are assigned to jobs under that category member. For example, the number of tickets that are assigned to user A determines how many tickets are assigned to all jobs of user A.
The number of tickets that are assigned to the job category does not determine how many tickets are assigned to jobs in that category.
Use the Share Override Tickets check box to set the share_override_tickets parameter of sched_conf(5). This parameter controls how job ticket values are derived from their category member ticket value. When you select the Share Override Tickets check box, the tickets of the category members are distributed evenly among the jobs under this member. If you clear the Share Override Tickets check box, each job inherits the ticket amount defined for its category member. In other words, the category member tickets are replicated for all jobs underneath.
Select the Share Override Tickets check box to control the total ticket amount distributed by the override policy. With this setting, ticket amounts that are assigned to a job can become negligibly small if many jobs are under one category member. For example, ticket amounts might diminish if many jobs belong to one member of the user category.
Clear the Share Override Tickets check box to control the importance of individual jobs relative to the ticket pools that are available for the other policies and override categories. With this setting, the number of jobs that are under a category member does not matter. The jobs always get the same number of tickets. However, the total number of override tickets in the system increases as the number of jobs with a right to receive override tickets increases. Other policies can lose importance in such cases.
The functional policy defines entitlement shares for the functional categories. Then the policy defines shares for all members of each of these categories. The functional policy is thus similar to a two-level share tree. The difference is that a job can be associated with several categories at the same time. The job belongs to a particular user, for instance, but the job can also belong to a project, a department, and a job class.
However, as in the share tree, the entitlement shares that a job receives from a functional category is determined by the following:
The shares that are defined for its corresponding category member (for example, its project)
The shares that are given to the category (project instead of user, department, and so on)
Use the Share Functional Tickets check box to set the share_functional_shares parameter of sched_conf(5). This parameter defines how the category member shares are used to determine the shares of a job. The shares assigned to the category members, such as a particular user or project, can be replicated for each job. Or shares can be distributed among the jobs under the category member.
Selecting the Share Functional Tickets check box means that functional shares are replicated among jobs.
Clearing the Share Functional Tickets check box means that functional shares are distributed among jobs.
Those shares are comparable to stock shares. Such shares have no effect for the jobs that belong to the same category member. All jobs under the same category member have the same number of shares in both cases. But the share number has an effect when comparing the share amounts within the same category. Jobs with many siblings that belong to the same category member receive relatively small share portions if you select the Share Functional Tickets check box. On the other hand, if you clear the Share Functional Tickets check box, all sibling jobs receive the same share amount as their category member.
Select the Share Functional Tickets check box to give a category member a constant entitlement level for the sum of all its jobs. The entitlement of an individual job can get negligibly small, however, if the job has many siblings.
Clear the Share Functional Tickets check box to give each job the same entitlement level, based on its category member's entitlement. The number of job siblings in the system does not matter.
A category member with many jobs underneath can dominate the functional policy.
Be aware that the setting of share functional shares does not determine the total number of functional tickets that are distributed. The total number is always as defined by the administrator for the functional policy ticket pool. The share functional shares parameter influences only how functional tickets are distributed within the functional policy.
The following example describes a common scenario where a user wishes to translate the SGE-5.3 Scheduler Option -user_sort true to an N1GE 6.1 Configuration but does not understand the share override functional policy ticket feature.
For a plain user-based equal share, you configure your global configuration sge_conf(5) with
-enforce_user auto |
-auto_user_fshare 100 |
Then you use -weight_tickets_functional 10000 in the scheduler configuration sched_conf(5). This action causes the functional policy to be used for user-based equal share scheduling with 100 shares for each user.
Pending jobs are sorted according to the number of tickets that each job has, as described in Job Sorting. The scheduler reports the number of tickets each pending job has to the master daemon sge_qmaster. However, on systems with very large numbers of jobs, you might want to turn off ticket reporting. When you turn off ticket reporting, you disable ticket-based job priority. The sort order of jobs is based only on the time each job is submitted.
To turn off the reporting of pending job tickets to sge_qmaster, clear the Report Pending Job Tickets check box on the Policy Configuration dialog box. Doing so sets the report_pjob_tickets parameter of sched_conf(5) to false.
Ticket policy hierarchy provides the means to resolve certain cases of conflicting ticket policies. The resolving of ticket policy conflicts applies particularly to pending jobs.
Such cases can occur in combination with the share-based policy and the functional policy. With both policies, assigning priorities to jobs that belong to the same leaf-level entities is done on a first-come-first-served basis. Leaf-level entities include:
User leaves in the share tree
Project leaves in the share tree
Any member of the following categories in the functional policy: user, project, department, or queue
Members of the job category are not included among leaf-level entities. So, for example, the first job of the same user gets the most, the second gets the next most, the third next, and so on.
A conflict can occur if another policy mandates an order that is different. So, for example, the override policy might define the third job as the most important, whereas the first job that is submitted should come last.
A policy hierarchy might gives the override policy higher priority over the share-tree policy or the functional policy. Such a policy hierarchy ensures that high-priority jobs under the override policy get more entitlements than jobs in the other two policies. Such jobs must belong to the same leaf level entity (user or project) in the share tree.
The Ticket Policy Hierarchy can be a combination of up to three letters. These letters are the first letters of the names of the following three ticket policies:
S – Share-based
F – Functional
O – Override
Use these letters to establish a hierarchy of ticket policies. The first letter defines the top policy. The last letter defines the bottom of the hierarchy. Policies that are not listed in the policy hierarchy do not influence the hierarchy. However, policies that are not listed in the hierarchy can still be a source for tickets of jobs. However, those tickets do not influence the ticket calculations in other policies. All tickets of all policies are added up for each job to define its overall entitlement.
The following examples describe two settings and how they influence the order of the pending jobs.
policy_hierarchy=OS |
The override policy assigns the appropriate number of tickets to each pending job.
The number of tickets determines the entitlement assignment in the share tree in case two jobs belong to the same user or to the same leaf-level project. Then the share tree tickets are calculated for the pending jobs.
The tickets from the override policy and from the share-tree policy are added together, along with all other active policies not in the hierarchy. The job with the highest resulting number of tickets has the highest entitlement.
policy_hierarchy=OF |
The override policy assigns the appropriate number of tickets to each pending job. Then the tickets from the override policy are added up.
The resulting number of tickets influences the entitlement assignment in the functional policy in case two jobs belong to the same functional category member. Based on this entitlement assignment, the functional tickets are calculated for the pending jobs.
The resulting value is added to the ticket amount from the override policy. The job with the highest resulting number of tickets has the highest entitlement.
All combinations of the three letters are theoretically possible, but only a subset of the combinations are meaningful or have practical relevance. The last letter should always be S or F, because only those two policies can be influenced due to their characteristics described in the examples.
The following form is recommended for policy_hierarchy settings:
[O][S|F] |
If the override policy is present, O should occur as the first letter only, because the override policy can only influence. The share-based policy and the functional policy can only be influenced. Therefore S or F should occur as the last letter.
Share-based scheduling grants each user and project its allocated share of system resources during an accumulation period such as a week, a month, or a quarter. Share-based scheduling is also called share tree scheduling. It constantly adjusts each user's and project's potential resource share for the near term, until the next scheduling interval. Share-based scheduling is defined for user or for project, or for both.
Share-based scheduling ensures that a defined share is guaranteed to the instances that are configured in the share tree over time. Jobs that are associated with share-tree branches where fewer resources were consumed in the past than anticipated are preferred when the system dispatches jobs. At the same time, full resource usage is guaranteed, because unused share proportions are still available for pending jobs associated with other share-tree branches.
By giving each user or project its targeted share as far as possible, groups of users or projects also get their targeted share. Departments or divisions are examples of such groups. Fair share for all entities is attainable only when every entity that is entitled to resources contends for those resources during the accumulation period. If a user, a project, or a group does not submit jobs during a given period, the resources are shared among those who do submit jobs.
Share-based scheduling is a feedback scheme. The share of the system to which any user or user-group, or project or project-group, is entitled is a configuration parameter. The share of the system to which any job is entitled is based on the following factors:
The share allocated to the job's user or project
The accumulated past usage for each user and user group, and for each project and project group. This usage is adjusted by a decay factor. “Old” usage has less impact.
The grid engine software keeps track of how much usage users and projects have already received. At each scheduling interval, the Scheduler adjusts all jobs' share of resources. Doing so ensures that all users, user groups, projects, and project groups get close to their fair share of the system during the accumulation period. In other words, resources are granted or are denied in order to keep everyone more or less at their targeted share of usage.
Half-life is how fast the system “forgets” about a user's resource consumption. The administrator decides whether to penalize a user for high resource consumption, be it six months ago or six days ago. The administrator also decides how to apply the penalty. On each node of the share tree, grid engine software maintains a record of users' resource consumption.
With this record, the system administrator can decide how far to look back to determine a user's underusage or overusage when setting up a share-based policy. The resource usage in this context is the mathematical sum of all the computer resources that are consumed over a “sliding window of time.”
The length of this window is determined by a “half-life” factor, which in the grid engine system is an internal decay function. This decay function reduces the impact of accrued resource consumption over time. A short half-life quickly lessens the impact of resource overconsumption. A longer half-life gradually lessens the impact of resource overconsumption.
This half-life decay function is a specified unit of time. For example, consider a half-life of seven days that is applied to a resource consumption of 1,000 units. This half-life decay factor results in the following usage “penalty” adjustment over time.
500 after 7 days
250 after 14 days
125 after 21 days
62.5 after 28 days
The half-life-based decay diminishes the impact of a user's resource consumption over time, until the effect of the penalty is negligible.
Override tickets that a user receives are not subjected to a past usage penalty, because override tickets belong to a different policy system. The decay function is a characteristic of the share-tree policy only.
Sometimes the comparison shows that actual usage is well below targeted usage. In such a case, the adjusting of a user's share or a project's share of resource can allow a user to dominate the system. Such an adjustment is based on the goal of reaching target share. This domination might not be desirable.
The compensation factor enables an administrator to limit how much a user or a project can dominate the resources in the near term.
For example, a compensation factor of two limits a user's or project's current share to twice its targeted share. Assume that a user or a project should get 20 percent of the system resources over the accumulation period. If the user or project currently gets much less, the maximum it can get in the near term is only 40 percent.
The share-based policy defines long-term resource entitlements of users or projects as per the share tree. When combined with the share-based policy, the compensation factor makes automatic adjustments in entitlements.
If a user or project is either under or over the defined target entitlement, the grid engine system compensates. The system raises or lowers that user's or project's entitlement for a short term over or under the long-term target. This compensation is calculated by a share tree algorithm.
The compensation factor provides an additional mechanism to control the amount of compensation that the grid engine system assigns. The additional compensation factor (CF) calculation is carried out only if the following conditions are true:
Short-term-entitlement is greater than long-term-entitlement multiplied by the CF
The CF is greater than 0
If either condition is not true, or if both conditions are not true, the compensation as defined and implemented by the share-tree algorithm is used.
The smaller the value of the CF, the greater is its effect. If the value is greater than 1, the grid engine system's compensation is limited. The upper limit for compensation is calculated as long-term-entitlement multiplied by the CF. And as defined earlier, the short-term entitlement must exceed this limit before anything happens based on the compensation factor.
If the CF is 1, the grid engine system compensates in the same way as with the raw share-tree algorithm. So a value of one has an effect that is similar to a value of zero. The only difference is an implementation detail. If the CF is one, the CF calculations are carried out without an effect. If the CF is zero, the calculations are suppressed.
If the value is less than 1, the grid engine system overcompensates. Jobs receive much more compensation than they are entitled to based on the share-tree algorithm. Jobs also receive this overcompensation earlier, because the criterion for activating the compensation is met at lower short-term entitlement values. The activating criterion is short-term-entitlement > long-term-entitlement * CF.
The share-based policy is implemented through a hierarchical share tree. The share tree specifies, for a moving accumulation period, how system resources are to be shared among all users and projects. The length of the accumulation period is determined by a configurable decay constant. The grid engine system bases a job's share entitlement on the degree to which each parent node in the share tree reaches its accumulation limit. A job's share entitlement is based on its leaf node share allocation, which in turn depends on the allocations of its parent nodes. All jobs associated with a leaf node split the associated shares.
The entitlement derived from the share tree is combined with other entitlements, such as entitlements from a functional policy, to determine a job's net entitlement. The share tree is allotted the total number of tickets for share-based scheduling. This number determines the weight of share-based scheduling among the four scheduling policies.
The share tree is defined during installation. The share tree can be altered at any time. When the share tree is edited, the new share allocations take effect at the next scheduling interval.
On the QMON Policy Configuration dialog box (Figure 5–1), click Share Tree Policy. The Share Tree Policy dialog box appears.
Under Node Attributes, the attributes of the selected node are displayed:
Identifier. A user, project, or agglomeration name.
Shares. The number of shares that are allocated to this user or project.
Shares define relative importance. They are not percentages. Shares also do not have quantitative meaning. The specification of hundreds or even thousands of shares is generally a good idea, as high numbers allow fine tuning of importance relationships.
Level Percentage. This node's portion of the total shares at the level of the same parent node in the tree. The number of this node's shares divided by the sum of its and its sibling's shares.
Total Percentage. This node's portion of the total shares in the entire share tree. The long-term targeted resource share of the node.
Actual Resource Usage. The percentage of all the resources in the system that this node has consumed so far in the accumulation period. The percentage is expressed in relation to all nodes in the share tree.
Targeted Resource Usage. Same as Actual Resource Usage, but only taking the currently active nodes in the share tree into account. Active nodes have jobs in the system. In the short term, the grid engine system attempts to balance the entitlement among active nodes.
Combined Usage. The total usage for the node. Combined Usage is the sum of the usage that is accumulated at this node. Leaf nodes accumulate the usage of all jobs that run under them. Inner nodes accumulate the usage of all descendant nodes. Combined Usage includes CPU, memory, and I/O usage according to the ratio specified under Share Tree Policy Parameters. Combined usage is decayed at the half-life decay rate that is specified by the parameters.
When a user node or a project node is removed and then added back, the user's or project's usage is retained. A node can be added back either at the same place or at a different place in the share tree. You can zero out that usage before you add the node back to the share tree. To do so, first remove the node from the users or projects configured in the grid engine system. Then add the node back to the users or projects there.
Users or projects that were not in the share tree but that ran jobs have nonzero usage when added to the share tree. To zero out usage when you add such users or projects to the tree, first remove them from the users or projects configured in the grid engine system. Then add them to the tree.
To add an interior node under the selected node, click Add Node. A blank Node Info window appears, where you can enter the node's name and number of shares. You can enter any node name or share number.
To add a leaf node under the selected node, click Add Leaf. A blank Node Info window appears, where you can enter the node's name and number of shares. The node's name must be an existing grid engine user (Configuring User Objects With QMON) or project (Defining Projects)
The following rules apply when you are adding a leaf node:
All nodes have a unique path in share tree.
A project is not referenced more than once in share tree.
A user appears only once in a project subtree.
A user appears only once outside of a project subtree.
A user does not appear as a nonleaf node.
All leaf nodes in a project subtree reference a known user or the reserved name default. See a detailed description of this special user in About the Special User default.
Project subtrees do not have subprojects.
All leaf nodes not in a project subtree reference a known user or known project.
All user leaf nodes in a project subtree have access to the project.
To edit the selected node, click Modify. A Node Info window appears. The window displays the mode's name and its number of shares.
To cut or copy the selected node to a buffer, click Cut or Copy. To Paste under the selected node the contents of the most recently cut or copied node, click Paste.
To delete the selected node and all its descendents, click Delete.
To clear the entire share-tree hierarchy, click Clear Usage. Clear the hierarchy when the share-based policy is aligned to a budget and needs to start from scratch at the beginning of each budget term. The Clear Usage facility also is handy when setting up or modifying test N1 Grid Engine 6.1 software environments.
QMON periodically updates the information displayed in the Share Tree Policy dialog box. Click Refresh to force the display to refresh immediately.
To save all the node changes that you make, click Apply. To close the dialog box without saving changes, click Done.
To search the share tree for a node name, click Find, and then type a search string. Node names are indicated which begin with the case sensitive search string. Click Find Next to find the next occurrence of the search string.
Click Help to open the online help system.
To display the Share Tree Policy Parameters, click the arrow at the right of the Node Attributes.
CPU [%] slider — This slider's setting indicates what percentage of Combined Usage CPU is. When you change this slider, the MEM and I/O sliders change to compensate for the change in CPU percentage.
MEM [%] slider — This slider's setting indicates what percentage of Combined Usage memory is. When you change this slider, the CPU and I/O sliders change to compensate for the change in MEM percentage.
I/O [%] slider — This slider's setting indicates what percentage of Combined Usage I/O is. When you change this slider, the CPU and MEM sliders change to compensate for the change in I/O percentage.
CPU [%], MEM [%], and I/O [%] always add up to 100%
Lock Symbol — When a lock is open, the slider that it guards can change freely. The slider can change either because the slider was moved or because it is compensating for another slider's being moved.
When a lock is closed, the slider that it guards cannot change. If two locks are closed and one lock is open, no sliders can be changed.
Half-life — Use this field to specify the half-life for usage. Usage is decayed during each scheduling interval so that any particular contribution to accumulated usage has half the value after a duration of half-life.
Days/Hours selection menu — Select whether half-life is to be measured in days or hours.
Compensation Factor — This field accepts a positive integer for the compensation factor. Reasonable values are in the range between 2 and 10.
The actual usage of a user or project can be far below its targeted usage. The compensation factor prevents such users or projects from dominating resources when they first get those resources. See Compensation Factor for more information.
You can use the special user default to reduce the amount of share-tree maintenance for sites with many users. Under the share-tree policy, a job's priority is determined based on the node the job maps to in the share tree. Users who are not explicitly named in the share tree are mapped to the default node, if it exists.
The specification of a single default node allows for a simple share tree to be created. Such a share tree makes user-based fair sharing possible.
You can use the default user also in cases where the same share entitlement is assigned to most users. Same share entitlement is also known as equal share scheduling.
The default user configures all user entries under the default node, giving the same share amount to each user. Each user who submits jobs receives the same share entitlement as that configured for the default user. To activate the facility for a particular user, you must add this user to the list of grid engine users.
The share tree displays “virtual” nodes for all users who are mapped to the default node. The display of virtual nodes enables you to examine the usage and the fair-share scheduling parameters for users who are mapped to the default node.
You can also use the default user for “hybrid” share trees, where users are subordinated under projects in the share tree. The default user can be a leaf node under a project node.
The short-term entitlements of users vary according to differences in the amount of resources that the users consume. However, long-term entitlements of users remain the same.
You might want to assign lower or higher entitlements to some users while maintaining the same long-term entitlement for all other users. To do so, configure a share tree with individual user entries next to the default user for those users with special entitlements.
In Example A, all users submitting to Project A get equal long-term entitlements. The users submitting to Project B only contribute to the accumulated resource consumption of Project B. Entitlements of Project B users are not managed.
Compare Example A with Example B:
In Example B, treatment for Project A is the same as for Example A. But all default users who submit jobs to Project B, except users A and B, receive equal long-term resource entitlements. Default users have 20 shares. User A, with 10 shares, receives half the entitlement of the default users. User B, with 40 shares, receives twice the entitlement as the default users.
Use QMON to configure the share tree policy, because a hierarchical tree is well-suited for graphical display and for editing. However, if you need to integrate share tree modifications in shell scripts, for example, you can use the qconf command and its options.
To configure the share-based policy from the command line, use the qconf command with appropriate options.
The qconf options -astree, -mstree, -dstree, and -sstree, enable you to do the following:
Add a new share tree
Modify an existing share tree
Delete a share tree
Display the share tree configuration
See the qconf(1) man page for details about these options. The share_tree(5) man page contains a description of the format of the share tree configuration.
The -astnode, -mstnode, -dstnode, and -sstnode options do not address the entire share tree, but only a single node. The node is referenced as path through all parent nodes down the share tree, similar to a directory path. The options enable you to add, modify, delete, and display a node. The information contained in a node includes its name and the attached shares.
The weighting of the usage parameters CPU, memory, and I/O are contained in the scheduler configuration as usage_weight. The weighting of the half-life is contained in the scheduler configuration as halftime. The compensation factor is contained in the scheduler configuration as compensation_factor. You can access the scheduler configuration from the command line by using the -msconf and the -ssconf options of qconf. See the sched_conf(5) man page for details about the format.
The objective of this setup is to guarantee a certain share assignment of all the cluster resources to different projects over time.
Specify the number of share-tree tickets (for example, 1000000) in the scheduler configuration.
See Configuring Policy-Based Resource Management With QMON, and the sched_conf(5) man page.
(Optional) Add one user for each scheduling-relevant user.
See Configuring User Objects With QMON, and the user(5) man page.
Add one project for each scheduling-relevant project.
See Defining Projects With QMON, and the project(5) man page.
Use QMON to set up a share tree that reflects the structure of all scheduling-relevant projects as nodes.
Assign share tree shares to the projects.
For example, if you are creating project-based share-tree scheduling with first-come, first-served scheduling among jobs of the same project, a simple structure might look like the following:
If you are creating project-based share-tree scheduling with equal shares for each user, a simple structure might look like the following:
If you are creating project-based share-tree scheduling with individual user shares in each project, add users as leaves to their projects. Then assign individual shares. A simple structure might look like the following:
If you want to assign individual shares to only a few users, designate the user default in combination with individual users below a project node. For example, you can condense the tree illustrated previously into the following:
Functional scheduling is a nonfeedback scheme for determining a job's importance. Functional scheduling associates a job with the submitting user, project, department, and job class. Functional scheduling is sometimes called priority scheduling. The functional policy setup ensures that a defined share is guaranteed to each user, project, or department at any time. Jobs of users, projects, or departments that have used fewer resources than anticipated are preferred when the system dispatches jobs to idle resources.
At the same time, full resource usage is guaranteed, because unused share proportions are distributed among those users, projects, and departments that need the resources. Past resource consumption is not taken into account.
Functional policy entitlement to system resources is combined with other entitlements in determining a job's net entitlement. For example, functional policy entitlement might be combined with share-based policy entitlement.
The total number of tickets that are allotted to the functional policy determines the weight of functional scheduling among the three scheduling policies. During installation, the administrator divides the total number of functional tickets among the functional categories of user, department, project, job, and job class.
Functional shares are assigned to every member of each functional category: user, department, project, job, and job class. These shares indicate what proportion of the tickets for a category each job associated with a member of the category is entitled to. For example, user davidson has 200 shares, and user donlee has 100. A job submitted by davidson is entitled to twice as many user-functional-tickets as donlee's job, no matter how many tickets there are.
The functional tickets that are allotted to each category are shared among all the jobs that are associated with a particular category.
At the bottom of the QMON Policy Configuration dialog box, click Functional Policy. The Functional Policy dialog box appears.
Select the functional category for which you are defining functional shares: user, project, department, or job.
The table under Functional Shares is scrollable. The table displays the following information:
A list of the members of the category currently selected from the Function Category list.
The number of functional shares for each member of the category. Shares are used as a convenient indication of the relative importance of each member of the functional category. You can edit this field.
The percentage of the functional share allocation for this category of functional ticket that this number of functional shares represents. This field is a feedback device and is not editable.
QMON periodically updates the information displayed in the Functional Policy dialog box. Click Refresh to force the display to refresh immediately.
To save all node changes that you make, click Apply. To close the dialog box without saving changes, click Done.
Click the jagged arrow above the Functional Shares table to open a configuration dialog box.
For User functional shares, the User Configuration dialog box appears. Use the User tab to switch to the appropriate mode for changing the configuration of grid engine users. See Configuring User Objects With QMON.
For Department functional shares, the User Configuration dialog box appears. Use the Userset tab to switch to the appropriate mode for changing the configuration of departments that are represented as usersets. See Defining Usersets As Projects and Departments.
For Project functional shares, the Project Configuration dialog box appears. See Defining Projects With QMON.
For Job functional shares, the Job Control dialog box appears. See Monitoring and Controlling Jobs With QMON in Sun N1 Grid Engine 6.1 User’s Guide.
To display the Ratio Between Sorts Of Functional Tickets, click the arrow at the right of the Functional Shares table .
User [%], Department [%], Project [%], Job [%] and Job Class [%] always add up to 100%.
When you change any of the sliders, all other unlocked sliders change to compensate for the change.
When a lock is open, the slider that it guards can change freely. The slider can change either because it is moved or because the moving of another slider causes this slider to change. When a lock is closed, the slider that it guards cannot change. If four locks are closed and one lock is open, no sliders can change.
User slider – Indicates the percentage of the total functional tickets to allocate to the users category
Departments slider – Indicates the percentage of the total functional tickets to allocate to the departments category
Project slider – Indicates the percentage of the total functional tickets to allocate to the projects category
Job slider – Indicates the percentage of the total functional tickets to allocate to the jobs category
You can assign functional shares to jobs only using QMON. No command-line interface is available for this function.
To configure the functional share policy from the command line, use the qconf command with the appropriate options.
Use the qconf -muser command to configure the user category. The -muser option modifies the fshare parameter of the user entry file. See the user(5) man page for information about the user entry file.
Use the qconf -mu command to configure the department category. The -mu option modifies the fshare parameter of the access list file. See the access_list(5) man page for information about the access list file, which is used to represent departments.
Use the qconf -mprj command to configure the project category. The -mprj option modifies the fshare parameter of the project entry file. See the project(5) man page for information about the project entry file.
Use the qconf -mq command to configure the job class category. The -mq option modifies the fshare parameter of the queue configuration file. See the queue_conf(5) man page for information about the queue configuration file, which is used to represent job classes.
The weighting between different categories is defined in the scheduler configuration sched_conf and can be changed using qconf -msconf. The parameters to change are weight_user, weight_department, weight_project, weight_job, and weight_jobclass. The parameter values range between 0 and 1, and the total sum of parameters must add up to 1.
Use this setup to create a certain share assignment of all the resources in the cluster to different users, projects, or departments. First-come, first-served scheduling is used among jobs of the same user, project, or department.
In the Scheduler Configuration dialog box, select the Share Functional Tickets check box.
See Sharing Functional Ticket Shares, and the sched_conf(5) man page.
Specify the number of functional tickets (for example, 1000000) in the scheduler configuration.
See Configuring Policy-Based Resource Management With QMON, and the sched_conf(5) man page.
Add scheduling-relevant items:
Add one user for each scheduling-relevant user.
See Configuring User Objects With QMON, and the user(5) man page.
Add one project for each scheduling-relevant project.
See Defining Projects With QMON, and the project(5) man page.
Add each scheduling-relevant department.
Assign functional shares to each user, project, or department.
See Configuring User Access Lists With QMON, and the access_list(5) man page.
Assign the shares as a percentage of the whole. Examples follow:
For users:
UserA (10)
UserB (20)
UserC (20)
UserD (20)
For projects:
ProjectA (55)
ProjectB (45)
For departments:
DepartmentA (90)
DepartmentB (5)
DepartmentC (5)
Override scheduling enables a grid engine system manager or operator to dynamically adjust the relative importance of one job or of all jobs that are associated with a user, a department, a project, or a job class. This adjustment adds tickets to the specified job, user, department, project, or job class. By adding override tickets, override scheduling increases the total number of tickets that a user, department, project, or job has. As a result, the overall share of resources is increased.
The addition of override tickets also increases the total number of tickets in the system. These additional tickets deflate the value of every job's tickets.
You can use override tickets for the following two purposes:
To temporarily override the share-based policy or the functional policy without having to change the configuration of these policies.
To establish resource entitlement levels with an associated fixed amount of tickets. The establishment of entitlement levels is appropriate for scenarios like high, medium, or low job classes, or high, medium, or low priority classes.
Override tickets that are assigned directly to a job go away when the job finishes. All other tickets are inflated back to their original value. Override tickets that are assigned to users, departments, projects, and job classes remain in effect until the administrator explicitly removes the tickets.
The Policy Configuration dialog box displays the current number of override tickets that are active in the system.
Override entries remain in the Override dialog box. These entries can influence subsequent work if they are not explicitly deleted by the administrator when they are no longer needed.
At the bottom of the QMON Policy Configuration dialog box, click Override Policy. The Override Policy dialog box appears.
Select the category for which you are defining override tickets: user, project, department, or job.
The override table is scrollable. It displays the following information:
A list of the members of the category for which you are defining tickets. The categories are user, project, department, job, and job class.
The number of override tickets for each member of the category. This field is editable.
QMON periodically updates the information that is displayed in the Override Policy dialog box. Click Refresh to force the display to refresh immediately.
To save all override changes that you make, click Apply. To close the dialog box without saving changes, click Done.
Click the jagged arrow above the override table to open a configuration dialog box.
For User override tickets, the User Configuration dialog box appears. Use the User tab to switch to the appropriate mode for changing the configuration of grid engine users. See Configuring User Objects With QMON.
For Department override tickets, the User Configuration dialog box appears. Use the Userset tab to switch to the appropriate mode for changing the configuration of departments that are represented as usersets. See Defining Usersets As Projects and Departments.
For Project override tickets, the Project Configuration dialog box appears. See Defining Projects With QMON.
For Job override tickets, the Job Control dialog box appears. See Monitoring and Controlling Jobs With QMON in Sun N1 Grid Engine 6.1 User’s Guide.
You can assign override tickets to jobs only using QMON. No command line interface is available for this function.
To configure the override policy from the command line, use the qconf command with the appropriate options.
Use the qconf -muser command to configure the user category. The -muser option modifies the oticket parameter of the user entry file. See the user(5) man page for information about the user entry file.
Use the qconf -mu command to configure the department category. The -mu option modifies the oticket parameter of the access list file. See the access_list(5) man page for information about the access list file, which is used to represent departments.
Use the qconf -mprj command to configure the project category. The -mprj option modifies the oticket parameter of the project entry file. See the project(5) man page for information about the project entry file.
Use the qconf -mq command to configure the job class category. The -mq option modifies the oticket parameter of the queue configuration file. See the queue_conf(5) man page for information about the queue configuration file, which is used to represent job classes.