Sun N1 Grid Engine 6.1 User's Guide

Submitting Extended Jobs and Advanced Jobs

Extended jobs and advanced jobs are more complex forms of job submission. Before attempting to submit such jobs, you should understand some important background information about the process. The following sections describe those job processes.

Submitting Extended Jobs With `QMON`

The General tab of the Submit Job dialog box enables you to configure the following parameters for an extended job. The General tab is shown in Figure 3–2.

Prefix – A prefix string that is used for script-embedded submit options. See Active Comments for details.
Job Script – The job script to use. Click the icon at the right of the Job Script field to open a file selection box. The file selection box is shown in Figure 3–4.
Job Tasks – The task ID range for submitting array jobs. See Submitting Array Jobs for details.
Job Name – The name of the job. A default is set after you select a job script.
Job Args – Arguments to the job script.
Priority – A counting box for setting the job's initial priority This priority ranks a single user's jobs. Priority tells the scheduler how to choose among a single user's jobs when several of that user's jobs are in the system simultaneously.

Note –
To enable users to set the priorities of their own jobs, the administrator must enable priorities with the weight_priority parameter of the scheduler configuration. For more information, see Chapter 5, Managing Policies and the Scheduler, in Sun N1 Grid Engine 6.1 Administration Guide.
Job Share – Defines the share of the job's tickets relative to other jobs. The job share influences only the share tree policy and the functional policy.
Start At – The time at which the job is considered eligible for execution. Click the icon at the right of the Start At field to open a dialog box for entering the correctly formatted time:
Project – The project to which the job is subordinated. Click the icon at the right of the Project field to select among the available projects:
Current Working Directory – A flag that indicates whether to execute the job in the current working directory. Use this flag only for identical directory hierarchies between the submit host and the potential execution hosts.
Shell – The command interpreter to use to run the job script. See How a Command Interpreter Is Selected for details. Click the icon at the right of the Shell field to open a dialog box for entering the command interpreter specifications of the job:
Merge Output – A flag indicating whether to merge the job's standard output and standard error output together into the standard output stream.
stdout – The standard output redirection to use. See Output Redirection for details. A default is used if nothing is specified. Click the icon at the right of the stdout field to open a dialog box for entering the output redirection alternatives:
stderr – The standard error output redirection to use, similar to the standard output redirection.
stdin – The standard input file to use, similar to the standard output redirection.
Request Resources – Click this button to define the resource requirement for your job. If resources are requested for a job, the button changes its color.
Restart depends on Queue – Click this button to define whether the job can be restarted after being aborted by a system crash or similar events. This button also controls whether the restart behavior depends on the queue or is demanded by the job.
Notify Job – A flag indicating whether the job is to be notified by SIGUSR1 or by SIGUSR2 signals if the job is about to be suspended or cancelled.
Hold Job – A flag indicating that either a user hold or a job dependency is to be assigned to the job. The job is not eligible for execution as long as any type of hold is assigned to the job. See Monitoring and Controlling Jobs for more details. The Hold Job field enables restricting the hold only to a specific range of tasks of an array job. See Submitting Array Jobs for information about array jobs.
Start Job Immediately – A flag that forces the job to be started immediately if possible, or to be rejected otherwise. Jobs are not queued if this flag is selected.
Job Reservation – A flag specifying that resources should be reserved for this job. See Resource Reservation and Backfilling in Sun N1 Grid Engine 6.1 Administration Guide for details.

The buttons at the right side of the Submit Job dialog box enable you to start various actions:

Submit – Submit the currently specified job.
Edit – Edit the selected script file in an X terminal, using either vi or the editor defined by the EDITOR environment variable.
Clear – Clear all settings in the Submit Job dialog box, including any specified resource requests.
Reload – Reload the specified script file, parse any script-embedded options, parse default settings, and discard intermediate manual changes to these settings. For more information, see Active Comments and Default Request Files. This action is the equivalent to a Clear action with subsequent specifications of the previous script file. The option has an effect only if a script file is already selected.
Save Settings – Save the current settings to a file. Use the file selection box to select the file. The saved files can either be loaded later or be used as default requests. For more information, see Load Settings and Default Request Files.
Load Settings – Load settings previously saved with the Save Settings button. The loaded settings overwrite the current settings. See Save Settings.
Done – Closes the Submit Job dialog box.

Extended Job Example

Figure 3–5 shows the Submit Job dialog box with most of the parameters set.

Figure 3–5 Extended Job Submission Example

Dialog box titled Submit Job. Shows the General
tab. The previous section describes the parameters and buttons that
are shown.

The parameters of the job configured in the example are:

The job has the script file flow.sh, which must reside in the working directory of QMON.
The job is called Flow.
The script file takes the single argument big.data.
The job starts with priority 3.
The job is eligible for execution not before 4:30.44 AM of the 22th of April in the year 2004.
The project definition means that the job is subordinated to project crash.
The job is executed in the submission working directory.
The job uses the tcsh command interpreter.
Standard output and standard error output are merged into the file flow.out, which is created in the current working directory.

Submitting Extended Jobs From the Command Line

To submit the extended job request that is shown in Figure 3–5 from the command line, type the following command:

% qsub -N Flow -p -111 -P devel -a 200404221630.44 -cwd \
	-S /bin/tcsh -o flow.out -j y flow.sh big.data

Submitting Advanced Jobs With `QMON`

The Advanced tab of the Submit Job dialog box enables you to define the following additional parameters:

Parallel Environment – A parallel environment interface to use
Environment – A set of environment variables to set for the job before the job runs. Click the icon at the right of the Environment field to open a dialog box that enables you to define he environment variables to export:

Environment variables can be taken from QMON`s runtime environment, or you can define your own environment variables.
Context – A list of name/value pairs that can be used to store and communicate job-related information. This information is accessible anywhere from within a cluster. You can modify context variables from the command line with the -ac, -dc, and -sc options to qsub, qrsh, qsh, qlogin, and qalter. You can retrieve context variables with the qstat -j command.
Checkpoint Object – The checkpointing environment to use if checkpointing the job is desirable and suitable. See Using Job Checkpointing for details.
Account – An account string to associate with the job. The account string is added to the accounting record that is kept for the job. The accounting record can be used for later accounting analysis.
Verify Mode – The Verify flag determines the consistency checking mode for your job. To check for consistency of the job request, the grid engine system assumes an empty and unloaded cluster. The system tries to find at least one queue in which the job could run. Possible checking modes are as follows:
- Skip – No consistency checking at all.
- Warning – Inconsistencies are reported, but the job is still accepted. Warning mode might be desirable if the cluster configuration should change after the job is submitted.
- Error – Inconsistencies are reported. The job is rejected if any inconsistencies are encountered.
- Just verify – The job is not submitted. An extensive report is generated about the suitability of the job for each host and queue in the cluster.
Mail – The events about which the user is notified by email. The events' start, end, abort, and suspend are currently defined for jobs.
Mail To – A list of email addresses to which these notifications are sent. Click the icon at the right of the Mail To field to open a dialog box for defining the mailing list.
Hard Queue List, Soft Queue List – A list of queue names that are requested to be the mandatory selection for the execution of the job. The Hard Queue List and the Soft Queue List are treated identically to a corresponding resource requirement.
Master Queue List – A list of queue names that are eligible as master queue for a parallel job. A parallel job is started in the master queue. All other queues to which the job spawns parallel tasks are called slave queues.
Job Dependencies – A list of IDs of jobs that must finish before the submitted job can be started. The newly created job depends on completion of those jobs.
Deadline – The deadline initiation time for deadline jobs. Deadline initiation defines the point in time at which a deadline job must reach maximum priority to finish before a given deadline. To determine the deadline initiation time, subtract an estimate of the running time, at maximum priority, of a deadline job from its desired deadline time. Click the icon at the right of the Deadline field to open the dialog box that enables you to set the deadline.

Note –
Not all users are allowed to submit deadline jobs. Ask your system administrator if you are permitted to submit deadline jobs. Contact the cluster administrator for information about the maximum priority that is given to deadline jobs.

Advanced Job Example

Figure 3–6 shows an example of an advanced job submission.

Figure 3–6 Advanced Job Submission Example

Dialog box titled Submit Job. Shows the Advanced
tab. Previous sections describe the parameters and buttons that are
shown.

The job defined in Extended Job Example has the following additional characteristics as compared to the job definition in Submitting Extended Jobs With QMON.

The job requires the use of the parallel environment mpi. The job needs at least 4 parallel processes to be created. The job can use up to 16 processes if the processes are available.
Two environment variables are set and exported for the job.
Two context variables are set.
The account string FLOW is to be added to the job accounting record.
Mail must be sent to me@myhost.org as soon as the job starts and finishes.
The job should preferably be executed in the queue big_q.

Submitting Advanced Jobs From the Command Line

To submit the advanced job request that is shown in Figure 3–6 from the command line, type the following command:

% qsub -N Flow -p -111 -P devel -a 200012240000.00 -cwd \
 -S /bin/tcsh -o flow.out -j y -pe mpi 4-16 \
 -v SHARED_MEM=TRUE,MODEL_SIZE=LARGE \
 -ac JOB_STEP=preprocessing,PORT=1234 \
 -A FLOW -w w -m s,e -q big_q\
 -M me@myhost.com,me@other.address \
 flow.sh big.data

Default Request Files

The preceding command shows that advanced job requests can be rather complex and unwieldy, in particular if similar requests need to be submitted frequently. To avoid the cumbersome and error-prone task of entering such commands, users can embed qsub options in the script files, or use default request files. For more information, see Active Comments.

Note –

The -binary yes|no option when specified with the y argument, allows you to use qrsh to submit executable jobs without the script wrapper. See the qsub man page.

The cluster administration can set up a default request file for all grid engine system users. Users, on the other hand, can create private default request files located in their home directories. Users can also create application-specific default request files that are located in their working directories.

Default request files contain the qsub options to apply by default to the jobs in one or more lines. The location of the global cluster default request file is sge-root/cell/common/sge_request. The private general default request file is located under $HOME/.sge_request. The application-specific default request files are located under $cwd/.sge_request.

If more than one of these files are available, the files are merged into one default request, with the following order of precedence:

Application-specific default request file
General private default request file
Global default request file

Script embedding and the qsub command line have higher precedence than the default request files. Therefore, script embedding overrides default request file settings. The qsub command line options can override these settings again.

To discard any previous settings, use the qsub -clear command in a default request file, in embedded script commands, or in the qsub command line.

Here is an example of a private default request file:

-A myproject -cwd -M me@myhost.com -m b e
-r y -j y -S /bin/ksh

Unless overridden, for all of this user's jobs the following is true:

The account string is myproject
The jobs execute in the current working directory
Mail notification is sent to me@myhost.com at the beginning and at the end of the jobs
The standard output and standard error output are merged
The ksh is used as command interpreter

Defining Resource Requirements

In the examples so far, the submit options do not express any resource requirements for the hosts on which the jobs are to be executed. The grid engine system assumes that such jobs can be run on any host. In practice, however, most jobs require that certain prerequisites be met on the executing host in order for the job to finish successfully. Such prerequisites include enough available memory, required software to be installed, or a certain operating system architecture. Also, the cluster administration usually imposes restrictions on the use of the machines in the cluster. For example, the CPU time that can be consumed by the jobs is often restricted.

The grid engine system provides users with the means to find suitable hosts for their jobs without precise knowledge of the cluster`s equipment and its usage policies. Users specify the requirement of their jobs and let the grid engine system manage the task of finding a suitable and lightly loaded host.

You specify resource requirements through requestable attributes, which are described in Requestable Attributes. QMON provides a convenient way to specify the requirements of a job. The Requested Resources dialog box displays only those attributes in the Available Resource list that are currently eligible. Click Request Resources in the Submit Job dialog box to open the Requested Resources dialog box. See Figure 3–7 for an example.

Figure 3–7 Requested Resources Dialog Box

Dialog box titled Requested Resources. Shows
lists hard and soft resources, and a list of available resources.

When you double-click an attribute, the attribute is added to the Hard or Soft Resources list of the job. A dialog box opens to guide you in entering a value specification for the attribute in question, except for BOOLEAN attributes, which are set to True. For more information, see How the Grid Engine System Allocates Resources.

Figure 3–7 shows a resource profile for a job that requests a solaris64 host with an available permas license offering at least 750 MBytes of memory. If more than one queue that fulfills this specification is found, any defined soft resource requirements are taken into account. However, if no queue satisfying both the hard and the soft requirements is found, any queue that grants the hard requirements is considered suitable.

Note –

The queue_sort_method parameter of the scheduler configuration determines where to start the job only if more than one queue is suitable for a job. See the sched_conf(5) man page for more information.

The attribute permas, an integer, is an administrator extension to the global resource attributes. The attribute arch, a string, is a host resource attribute. The attribute h_vmem, memory, is a queue resource attribute.

An equivalent resource requirement profile can as well be submitted from the qsub command line:

% qsub -l arch=solaris64,h_vmem=750M,permas=1 \
	permas.sh

The implicit -hard switch before the first -l option has been skipped.

The notation 750M for 750 MBytes is an example of the quantity syntax of the grid engine system. For those attributes that request a memory consumption, you can specify either integer decimal, floating-point decimal, integer octal, and integer hexadecimal numbers. The following multipliers must be appended to these numbers:

k – Multiplies the value by 1000
K – Multiplies the value by 1024
m – Multiplies the value by 1000 times 1000
M – Multiplies the value by 1024 times 1024

Octal constants are specified by a leading zero and digits ranging from 0 to 7 only. To specify a hexadecimal constant, you must prefix the number with 0x. You must also use digits ranging from 0 to 9, a through f, and A through F. If no multipliers are appended, the values are considered to count as bytes. If you are using floating-point decimals, the resulting value is truncated to an integer value.

For those attributes that impose a time limit, you can specify time values in terms of hours, minutes, or seconds, or any combination. Hours, minutes, and seconds are specified in decimal digits separated by colons. A time of 3:5:11 is translated to 11111 seconds. If zero is a specifier for hours, minutes, or seconds, you can leave it out if the colon remains. Thus a value of :5: is interpreted as 5 minutes. The form used in the Requested Resources dialog box that is shown in Figure 3–7 is an extension, which is valid only within QMON.

How the Grid Engine System Allocates Resources

As shown in the previous section, knowing how grid engine software processes resource requests and allocates resources is important. The schematic view of grid engine software's resource allocation algorithm is as follows.

Read in and parse all default request files. See Default Request Files for details.
Process the script file for embedded options. See Active Comments for details.
Read all script-embedding options when the job is submitted, regardless of their position in the script file.
Read and parse all requests from the command line.

As soon as all qsub requests are collected, hard and soft requests are processed separately, the hard requests first. The requests are evaluated, according to the following order of precedence:

From left to right of the script or default request file
From top to bottom of the script or default request file
From left to right of the command line

In other words, you can use the command line to override the embedded flags.

The resources requested as hard are allocated. If a request is not valid, the submission is rejected. If one or more requests cannot be met at submit time, the job is spooled and rescheduled to be run at a later time. A request might not be met, for example, if a requested queue is busy. If all hard requests can be met, the requests are allocated and the job can be run.

The resources requested as soft are checked. The job can run even if some or all of these requests cannot be met. If multiple queues that meet the hard requests provide parts of the soft resources list, the grid engine software selects the queues that offer the most soft requests.

The job is started and covers the allocated resources.

You might want to gather experience of how argument list options and embedded options or hard and soft requests influence each other. You can experiment with small test script files that execute UNIX commands such as hostname or date.

Job Dependencies

Often the most convenient way to build a complex task is to split the task into subtasks. In these cases, subtasks depend on the completion of other subtasks before the dependent subtasks can get started. An example is that a predecessor task produces an output file that must be read and processed by a dependent task.

The grid engine system supports interdependent tasks with its job dependency facility. You can configure jobs to depend on the completion of one or more other jobs. The facility is enforced by the qsub -hold_jid command. You can specify a list of jobs upon which the submitted job depends. The list of jobs can also contain subsets of array jobs. The submitted job is not eligible for execution unless all jobs in the dependency list have finished.

Submitting Array Jobs

Parameterized and repeated execution of the same set of operations that are contained in a job script is an ideal application for the array job facility of the grid engine system. Typical examples of such applications are found in the Digital Content Creation industries for tasks such as rendering. Computation of an animation is split into frames. The same rendering computation can be performed for each frame independently.

The array job facility offers a convenient way to submit, monitor, and control such applications. The grid engine system provides an efficient implementation of array jobs, handling the computations as an array of independent tasks joined into a single job. The tasks of an array job are referenced through an array index number. The indexes for all tasks span an index range for the entire array job. The index range is defined during submission of the array job by a single qsub command.

You can monitor and control an array job. For example, you can suspend, resume, or cancel an array job as a whole or by individual task or subset of tasks. To reference the tasks, the corresponding index numbers are suffixed to the job ID. Tasks are executed very much like regular jobs. Tasks can use the environment variable SGE_TASK_ID to retrieve their own task index number and to access input data sets designated for this task identifier.

Submitting an Array Job With `QMON`

Follow the instructions in How To Submit a Simple Job With QMON, additionally taking into account the following information.

The submission of array jobs from QMON works virtually identically to how the submission of a simple job is described in How To Submit a Simple Job With QMON. The only difference is that the Job Tasks input window that is shown in Figure 3–5 must contain the task range specification. The task range specification uses syntax that is identical to the qsub -t command. See the qsub(1) man page for detailed information about array index syntax.

For information about monitoring and controlling jobs in general, and about array jobs in particular, see Monitoring and Controlling Jobs and Monitoring and Controlling Jobs From the Command Line. See also the man pages for qstat(1), qhold(1), qrls(1), qmod(1), and qdel(1).

Array jobs offer full access to all facilities of the grid engine system that are available for regular jobs. In particular, array jobs can be parallel jobs at the same time. Array jobs also can have interdependencies with other jobs.

Note –

Array tasks cannot have interdependencies with other jobs or with other array tasks.

Submitting an Array Job From the Command Line

To submit an array job from the command line, type the qsub command with appropriate arguments.

The following is an example of how to submit an array job:

% qsub -l h_cpu=0:45:0 -t 2-10:2 render.sh data.in

The -t option defines the task index range. In this case, 2-10:2 specifies that 2 is the lowest index number, and 10 is the highest index number. Only every second index, the :2 part of the specification, is used. Thus, the array job is made up of 5 tasks with the task indices 2, 4, 6, 8, and 10. Each task requests a hard CPU time limit of 45 minutes with the -l option. Each task executes the job script render.sh once the task is dispatched and started by the grid engine system. Tasks can use SGE_TASK_ID to find their index number, which they can use to find their input data record in the data file data.in.

Submitting Extended Jobs and Advanced Jobs

Submitting Extended Jobs With QMON

Extended Job Example

Figure 3–5 Extended Job Submission Example

Submitting Extended Jobs From the Command Line

Submitting Advanced Jobs With QMON

Advanced Job Example

Figure 3–6 Advanced Job Submission Example

Submitting Advanced Jobs From the Command Line

Default Request Files

Defining Resource Requirements

Figure 3–7 Requested Resources Dialog Box

How the Grid Engine System Allocates Resources

Job Dependencies

Submitting Array Jobs

Submitting an Array Job With QMON

Submitting an Array Job From the Command Line

Submitting Extended Jobs With `QMON`

Submitting Advanced Jobs With `QMON`

Submitting an Array Job With `QMON`