Sun N1 Grid Engine 6.1 User's Guide

Chapter 3 Submitting Jobs

This chapter provides background information about submitting jobs, as well as instructions for how to submit jobs for processing. The chapter begins with an example of how to run a simple job. The chapter then continues with instructions for how to run more complex jobs.

Instructions for accomplishing the following tasks are included in this chapter.

Submitting a Simple Job

Use the information and instructions in this section to become familiar with basic procedures involved in submitting jobs.


Note –

If you installed the N1 Grid Engine 6.1 software under an unprivileged user account, you must log in as that user to be able to run jobs. See Installation Accounts in Sun N1 Grid Engine 6.1 Installation Guide for details.


ProcedureHow To Submit a Simple Job From the Command Line

Before you run any grid engine system command, you must first set your executable search path and other environment conditions properly.

  1. From the command line, type one of the following commands.

    • If you are using csh or tcsh as your command interpreter, type the following:


      % source sge-root/cell/common/settings.csh

      sge-root specifies the location of the root directory of the grid engine system. This directory was specified at the beginning of the installation procedure.

    • If you are using sh, ksh, or bash as your command interpreter, type the following:


      # . sge-root/cell/common/settings.sh

      Note –

      You can add these commands to your .login, .cshrc, or .profile files, whichever is appropriate. By adding these commands, you guarantee proper settings for all interactive session you start later.


  2. Submit a simple job script to your cluster by typing the following command:


    % qsub simple.sh
    

    The command assumes that simple.sh is the name of the script file, and that the file is located in your current working directory.

    You can find the following job in the file /sge-root/examples/jobs/simple.sh.


    #!/bin/sh
    #
    #
    # (c) 2004 Sun Microsystems, Inc. Use is subject to license terms.
    
    # This is a simple example of a SGE batch script
    
    # request Bourne shell as shell for job
    #$ -S /bin/sh
    
    #
    # print date and time
    date
    # Sleep for 20 seconds
    sleep 20
    # print date and time again
    date

    If the job submits successfully, the qsub command responds with a message similar to the following example:


    your job 1 (“simple.sh”) has been submitted
  3. Type the following command to retrieve status information about your job.


    % qstat

    You should receive a status report that provides information about all jobs currently known to the grid engine system. For each job, the status report lists the following items:

    • Job ID, which is the unique number that is included in the submit confirmation

    • Name of the job script

    • Owner of the job

    • State indicator; for example, r means running

    • Submit or start time

    • Name of the queue in which the job runs

    If qstat produces no output, no jobs are actually known to the system. For example, your job might already have finished.

    You can control the output of the finished jobs by checking their stdout and stderr redirection files. By default, these files are generated in the job owner`s home directory on the host that ran the job. The names of the files are composed of the job script file name with a .o extension for the stdout file and a .e extension for the stderr file, followed by the unique job ID. The stdout and stderr files of your job can be found under the names simple.sh.o1 and simple.sh.e1 respectively. These names are used if your job was the first ever executed in a newly installed grid engine system.

ProcedureHow To Submit a Simple Job With QMON

A more convenient way to submit and control jobs and of getting an overview of the grid engine system is the graphical user interface QMON. Among other facilities, QMON provides a job submission dialog box and a Job Control dialog box for the tasks of submitting and monitoring jobs.

  1. Type the following command to start the QMON GUI:


    % qmon
    

    During startup, a message window appears, and then the QMON Main Control window appears.

    Figure 3–1 QMON Main Control Window

    QMON Main Control window. Shows callouts indicating
that you should first click the Job Control button and then click
the Submit Jobs button.

  2. Click the Job Control button, and then click the Submit Jobs button.


    Tip –

    The button names, such as Job Control, are displayed when you rest the mouse pointer over the buttons.


    The Submit Job and the Job Control dialog boxes appear, as shown in the following figures.

    Figure 3–2 Submit Job Dialog Box

    Dialog box titled Submit Job. Shows callouts
indicating that you should first click the script button and then
click Submit to submit the job.

    Figure 3–3 Job Control Dialog Box

    Dialog box titled Job Control. Shows the Running
Jobs tab with job information. Shows buttons for manipulating jobs.

  3. In the Submit Job dialog box, click the icon at the right of the Job Script field.

    The Select a File dialog box appears.

    Figure 3–4 Select a File Dialog Box

    Dialog box titled Select a File. Shows lists
of directories and files from which to select a job script. Shows
OK, Filter, Cancel, and Help buttons.

  4. Select your script file.

    For example, select the file simple.sh that was used in the command line example.

  5. Click OK to close the Select a File dialog box.

  6. On the Submit Job dialog box, click Submit.

    After a few seconds you should be able to monitor your job on the Job Control dialog box. You first see your job on the Pending Jobs tab. The job quickly moves to the Running Jobs tab once the job starts running.

Submitting Batch Jobs

The following sections describe how to submit more complex jobs through the grid engine system.

About Shell Scripts

Shell scripts, also called batch jobs, are a sequence of command-line instructions that are assembled in a file. Script files are made executable by the chmod command. If scripts are invoked, a command interpreter is started. Each instruction is interpreted as if the instruction were typed manually by the user who is running the script. csh, tcsh, sh, or ksh are typical command interpreters. You can invoke arbitrary commands, applications, and other shell scripts from within a shell script.

The command interpreter can be invoked as login shell. To do so, the name of the command interpreter must be contained in the login_shells list of the grid engine system configuration that is in effect for the particular host and queue that is running the job.


Note –

The grid engine system configuration might be different for the various hosts and queues configured in your cluster. You can display the effective configurations with the -sconf and -sq options of the qconf command. For detailed information, see the qconf(1) man page.


If the command interpreter is invoked as login shell, the environment of your job is the same as if you logged in and ran the script. In using csh, for example, .login and .cshrc are executed in addition to the system default startup resource files, such as /etc/login, whereas only .cshrc is executed if csh is not invoked as login-shell. For a description of the difference between being invoked and not being invoked as login-shell, see the man page of your command interpreter.

Example of a Shell Script

Example 3–1 is a simple shell script. The script first compiles the application flow from its Fortran77 source and then runs the application.


Example 3–1 Simple Shell Script


#!/bin/csh
# This is a sample script file for compiling and 
# running a sample FORTRAN program under N1 Grid Engine 6 
cd TEST
# Now we need to compile the program "flow.f" and
# name the executable "flow".
f77 flow.f -o flow

Your local system user's guide provides detailed information about building and customizing shell scripts. You might also want to look at the sh, ksh, csh, or tcsh man page. The following sections emphasize special things that you should consider when you prepare batch scripts for the grid engine system.

In general, you can submit to the grid engine system all shell scripts that you can run from your command prompt by hand. Such shell scripts must not require a terminal connection, and the scripts must not need interactive user intervention. The exceptions are the standard error and standard output devices, which are automatically redirected. Therefore, Example 3–1 is ready to be submitted to the grid engine system and the script will perform the desired action.

Extensions to Regular Shell Scripts

Some extensions to regular shell scripts influence the behavior of scripts that run under grid engine system control. The following sections describe these extensions.

How a Command Interpreter Is Selected

At submit time, you can specify the command interpreter to use to process the job script file as shown in Figure 3–5. However, if nothing is specified, the configuration variable shell_start_mode determines how the command interpreter is selected:

Output Redirection

Since batch jobs do not have a terminal connection, their standard output and their standard error output must be redirected into files. The grid engine system enables the user to define the location of the files to which the output is redirected. Defaults are used if no output files are specified.

The standard location for the files is in the current working directory where the jobs run. The default standard output file name is job-name.ojob-id, the default standard error output is redirected to job-name>.ejob-id. The job-name can be built from the script file name, or defined by the user. See, for example, the -N option in the submit(1) man page. job-id is a unique identifier that is assigned to the job by the grid engine system.

For array job tasks , the task identifier is added to these filenames, separated by a dot. The resulting standard redirection paths are job-name.ojob-id.task-id> and job-name.ejob-id.task-id. For more information, see Submitting Array Jobs.

In case the standard locations are not suitable, the user can specify output directions with QMON, as shown in Figure 3–6. Or the user can use the -e and -o options to the qsub command to specify output directions. Standard output and standard error output can be merged into one file. The redirections can be specified on a per execution host basis, in which case, the location of the output redirection file depends on the host on which the job is executed. To build custom but unique redirection file paths, use dummy environment variables together with the qsub -e and -o options. A list of these variables follows.

When the job runs, these variables are expanded into the actual values, and the redirection path is built with these values.

See the qsub(1) man page for further details.

Active Comments

Lines with a leading # sign are treated as comments in shell scripts. However, the grid engine system recognizes special comment lines and uses these lines in a special way. The special comment script line is treated as part of the command line argument list of the qsub command. The qsub options that are supplied within these special comment lines are also interpreted by the QMON Submit Job dialog box. The corresponding parameters are preset when a script file is selected.

By default, the special comment lines are identified by the #$ prefix string. You can redefine the prefix string with the qsub -C command.

This use of special comments is called script embedding of submit arguments. The following example shows a script file that uses script-embedded command-line options.


Example 3–2 Using Script-Embedded Command Line Options


#!/bin/csh

#Force csh if not Grid Engine default 
#shell

#$ -S /bin/csh

# This is a sample script file for compiling and
# running a sample FORTRAN program under N1 Grid Engine 6
# We want Grid Engine to send mail
# when the job begins
# and when it ends.

#$ -M EmailAddress
#$ -m b e

# We want to name the file for the standard output
# and standard error.

#$ -o flow.out -j y

# Change to the directory where the files are located.

cd TEST

# Now we need to compile the program "flow.f" and
# name the executable "flow".

f77 flow.f -o flow

# Once it is compiled, we can run the program.

flow

Environment Variables

When a job runs, several variables are preset into the job's environment.

Submitting Extended Jobs and Advanced Jobs

Extended jobs and advanced jobs are more complex forms of job submission. Before attempting to submit such jobs, you should understand some important background information about the process. The following sections describe those job processes.

Submitting Extended Jobs With QMON

The General tab of the Submit Job dialog box enables you to configure the following parameters for an extended job. The General tab is shown in Figure 3–2.

The buttons at the right side of the Submit Job dialog box enable you to start various actions:

Extended Job Example

Figure 3–5 shows the Submit Job dialog box with most of the parameters set.

Figure 3–5 Extended Job Submission Example

Dialog box titled Submit Job. Shows the General
tab. The previous section describes the parameters and buttons that
are shown.

The parameters of the job configured in the example are:

Submitting Extended Jobs From the Command Line

To submit the extended job request that is shown in Figure 3–5 from the command line, type the following command:


% qsub -N Flow -p -111 -P devel -a 200404221630.44 -cwd \
	-S /bin/tcsh -o flow.out -j y flow.sh big.data

Submitting Advanced Jobs With QMON

The Advanced tab of the Submit Job dialog box enables you to define the following additional parameters:

Advanced Job Example

Figure 3–6 shows an example of an advanced job submission.

Figure 3–6 Advanced Job Submission Example

Dialog box titled Submit Job. Shows the Advanced
tab. Previous sections describe the parameters and buttons that are
shown.

The job defined in Extended Job Example has the following additional characteristics as compared to the job definition in Submitting Extended Jobs With QMON.

Submitting Advanced Jobs From the Command Line

To submit the advanced job request that is shown in Figure 3–6 from the command line, type the following command:


% qsub -N Flow -p -111 -P devel -a 200012240000.00 -cwd \
 -S /bin/tcsh -o flow.out -j y -pe mpi 4-16 \
 -v SHARED_MEM=TRUE,MODEL_SIZE=LARGE \
 -ac JOB_STEP=preprocessing,PORT=1234 \
 -A FLOW -w w -m s,e -q big_q\
 -M me@myhost.com,me@other.address \
 flow.sh big.data

Default Request Files

The preceding command shows that advanced job requests can be rather complex and unwieldy, in particular if similar requests need to be submitted frequently. To avoid the cumbersome and error-prone task of entering such commands, users can embed qsub options in the script files, or use default request files. For more information, see Active Comments.


Note –

The -binary yes|no option when specified with the y argument, allows you to use qrsh to submit executable jobs without the script wrapper. See the qsub man page.


The cluster administration can set up a default request file for all grid engine system users. Users, on the other hand, can create private default request files located in their home directories. Users can also create application-specific default request files that are located in their working directories.

Default request files contain the qsub options to apply by default to the jobs in one or more lines. The location of the global cluster default request file is sge-root/cell/common/sge_request. The private general default request file is located under $HOME/.sge_request. The application-specific default request files are located under $cwd/.sge_request.

    If more than one of these files are available, the files are merged into one default request, with the following order of precedence:

  1. Application-specific default request file

  2. General private default request file

  3. Global default request file

Script embedding and the qsub command line have higher precedence than the default request files. Therefore, script embedding overrides default request file settings. The qsub command line options can override these settings again.

To discard any previous settings, use the qsub -clear command in a default request file, in embedded script commands, or in the qsub command line.

Here is an example of a private default request file:


-A myproject -cwd -M me@myhost.com -m b e
-r y -j y -S /bin/ksh

Unless overridden, for all of this user's jobs the following is true:

Defining Resource Requirements

In the examples so far, the submit options do not express any resource requirements for the hosts on which the jobs are to be executed. The grid engine system assumes that such jobs can be run on any host. In practice, however, most jobs require that certain prerequisites be met on the executing host in order for the job to finish successfully. Such prerequisites include enough available memory, required software to be installed, or a certain operating system architecture. Also, the cluster administration usually imposes restrictions on the use of the machines in the cluster. For example, the CPU time that can be consumed by the jobs is often restricted.

The grid engine system provides users with the means to find suitable hosts for their jobs without precise knowledge of the cluster`s equipment and its usage policies. Users specify the requirement of their jobs and let the grid engine system manage the task of finding a suitable and lightly loaded host.

You specify resource requirements through requestable attributes, which are described in Requestable Attributes. QMON provides a convenient way to specify the requirements of a job. The Requested Resources dialog box displays only those attributes in the Available Resource list that are currently eligible. Click Request Resources in the Submit Job dialog box to open the Requested Resources dialog box. See Figure 3–7 for an example.

Figure 3–7 Requested Resources Dialog Box

Dialog box titled Requested Resources. Shows
lists hard and soft resources, and a list of available resources.

When you double-click an attribute, the attribute is added to the Hard or Soft Resources list of the job. A dialog box opens to guide you in entering a value specification for the attribute in question, except for BOOLEAN attributes, which are set to True. For more information, see How the Grid Engine System Allocates Resources.

Figure 3–7 shows a resource profile for a job that requests a solaris64 host with an available permas license offering at least 750 MBytes of memory. If more than one queue that fulfills this specification is found, any defined soft resource requirements are taken into account. However, if no queue satisfying both the hard and the soft requirements is found, any queue that grants the hard requirements is considered suitable.


Note –

The queue_sort_method parameter of the scheduler configuration determines where to start the job only if more than one queue is suitable for a job. See the sched_conf(5) man page for more information.


The attribute permas, an integer, is an administrator extension to the global resource attributes. The attribute arch, a string, is a host resource attribute. The attribute h_vmem, memory, is a queue resource attribute.

An equivalent resource requirement profile can as well be submitted from the qsub command line:


% qsub -l arch=solaris64,h_vmem=750M,permas=1 \
	permas.sh

The implicit -hard switch before the first -l option has been skipped.

The notation 750M for 750 MBytes is an example of the quantity syntax of the grid engine system. For those attributes that request a memory consumption, you can specify either integer decimal, floating-point decimal, integer octal, and integer hexadecimal numbers. The following multipliers must be appended to these numbers:

Octal constants are specified by a leading zero and digits ranging from 0 to 7 only. To specify a hexadecimal constant, you must prefix the number with 0x. You must also use digits ranging from 0 to 9, a through f, and A through F. If no multipliers are appended, the values are considered to count as bytes. If you are using floating-point decimals, the resulting value is truncated to an integer value.

For those attributes that impose a time limit, you can specify time values in terms of hours, minutes, or seconds, or any combination. Hours, minutes, and seconds are specified in decimal digits separated by colons. A time of 3:5:11 is translated to 11111 seconds. If zero is a specifier for hours, minutes, or seconds, you can leave it out if the colon remains. Thus a value of :5: is interpreted as 5 minutes. The form used in the Requested Resources dialog box that is shown in Figure 3–7 is an extension, which is valid only within QMON.

How the Grid Engine System Allocates Resources

    As shown in the previous section, knowing how grid engine software processes resource requests and allocates resources is important. The schematic view of grid engine software's resource allocation algorithm is as follows.

  1. Read in and parse all default request files. See Default Request Files for details.

  2. Process the script file for embedded options. See Active Comments for details.

  3. Read all script-embedding options when the job is submitted, regardless of their position in the script file.

  4. Read and parse all requests from the command line.

    As soon as all qsub requests are collected, hard and soft requests are processed separately, the hard requests first. The requests are evaluated, according to the following order of precedence:

  1. From left to right of the script or default request file

  2. From top to bottom of the script or default request file

  3. From left to right of the command line

In other words, you can use the command line to override the embedded flags.

The resources requested as hard are allocated. If a request is not valid, the submission is rejected. If one or more requests cannot be met at submit time, the job is spooled and rescheduled to be run at a later time. A request might not be met, for example, if a requested queue is busy. If all hard requests can be met, the requests are allocated and the job can be run.

The resources requested as soft are checked. The job can run even if some or all of these requests cannot be met. If multiple queues that meet the hard requests provide parts of the soft resources list, the grid engine software selects the queues that offer the most soft requests.

The job is started and covers the allocated resources.

You might want to gather experience of how argument list options and embedded options or hard and soft requests influence each other. You can experiment with small test script files that execute UNIX commands such as hostname or date.

Job Dependencies

Often the most convenient way to build a complex task is to split the task into subtasks. In these cases, subtasks depend on the completion of other subtasks before the dependent subtasks can get started. An example is that a predecessor task produces an output file that must be read and processed by a dependent task.

The grid engine system supports interdependent tasks with its job dependency facility. You can configure jobs to depend on the completion of one or more other jobs. The facility is enforced by the qsub -hold_jid command. You can specify a list of jobs upon which the submitted job depends. The list of jobs can also contain subsets of array jobs. The submitted job is not eligible for execution unless all jobs in the dependency list have finished.

Submitting Array Jobs

Parameterized and repeated execution of the same set of operations that are contained in a job script is an ideal application for the array job facility of the grid engine system. Typical examples of such applications are found in the Digital Content Creation industries for tasks such as rendering. Computation of an animation is split into frames. The same rendering computation can be performed for each frame independently.

The array job facility offers a convenient way to submit, monitor, and control such applications. The grid engine system provides an efficient implementation of array jobs, handling the computations as an array of independent tasks joined into a single job. The tasks of an array job are referenced through an array index number. The indexes for all tasks span an index range for the entire array job. The index range is defined during submission of the array job by a single qsub command.

You can monitor and control an array job. For example, you can suspend, resume, or cancel an array job as a whole or by individual task or subset of tasks. To reference the tasks, the corresponding index numbers are suffixed to the job ID. Tasks are executed very much like regular jobs. Tasks can use the environment variable SGE_TASK_ID to retrieve their own task index number and to access input data sets designated for this task identifier.

Submitting an Array Job With QMON

Follow the instructions in How To Submit a Simple Job With QMON, additionally taking into account the following information.

The submission of array jobs from QMON works virtually identically to how the submission of a simple job is described in How To Submit a Simple Job With QMON. The only difference is that the Job Tasks input window that is shown in Figure 3–5 must contain the task range specification. The task range specification uses syntax that is identical to the qsub -t command. See the qsub(1) man page for detailed information about array index syntax.

For information about monitoring and controlling jobs in general, and about array jobs in particular, see Monitoring and Controlling Jobs and Monitoring and Controlling Jobs From the Command Line. See also the man pages for qstat(1), qhold(1), qrls(1), qmod(1), and qdel(1).

Array jobs offer full access to all facilities of the grid engine system that are available for regular jobs. In particular, array jobs can be parallel jobs at the same time. Array jobs also can have interdependencies with other jobs.


Note –

Array tasks cannot have interdependencies with other jobs or with other array tasks.


Submitting an Array Job From the Command Line

To submit an array job from the command line, type the qsub command with appropriate arguments.

The following is an example of how to submit an array job:


% qsub -l h_cpu=0:45:0 -t 2-10:2 render.sh data.in

The -t option defines the task index range. In this case, 2-10:2 specifies that 2 is the lowest index number, and 10 is the highest index number. Only every second index, the :2 part of the specification, is used. Thus, the array job is made up of 5 tasks with the task indices 2, 4, 6, 8, and 10. Each task requests a hard CPU time limit of 45 minutes with the -l option. Each task executes the job script render.sh once the task is dispatched and started by the grid engine system. Tasks can use SGE_TASK_ID to find their index number, which they can use to find their input data record in the data file data.in.

Submitting Interactive Jobs

The submission of interactive jobs instead of batch jobs is useful in situations where a job requires your direct input to influence the job results. Such situations are typical for X Windows applications or for tasks in which your interpretation of immediate results is required to steer further processing.

You can create interactive jobs in three ways:

The default handling of interactive jobs differs from the handling of batch jobs. Interactive jobs are not queued if the jobs cannot be executed when they are submitted. A job's not being queued indicates immediately that not enough appropriate resources are available to dispatch an interactive job at the time the job is submitted. The user is notified in such cases that the cluster is currently too busy.

You can change this default behavior with the -now no option to qsh, qlogin, and qrsh. If you use this option, interactive jobs are queued like batch jobs. When you use the -now yes option, batch jobs that are submitted with qsub can also be handled like interactive jobs. Such batch jobs are either dispatched for running immediately, or they are rejected.


Note –

Interactive jobs can be run only in queues of the type INTERACTIVE. See Configuring Queues in Sun N1 Grid Engine 6.1 Administration Guide for details.


The following sections describe how to use the qlogin and qsh facilities. The qrsh command is explained in a broader context in Transparent Remote Execution.

Submitting Interactive Jobs With QMON

The only type of interactive jobs that you can submit from QMON are jobs that bring up an xterm on a host selected by the grid engine system.

At the right side of the Submit Job dialog box, click the button above the Submit button until the Interactive icon is displayed. Doing so prepares the Submit Job dialog box to submit interactive jobs. See Figure 3–8 and Figure 3–9.

The meaning and the use of the selection options in the dialog box is the same as that described for batch jobs in Submitting Batch Jobs. The difference is that several input fields are grayed out because those fields do not apply to interactive jobs

Figure 3–8 Interactive Submit Job Dialog Box, General Tab

Dialog box titled Submit Job. Shows General tab
with fields for interactive jobs filled in.

Figure 3–9 Interactive Submit Job Dialog Box, Advanced Tab

Dialog box titled Submit Job. Shows Advanced
tab with fields for interactive jobs filled in.

Submitting Interactive Jobs With qsh

qsh is very similar to qsub. qsh supports several of the qsub options, as well as the additional option -display to direct the display of the xterm to be invoked. See the qsub(1) man page for details.

To submit an interactive job with qsh, type a command like the following:


% qsh -l arch=solaris64

This command starts an xterm on any available Sun Solaris 64–bit operating system host.

Submitting Interactive Jobs With qlogin

Use the qlogin command from any terminal or terminal emulation to start an interactive session under the control of the grid engine system.

To submit an interactive job with qlogin, type a command like the following:


% qlogin -l star-cd=1,h_cpu=6:0:0

This command locates a low-loaded host. The host has a Star-CD license available. The host also has at least one queue that can provide a minimum of six hours hard CPU time limit.


Note –

Depending on the remote login facility that is configured to be used by the grid engine system, you might have to provide your user name, your password, or both, at a login prompt.


Transparent Remote Execution

The grid engine system provides a set of closely related facilities that support the transparent remote execution of certain computational tasks. The core tool for this functionality is the qrsh command, which is described in Remote Execution With qrsh. Two high-level facilities, qtcsh and qmake, build on top of qrsh. These two commands enable the grid engine system to transparently distribute implicit computational tasks, thereby enhancing the standard UNIX facilities make and csh. qtcsh is described in Transparent Job Distribution With qtcsh. qmake is described in Parallel Makefile Processing With qmake.

Remote Execution With qrsh

qrsh is built around the standard rsh facility. See the information that is provided in sge-root/3rd_party for details on the involvement of rsh. qrsh can be used for various purposes, including the following:

By virtue of these capabilities, qrsh is the major enabling infrastructure for the implementation of the qtcsh and the qmake facilities. qrsh is also used for the tight integration of the grid engine system with parallel environments such as MPI or PVM.

Invoking Transparent Remote Execution With qrsh

Type the qrsh command, adding options and arguments according to the following syntax:


% qrsh	[options] program|shell-script [arguments] \
	[> stdout] [>&2 stderr] [< stdin]

qrsh understands almost all options of qsub. qrsh provides the following options:

Transparent Job Distribution With qtcsh

qtcsh is a fully compatible replacement for the widely known and used UNIX C shell derivative tcsh. qtcsh is built around tcsh. See the information that is provided in sge-root/3rd_party for details on the involvement of tcsh. qtcsh provides a command shell with the extension of transparently distributing execution of designated applications to suitable and lightly loaded hosts that use the grid engine system. The .qtask configuration files define the applications to execute remotely and the requirements that apply to the selection of an execution host.

These applications are transparent to the user and are submitted to the grid engine system through the qrsh facility. qrsh provides standard output, error output, and standard input handling as well as terminal control connection to the remotely executing application. Three noticeable differences between running such an application remotely and running the application on the same host as the shell are:

In addition to the standard use, qtcsh is a suitable platform for third-party code and tool integration. The single-application execution form of qtcsh is qtcsh -c app-name. The use of this form of qtcsh inside integration environments presents a persistent interface that almost never needs to be changed. All the required application, tool, integration, site, and even user-specific configurations are contained in appropriately defined .qtask files. A further advantage is that this interface can be used in shell scripts of any type, in C programs, and even in Java applications.

qtcsh Usage

The invocation of qtcsh is exactly the same as for tcsh. qtcsh extends tcsh in providing support for the .qtask file and by offering a set of specialized shell built-in modes.

The .qtask file is defined as follows. Each line in the file has the following format:


% [!]app-name qrsh-options

The optional leading exclamation mark (!) defines the precedence between conflicting definitions in a global cluster .qtask file and the personal .qtask file of the qtcsh user. If the exclamation mark is missing in the global cluster file, a conflicting definition in the user file overrides the definition in the global cluster file. If the exclamation mark is in the global cluster file, the corresponding definition cannot be overridden.

app-name specifies the name of the application that, when typed on a command line in a qtcsh, is submitted to the grid engine system for remote execution.

qrsh-options specifies the options to the qrsh facility to use. These options define resource requirements for the application.

The application name must appear in the command line exactly as the application is defined in the .qtask file. If the application name is prefixed with a path name, a local binary is addressed. No remote execution is intended.

csh aliases are expanded before a comparison with the application names is performed. The applications intended for remote execution can also appear anywhere in a qtcsh command line, in particular before or after standard I/O redirections.

Hence, the following examples are valid and meaningful syntax:


# .qtask file
netscape -v DISPLAY=myhost:0
grep -l h=filesurfer

Given this .qtask file, the following qtcsh command lines:


netscape
~/mybin/netscape
cat very_big_file | grep pattern | sort | uniq

implicitly result in:


qrsh -v DISPLAY=myhost:0 netscape
~/mybin/netscape
cat very_big_file | qrsh -l h=filesurfer grep pattern | sort | uniq

qtcsh can operate in different modes, influenced by switches that can be set on or off:

The setting of these modes can be changed using option arguments of qtcsh at start time or with the shell built-in command qrshmode at runtime. See the qtcsh(1) man page for more information.

Parallel Makefile Processing With qmake

qmake is a replacement for the standard UNIX make facility. qmake extends make by enabling the distribution of independent make steps across a cluster of suitable machines. qmake is built around the popular GNU-make facility gmake. See the information that is provided in sge-root/3rd_party for details on the involvement of gmake.

To ensure that a distributed make process can run to completion, qmake first allocates the required resources in a way analogous to a parallel job. qmake then manages this set of resources without further interaction with the scheduling. qmake distributes make steps as resources become available, using the qrsh facility with the -inherit option.

qrsh provides standard output, error output, and standard input handling as well as terminal control connection to the remotely executing make step. Therefore, only three noticeable differences exist between executing a make procedure locally and using qmake:

The most common use of make is the compilation of complex software packages. Compilation might not be the major application for qmake, however. Program files are often quite small as a matter of good programming practice. Therefore, compilation of a single program file, which is a single make step, often takes only a few seconds. Furthermore, compilation usually implies significant file access. Nested include files can cause this problem. File access might not be accelerated if done for multiple make steps in parallel because the file server can become a bottleneck. Such a bottleneck effectively serializes all the file access. Therefore, the compilation process sometimes cannot be accelerated in a satisfactory manner.

Other potential applications of qmake are more appropriate. An example is the steering of the interdependencies and the workflow of complex analysis tasks through makefiles. Each make step in such environments is typically a simulation or data analysis operation with nonnegligible resource and computation time requirements. A considerable acceleration can be achieved in such cases.

qmake Usage

The command-line syntax of qmake looks similar to the syntax of qrsh:


% qmake [-pe pe-name pe-range][options] \
 -- [gnu-make-options][target]

Note –

The -inherit option is also supported by qmake, as described later in this section.


Pay special attention to the use of the -pe option and its relation to the gmake -j option. You can use both options to express the amount of parallelism to be achieved. The difference is that gmake provides no possibility with -j to specify something like a parallel environment to use. Therefore, qmake assumes that a default environment for parallel makes is configured that is called make. Furthermore, gmake ´s -j allows for no specification of a range, but only for a single number. qmake interprets the number that is given with -j as a range of 1-n. By contrast, -pe permits the detailed specification of all these parameters. Consequently the following command line examples are identical:


% qmake -- -j 10
% qmake -pe make 1-10 --

The following command lines cannot be expressed using the -j option:


% qmake -pe make 5-10,16 --
% qmake -pe mpi 1-99999 --

Apart from the syntax, qmake supports two modes of invocation: interactively from the command line without the -inherit option, or within a batch job with the -inherit option. These two modes start different sequences of actions:

See the qmake(1) man page for further details.

How Jobs Are Scheduled

The grid engine software's policy management automatically controls the use of shared resources in the cluster to best achieve the goals of the administration. High priority jobs are dispatched preferentially. Such jobs receive better access to resources. The administration of a cluster can define high-level usage policies. The following policies are available:

The grid engine software can be set up to routinely use either a share-based policy, a functional policy, or both. These policies can be combined in any proportion, from giving zero weight to one policy and using only the second policy, to giving both policies equal weight.

Along with the routine policies, jobs can be submitted with an initiation deadline. See the description of the deadline submission parameter under Submitting Advanced Jobs With QMON. Deadline jobs disturb routine scheduling. Administrators can also temporarily override share-based scheduling and functional scheduling. An override can be applied to an individual job, or to all jobs associated with a user, a department, or a project.

Job Priorities

In addition to the four policies for mediating among all jobs, the grid engine software sometimes lets users set priorities among their own jobs. A user who submits several jobs can specify, for example, that job 3 is the most important and that jobs 1 and 2 are equally important but less important than job 3.

Priorities for jobs are set by using the QMON Submit Job parameter Priority or by using the qsub -p option. A priority range of -1024 (lowest) to 1023 (highest) can be given. This priority tells the scheduler how to choose among a single user's jobs when several of that user's jobs are in the system simultaneously. The relative importance assigned to a particular job depends on the maximum and minimum priorities that are given to any of that user's jobs, and on the priority value of the specific job.

Ticket Policies

The functional policy, the share-based policy, and the override policy are all implemented with tickets. Each ticket policy has a ticket pool from which tickets are allocated to jobs that are entering the multimachine grid engine system. Each routine ticket policy that is in force allocates some tickets to each new job. The ticket policy can reallocate tickets to the executing job at each scheduling interval. The criteria that each ticket policy uses to allocate tickets are explained in this section.

Tickets weight the three policies. For example, if no tickets are allocated to the functional policy, that policy is not used. If an equal number of tickets are assigned to the functional ticket pool and to the share-based ticket pool, both policies have equal weight in determining a job's importance.

Grid engine managers allocate tickets to the routine ticket policies at system configuration. Managers and operators can change ticket allocations at any time. Additional tickets can be injected into the system temporarily to indicate an override. Ticket policies are combined by assignment of tickets: when tickets are allocated to multiple ticket policies, a job gets a portion of its tickets from each ticket policy in force.

The grid engine system grants tickets to jobs that are entering the system to indicate their importance under each ticket policy in force. Each running job can gain tickets, for example, from an override; lose tickets, for example, because the job is getting more than its fair share of resources; or keep the same number of tickets at each scheduling interval. The number of tickets that a job holds represents the resource share that the grid engine system tries to grant that job during each scheduling interval.

You can display the number of tickets a job holds with QMON or using qstat -ext. See Monitoring and Controlling Jobs With QMON. The qstat command also displays the priority value assigned to a job, for example, using qsub -p. See the qstat(1) man page for more details.

Queue Selection

The grid engine system does not dispatch jobs that request nonspecific queues if the jobs cannot be started immediately. Such jobs are marked as spooled at the sge_qmaster, which tries to reschedule the jobs from time to time. The jobs are dispatched to the next suitable queue that becomes available.

As opposed to spooling jobs, jobs that are submitted to a certain queue by name go directly to the named queue, regardless of whether the jobs can be started or need to be spooled. Therefore, viewing the queues of the grid engine system as computer science batch queues is valid only for jobs requested by name. Jobs submitted with nonspecific requests use the spooling mechanism of sge_qmaster for queueing, thus using a more abstract and flexible queuing concept.

If a job is scheduled and multiple free queues meet its resource requests, the job is usually dispatched to a suitable queue belonging to the least loaded host. By setting the scheduler configuration entry queue_sort_method to seq_no, the cluster administration can change this load-dependent scheme into a fixed order algorithm. The queue configuration entry seq_no defines a precedence among the queues, assigning the highest priority to the queue with the lowest sequence number.