Sun N1 Grid Engine 6.1 User's Guide

Submitting Batch Jobs

The following sections describe how to submit more complex jobs through the grid engine system.

About Shell Scripts

Shell scripts, also called batch jobs, are a sequence of command-line instructions that are assembled in a file. Script files are made executable by the chmod command. If scripts are invoked, a command interpreter is started. Each instruction is interpreted as if the instruction were typed manually by the user who is running the script. csh, tcsh, sh, or ksh are typical command interpreters. You can invoke arbitrary commands, applications, and other shell scripts from within a shell script.

The command interpreter can be invoked as login shell. To do so, the name of the command interpreter must be contained in the login_shells list of the grid engine system configuration that is in effect for the particular host and queue that is running the job.


Note –

The grid engine system configuration might be different for the various hosts and queues configured in your cluster. You can display the effective configurations with the -sconf and -sq options of the qconf command. For detailed information, see the qconf(1) man page.


If the command interpreter is invoked as login shell, the environment of your job is the same as if you logged in and ran the script. In using csh, for example, .login and .cshrc are executed in addition to the system default startup resource files, such as /etc/login, whereas only .cshrc is executed if csh is not invoked as login-shell. For a description of the difference between being invoked and not being invoked as login-shell, see the man page of your command interpreter.

Example of a Shell Script

Example 3–1 is a simple shell script. The script first compiles the application flow from its Fortran77 source and then runs the application.


Example 3–1 Simple Shell Script


#!/bin/csh
# This is a sample script file for compiling and 
# running a sample FORTRAN program under N1 Grid Engine 6 
cd TEST
# Now we need to compile the program "flow.f" and
# name the executable "flow".
f77 flow.f -o flow

Your local system user's guide provides detailed information about building and customizing shell scripts. You might also want to look at the sh, ksh, csh, or tcsh man page. The following sections emphasize special things that you should consider when you prepare batch scripts for the grid engine system.

In general, you can submit to the grid engine system all shell scripts that you can run from your command prompt by hand. Such shell scripts must not require a terminal connection, and the scripts must not need interactive user intervention. The exceptions are the standard error and standard output devices, which are automatically redirected. Therefore, Example 3–1 is ready to be submitted to the grid engine system and the script will perform the desired action.

Extensions to Regular Shell Scripts

Some extensions to regular shell scripts influence the behavior of scripts that run under grid engine system control. The following sections describe these extensions.

How a Command Interpreter Is Selected

At submit time, you can specify the command interpreter to use to process the job script file as shown in Figure 3–5. However, if nothing is specified, the configuration variable shell_start_mode determines how the command interpreter is selected:

Output Redirection

Since batch jobs do not have a terminal connection, their standard output and their standard error output must be redirected into files. The grid engine system enables the user to define the location of the files to which the output is redirected. Defaults are used if no output files are specified.

The standard location for the files is in the current working directory where the jobs run. The default standard output file name is job-name.ojob-id, the default standard error output is redirected to job-name>.ejob-id. The job-name can be built from the script file name, or defined by the user. See, for example, the -N option in the submit(1) man page. job-id is a unique identifier that is assigned to the job by the grid engine system.

For array job tasks , the task identifier is added to these filenames, separated by a dot. The resulting standard redirection paths are job-name.ojob-id.task-id> and job-name.ejob-id.task-id. For more information, see Submitting Array Jobs.

In case the standard locations are not suitable, the user can specify output directions with QMON, as shown in Figure 3–6. Or the user can use the -e and -o options to the qsub command to specify output directions. Standard output and standard error output can be merged into one file. The redirections can be specified on a per execution host basis, in which case, the location of the output redirection file depends on the host on which the job is executed. To build custom but unique redirection file paths, use dummy environment variables together with the qsub -e and -o options. A list of these variables follows.

When the job runs, these variables are expanded into the actual values, and the redirection path is built with these values.

See the qsub(1) man page for further details.

Active Comments

Lines with a leading # sign are treated as comments in shell scripts. However, the grid engine system recognizes special comment lines and uses these lines in a special way. The special comment script line is treated as part of the command line argument list of the qsub command. The qsub options that are supplied within these special comment lines are also interpreted by the QMON Submit Job dialog box. The corresponding parameters are preset when a script file is selected.

By default, the special comment lines are identified by the #$ prefix string. You can redefine the prefix string with the qsub -C command.

This use of special comments is called script embedding of submit arguments. The following example shows a script file that uses script-embedded command-line options.


Example 3–2 Using Script-Embedded Command Line Options


#!/bin/csh

#Force csh if not Grid Engine default 
#shell

#$ -S /bin/csh

# This is a sample script file for compiling and
# running a sample FORTRAN program under N1 Grid Engine 6
# We want Grid Engine to send mail
# when the job begins
# and when it ends.

#$ -M EmailAddress
#$ -m b e

# We want to name the file for the standard output
# and standard error.

#$ -o flow.out -j y

# Change to the directory where the files are located.

cd TEST

# Now we need to compile the program "flow.f" and
# name the executable "flow".

f77 flow.f -o flow

# Once it is compiled, we can run the program.

flow

Environment Variables

When a job runs, several variables are preset into the job's environment.