A P P E N D I X  C

Sun Grid Engine Reference

This appendix provides basic information about the Sun Grid Engine commands and options. More thorough information is available in the Sun Grid Engine documentation. See Related Documentation.

Topics in this section include:


Accessing the Sun Grid Engine Environment

To access Sun Grid Engine, the client host NFS mounts the Sun Grid Engine installation. Your client host should mount Sun Grid Engine so that you can use the same $SGE_ROOT as the NFS server does. (The default is /gridware/sge.)


procedure icon  To Access the Sun Grid Engine Environment

1. Test the accessibility of the $SGE_ROOT directory from your client host:


# ls /net/nfsserverhostname/gridware

where /gridware is the base directory of your $SGE_ROOT.

2. From your client host, access the NFS server’s $SGE_ROOT as the client’s own $SGE_ROOT using /etc/vfstab, /etc/fstab, or automount.



Note - Client hosts must not mount the NFS server with nosuid option, since setuid is needed by rlogin and rsh.


a. Add the following line to the /etc/auto_direct file:


/gridware  -rw,suid,bg,hard,noquota,intr nfsserverhostname:/gridware

where /gridware is the base directory of your $SGE_ROOT.

b. Restart the automounter:



Note - The easiest method to automount every file system from the NFS server is to create a symbolic link. For example:
# ln -s /net/nfsserverhostname/$SGE_ROOT $SGE_ROOT
However, you must ensure that such a mount allows suid access.


a. Add the following line to the /etc/fstab file:,


nfsserverhostname:/gridware    /gridware  nfs        auto,suid,bg,intr          0 0

b. Type these two commands:


# mkdir /gridware
# mount /gridware

where /gridware is the base directory of your $SGE_ROOT.



Note - If you use NIS to resolve host names, add the server’s name to the /etc/hosts file and ensure that files appears in the hosts entry in /etc/nsswitch.conf


3. If your grid installation requires it, copy the server’s sge_qmaster line from the server’s /etc/services file into your client’s.

This step is not needed if the SGE settings files set the SGE_QMASTER_PORT environment variable. See Setting Up the Sun Grid Engine Environment Variables.


Setting Up the Sun Grid Engine Environment Variables


procedure icon  To Set Up the Sun Grid Engine Environment Variables

single-step bullet  Set up the Sun Grid Engine environment variables:

Substitute /gridware/sge with your value for $SGE_ROOT.

Substitute /gridware/sge with your value for $SGE_ROOT.



Note - These commands add $SGE_ROOT/bin/$ARCH to $path, add $SGE_ROOT/man to $MANPATH, set $SGE_ROOT, and if needed, set $SGE_CELL (probably default). These commands probably also set your SGE_QMASTER_PORT environment variable.


You might want to insert a command like these in your login configuration file, probably subject to a test that the settings file exists (is readable).


Basic Sun Grid Engine Commands

TABLE C-1 provides a brief description of the basic Sun Grid Engine commands.


TABLE C-1 Basic Sun Grid Engine Commands

Command

Description

qmon &

Starts a graphical user interface (GUI) for displaying the Sun Grid Engine state and for submitting jobs. A Sun Grid Engine administrator can also use the GUI to alter the state of Sun Grid Engine.

qstat

Shows jobs you have submitted, yet are not complete.

qstat -f

Shows available queues and execution hosts, the architecture (operating system and processor type), the current state (au means unavailable), all running jobs, and other information.

qsub

Submits a job for future execution. This job might need to wait until necessary resources are available. Job output is saved in files.

qrsh

Submits an interactive job. If the job cannot start immediately, you are told to try again later. The job is not queued. Job output goes to the invoking window.


If SGE commands such as qstat are still not found after setting up the environment, have your system administrator verify that the NFS server contains binaries for your client’s architecture (operating system and processor, as output by SGE’s arch command at $SGE_ROOT/util/arch). For example, if arch prints sol-amd64, then $SGE_ROOT/bin should contain a subdirectory named sol-amd64.

qsub job output is redirected to a file (and qrsh output is not). By default, this file is $HOME/JobName.oJobId. Any error output is likewise saved in the error file, which defaults to $HOME/JobName.eJobId. Other differences between qsub and qrsh are presented in TABLE C-3. Both qsub and qrsh require an absolute (starting with /) or relative path to the program or script to be submitted. $PATH will not be searched to locate qsub or qrsh.


qsub and qrsh Commands

The qsub command starts batch jobs at a later time. The qrsh command runs jobs interactively.

Some Common qsub and qrsh Options

TABLE C-2 provides command options common to both qsub and qrsh.


TABLE C-2 Common qsub and qrsh Options

Option

Description

-v variable

Introduces environment variables whose values should be copied from the current shell to the job. You can also use -v variable=value to assign the value that should be saved with the job for that variable.

-q queue-name

Enables you to demand that your job execute on a particular queue. Using wildcards such as “*@myserver”, you can demand any queue on a certain host without specifying which queue. Quoting is needed to pass the wildcard characters to qsub, rather than having the characters expanded by your interactive shell.

-l resource=value[,resource=value]...

Specifies Sun Grid Engine job resource attributes.

 

graphics=1

Allocates use of a graphics accelerator.

 

arch=string

Where string identifies the processor and operating system. For example:

 

 

sol-sparc64

sol-sparc

Solaris on SPARC (64-bit)

Solaris on SPARC (32-bit)

 

 

sol-amd64

sol-x86

Solaris on x64 (64-bit)

Solaris on x86 (32-bit)

 

 

lx24-amd64

lx24-x86

Linux (2.4 or 2.6 kernel) on x64 (64-bit)

Linux (2.4 or 2.6 kernel) on x86 (32-bit)

 

 

Wildcarding is supported, if quoted to keep the submit shell from expanding the wildcards. For example:

 

 

"sol-sparc*"

"*-x86"

"lx24-*"

Solaris on SPARC (32-bit or 64-bit)

Solaris or Linux on x86 (32-bit)

Linux (2.4 or 2.6 kernel, 32-bit or 64-bit)

-

h_rt=hour:minute:seconds

s_rt=hour:minute:seconds

Hard runtime limit. After the specified hard runtime limit, Sun Grid Engine aborts the job using the SIGKILL signal. If the similar s_rt soft limit is reached, Sun Grid Engine warns the job by sending the job the SIGUSR1 signal. This behavior is effective only if the job catches and handles that warning signal.

Jobs that do not specify an elapsed time limit inherit a system default. The default is necessary for the Advance Reservation system to assure resource availability.


Different Default Behavior of qsub and qrsh

Though the qsub and qrsh commands start jobs, their respective default behavior is different. TABLE C-3 presents the differences in qsub’s and qrsh’s defaults for certain options.


TABLE C-3 Differences in qsub and qrsh Command Options

Option

Mnemonic

qsub Default

qrsh Default

Behavior

 

 

(batch job)

(interactive)

 

-now [yn]

now

n

y

If the job cannot run immediately:

y = Submission fails.

n = Spool the job for later.

-b [yn]

binary

n

y

n = Target script file is copied into the job and scanned for #$ options (job default functions).

y = Neither of these events happen.

-w [ewnv]

e

w

n

v

warn

error

warning

none

verify

n

e

 

Fail submit if job cannot run.

Print message if job cannot run.

Enqueue syntactically valid jobs.

Explain any reason job cannot run.




Note - Use the -w option of qsub or qrsh to obtain more information about why Sun Grid Engine cannot schedule a job to run.



Example Sun Grid Engine Job Script

The following example job script starts /opt/VirtualGL/bin/glxspheres on a Solaris or Linux graphics server. This script is a simplified version of $SGE_ROOT/graphics/RUN.glxspheres. Italicized text in this listing provides commentary, but is not part of the job script itself.


#!/bin/sh This script is interpreted by the Bourne shell, sh.

#
# The name of my job:
#$ -N glxspheres
#
# The interpreter SGE must use:

#$ -S /bin/sh Sun Grid Engine always uses sh to interpret this script.

#
# Join stdout and stderr:
#$ -j y
#
# This job needs a graphics device:

#$ -l gfx=1 # Allocate a graphics resource to this job.

#
# Specify that these environment variables are to be sent to SGE with the job:
#$ -v DISPLAY
#$ -v VGL_CLIENT
#$ -v VGL_GAMMA
#$ -v VGL_GLLIB
#$ -v VGL_SPOIL
#$ -v VGL_X11LIB
#$ -v SSH_CLIENT
# If these variables are not set before qsub/qrsh is invoked,
# then the job will find these variables set, but with a null string value ("").
#
# Script can run on what systems?
# Solaris (SPARC or x86, 32-bit or 64-bit) and Linux systems (32- or 64-bit),
# provided glxspheres is installed on the target system in one of the paths below.
#$ -l arch=sol-sparc|sol-sparc64|sol-x86|sol-amd64|lx24-x86|lx24-amd64
 
# If VGL_DISPLAY is set by SGE, then run program with vglrun. Otherwise don't.

if [ "${VGL_DISPLAY+set}" ]; then If VGL_DISPLAY is set (even if null)...

VGLRUN=/opt/VirtualGL/bin/vglrun Then the script will use vglrun to launch application.

    if [ ! -x $VGLRUN ]; then
	echo 1>&2 "vglrun not found on host ${HOSTNAME:=‘hostname‘}"
	exit 1
    fi
else
    VGLRUN=""
fi
 
if  [ -x /opt/VirtualGL/bin/glxspheres ]; then
    path=/opt/VirtualGL/bin/glxspheres
else
    echo 1>&2 "glxspheres not found on host ${HOSTNAME}"
    exit 2
fi
    
# Sun Grid Engine job starts vglrun which starts glxspheres
# with any arguments passed to this script.  If VGL_DISPLAY is not set,
# $VGLRUN will be the empty string, and vglrun won't be invoked.
$VGLRUN "$path" "$@"