This section explains the grid engine system's load parameters. Instructions are included for writing your own load sensors.
By default, sge_execd periodically reports several load parameters and their corresponding values to sge_qmaster. These values are stored in the sge_qmaster internal host object, which is described in About Hosts and Daemons. However, the values are used internally only if a complex resource attribute with a corresponding name is defined. Such complex resource attributes contain the definition as to how load values are to be interpreted. See Assigning Resource Attributes to Queues, Hosts, and the Global Cluster for more information.
After the primary installation, a standard set of load parameters is reported. All attributes required for the standard load parameters are defined as host-related attributes. Subsequent releases of N1 Grid Engine 6.1 software may provide extended sets of default load parameters, therefore the set of load parameters that is reported by default is documented in the file sge-root/doc/load_parameters.asc.
How load attributes are defined determines their accessibility. By defining load parameters as global resource attributes, you make them available for the entire cluster and for all hosts. By defining load parameters as host-related attributes, you provide the attributes for all hosts but not for the global cluster.
Do not define load attributes as queue attributes. Queue attributes would not be available to any host nor to the cluster.
The set of default load parameters might not be adequate to completely describe the load situation in a cluster. This possibility is especially likely with respect to site-specific policies, applications, and configurations. Therefore grid engine software provides the means to extend the set of load parameters. For this purpose, sge_execd offers an interface to feed load parameters and the current load values into sge_execd. Afterwards, these parameters are treated like the default load parameters. As for the default load parameters, corresponding attributes must be defined in the complex for the site-specific load parameters to become effective. See Default Load Parameters for more information.
To feed sge_execd with additional load information, you must supply a load sensor. The load sensor can be a script or a binary executable. In either case, the load sensor's handling of the standard input and standard output streams and its control flow must comply with the following rules:
The load sensor must be written as an infinite loop that waits at a certain point for input from STDIN.
If the string quit is read from STDIN, the load sensor is supposed to exit.
As soon as an end-of-line is read from STDIN, a retrieval cycle for loading data is supposed to start.
The load sensor then performs whatever operation is necessary to compute the desired load figures. At the end of the cycle, the load sensor writes the result to STDOUT.
If load retrieval takes a long time, the load measurement process can be started immediately after sending a load report. When quit is received, the load values are then available to be sent.
The format for the load sensor rules is as follows:
A load value report starts with a line that contains nothing but the word begin.
Individual load values are separated by newlines.
Each load value consists of three parts separated by colons (:) and contains no blanks.
The first part of a load value is either the name of the host for which load is reported or the special name global.
The second part of the load sensor is the symbolic name of the load value, as defined in the complex. See the complex(5) man page for details. If a load value is reported for which no entry in the complex exists, the reported load value is not used.
The third part of the load sensor is the measured load value. A load value report ends with a line that contains the word end.
The following example shows a load sensor. The load sensor is a Bourne shell script.
#!/bin/sh myhost=`uname -n` while [ 1 ]; do # wait for input read input result=$? if [ $result != 0 ]; then exit 1 fi if [ $input = quit ]; then exit 0 fi #send users logged in logins=`who | cut -f1 -d" " | sort | uniq | wc -l | sed "s/^ *//"` echo begin echo "$myhost:logins:$logins" echo end done # we never get here exit 0 |
Save this script to the file load.sh. Assign executable permission to the file with the chmod command. To test the script interactively from the command line, type load.sh and repeatedly press the Return key.
As soon as the procedure works, you can install it for any execution host. To install the procedure, configure the load sensor path as the load_sensor parameter for the cluster configuration, global configuration, or the host-specific configuration. See Basic Cluster Configuration or the sge_conf(5) man page for more information.
The corresponding QMON window might look like the following figure:
The reported load parameter logins is usable as soon as a corresponding attribute is added to the complex. The required definition might look like the last table entry shown in the following figure.