Sun HPC ClusterTools 3.0 Administrator's Guide: With LSF

Configuring for Fast Interactive Batch Response Time

There are several steps you can take to optimize the response time of an interactive batch queue. These steps are discussed in Sections "Set PRIORITY in lsb.queues" through "Add Optimization Parameters to lsb.params".

Set PRIORITY in lsb.queues

The PRIORITY parameter defines a batch queue's priority relative to other batch queues. To ensure faster dispatching, assign a higher PRIORITY value to interactive batch queues than you give to noninteractive queues. A higher number equals a higher priority. For example, the following setting

PRIORITY=12
means that jobs on that queue will usually be serviced sooner than
jobs on queues with a setting of PRIORITY=11 or lower.

Set NICE in lsb.queues

Set the queue's NICE parameter to 10. This will ensure that it receives the same CPU priority as other interactive queues.

Set NEW_JOB_SCHED_DELAY in lsb.queues

Set the NEW_JOB_SCHED_DELAY parameter to 0. This will allow a new job scheduling session to be started as soon as a job is submitted to this queue.

Add Optimization Parameters to lsb.params

During installation of the Sun HPC ClusterTools 3.0 packages, you are asked if you want to modify the lsb.params file to optimize interactive batch response time. If you answered yes, the SUNWrte package makes the following changes to the lsb.params file:

MBD_SLEEP_TIME=1
MAX_SBD_FAIL=30
JOB_ACCEPT_INTERVAL=0

The first parameter, MBD_SLEEP_TIME, specifies the number of seconds LSF Batch will wait between attempts to dispatch jobs. The default is 60 seconds. SUNWrte changes the interval to 1 second.

The MAX_SBD_FAIL parameter specifies how many times LSF Batch will try to reach an unresponsive slave batch daemon before giving up. MBD_SLEEP_TIME controls the frequency of these attempts. If MAX_SBD_FAIL is not specified, its default value is three times the MBD_SLEEP_TIME value. SUNWrte sets MAX_SBD_FAIL to 30.

The JOB_ACCEPT_INTERVAL parameter specifies how many MBD_SLEEP_TIME periods LSF Batch will wait after successfully dispatching a job to a host before it dispatches another job to the same host. SUNWrte sets this parameter to 0, allowing the host to accept multiple jobs in each job dispatching period (MBD_SLEEP_TIME).

If you answered no during the installation, but now wish to enable these optimizations, simply edit these parameters in the lsb.params file as shown above.