Sun HPC ClusterTools 3.0 Administrator's Guide: With CRE

Removing CRE Jobs that have Exited

When a job does not exit cleanly, it is possible for all of a job's processes to have reached a final state, but the job object itself to not be removed from the CRE database. The following are two indicators of such incompletely exited jobs:

If you see a job in one of these defunct states perform the following steps to clear the job from the CRE database:

  1. Execute mpps -e again in case the CRE has had time to update the database (and remove the job).

  2. If the job is still running, kill it, specifying its job ID.

% mpkill jid
  1. If necessary, remove the job object from the CRE database.

    If mpps continues to report the killed job, use the -C option to mpkill to remove the job object from the CRE database; This must be done as root, from the master node.

# mpkill -C jid