5.5 Recovering Jobs from a System Crash

If the system crashes, all active jobs (status_cd = RUN) fail. You can recover the jobs by running the script recover_mantas.sh. This script changes the status_cd to RES so that these jobs can restart and finish running. The recover_mantas.sh script has an optional parameter—the date on which the system ran the start_mantas.sh script. This parameter has a DD-MM-YYYY format. The default value is the current date.

Running the recover_mantas.sh script with this parameter ensures the script recovers only the jobs started that day. The dispatcher must be running to pick up the restarted jobs. This results in either a successful completion (status_cd = FIN) or failure (status_cd = ERR).

You can restart jobs that ended in failure by running the restart_mantas.sh script. The restart_mantas.sh <template group id> script changes the status_cd from ERR to RES for any jobs passed in the template group that have a status_cd of ERR for the dispatcher to pickup.