5.5 Recovering Jobs from a System Crash
If the system crashes, all active jobs (status_cd = RUN
) fail. You
can recover the jobs by running the script recover_mantas.sh
. This
script changes the status_cd to RES so that these jobs can restart and finish running.
The recover_mantas.sh
script has an optional parameter—the date on
which the system ran the start_mantas.sh
script. This parameter has
a DD-MM-YYYY
format. The default value is the current date.
Running the recover_mantas.sh
script with this parameter
ensures the script recovers only the jobs started that day. The dispatcher must be
running to pick up the restarted jobs. This results in either a successful completion
(status_cd = FIN
) or failure (status_cd = ERR
).
You can restart jobs that ended in failure by running the
restart_mantas.sh
script. The restart_mantas.sh
<template group id> script changes the status_cd
from ERR to RES for
any jobs passed in the template group that have a status_cd of ERR for the
dispatcher
to pickup.