For each group of restartable processes present in a ChorusOS system, the Hot Restart Controller stores a list of the processes for each group in a persistent memory block. A process is added to the list when it is first started. When a process cleanly terminates, the Hot Restart Controller notes this in the list. When all processes in the list have terminated cleanly, the Hot Restart Controller performs the following:
Deallocates the persistent memory blocks used to store the
images of the terminated processes, as well as blocks that were allocated
using the HR_GROUP_KEY
and HR_GROUP_KEYSIZE
deletion key macros. The process names used by the processes
can then be reused by other restartable processes (which will be loaded into
memory as new processes).
Adds the group ID to the list of available IDs for new process groups.
A group of processes can only terminate if all of its member processes terminate cleanly. This is important to remember in situations where not all indirect processes are restarted after a group restart. This is a matter of execution flow: if certain conditions in a direct process change the process's flow from one execution to another, the direct process may not restart an indirect process that was running prior to the restart. As a result, the indirect process will never terminate cleanly and so the group will not be able to terminate.
For example, consider the situation in the following diagram. The direct process spawns the indirect process only after certain conditions are met. These conditions are met the first time the direct process runs. After the direct process restarts, the conditios are no longer satisfied, so the indirect process is no longer spawned.
In the preceding diagram, the process group will not be able to terminate until the indirect process has been rerun using hrfexec(), and has terminated cleanly.
When a restart group cannot terminate because of one or more direct processes, the Hot Restart Controller detects this situation and displays the following message on the target console:
HR_CTRL: group gid blocked, some members have not terminated: list_of_processes |
gid is the ID of the group in question, and list_of_processes provides the name of each process which prevents the group from terminating. When this message is displayed, a common solution is to kill the process group using the akill command with the -g option. However, this solution is useful only if none of the indirect processes need to be run to complete the group's task.
A better solution is to use careful application design. If the preceding situation is likely to occur, flags can be stored in persistent memory to identify indirect processes that have not terminated cleanly. A process can then be made responsible for cleaning up the group, that is, restarting each indirect process that is flagged. This clean-up process can be run using the arun -g command when the Hot Restart Controller notification is displayed on the target console. Alternatively, the group could be designed so that the clean-up process is always run just before the group is expected to terminate. In this case the problem is solved without accessing the C_INIT console.