ChorusOS 4.0 Hot Restart Programmer's Guide

4.2.4.1 Group Termination

For each group of restartable actors present in a ChorusOS system, the Hot Restart Controller stores a list of the actors in the group in a persistent memory block. An actor is added to the list when it is first started. When an actor cleanly terminates, the Hot Restart Controller notes this in the list. When all actors in the list have terminated cleanly, the Hot Restart Controller does the following:

A group of actors can only terminate if all of its member actors terminate cleanly. This is important to remember in situations where not all indirect actors are restarted after a group restart. This is a matter of execution flow: if certain conditions in a direct actor change the actor's flow from one execution to the next, the direct actor may not restart an indirect actor which was running prior to the restart. As a result, the indirect actor will never terminate cleanly and so the group will not be able to terminate.

For example, consider the situation in the following figure. The direct actor spawns the indirect actor only if a certain condition is fulfilled. This condition is fulfilled the first time the direct actor runs. After the direct actor restarts, the condition is no longer fulfilled, so the indirect actor is no longer spawned.

Figure 4-2 Conditional Spawning of a Restartable Actor

Graphic

In the situation illustrated above, the actor group will not be able to terminate until the indirect actor has been rerun using hrfexec(), and has terminated cleanly.

When a restart group cannot terminate because of one or more direct actors in this situation, the Hot Restart Controller detects the fact and prints the following message on the target console:


HR_CTRL: group gid blocked, some members have not terminated: list_of_actors

gid is the ID of the group concerned, and list_of_actors provides the name of each actor which prevents the group from terminating. When this message appears, a basic solution is to kill the actor group using the akill command with the -g option, as described in "4.3 Killing Restartable Actors". This solution is only useful if none of the indirect actors need to be run to complete the group's task.

A better solution is to use careful application design. If the situation is likely to occur, flags can be stored in persistent memory to indicate indirect actors which have not terminated cleanly. An actor can then be made responsible for cleaning up the group, that is, restarting each indirect actor which is flagged. This clean-up actor can be run using the arun -g command when the Hot Restart Controller notification appears on the target console. Alternatively, the group could be designed so that the clean-up actor is always run just before the group is expected to terminate, in which case the problem is solved without the need for access to the C_INIT console.