This chapter describes how to set up your ChorusOS 4.0 system to use the hot restart feature. It covers the following:
Configuring your ChorusOS 4.0 system for hot restart: see "2.1 System Configuration".
Running the graphical hot restart demonstration program provided with Sun Embedded Workshop: see "2.2 Running the Hot Restart Demonstration Program".
This chapter assumes that you have already correctly installed Sun Embedded Workshop on a host machine, and that you have a target machine which can be booted from a network boot server. You should also be familiar with configuring your ChorusOS 4.0 system and building a system image. For more information on these topics, see the related documents cited in the Preface of this guide.
This chapter does not cover hints for linking and building your own hot restartable applications. For information on this topic, see Appendix A, Hot Restart Programming Environment.
Before beginning to program and run actors which use the hot restart feature, you will need to update and configure your system for hot restart. System configuration for hot restart involves the following steps:
including the necessary ChorusOS optional features in your system
ensuring that the settings for the tunable parameters used by the hot restart feature are suitable for your system
These steps are described in the sections which follow.
To incorporate hot restart in your ChorusOS 4.0 system, use the ews
graphical tool or the configurator(1CC) command line utility to include the following optional features
in your system profile:
HOT_RESTART. This feature exports the hot restart API and restart mechanism.
ACTOR_EXTENDED_MNGT, LAPSAFE, LAPBIND and ADMIN_SHUTDOWN. These features provide necessary support for the HOT_RESTART feature.
As stated in Chapter 1, Introduction, the hot restart feature implements persistent memory as a portion of physical memory (RAM) on the target device. Although the persistent memory bank does not itself use virtual memory or swapping, hot restart is compatible with all three of the main memory models: flat, protected, and virtual.
The size of the persistent memory bank is defined in bytes by a system
tunable parameter, pmm.rambankSize
. The value of
this parameter is static: its value cannot be modified while the system is
running. In addition, because the RAM persistent memory bank does not use
virtual memory or swapping, objects in persistent memory are locked in memory
until they are freed. For these two reasons, it is important to make sure
that pmm.rambankSize
is set to a value realistic
for the amount of data likely to be stored in persistent memory at any one
time.
A portion of space reserved for an object in the persistent memory bank is termed a persistent memory block. A block is a contiguous set of memory pages, which means that the size of a block is always a multiple of the page size. Use vmPageSize(2K) to find out the page size for your platform.
For each running restartable actor, the system stores the following data in persistent memory:
The text and initialized data which were loaded into memory from stable storage. This is known as the actor image. The actor image occupies a single block of persistent memory.
The executed text, initialized data and BSS (data initialized to zero), from which the actor is running. This is known as the actor's executing image. The executing image occupies two blocks of persistent memory: one block for the text and one block for the data. The heap and stack for the executing actor are stored in non-persistent memory.
The persistent memory blocks used to store the actor image and executing image will only be freed when the actor's group terminates cleanly (note that this may be some time after the actor itself has terminated). The actor can also allocate its own blocks of persistent memory to store run-time data while it is executing.
Although it can be difficult to predict the likely required value of pmm.rambankSize
early in the development cycle, the following
rule of thumb, derived from the statements above, may be of use to developers
at the system design stage:
Restartable
actors occupy an absolute minimum of twice their size in persistent memory.
This minimum will accommodate the actor's actor image and executing image
(although it does not allow for rounding up of memory block sizes to the nearest
page). The actor may also allocate additional portions of memory. pmm.rambankSize
should therefore be greater than twice the combined
size of the restartable actors expected to run simultaneously on a system.
Sharing persistent memory blocks between user actors, or between user and supervisor actors is not supported. Persistent memory blocks can only be shared between supervisor actors.
The default value of the pmm.rambankSize
tunable
parameter is 1024*1024 bytes, that is, one megabyte.
The HOT_RESTART feature uses a number of system tunable parameters. Each parameter has a default value which can serve as a guideline and is generally suitable for getting started with hot restart programming. All tunable parameters are static: they cannot be modified while the system is running.
Two parameters define limits for persistent memory occupation in the system's persistent memory bank:
pmm.rambankSize
is the maximum amount of persistent memory available in the
system, in bytes. The default value is one megabyte (0x100000).
See the previous section for guidelines on setting this parameter to suit
your system. If you want to run the hot restart demonstration program, you
will need to increase the value of this parameter to four megabytes (0x400000).
pmm.maxBlocks
is the maximum number of recorded persistent memory
blocks which can be allocated in the persistent memory bank. A block is a
variable-sized number of contiguous pages of RAM. Each time an actor (supervisor
or user) issues a request to store a piece of data in persistent memory, a
block of the appropriate size, rounded up to the nearest whole page, is allocated.
The default value is 30.
Two parameters control the maximum number of restartable actors and restart groups permitted in the system:
hrCtrl.maxActors
is the maximum number of hot restartable actors which can be
registered in the system. An actor is registered in the system when it is
first run, and remains registered until all the actors in its group have terminated
normally. The default value is 32. If hrCtrl.maxActors
is greater than 65536, 65536 is used instead.
hrCtrl.maxGroups
is the maximum number of restart groups which
can be present in the system at the same time. Its default value is 32.
Two parameters define the system's restart policy (see "1.2.4 Site Restart"). These parameters are quite sensitive: different values can produce very different behavior in the system. The system manages a restart counter for each restart group. Each time a group is restarted, the system increases its restart counter by one.
hrCtrl.interval
is the frequency with which a group's restart counter is decreased, in seconds. Every hrCtrl.interval
seconds, the system will decrease the group's restart counter by one (until
the counter reaches zero). The default value for hrCtrl.interval
is 3 seconds.
hrCtrl.maxBadness
is the maximum value a group's restart counter
can reach before it triggers a site restart. In other words, when a group's
restart counter reaches this value, a site restart is automatically performed.
The default value is 25. If set to zero, the system never triggers a site
restart.
Once you have updated your system's features for hot restart and the tunable parameter settings are appropriate for your needs, you are ready to build the system image.
If you want to run the hot restart demonstration and examples, ensure that you include the examples directory and X11 library in your system build paths if they are not already included. For information on building a system image for your particular target platform, see the corresponding document in the ChorusOS 4.0 Target Family Guide collection.
After the system image has been correctly built, copy it to your boot server and reboot your target machine. You are now ready to begin programming and running applications which use the hot restart feature.
Sun Embedded Workshop includes a graphical
demonstration of the hot restart feature. The demonstration is based on the
well-known program Xmaze, which has been slightly modified to make it hot
restartable. Some of the program's data is stored in persistent memory, which
means that when the program is restarted, it starts at a point close to the
point it had reached prior to the restart. The resulting application is a
ChorusOS actor called xdemo_s
.
To run the hot restart demonstration program, do the following:
Ensure that your system features are correctly set for hot restart: see "2.1.1 Features".
Adjust the following system tunable parameters to suit the memory requirements of the Xmaze demonstration program, using ews or the configurator(1CC) command line utility:
Tunable parameter |
Description |
Required value |
---|---|---|
pmm.rambankSize | Size of persistent memory bank, in bytes | 0x400000 |
kern.exec.dflSysStackSize | Default system stack size, in bytes | 0x8000 |
Configure your system image build to include the X11 library and ChorusOS examples directory, if this is not already the case.
If you have made changes to the system image since the previous build, rebuild the system, copy the system image to the appropriate location (for example, the boot directory if you are using tftp-based boot) , and reboot the target machine.
Ensure that a copy of the xdemo_s
actor is present in a directory which is mounted on the target
machine. If you use the make root command, a copy of the
actor is already stored in build_dir/root/bin/examples. If this directory is not mounted, or you prefer to use a different
mounted directory:
$ cp build_dir/BUILD_EXAMPLES/restartDemo/xdemo_s example_directory |
Set the target machine's DISPLAY
environment variable to the host machine which you are currently working on:
$ rsh target setenv DISPLAY host_IP_address:0.0 |
Run the restartable actor:
$ rsh target arun -g 0 example_directory/xdemo_s |
The actor will be run as a member of the restart group with group ID 0.
The Xmaze demonstration appears on the screen. As the demonstration runs, it periodically stores its state as data in persistent memory. Let the demonstration advance a little, then restart the actor by typing the following on the host console:
$ rsh akill aid |
aid is the actor identifier which is printed on the host console when the actor starts. The actor is restarted, and the Xmaze demonstration continues from a point close to the point it had reached before the restart.
The akill command provoked the restart because it was not passed with the restart-specific option -g. To kill the Xmaze demonstration actor without restarting it, type:
$ rsh target akill -g 0 |
As the xdemo_s
actor is run from the command
line, it is a direct actor, and will be started automatically by the system
when the site is restarted. To check this, rerun the actor, and then provoke
a site restart by typing the following:
$ rsh target restart |
When the system has been re-initialized, the demonstration will be restarted.
Of course, this is a very simple illustration of the use of hot restart. The site restart is provoked manually from the command line. As an alternative, try restarting the actor (using akill -g) sufficiently frequently to trigger an automatic site restart. To do this, you will first need to set the system's restart policy to be more sensitive to actor failure. The following configuration will cause a site restart if the actor is restarted twice in the space of four seconds:
Tunable parameter |
Value |
---|---|
hrCtrl.interval |
4 |
hrCtrl.maxBadness |
2 |