This section describes how to set up a ChorusOS system to use the hot restart feature. It covers the following:
Configuring the ChorusOS system for hot restart.
Running the graphical hot restart demonstration program provided with the Sun Embedded Workshop software.
This chapter assumes that you have already correctly installed the Sun Embedded Workshop software on a host machine, and that you have a target machine which can be booted from a network boot server. You should also be familiar with configuring your ChorusOS system and building a system image. For more information on these topics, see Chapter 4, Building Makefiles and Configuring the System Image.
Before beginning to program and run processes that will use the hot restart feature, you must update and configure your system for hot restart. System configuration for hot restart involves the following steps:
Including the necessary ChorusOS optional features in your system
Ensuring that the settings for the tunable parameters used by the hot restart feature are suitable for your system.
These steps are described in the following sections.
To incorporate hot restart into your ChorusOS system, use the ews graphical tool or the configurator(1CC) command line utility to include the following optional features in your system profile:
HOT_RESTART. This feature exports the hot restart API and restart mechanism.
LAPSAFE and LAPBIND. These features provide the necessary support for the HOT_RESTART feature.
The HOT_RESTART feature implements persistent
memory as a portion of the random access memory (RAM) on the target device.
Although the persistent memory bank does not use virtual memory or swapping, HOT_RESTART is compatible with all three main memory models:
flat, protected, and virtual.
The size of the persistent memory bank is defined in bytes by the system tunable parameter, pmm.rambankSize. The value of this parameter is static and cannot be modified while the system is running. In addition, because the RAM persistent memory bank does not use virtual memory or swapping, objects in persistent memory are locked in memory until they are freed. For these two reasons, it is important to ensure that pmm.rambankSize is set to a value realistic for the amount of data likely to be stored in persistent memory at any given time.
A portion of space reserved for an object in the persistent memory bank is known as a persistent memory block. A block is a contiguous set of memory pages, which means that the size of a block is always a multiple of the page size. For more information on the page size relevant to your platform, see the vmPageSize(2K) man page.
For each restartable process that is running, the system stores the following data in persistent memory:
The text and initialized data which were loaded into memory from stable storage. This is known as the process image. The process image occupies a single block of persistent memory.
The executed text, initialized data, and BSS (data initialized to zero), from which the process is running. This is known as the executing image of the process. The executing image occupies two blocks of persistent memory: one block for the text and one block for the data. The heap and stack for the executing process are stored in non-persistent memory.
Although it can be difficult to predict the likely required value of pmm.rambankSize in the early stages of the development cycle, the following rule of thumb, derived from the previously mentioned statements, may be of use to developers at the system design stage:
Restartable processes require an absolute minimum of twice their size in persistent memory. This minimum memory portion will accommodate the process's process image and executing image (although it does not enable rounding of memory block sizes to the nearest page).
The process can also allocate additional memory. Therefore, pmm.rambankSize should be greater than twice the combined size of the restartable processes expected to run simultaneously.
Sharing persistent memory blocks between user processes, or between user and supervisor processes is not supported. Persistent memory blocks can only be shared between supervisor processes.
The default value of the pmm.rambankSize tunable parameter is one megabyte.
The HOT_RESTART feature uses a number of system tunable parameters. Each parameter has a default value which can serve as a guideline, and is generally suitable for getting started with hot restart programming. All tunable parameters are static -- they cannot be modified while the system is running.
Two parameters define the limits for persistent memory occupation of the system's persistent memory bank:
pmm.rambankSize is the maximum amount of persistent memory available in the system (in bytes). The default value is one megabyte (0x100000). See the previous section for guidelines on setting this parameter to suit your system. To run the hot restart demonstration program, you will need to increase the value of this parameter to four megabytes (0x400000).
pmm.maxBlocks is the maximum number of recorded persistent memory blocks which can be allocated in the persistent memory bank. A block is a variable-sized number of contiguous pages of RAM. Each time a process (supervisor or user) issues a request to store a piece of data in persistent memory, a block of the appropriate size, rounded up to the nearest whole page, is allocated. The default value is 30.
Two parameters control the maximum number of restartable processes and restart groups permitted in a system:
hrCtrl.maxprocesses is the maximum number of hot restartable processes which can be registered in the system. A process is registered in the system when it is first run, and remains registered until all processes in its group have terminated cleanly. The default value is 32. If hrCtrl.maxprocesses is greater than 65536, this value will be used instead.
hrCtrl.maxGroups is the maximum number of restart groups that can be present in the system at the same time. Its default value is 32.
Two parameters define the system's restart policy (see "Site Restart"). These parameters are fairly sensitive -- different values can produce very different behavior in the system. The system manages a restart counter for each restart group. Each time a group is restarted, the system increases its restart counter by one.
hrCtrl.interval is the frequency with which a group's restart counter is decreased, in seconds. Every hrCtrl.interval seconds, the system decreases the group's restart counter by one (until the counter reaches zero). The default value for hrCtrl.interval is three seconds.
hrCtrl.maxBadness is the maximum value a group's restart counter can reach before it triggers a site restart. In other words, when a group's restart counter reaches this value, a site restart is automatically performed. The default value is 25. If set to zero, the system will never trigger a site restart.
After updating your system's features for hot restart and setting the tunable parameters to suit your requirements, you are ready to build the system image.
To run the examples and hot restart demonstration, include the examples directory and X11 library in your system build paths (if they are not already included). For information on building a system image for your particular target platform, see the corresponding document in the ChorusOS 5.0 Target Platform Collection.
After the system image has been correctly built, copy the image to your boot server and reboot the target machine. You are now ready to begin programming and running applications that can use the hot restart feature.
The Sun Embedded Workshop software includes a graphical demonstration
of the hot restart feature. The demonstration is based on a well-known program, Xmaze, which has been slightly modified to make it hot restartable.
Some of the program's data is stored in persistent memory, which means that
when the program is restarted, it starts at a point close to the point it
had reached prior to the restart. The resulting application is a ChorusOS
process called xdemo.
To run the hot restart demonstration program, do the following:
Ensure that your system features are correctly set for hot restart (see "Features".
Adjust the following system parameters to suit the memory requirements of the Xmaze demonstration program using Ews or the configurator(1CC) command line utility, as shown in the following table:
|
Tunable parameter |
Description |
Required Value |
|---|---|---|
|
pmm.rambankSize |
Size of persistent memory bank (in bytes) |
0x400000 |
|
kern.exec.dflSysStackSize |
Default system stack size (in bytes) |
0x8000 |
Configure your system image build to include the X11 library and ChorusOS examples directory (if this is not already the case):
% make reconfigure NEWCONF '_s <src_dir>/opt/X11' |
If you have made changes to the system image since the previous build, rebuild the system and copy the system image to the appropriate location (for example, the boot directory if you are using tftp-based boot). Reboot the target machine.
Ensure that a copy of the xdemo process
is present in a directory mounted on the target machine. If you use the make root command, a copy of the process is already stored in build_dir/root/bin/examples. If this directory
is not mounted, or to use a different mounted directory, do the following:
$ cp build_dir/BUILD_EXAMPLES/restartDemo/xdemo example_directory
Set the target machine's DISPLAY environment variable to the host machine:
$ rsh target setenv DISPLAY host_IP_address:0.0
Run the restartable process:
$ rsh target arun -g 0 example_directory/xdemo
The process will be run as a member of the restart group with a group ID 0.
The Xmaze demonstration appears on the screen. As the demonstration runs, it periodically stores its state as data in persistent memory. Allow the demonstration to run for a short time, then restart the process by typing the following command on the host console:
$ rsh target akill pid
The process identifier (PID) is printed on the host console when the process starts.
The process is restarted, and the Xmaze demonstration continues from a point close to where it left off before the restart.
The akill command provoked the restart because it was not called with the restart-specific option -g. To kill the Xmaze demonstration process without restarting it, type:
$ rsh target akill -g 0
Because the xdemo process runs from the command
line it is a direct process, and will be started automatically by the system
when the site is restarted. To confirm this, rerun the process, and provoke
a site restart by typing the following:
$ rsh target restart
After the system has been re-initialized, the demonstration will be restarted.
This is a basic illustration of the use of the hot restart feature. The site restart is provoked manually from the command line. As an alternative, try restarting the process using akill -g sufficiently frequently to trigger an automatic site restart. To do this, set the system's restart policy to be more sensitive to process failure. The following example configuration will invoke a site restart if the process is restarted twice within four seconds:
|
Tunable parameter |
Value |
|---|---|
|
hrCtrl.interval |
4 |
|
hrCtrl.maxBadness |
2 |