ChorusOS 4.0 Hot Restart Programmer's Guide

Chapter 2 Getting Started With Hot Restart

This chapter describes how to set up your ChorusOS 4.0 system to use the hot restart feature. It covers the following:


Note -

This chapter assumes that you have already correctly installed Sun Embedded Workshop on a host machine, and that you have a target machine which can be booted from a network boot server. You should also be familiar with configuring your ChorusOS 4.0 system and building a system image. For more information on these topics, see the related documents cited in the Preface of this guide.


This chapter does not cover hints for linking and building your own hot restartable applications. For information on this topic, see Appendix A, Hot Restart Programming Environment.

2.1 System Configuration

Before beginning to program and run actors which use the hot restart feature, you will need to update and configure your system for hot restart. System configuration for hot restart involves the following steps:

These steps are described in the sections which follow.

2.1.1 Features

To incorporate hot restart in your ChorusOS 4.0 system, use the ews graphical tool or the configurator(1CC) command line utility to include the following optional features in your system profile:

2.1.2 Memory Requirements and Design Considerations

As stated in Chapter 1, Introduction, the hot restart feature implements persistent memory as a portion of physical memory (RAM) on the target device. Although the persistent memory bank does not itself use virtual memory or swapping, hot restart is compatible with all three of the main memory models: flat, protected, and virtual.

The size of the persistent memory bank is defined in bytes by a system tunable parameter, pmm.rambankSize. The value of this parameter is static: its value cannot be modified while the system is running. In addition, because the RAM persistent memory bank does not use virtual memory or swapping, objects in persistent memory are locked in memory until they are freed. For these two reasons, it is important to make sure that pmm.rambankSize is set to a value realistic for the amount of data likely to be stored in persistent memory at any one time.

A portion of space reserved for an object in the persistent memory bank is termed a persistent memory block. A block is a contiguous set of memory pages, which means that the size of a block is always a multiple of the page size. Use vmPageSize(2K) to find out the page size for your platform.

For each running restartable actor, the system stores the following data in persistent memory:

The persistent memory blocks used to store the actor image and executing image will only be freed when the actor's group terminates cleanly (note that this may be some time after the actor itself has terminated). The actor can also allocate its own blocks of persistent memory to store run-time data while it is executing.

Although it can be difficult to predict the likely required value of pmm.rambankSize early in the development cycle, the following rule of thumb, derived from the statements above, may be of use to developers at the system design stage:


Note -

Sharing persistent memory blocks between user actors, or between user and supervisor actors is not supported. Persistent memory blocks can only be shared between supervisor actors.


The default value of the pmm.rambankSize tunable parameter is 1024*1024 bytes, that is, one megabyte.

2.1.3 Tunable Parameters

The HOT_RESTART feature uses a number of system tunable parameters. Each parameter has a default value which can serve as a guideline and is generally suitable for getting started with hot restart programming. All tunable parameters are static: they cannot be modified while the system is running.

Two parameters define limits for persistent memory occupation in the system's persistent memory bank:

Two parameters control the maximum number of restartable actors and restart groups permitted in the system:

Two parameters define the system's restart policy (see "1.2.4 Site Restart"). These parameters are quite sensitive: different values can produce very different behavior in the system. The system manages a restart counter for each restart group. Each time a group is restarted, the system increases its restart counter by one.

2.1.4 Building the System Image

Once you have updated your system's features for hot restart and the tunable parameter settings are appropriate for your needs, you are ready to build the system image.

If you want to run the hot restart demonstration and examples, ensure that you include the examples directory and X11 library in your system build paths if they are not already included. For information on building a system image for your particular target platform, see the corresponding document in the ChorusOS 4.0 Target Family Guide collection.

After the system image has been correctly built, copy it to your boot server and reboot your target machine. You are now ready to begin programming and running applications which use the hot restart feature.

2.2 Running the Hot Restart Demonstration Program

Sun Embedded Workshop includes a graphical demonstration of the hot restart feature. The demonstration is based on the well-known program Xmaze, which has been slightly modified to make it hot restartable. Some of the program's data is stored in persistent memory, which means that when the program is restarted, it starts at a point close to the point it had reached prior to the restart. The resulting application is a ChorusOS actor called xdemo_s.

To run the hot restart demonstration program, do the following:

  1. Ensure that your system features are correctly set for hot restart: see "2.1.1 Features".

  2. Adjust the following system tunable parameters to suit the memory requirements of the Xmaze demonstration program, using ews or the configurator(1CC) command line utility:

    Tunable parameter 

    Description 

    Required value 

    pmm.rambankSize Size of persistent memory bank, in bytes 0x400000
    kern.exec.dflSysStackSize Default system stack size, in bytes 0x8000

  3. Configure your system image build to include the X11 library and ChorusOS examples directory, if this is not already the case.

  4. If you have made changes to the system image since the previous build, rebuild the system, copy the system image to the appropriate location (for example, the boot directory if you are using tftp-based boot) , and reboot the target machine.

  5. Ensure that a copy of the xdemo_s actor is present in a directory which is mounted on the target machine. If you use the make root command, a copy of the actor is already stored in build_dir/root/bin/examples. If this directory is not mounted, or you prefer to use a different mounted directory:


    $ cp build_dir/BUILD_EXAMPLES/restartDemo/xdemo_s example_directory
    
  6. Set the target machine's DISPLAY environment variable to the host machine which you are currently working on:


    $ rsh target setenv DISPLAY host_IP_address:0.0
    
  7. Run the restartable actor:


    $ rsh target arun -g 0 example_directory/xdemo_s
    

    The actor will be run as a member of the restart group with group ID 0.

The Xmaze demonstration appears on the screen. As the demonstration runs, it periodically stores its state as data in persistent memory. Let the demonstration advance a little, then restart the actor by typing the following on the host console:


$ rsh akill aid

aid is the actor identifier which is printed on the host console when the actor starts. The actor is restarted, and the Xmaze demonstration continues from a point close to the point it had reached before the restart.

The akill command provoked the restart because it was not passed with the restart-specific option -g. To kill the Xmaze demonstration actor without restarting it, type:


$ rsh target akill -g 0

As the xdemo_s actor is run from the command line, it is a direct actor, and will be started automatically by the system when the site is restarted. To check this, rerun the actor, and then provoke a site restart by typing the following:


$ rsh target restart

When the system has been re-initialized, the demonstration will be restarted.

Of course, this is a very simple illustration of the use of hot restart. The site restart is provoked manually from the command line. As an alternative, try restarting the actor (using akill -g) sufficiently frequently to trigger an automatic site restart. To do this, you will first need to set the system's restart policy to be more sensitive to actor failure. The following configuration will cause a site restart if the actor is restarted twice in the space of four seconds:

Tunable parameter 

Value 

hrCtrl.interval

hrCtrl.maxBadness