The ChorusOS operating system provides two types of APIs:
APIs that are specific to ChorusOS systems which are not directly portable to other environments.
Standard POSIX APIs that are portable to other environments.
The ChorusOS specific APIs provide more efficient access to the microkernel primitive, when real-time operation is required. These APIs also enable compatibility with applications developed for earlier versions of the ChorusOS system. However, they do not enable POSIX-compliant development. Their use is therefore encouraged only when constraints outweigh the benefits of application compatibility and portability.
This section of the guide describes how to build applications that use ChorusOS specific APIs and POSIX APIs. Examples of applications that can be built each both type of API are also presented.
This guide does not cover the ChorusOS product hardware abstraction layer and board support packages. For instructions on porting ChorusOS systems to specific hardware, or writing drivers, see the ChorusOS 5.0 Board Support Package Developer's Guide.
This chapter introduces the steps involved in developing applications to run on ChorusOS systems. It includes the APIs available to ChorusOS applications and the libraries available within these APIs.
The ChorusOS operating system provides an environment for applications running on a network of target machines, controlled by a remote host.
The target system runs the ChorusOS operating system and provides the execution environment.
The host machine provides the development and debugging environment. The user can develop the applications on the host and from the host, start and debug applications on the target.
The ChorusOS operating system supports three kinds of applications:
POSIX processes
Most of the applications running on ChorusOS systems will be POSIX processes. These applications have access to pure POSIX APIs, a few "POSIX-like" extended APIs, and a small number of restricted microkernel system calls.
ChorusOS actors
These applications run on top of the microkernel and are restricted to the microkernel API. ChorusOS actors include drivers, subsystem events, and protocol stacks.
ChorusOS 4.x Legacy applications
These applications are supported for backward compatibility and use the same APIs they used as in previous versions of the ChorusOS system. Although these applications are supported in ChorusOS 5.0, you are encouraged to move such applications to the POSIX level, using the standard POSIX APIs.
The different groups of APIs represent different development environments - you can develop new applications on one or the other. You are strongly encouraged to use the ChorusOS POSIX API because this offers maximum flexibility and enables porting of POSIX applications with relative ease. A ChorusOS POSIX application cannot access the ChorusOS microkernel API, (apart from a few restricted calls, which are described at the end of this chapter). If you specifically need to emphasize performance issues, or if you require access to the microkernel primitives directly, you should use the ChorusOS microkernel API.
This section provides an overview of the programming interfaces available to applications developed for ChorusOS systems. The programming interface may differ from one application to another, depending on:
Application type - whether it is a POSIX process, a ChorusOS actor, or an embedded actor.
Execution mode - whether running in user or supervisor space.
Execution structure - whether it contains one or more threads.
Library names used in the ChorusOS operating system adhere to the following conventions with regard to their suffixes:
.u.a | These static libraries can only be used to build actors that will be loaded in a user address space. |
.s.a | These static libraries can only be used to build actors that will be loaded in the supervisor address space. |
.a | These static libraries can be used to build either user or supervisor actors or processes. |
.so |
These libraries are shared or dynamic and can be used to build either user or supervisor actors or processes. |
All header file and library pathnames listed in the following subsections are related to the installation path of your ChorusOS system, typically, install_dir/chorus-target.
The programming environment of ChorusOS actors consists of the following interfaces:
Microkernel API
Private Data API
Standard-C API
Console I/O API
The ChorusOS actor APIs enable you to build two kinds of applications:
Embedded actors
Standard actors
Embedded user actors use the embedded library:
kernel/lib/embedded/libebd.u.a
Embedded supervisor actors use the embedded library:
kernel/lib/embedded/libebd.s.a
Standard actors not linked to the embedded library use the library:
os/lib/libc.a
Note that this is the same library used by POSIX processes.
The programming environment of POSIX processes includes the standard POSIX APIs and a number of POSIX extensions that can be accessed directly by any process.
Certain libraries (libc and libstdc++, for example) are implicitly provided when you select the appropriate Imakefile rule (see "The imake Environment"). Other libraries must be explicitly stated in the Imakefile.
The libraries to which a process has access may be either static (.a) or shared (.so). Whether the process uses a static or shared library depends on the rule selected in the Imakefile. Shared processes use libc.so.
The following sections describe all libraries that can be used by a process. Note that the libraries can be accessed by either a user or supervisor actor or process.
The os/lib/libc.a and os/lib/libc.so libraries are automatically included by any process or actor and include the following APIs:
Private Data API
LAP API
MIPC API
IPC API
Standard C API
POSIX I/O API
POSIX Network API
C++ actors or processes have access to the same APIs through the C++ library (os/lib/libstdc++.a).
The following libraries can be accessed by either a supervisor or user process or actor. These libraries must be explicitly included in your Imakefile and contain the full pathname.
To add a library to an Imakefile, $(OS_DIR)/lib/lib<libraryname>.a or $(OS_DIR)/lib/lib<libraryname>.so to the relevant rule.
Table 5-1 Static libraries (.a)
Using the libsysevent.a library dictates that you use the libnvpair.a, librpc.a and libpthreads.a libraries too.
Library name and path |
Description |
---|---|
os/lib/libldap.so |
LDAP library |
os/lib/libnsl.so |
Network services library (symbolic link to libresolv.so) |
os/lib/libpam.so |
Interface library for PAM |
os/lib/libpthreads.so |
POSIX pthread library |
os/lib/libresolv.so |
Network services library |
os/lib/librpc.so |
RPC library |
The extensions to the standard POSIX APIs can be divided into two groups:
Restricted microkernel calls
POSIX-like extended services
The restricted microkernel calls are directly available from the libc.a or libc.so libraries and include the following APIs:
MIPC
IPC
LAP
To add POSIX-like extended services, you must specify the required library in your Imakefile. These services include the following APIs:
Blackbox
SysEvent
LDAP
The embedded libraries (libebd.u.a and libebd.s.a) and the libc.a and libstdc++.a libraries have been made thread-safe to support multithreaded actors. Multithreading is managed by the library in the following way:
Protecting shared variables with mutexes.
Using the Private Data library to maintain private variables per thread (for errno management, see the next section).
Defining errno as a single global variable for an actor is not suitable for multithreaded actors. Situations can arise where a thread, when examining errno on return from a failed system call, concludes that the call failed for the wrong reason (because the global errno was changed by another system call in another thread in the meantime). Some programs also test errno rather than system call return values to detect errors.
To prevent global errno changes by another system call, the header file errno.h (which is exported by the POSIX environment) should be included in any source file using errno. This will result in a separate value of errno for each thread.
Routines implemented in the mathematical API can be used by actors that use the microkernel API and by processes that use the POSIX API.
The mathematical API is located in kernel/lib/libm.
The mathematical library is libm.a.
The ChorusOS operating system provides a set of application programming interfaces (APIs) for the development of everything from board support packages to network drivers and subsystems to POSIX compatible applications. However, indescriminately mixing functions from different APIs in the same program could lead to disaster.
The ChorusOS API Classification model defines ChorusOS APIs in terms of specific classes and highlights the restrictions that exist between APIs of each class. For the complete API Classfication model, see the API(5FEA) man page.
This chapter demonstrates the use of imake macros to build different types of ChorusOS applications written in the C and C++ languages. The examples in this chapter are installed with the ChorusOS product as part of the SUNWewp examples package.
As previously discussed, ChorusOS applications can be POSIX processes or actors. This chapter provides the build information for both types of applications.
After completing the examples provided in this chapter, you will understand how to use imake to build an application with the ChorusOS product.
The default installation directory for all examples used in this book is install-dirchorus_family/src/opt/examples/. You can use the SolarisTM tool pkginfo(1)as follows:
% pkginfo | grep SUNWewp application SUNWewps Sun Embedded Workshop for Solaris/SPARC Examples |
To build a ChorusOS component, use the make and imake tools. All development tools are provided in the install_dir/5.0/chorus-family/tools directory of your ChorusOS installation.
The make environment is defined by a file containing variable definitions and rules. This file provides the rules for compiling C, C++, and assembly language. The rules are specific to your compiler - the name of the file indicates the compiler used. For example, when using the gcc compiler, the make environment file is named tgt-make/gcc-devsys.mk. This file contains the variables and rules required for building the component. The following variables are defined:
CFLAGS and CXXFLAGS specify the compilation options for C and C++ files, respectively. The compilation options are shown in Table 6-1.
Table 6-1 Compilation OptionsOption | Possible Settings | Default Setting |
---|---|---|
WARN | $(WARN_ON), $(WARN_OFF) | $(WARN_ON) |
DEBUG | $(DEBUG_ON), $(DEBUG_OFF) | $(DEBUG_OFF) |
PROF | $(PROF_ON), $(PROF_OFF) | $(PROF_OFF) |
OPT | $(OPT_ON), $(OPT_OFF) | $(OPT_ON) |
INCLUDES and DEFINES specify include and depend values. These variables can be replaced by different variables (overloaded) at the application level. The variables are contained in the CPPFLAGS flag, which is used in compilation and in computing dependencies. Both INCLUDES and DEFINES can be initialized at the application level.
LD_UCRT0, LD_SCRT0, LD_DCRTO, LD_PSCRT0, LD_PUCRT0, LD_CRTI, and LD_CRTN are used to manage different
types of crt object files. A crt object
file (crto) contains the entry point of the application
(_start
or _main
).
LD_UCRT0 |
Specifies the entry point for an embedded user actor |
LD_SCRT0 |
Specifies the entry point for an embedded supervisor actor |
LD_DCRT0 |
Specifies the entry point for drivers |
LD_PSCRT0 |
Specifies the entry point for a supervisor actor or process |
LD_PUCRT0 |
Specifies the entry point for a user actor or process |
LD_CRTI |
Contains the crti.o file. This file is always the first object file of a link list and should contain the beginning of the .init and .fini sections |
LD_CRTN |
Contains the .fini section |
LD_U_ACTOR and LD_S_ACTOR specify link information for user and supervisor actors.
EBD_U_LIBS, EBD_S_LIBS, CLX_U_LIBS, CLX_S_LIBS, EBD_CXX_LIBS, and CLX_LIBS are provided to manage the libraries for embedded actors, actors and processes.
Table 6-3 Library Variables
EBD_U_LIBS |
Manages the C library for embedded user actors |
EBD_S_LIBS |
Manages the C library for embedded supervisor actors |
CLX_U_LIBS |
Manages the C library for user actors and processes |
CLX_S_LIBS |
Manages the C library for supervisor actors and processes |
EBD_CXX_LIBS |
Manages the C++ library for embedded actors |
CLX_LIBS |
Manages the C++ library for actors and processes |
The make environment includes the following commands:
cc
ld
as
mkctors
The ChorusOS imake environment enhances the make environment by
providing template rules for common ChorusOS build operations using generic
names. When using the predefined imake rules, it is not
necessary to know which libraries, crt
files, or
entry points should be used to build an application, because they are selected
automatically.
Instead of creating Makefiles you create Imakefiles, and imake generates Makefiles from them.
The imake environment is defined by four files containing sets of variables and rules. These files are located in the tools/imake directory. The rules are independent of the compiler you use.
Imake.tmpl contains definitions of variables. See "imake Variable Definitions".
Imake.rules contains build rules. See "imake Build Rules".
Package.rules contains rules used to build a binary distribution. See "imake Packaging Rules".
The Imake.tmpl file contains the following definitions:
FAMILY, indicates the target family (x86, usparc, ppc60x, mpc860).
COMPILER, identifies the compiler to be used (for example, gcc).
REL_DIR, specifies the path of the current directory. This variable is automatically set in subdirectories by imake.
HOST, indicates the host operating system (SOLARIS).
DEVTOOLS_DIR, indicates the path of the ChorusOS development tools.
The Imake.rules file contains macros known as Imake build rules. A list of Imake build rules and their functions is displayed in Table 6-4.
Table 6-4 imake Build RulesMacro name | Function |
---|---|
MakeDir(dir) | Creates a directory named dir. |
LibraryTarget(lib, objs) | Adds the objects indicated by objs into the library lib. |
Depend(srcs) | Computes the dependencies of srcs and adds them to the dependency list in the Makefile (using makedepend). |
ActorTarget(prog, objs, options, crt0, libs) | Uses objs to create a C actor called prog, and passes options, crt0 and libs to the linker. |
UserActorTarget(prog, objs, libs) | Creates a user C actor or process. |
SupActorTarget(prog, objs, libs) | Creates a supervisor C actor or process. |
EmbeddedUserActorTarget(prog, objs, libs) | Creates an embedded user C actor. |
EmbeddedSupActorTarget(prog, objs, libs) | Creates an embedded supervisor C actor. |
BuiltinDriver(prog, objs, libs) | Creates a ChorusOS operating system driver. |
BspProgTarget(prog, entry, objs, libs) | Creates a BSP program. |
CXXActorTarget(prog, objs, options, crt0, libs) | Uses objs to create a C++ actor or process named prog, and passes options, crt0 and libs to the linker. |
CXXUserActorTarget(prog, objs, libs) | Creates a user C++ actor or process. |
CXXSupActorTarget(prog, objs, libs) | Creates a supervisor C++ actor or process. |
CXXEmbeddedUserActorTarget(prog, objs, libs) | Creates an embedded user C++ actor. |
CXXEmbeddedSupActorTarget(prog, objs, libs) | Creates an embedded supervisor C++ actor. |
DynamicUserTarget(prog, objs, libs, dynamicLibs, dlDeps, options) | Creates a dynamic user C process. |
DynamicSupTarget(prog, objs, libs, dynamicLibs, dlDeps, options) | Creates a dynamic supervisor C process. |
DynamicCXXUserTarget(prog, objs, libs, dynamicLibs, dlDeps, options) | Creates a dynamic user C++ process. |
DynamicCXXSupTarget(prog, objs, libs, dynamicLibs, dlDeps, options) | Creates a dynamic supervisor C++ process. |
DynamicLibraryTarget(dlib, objs, staticLibs, dynamicLibs, dlDeps, options) | Creates a dynamic library. |
SharedUserTarget(prog, shobjs, staticLibs, sharedLibs, slDeps, options) | Creates a shared user C process. |
SharedSupTarget(prog, objs, staticLibs, sharedLibs, slDeps, options) | Creates a shared supervisor C process. |
SharedCXXUserTarget(prog, objs, staticLibs, sharedLibs, slDeps, options) | Creates a shared user C++ process. |
SharedCXXSupTarget(prog, objs, staticLibs, sharedLibs, slDeps, options) | Creates a shared supervisor C++ process. |
SharedLibraryTarget(shlib, shobjs, sharedLibs, staticLibs, slDeps, options) | Creates a shared library. |
The rules for building actors and processes use the following common arguments:
obj is the list of binary objects included in the actor/process.
prog is the name of the actor/process.
libs is the list of additional libraries used to build the actor/process. The application is linked by default with the library to provide either the basic or extended environment.
The rules used to build dynamic or shared libraries for actors and processes are described in more detail in "Building a Dynamic Process".
The Package.rules file contains macros known as Imake packaging rules which are used for building a binary distribution. Their names and functions are listed in Table 6-5.
Table 6-5 imake Packaging RulesMacro name | Function |
---|---|
DistLibrary(lib, dir) | Creates the directory dir and copies the library lib into it. |
DistActor(actor, dir) | Creates the directory dir and copies the actor actor into it. |
DistFile(file, dir) | Creates the directory dir and copies the file file into it. |
DistRenFile(file, nFile, dir) | Creates the directory dir, copies file into it, changing the name of file to nFile. |
DistProgram(program, dir) | Creates the directory dir and copies program into it. |
The following examples demonstrate how to create an Imakefile using single and multiple source files.
The application in this example is composed of a single C source file, myprog.c, in the directory myprog. Writing an Imakefile is fairly straightforward.
Set the SRCS variable to the list of source files.
In this case there is only one source file:
SRCS = myprog.c
Specify how to build the executable.
The macro you use depends on the type of binary you want. If you want to build a user-mode binary (for example myprog_u), use the UserActorTarget() macro, as illustrated in the following file extract. The first argument is the name of the executable and the second lists the object files. The third argument enables you to specify which libraries your program requires. In this example there is no library, therefore the argument is empty (you could also pass a NullParameter).
UserActorTarget(myprog_u,myprog.o,)
In the following example, we will build a user-mode binary. If you want to build a supervisor-mode binary (for example, myprog_s.r), use the SupActorTarget() macro as shown. The arguments are the same as for UserActorTarget().
SupActorTarget(myprog_s.r,myprog.o,)
Use the Depend() macro to generate the Makefile dependencies.
Depend($(SRCS))
The Imakefile is now complete and looks as follows:
SRCS = myprog.c UserActorTarget(myprog_u,myprog.o,) Depend($(SRCS))
Generate the Makefile with the ChorusOSMkMf tool.
See the ChorusOSMkMf(1CC) man page for details of how to do this. In the myprog directory, type:
% ChorusOSMkMf build_dir |
Where build_dir is the directory where you have built the ChorusOS system image on which your application will run.
Generate the make dependencies
To do this type:
% make depend |
Compile and link the program
To do this type:
% make |
The compiled program is now in your myprog directory and ready to be executed.
If an application used source files located in several subdirectories, you need to create a root Imakefile in the root directory, containing only the following:
#define IHaveSubdirs SUBDIRS = subdir1 subdir2 ...
Where subdir1, subdir2, ... are the subdirectories containing the source files (or other intermediate root Imakefile files). Next, create an Imakefile in each subdirectory containing source files. To generate the first Makefile, go to the root directory and type:
% ChorusOSMkMf build_dir |
Next, populate the tree with Makefile files, generate dependencies and finally compile the programs by typing make Makefiles, make depend, and then make.
% make Makefiles % make depend % make |
The compiled program is now ready to be executed.
Examples of Imakefiles which can be modified and used to build your own applications are provided in install_dir/chorus_family/src/opt/examples.
Actors and processes are defined according to the types of libraries they use. This section describes the different library types in the ChorusOS operating system and illustrates the kinds of actors and processes that would use each library type.
There are three types of library in the ChorusOS operating system:
Static library names are suffixed by .a. The static library is a collection of binary object files (.o). The linker concatenates all required binary objects within the static libraries into an executable program file.
Dynamic
Dynamic library names are suffixed by .so and can be linked with a program at runtime. Dynamic linking is supported by a ChorusOS operating system component named the runtime linker (see "Runtime Linker"). There are two two cases in which dynamic linking occurs:
At actor start-up - in order to build the executable, the runtime linker loads and links a list of libraries. These libraries are known as the dependencies of the executable.
During actor execution - with the dynamic linking API, an actor can explicitly load and link to dynamic libraries using the dlopen() function. This facilitates dynamic programming.
Dynamic libraries are loaded at runtime and are not included in the executable. This reduces the size of executable files. Dynamic libraries use relocatable code format. This code is turned into absolute code by the runtime linker.
In the ChorusOS operating system, both user and supervisor actors (but not boot actors) can use dynamic libraries.
Shared
Shared libraries are also suffixed by .so. Like dynamic libraries, shared libraries can be dynamically linked but unlike dynamic libraries, they are not duplicated in physical memory. When two actors, each with its own private copy of data, use the same shared library, only one instance of the library is present in physical memory.
Shared libraries use Position Independent Code (PIC). In PIC, references to global symbols are made indirectly through symbol tables. References to global symbols, as well as local jumps are made relative to the program counter. This code, as a result, can run anywhere in memory without being modified, however, the tables must be relocated first before program execution can begin. Using PIC in shared libraries does not perform as well as the code in dynamic libraries because the global symbol information is accessed through indirect references.
To compile an application in PIC, you must set the following feature in your Imakefile:
FPIC = ON
The FLARGEGOT feature complements the FPIC feature. Setting FLARGEGOT = ON enables support for large tables of global symbols.
Both dynamic and shared library names are suffixed by .so. However, each library type is built in a different way and is readily distinguished. The imake tool does not check that library components are PIC format binary object files. The resulting library may contain .o objects, which are not in PIC format, that can still be dynamically loaded but can no longer be shared.
It is possible to check whether or not a library is shared. To check whether or not the libfoo.so library is shared, for example, run the following GNU command:
% objdump -x libfoo.so | grep TEXTREL |
objdump is located in the chorus_family/tools tree. The specific path includes a family-dependent prefix.
If the library is shared, no output is generated by this command. Only dynamic library object files contain references to TEXTREL.
Actors or processes can be divided into two groups, according to the types of libraries they use.
Relocatable actors and processes
Upon actor start-up, the runtime linker loads the actor and performs the necessary relocations. A relocatable actor or process uses only static libraries.
Dynamic actors/processes
Upon actor start-up, the runtime linker loads the actor or process and performs the necessary relocations. It also loads and links the actor dependencies (the required dynamic libraries).
When you build a process or actor other than an embedded actor, you must specify one of the following macros in your Imakefile:
Relocatable processes/actors:
UserActorTarget
SupActorTarget
CXXUserTarget
CXXSupTarget
Dynamic processes/actors:
DynamicUserTarget
DynamicSupTarget
DynamicCXXUserTarget
DynamicCXXSupTarget
Shared processes/actors:
SharedUserTarget
SharedSupTarget
SharedCXXUserTarget
SharedCXXSupTarget
Each of these rules implicitly calls the libc.a or libc.so libraries. Therefore, when creating the Imakefile for a process or actor, there is no need to think about the libc.a or libc.so libraries because this is taken care of when you select the Imakefile rule. The C++ library is automatically included by specifying CXX in the relevant Imakefile rule.
The static linker runs on the development host and the runtime linker (dynamic linker) runs on the target.
The following table summarizes the actions performed by the static linker and by the runtime linker.
Link | Relocatable executable | Dynamic executable |
Static Linker | .a Static linker adds the required objects (.o) of a static library (.a) to the executable. | .a - Static linker adds the required objects (.o) of a static library (.a) to the executable. .so - Static linker adds the library to the list of libraries to be loaded at actor start-up. |
Runtime Linker | n/a | .so - At actor start-up, libraries are loaded and linked by the runtime linker. Libraries to be loaded are defined either as static links, or in the LD_PRELOAD environment variable. The runtime linker uses a library search path to locate dynamic and shared libraries. |
Runtime Linker (dlopen) | n/a | .so - Application explicitly asks the runtime linker to dynamically load and link a dynamic library using the dlopen() function of the dynamic linking API. |
Dynamic linking of libraries applies recursively to library dependencies; when a library is loaded, all libraries it uses are also loaded.
Dynamic applications consist of one or more dynamic objects. The dynamic application is typically a dynamic executable and associated dynamic object dependencies. As part of the initialization of a dynamic application, the runtime linker completes the binding of the application to its dynamic object dependencies.
In addition to initializing an application, the runtime linker provides services that enable the application to extend its address space by adding dynamic objects and binding to symbols within them.
The runtime linker performs the following functions:
Analyzes the executable's dynamic information section and determines which dynamic libraries are required.
Locates and loads the required dynamic libraries, and analyzes their dynamic information sections to determine whether additional dynamic library dependencies are required.
Performs the necessary relocations to bind dynamic libraries in preparation for actor execution (after all dynamic libraries are located and loaded).
Calls any initialization functions provided by the dynamic libraries. The initialization functions are called in the reverse order of the topologically sorted dependencies. Should cyclical dependencies exist, the initialization functions are called using the sorted order with the cycle removed.
Passes control to the application.
Calls all finalization functions on deletion of dynamic objects from the actor. The finalization functions are called in the order of the topologically sorted dependencies.
Acquires additional dynamic objects with dlopen() and binds to symbols within these objects using dlsym(), if required by the application.
The runtime linker uses a prescribed search path for locating the dynamic dependencies of an object. The default search paths are the runpath recorded in the object, proceeded by /usr/lib. The runpath is specified when the dynamic object is constructed using the -rpath option of the linker. The environment variable LD_LIBRARY_PATH can be used to indicate directories that are to be searched ahead of the default directories.
The runtime linker requires a file system to load dynamic objects. This file system can be on a host and can be accessed through NFS from the target. In embedded systems without a network connection, a bank of the system image can be used. For example, /image/sys_bank.
The following environment variables are used by the runtime linker:
LD_LIBRARY_PATH specifies a colon-separated list of directories that are to be searched ahead of the default directories defined previously. The colon-separated list is used to enhance the search path that the runtime linker uses to locate dynamic and shared libraries.
LD_PRELOAD provides a dynamic object name that is linked after the program is loaded but before any other dynamic objects that the program references.
LD_DEBUG is a column-separated list of tokens
for debugging the runtime linking of an application. Each token is associated
with a set of traces that are displayed during runtime linking. The supported
tokens are: file-ops
, reloc
, symbol-resolution
, malloc
, segment-alloc
, dependency
, misc
, linking
, dynamic-map-op
, group
, and error
.
Wildcard substitutions can also be used, for example, s*
matches both symbol-resolution
and segment-alloc
. Using just the wildcard (*
) matches
all traces.
Immediate binding:
The runtime linker performs both data reference and function reference relocations during process initialization (before transferring control to the application). This behavior is equivalent to the LD_BIND_NOW behavior in the Solaris operating environment (lazy binding is not supported).
No version checking:
The runtime linker does not perform version dependency checking. When looking for a library, the runtime linker looks for a file name that matches the library name exactly. This behavior is equivalent to the LD_NOVERSION behavior in the Solaris operating environment.
Weak symbols and aliases:
During symbol resolution, weak symbols are silently overridden by any global definition with the same name. Weak symbols can be defined alone or as aliases to global symbols. Weak symbols are defined with pragma definitions.
Call debug:
The runtime linker obtains debug information from the dynamic libary, thereby enabling you to debug the dynamic library.
In a dynamic program, an actor can link a dynamic or shared library explicitly during its execution. This on-demand object linking has several advantages:
By processing a dynamic object when it is required rather than during the initialization of an application, start-up time can be greatly reduced. In fact, the object might not be required if its services are not required during a particular run of the application.
The application can choose between several different dynamic objects depending on the exact services required. For example, if different libraries implement the same driver interface, the application can choose a specific driver implementation and load it dynamically.
Any dynamic object added to the actor address space during execution can be freed after use, thereby reducing overall memory consumption.
Typically, an application performs the following sequence to access an additional dynamic object using the dynamic library API:
A dynamic object is located and added to the address space of a running application using dlopen(). Any dependencies this dynamic object has are also located and added at this time.
The added dynamic object and its dependencies are relocated, and any initialization sections within these objects are called.
The application locates symbols within the added objects using dlsym(). The application is then able to reference the data or call the functions defined by these new symbols.
After the application has finished with the objects, the address space can be freed using dlclose(). Any termination section within the objects being freed will be called at this time.
Any error conditions that occur as a result of using these runtime linker interface routines can be displayed using dlerror().
Both dynamic and shared libraries can be used in dynamic programs. The only difference is that shared libraries are not duplicated in memory.
The following imake macro is used to build dynamic libraries:
DynamicLibraryTarget(dlib, objs, staticLibs, dynamicLibs, dlDeps, options)
This macro includes the following arguments:
dlib: name of the resulting dynamic library (suffixed by .so).
objs: library components. A list of binary object files (suffixed by .o).
staticLibs: a list of the static libraries (.a) that can be statically linked.
dynamicLibs: a list of dependencies. This list includes the dynamic libraries that must be loaded together with the resulting library. Each library can be defined in one of two ways:
-L path -l name
On the host, the linker looks for the library path/libname.so. On the target, the runtime linker looks for lib name.so in the library search path.
path
This library path can be absolute or relative and is used on the host by the linker, and on the target by the runtime linker. A relative path containing a "/" is interpreted as relative to the current directory by the runtime linker. A path without the "/" is searched in the library search path by the runtime linker.
dlDeps: a list of dynamic libraries on which the resulting library depends. If these dynamic libraries are changed, the resulting library dlib will be rebuilt. Each library must be defined as a path on the host. Generally dlDeps duplicates the libraries described in dynamicLibs. To express the dependency without embedding a path in the executable, use the -L path -l name syntax.
options: any linker options.
The following example builds a dynamic library named libfoo.so from the binary objects files a.o and b.o. When this library is loaded dynamically, the runtime linker will also load the dynamic library libdyn.so, which must be in its search path.
DynamicLibraryTarget( libfoo.so, a.o b.o, , libdyn.so, , )
Dynamic libraries are supported with the gcc
compiler only.
The following imake macros are used to build dynamic applications:
DynamicUserTarget(prog, objs, staticLibs, dynamicLibs, dlDeps, options) DynamicSupTarget(prog, objs, staticLibs, dynamicLibs, dlDeps, options) DynamicCXXUserTarget(prog, objs, staticLibs, dynamicLibs, dlDeps, options) DynamicCXXSupTarget(prog, objs, staticLibs, dynamicLibs, dlDeps, options)
For details of the functions that each macro performs, refer to "imake Build Rules".
The prog argument is the name of the resulting process. Other arguments are similar to those in the DynamicLibraryTarget() macro. For the options argument, the following options are particularly useful:
-soname=<name>
Within a DynamicLibraryTarget() rule, this option sets the internal soname of the library. If a library is used as a dependency in a rule that builds a dynamic executable and has a soname defined, the executable records the soname instead of the dynamicLibs argument.
-rpath <dir>
This option defines the runpath directory that is added to the library search path.
The following example builds a dynamic application called dyn_app from the binary object files a.o and b.o. The process is statically linked with the static ChorusOS operating system library. When this application is started, the runtime linker loads the dynamic library libdyn.so. In the target file system, this library can be located in the /libraries directory because this directory will be added to the search path of the runtime linker.
DynamicUserTarget( dyn_app, a.o b.o, , libdyn.so, , -rpath /libraries)
The following imake macro is used to build shared libraries:
SharedLibraryTarget(shlib, shobjs, sharedLibs, staticLibs, slDeps, options)
shlib: the name of the resulting shared library (suffixed by .so).
shobjs: a list of binary object files in PIC format (suffixed by .o).
sharedLibs: a list of dependencies. This includes the shared libraries that must be loaded together with the resulting library. Each library can be defined in one of two ways:
-L path -l name
On the host, the linker looks for the library path/libname.so. On the target, the runtime linker looks for lib name.so in the library search path.
path
This is an absolute or relative library path used on the host by the linker, and on the target by the runtime linker. A relative path containing a "/" is interpreted as relative to the current directory by the runtime linker. A path without the "/" is searched in the library search path by the runtime linker.
staticLibs: a list of static libraries (.a) that are statically linked.
slDeps: a list of shared libraries on which the resulting library depends. If these libraries are changed, the resulting library slib will be rebuilt. Each library must be defined as a path on the host. Generally slDeps duplicates the libraries described in sharedLibs. To express the dependency without embedding a path in the executable, use the -L path -l name syntax.
options: any linker options, preceded by -Xlinker. This option must be used to supply system-specific linker options which cannot be recognized by the compiler. To pass an option that takes an argument, you must use -Xlinker twice, once for the option and once for the argument.
The following example builds a shared library named libfoo.so from the PIC binary objects files, a.o and b.o. When this library is loaded dynamically, the runtime linker will also load the libshared.so.
SharedLibraryTarget( libfoo.so, a.o b.o, , libshared.so, , )
The following imake macros are used to build shared applications:
SharedUserTarget(prog, shobjs, staticLibs, sharedLibs, slDeps, options) SharedSupTarget(prog, objs, staticLibs, sharedLibs, slDeps, options) SharedCXXUserTarget(prog, objs, staticLibs, sharedLibs, slDeps, options) SharedCXXSupTarget(prog, objs, staticLibs, sharedLibs, slDeps, options)
For more information on each of these macros, refer to "imake Build Rules".
The prog argument is the name of the resulting process. Other arguments are the same as the SharedLibraryTarget() macro. For the options argument, the following options are particularly useful:
-Xlinker -soname=<name>
Within the SharedLibraryTarget() rule, this option sets the internal soname of the library. If a library used as a dependency in a rule that builds a dynamic application has a soname defined, the executable records the soname instead of the sharedLibs argument.
-Xlinker -rpath -Xlinker <dir>
This option defines the runpath directory that is added to the library search path.
The following example builds a shared application named shr_app from the PIC binary object files a.o and b.o. When this process is started, the runtime linker loads the shared libraries libshared.so and libcx.u.so (if required). In the target file system, this library is located in the /libraries directory because this directory will be added to the search path of the runtime linker.
SharedUserTarget( shr_app, a.o b.o, , libshared.so, , -Xlinker -rpath -Xlinker /libraries)
As with shared libraries, it is possible to build shared applications which do not contain .o objects in PIC format. The applications are still able to use shared libraries, however the application code is not shared.
This section discusses two examples of dynamic and shared applications. In these examples, it is assumed that a standard development environment has been set up (system build tree, search path, boot and initialization of target machine).
In these examples, the chorus_root_directory is the path of the target root directory on the NFS host (for example /home/chorus/root), the name of the target is jericho, and the environment variable WORK refers to the directory used for building these examples.
The following dynamic application uses a custom dynamic library which will be loaded and linked at application start-up. It uses the function foo() which is defined in the dynamic library. This function calls the bar() function defined in the main application.
The following is the dynamic application progdyn.c:
#include <chorus.h> extern void foo(); main() { foo(); /* calling foo defined in the library */ } void bar() { printf ("bar called\n"); }
The following is the dynamic library libdyn.c:
#include <chorus.h> extern void bar(); void foo() { printf ("Calling bar\n"); bar(); /* calling bar defined in the main application */ }
Create the directory and the Imakefile.
Create a directory libdyndir in $WORK, containing libdyn.c and the following Imakefile:
SRCS = libdyn.c DynamicLibraryTarget (libdyn.so, libdyn.o, , , ,) Depend(libdyn.c)
Build the dynamic library.
In the libdyndir directory, build the dynamic library libdyn.so using the commands ChorusOSMkMf, make depend, and make.
Create the directory and the Imakefile.
Create a directory progdyndir in $WORK, containing progdyn.c and the following Imakefile:
SRCS = progdyn.c DynamicUserTarget (progdyn, progdyn.o, , $(WORK)/libdyndir/libdyn.so, $(WORK)/libdyndir/libdyn.so, ) Depend(progdyn.c)
Build the dynamic application.
In the progdyndir directory, build the dynamic application progdyn using the commands ChorusOSMkMf, make depend and make.
% ChorusOSMkMf $WORK % make depend % make |
Copy the dynamic application.
Copy the dynamic application into the /bin subdirectory of the chorus_root_directory directory:
% cp $WORK/progdyndir/progdyn chorus_root_directory/bin |
Copy the dynamic library.
Copy the dynamic library into the /lib subdirectory of the chorus_root_directory directory:
% cp $WORK/libdyndir/libdyn.so chorus_root_directory/lib |
The following command will notify the runtime linker where to find the libdyn.so dynamic library:
% rsh jericho setenv LD_LIBRARY_PATH /lib |
Alternatively, set the runpath to /lib in the ldopts argument of the application macro (-rpath /lib).
Start the application.
Start the application and dynamically load the libdyn.so library, using the following command:
% rsh jericho /bin/progdyn |
The following program explicitly loads a dynamic library at runtime using the function dlopen(). This program searches for the address of the dynfunc() function (defined in the library) and calls this function.
The following is the dynamic program progdyn2.c:
#include <chorus.h> #include <cx/dlfcn.h> int main() { void (*funcptr)(); /* pointer to function to search */ void *handle; /* handle to the dynamic library */ /* finding the library */ handle = dlopen ("libdyn2.so", RTLD_NOW); if !(handle) { printf ("Cannot find library libdyn2.so\n"); exit(1); } /* finding the function in the library */ funcptr = (void (*)()) dlsym (handle, "dynfunc"); if !(funcptr) { printf ("Cannot find function dynfunc\n"); exit(1); } /* calling library function */ (*funcptr)(); }
The following is the dynamic library libdyn2.c:
#include <chorus.h> void dynfunc() { printf ("Calling dynfunc\n"); }
The program and library discussed in the previous section are built in the same way as the previous example using two Imakefiles:
Create the first directory.
Create a directory called libdyn2dir in $WORK, containing libdyn2.c and the following Imakefile:
SRCS = libdyn2.c DynamicLibraryTarget (libdyn2.so, libdyn2.o, , , , ) Depend(libdyn2.c)
Create the second directory.
Create a directory progdyn2dir in $WORK, containing progdyn2.c and the following Imakefile:
SRCS = progdyn2.c DynamicUserTarget (progdyn2, progdyn2.o, , , , ) Depend(progdyn2.c)
Copy the program.
Copy the dynamic program into the /bin subdirectory of the chorus_root_directory directory:
% cp $WORK/progdyn2dir/progdyn2 chorus_root_directory/bin |
Copy the library.
Copy the dynamic library into the /lib subdirectory of the chorus_root_directory directory:
% cp $WORK/libdyn2dir/libdyn2.so chorus_root_directory/lib |
Notify the runtime linker.
Use the following command to notify the runtime linker where to find the libdyn2.so dynamic library:
% rsh jericho setenv LD_LIBRARY_PATH /lib |
Start the program.
Use the following command to start the program:
% rsh jericho arun /bin/progdyn2 |
At program start-up, the runtime linker will only load the progdyn2 executable . The libdyn2.so library is loaded when the dlopen() function is called.
A small modification to the previous example enables a shared application to load a shared library dynamically.
Create the shared library file.
Copy the libdyn.c file into the libshareddir directory and rename it libshared.c. The ensuing Imakefile will build a PIC binary object file called libshared.o and a shared library named libshared.so as follows:
FPIC = ON SRCS = libshared.c SharedLibraryTarget (libshared.so, libshared.o, , , , ) Depend($(SRCS))
Create the shared program file.
Copy the progdyn.c file into the progshareddir directory and rename it progshared.c. The ensuing Imakefile will build a PIC binary object file called progshared.o and an executable file called progshared:
FPIC = ON SRCS = progshared.c LIBPATH = <pathname of the shared library directory> SharedUserTarget (progshared_u, progshared.o, $(UTILS_LIB) $(CLX_UTILS_LIB), -L$(LIBPATH) -Xlinker -rpath -Xlinker/shared -lshared,,) Depend($(SRCS))
Copy the shared application.
Copy the shared application into the /bin subdirectory of the chorus_root_directory directory:
% cp $WORK/progshareddir/progshared_u chorus_root_directory/bin |
Copy the shared library.
Copy the shared library into the /shared subdirectory of the chorus_root_directory directory:
% cp $WORK/libshareddir/libshared.so chorus_root_directory/shared |
There is no need to set LD_LIBRARY_PATH as the path for the runtime linker. This path was specified in the Imakefile by setting the -Xlinker -rpath option.
Start the application.
The following command starts the application and loads the libshared.so and libc.so libraries (if they have not already been loaded by another application):
% rsh jericho /bin/progshared_u |
The following procedure enables a shared program to load a shared library explicitly.
Create the library file.
Copy the libdyn2.c file into the libshared2dir directory and rename it libshared2.c. The following Imakefile will build a shared library called libshared2.so:
FPIC = ON SRCS = libshared2.c SharedLibraryTarget (libshared2.so, libshared2.o,,,,) Depend($(SRCS))
Create the program file.
Copy the progdyn2.c file into the progshared2dir directory and rename it progshared2.c. The following Imakefile will build a PIC binary object file called progshared2.o and an executable file called progshared2
FPIC = ON SRCS = progshared2.c SharedUserTarget (progshared2_u, progshared2.o, $(UTILS_LIB) $(CLX_UTILS_LIB),,,) Depend($(SRCS))
Copy the application.
Copy the shared application into the /bin subdirectory of the chorus_root_directory directory:
% cp $WORK/progshared2dir/progshared2_u chorus_root_directory/bin |
Copy the shared library.
Copy the shared library into the /lib subdirectory of the chorus_root_directory directory:
% cp $WORK/libshared2dir/libshared2.so chorus_root_directory/lib |
Notify the runtime linker.
The following command will notify the runtime linker where to find the libshared2.so shared library:
% rsh jericho setenv LD_LIBRARY_PATH /lib/shared:/lib |
Alternatively, set the runpath to /lib in the ldopts argument of the program macro (-rpath /lib).
Start the program.
The following command will start the program and load the libshared2.so and libc.so libraries (if they have not already been loaded by another actor):
% rsh jericho /bin/progshared2 |
This chapter deals with threads, which are the basic unit of execution on ChorusOS systems. It also demonstrates the use of the ChorusOS native application programming interfaces for thread creation and deletion, and for multithreaded application development.
After implementing the examples provided in this chapter, you should understand how these APIs can be used in an application.
A thread is the flow of control within an actor. Each thread is associated with an actor and defines a unique execution state. An actor may contain multiple threads. The threads share the resources of that actor, such as memory regions or message spaces, and are scheduled independently.
Each thread has a local identifier, threadli. A thread may obtain its local identifier using the following ChorusOS operating system service:
#include <chorus.h> int threadSelf();
An example of how this call can be used is provided in Example 7-1.
For an overview of the ChorusOS thread model, see the "Microkernel" chapter in the ChorusOS 5.0 Features and Architecture Overview.
A thread may be created dynamically, using the following ChorusOS operating system service:
#include <chorus.h> int threadCreate(KnCap* actorCap, KnThreadLid* thLi, KnThreadStatus status, void* schedParam, void* startInfo);
The actorCap parameter identifies the actor in which the new thread will be created. The standard way to create a new thread in the current actor is to pass K_MYACTOR as the actor capability. If this is successful, the local identifier of the newly created thread is returned at the location defined by the thLi parameter.
The schedParam parameter is used to define the scheduling properties of the thread to be created. If this parameter is set to 0, the created thread inherits the scheduling attributes of the creator thread.
The startInfo parameter defines the initial state of the thread, such as the initial program counter of the thread (the thread entry point). It also defines the initial value of the stack pointer to be used by the created thread and whether the thread will run as a user thread or as a supervisor thread.
A thread needs a stack to run so as to have room in which to store its local variables. When you create a user thread, you must explicitly specify a user stack for the thread. Stacks for supervisor threads are implicitly allocated by the system. In fact, a system stack is allocated for all threads, even those running in user mode. When a thread switches from user to supervisor mode, the system stack is used, rather than the user stack.
Because the operating system does not prevent the user stack from overflowing, checks must be made each time a thread is created.
System stacks cannot overflow as memory becomes corrupted, resulting in unpredictable operating system behavior.
Example 7-1 is a basic program illustrating the creation of a thread by the main thread of an actor. The actor is loaded by the arun command. Its main thread will be implicitly created by the system. The following example includes these steps:
A thread is created, which prints a message, including its thread identifier.
Simultaneously, the main thread prints another message with both thread identifiers.
The main thread then terminates the actor.
The following example works without modification, whether run as a user or supervisor actor. The user thread is created first and the supervisor thread is created second. The actorPrivilege(2K) call is used to retrieve and to set the actor privileges of the user and supervisor threads.
The example requires synchronization between the main thread and the created thread. Execution of a thread can be suspended for a given delay:
#include <chorus.h> int threadDelay(KnTimeVal* waitLimit);
The preceding call suspends the execution of the invoking thread for a period specified by the KnTimeVal structure. There are two predefined values:
K_NOTIMEOUT specifies an infinite delay.
K_NOBLOCK specifies no delay. This is an explicit request for the processor to yield and reschedule another thread of the same priority.
These two values may be used instead of the pointer to the KnTimeVal data structure. There is also a predefined macro which sets this type of data structure from a delay expressed in milliseconds: K_MILLI_TO_TIMEVAL(KnTimeVal* waitLimit, int delay). For more information, see the threadCreate(2K), threadDelay(2K) and threadSelf(2K) man pages.
(file: progov/thCreate.c) #include <stdio.h> #include <stdlib.h> #include <chorus.h> #define USER_STACK_SIZE (1024 * sizeof(long)) int childCreate(KnPc entry) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; /* Set defaults startInfo fields */ startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; /* Get actor's privilege */ res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } /* Set thread privilege */ if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } /* Allocate a stack for user threads */ if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } /* Set entry point for the new thread */ startInfo.dsEntry = entry; /* Create the thread in the active state */ res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, NULL, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } void sampleThread() { int myThreadLi; myThreadLi = threadSelf(); printf("I am the new thread. My thread identifier is: %d\n", myThreadLi); /* Block itself for ever */ (void) threadDelay(K_NOTIMEOUT); } int main(int argc, char** argv, char**envp) { int myThreadLi; int newThreadLi; int res; KnTimeVal wait; newThreadLi = childCreate((KnPc)sampleThread); myThreadLi = threadSelf(); /* Initialize KnTimeVal structure */ K_MILLI_TO_TIMEVAL(&wait, 10); /* * Suspend myself for 10 milliseconds to give the newly * created thread the opportunity to run before * the actor terminates. */ (void) res = threadDelay(&wait); printf("Parent thread identifier = %d, Child thread identifier = %d\n", myThreadLi, newThreadLi); return 0; }
In the previous example the schedParam parameter is set to a NULL pointer. As a result, the created thread inherits the scheduling attributes of the creator thread. The actorPrivilege(2K) service enables the program to determine whether it must allocate a user stack area for the created thread. It also indicates the type of thread to be created.
If the actor is a supervisor actor, the following line: startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE
indicates to the system the expected usage of the system stack. The maximum system stack length is defined by a global tunable value.
On some platforms, the stack pointer value passed in dsUserStackPointer is automatically decremented by the microkernel before being used for the thread. The decrementation occurs either to enforce the platform-required alignment, on 8 or 16 byte boundaries for example, or to reserve a space. This space will be accessed by a typical C language routine because of the platform-specific calling conventions, such as saving the return address to the caller.
The status parameter specifies that the thread is created in the active state, so that it is ready to execute as soon as it is created. Note that, although this program creates only one thread, there are in fact two threads running in this actor: the main thread, created implicitly by the system when the actor is loaded, and the thread created explicitly by the program.
The example uses the threadDelay(2K) service, which enables a thread to suspend its execution for a certain period. The parent thread suspends itself for ten milliseconds, so that the child thread is able to run before exit(3STDC) is called. Without this suspension period in the parent thread, the actor could terminate before the created thread had run. The termination of an actor implies that all its resources are freed. Threads are not an exception to this rule. Therefore, the exit(3STDC) call at the end of the main routine results in the destruction of both threads.
Using the suspension period within the parent thread does not guarantee that the child thread will execute correctly. Depending on the load of the system, ten milliseconds might not be sufficient to ensure that the child thread has completed its task. The threadDelay(2K) command is used in this example merely for the sake of simplicity, and is not recommended in practice for synchronizing threads. A more reliable synchronization method should be used to ensure that the actor does not terminate before the second thread has completed all jobs. These synchronization mechanisms are further explained in "Mutual Exclusion Locks" and "Semaphores".
The child thread uses the K_NOTIMEOUT special value to suspend itself forever. This is a simple way to avoid undesirable behavior of the child thread until the actor terminates. Assume that the call to threadDelay(2K) does not return. The child thread, having executed the printf(3STDC) statement, would reach the end of the sampleThread() routine which, since it is written in C, terminates with a return instruction. However, the child thread has nowhere to return. As a result it would return to an unspecified location, probably resulting in a memory fault.
An appropriate method to terminate a created thread is to make it delete itself using the threadDelete() service (see "Deleting a Thread").
The ChorusOS operating system does not preset the stack of a thread to ensure that the thread is deleted upon return from its starting routine. You, the ChorusOS application developer, must ensure that threads are properly cleaned up after they have finished running. Mechanisms for coping with these types of situations are described in Chapter 8, Native Memory Management.
A thread may be deleted dynamically by itself, or by another thread using the following service:
#include <chorus.h> int threadDelete(KnCap* actorCap, KnThreadLid thLi);
This call enables one thread to delete another inside the same actor. The actorCap parameter is set to K_MYACTOR and the thread identifier of the thread to be deleted has been identified. The call also enables one thread to delete another inside a different actor (provided they are both running on the same machine), as long as it provides both the actor capability and the target thread identifier. The predefined thread identifier K_MYSELF enables a thread to name itself without knowing its actual thread identifier.
Example 7-2 is a slightly different version of the previous example. The subroutine childCreate() is unchanged, however in this example, the created thread kills itself instead of going idle forever.
This strategy does not solve the synchronization problem that occurred in the previous example -- the main thread still does not know exacty when to terminate the actor.
For more information, refer to the threadDelete(2K) man page.
(file: progov/thDelete.c) #include <stdio.h> #include <stdlib.h> #include <chorus.h> #define USER_STACK_SIZE (1024 * sizeof(long)) int childCreate(KnPc entry) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } startInfo.dsEntry = entry; res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, NULL, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } void sampleThread() { int myThreadLi; myThreadLi = threadSelf(); printf("I am the new thread. My thread identifier is: %d\n", myThreadLi); /* Suicide */ (void) threadDelete(K_MYACTOR, K_MYSELF); /* Should never reach this point! */ } int main(int argc, char** argv, char**envp) { int myThreadLi; int newThreadLi; int res; KnTimeVal wait; newThreadLi = childCreate((KnPc)sampleThread); myThreadLi = threadSelf(); /* Initialize KnTimeVal structure */ K_MILLI_TO_TIMEVAL(&wait, 10); /* * Suspend myself for 10 milliseconds to give the newly * created thread the opportunity to run before * the actor terminates. */ (void) threadDelay(&wait); printf("Parent thread identifier = %d, Child thread identifier = %d\n", myThreadLi, newThreadLi); return 0; }
The exit(3STDC) function is used instead of the threadDelete(2K) function in the main thread. Using threadDelete(2K) would leave the actor in a passive situation, with no thread running within it. This implies that resources used by an actor are not freed when the last thread is deleted.
In the case of a user thread, deleting a thread does not imply that the stack of the thread will also be freed. If the user stack was allocated through a call to malloc(3STDC), it must be freed through a call to free(3STDC). This cannot be done by the thread itself and must be done by a separate thread. In the previous example, the actor is going to terminate so there is no real need to free the stack because all resources used by the actor will be returned to the system. In the case of a supervisor thread, the ChorusOS operating system frees the system stack it allocated at threadCreate(2K)time.
Within an actor, whether user or supervisor, more than one thread may execute concurrently. For an overview of the ChorusOS multithreaded model, refer to the "Microkernel" chapter in the ChorusOS 5.0 Features and Architecture Overview.
One of the most common issues in a multithreaded environment is the management of per-thread data structures. This issue is particularly important in the context of libraries. In a single-threaded process, managing these data structures as global variables is sufficient. In a multithreaded environment, this approach no longer works.
The ChorusOS operating system provides a convenient way for threads to manage per-thread data. Any piece of data that needs to be instantiated on a per-thread basis must be associated with a unique key. This key can be obtained using the system call ptdKeyCreate(2K). This data may be accessed using ptdSet(2K) and ptdGet(2K).
#include <pd/chPd.h> int ptdKeyCreate (PdKey* key, KnPdHdl destructor);
The ptdKeyCreate(2K) call generates a unique, opaque key. This key is stored in the location defined by the key argument. You may specify a routine as the destructor argument. This routine is invoked at thread deletion time and is passed the value associated with key. Upon return from ptdKeyCreate(2K), the value associated with key is 0. This type of key is visible to all threads of the actor, but each thread using a specific key has its own private copy of the data.
#include <pd/chPd.h> int ptdSet (PdKey key, void* value);
The ptdSet(2K) call enables a thread to associate the value value with the key key, which has been generated by a call to ptdKeyCreate(2K).
#include <pd/chPd.h> int ptdGet (PdKey key);
The ptdGet(2K) call returns the last value associated with the key by the same thread.
Example 7-3 includes a small library that returns a pointer to the next word of a string. This is a simplified version of the strtok(3STDC) C library routine. For simplicity, it is assumed that words are always separated by spaces in the string.
This library may be called simultaneously from different threads, each thread working on its own string. The routine that returns the pointer to the next word does not take any parameters. These routines are called snw routines (where snw stands for String Next Word). The snwSet(char *str) routine defines the string that will be looked up by the invoking thread, while the char* snwGet() routine returns a pointer to the next word.
The library is invoked from the main thread and the created thread on two different strings in order to count the number of words in each string. The results are printed and the threads are synchronized before terminating the actor.
For more information, refer to the ptdKeyCreate(2K), ptdSet(2K) and ptdGet(2K) man pages.
file: progov/perThreadData.c) #include <stdio.h> #include <string.h> #include <stdlib.h> #include <chorus.h> #include <pd/chPd.h> #define USER_STACK_SIZE (1024 * sizeof(long)) KnSem sampleSem; PdKey snwKey; int childCreate(KnPc entry) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } startInfo.dsEntry = entry; res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, NULL, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } void snwInit() { int res; /* Just allocate a key for our "snw" library */ res = ptdKeyCreate(&snwKey, NULL); if (res != K_OK) { printf("Cannot create a ptd key, error %d\n", res); exit(1); } } void snwSet(char* str) { int res; res = ptdSet(snwKey, str); if (res != K_OK) { printf("Cannot set the ptd key, error %d\n", res); exit(1); } } char* snwGet() { int res; char* p; char* s; p = (char*)ptdGet(snwKey); if (p == NULL) return NULL; s = strchr(p, ' '); if (s != NULL) { s++; } else if (*p != '\0') { /* Last word might not have a following space */ s = p + strlen(p); } res = ptdSet(snwKey, s); return s; } void sampleThread() { char* ptr; int words = 0; int res; snwSet("This is the child thread!"); for (ptr= snwGet(); ptr != NULL; ptr = snwGet()) { words++; } printf("Child thread found %d words.\n", words); res = semV(&sampleSem); if (res != K_OK){ printf("Cannot perform the semV operation, error %d\n", res); exit(1); } (void) threadDelete(K_MYACTOR, K_MYSELF); } int main(int argc, char** argv, char**envp) { char* ptr; int words = 0; int res; res = semInit(&sampleSem, 0); if (res != K_OK) { printf("Cannot initialize the semaphore, error %d\n", res); exit(1); } snwInit(); (void) childCreate((KnPc)sampleThread); snwSet("I am the main thread and counting words in this string!"); for (ptr= snwGet(); ptr != NULL; ptr = snwGet()) { words++; } printf("Main thread found %d words.\n", words); res = semP(&sampleSem, K_NOTIMEOUT); if (res != K_OK) { printf("Cannot perform the semP operation, error %d\n", res); exit(1); } return 0; }
As illustrated in the previous example, it is often the case that C and C++ libraries have been designed for UNIX processes which were initially mono-threaded entities. To enable C programmers to continue using the usual libraries within multithreaded actors, the ChorusOS operating environment provides a set of adapted C libraries. These can be used from different threads of a given actor without encountering problems.
Some of these adapted libraries (printf(), fprintf(), fopen(), and malloc()) were used in the previous examples. All of these C libraries have been adapted to work efficiently, even within a multithreaded actor. Modifications are not visible to the programmer. They rely mainly on synchronization such as mutexes for protecting critical sections, and on the per-thread data mechanism to store per-thread global data.
Some libraries required no modification and work in a straightforward fashion within a multithreaded actor. These libraries, such as strtol() (string to lower case), work exclusively on local variables and do not access or generate any global states.
This chapter deals with the ChorusOS application programming interfaces for managing memory.
After implementing the examples provided in this chapter, you will understand how these APIs might be used in an application.
Detailed information on ChorusOS memory models and address spaces is not provided in this chapter. For information on the memory models and how they are implemented, refer to "Processes" in the ChorusOS 5.0 Features and Architecture Overview.
A memory management module may support several different user address spaces and perform memory context switches when required in thread scheduling. Actors within the ChorusOS operating system environment may extend their address space using the malloc(3STDC) library call. However, this is an inflexible way of allocating memory because there is no way to control the attributes of the allocated memory, that is, whether it is a read-only or a read/write memory area. The malloc(3STDC) routine uses the ChorusOS operating system services described in this section. These services can also be used to share a region of memory between two or more actors.
The hot restart feature provides support for persistent memory (memory that can extend beyond the lifetime of the runtime instance of an actor). Hot restart is not covered in this chapter. For information about using hot restart and the persistent memory services it provides, see Chapter 15, Recovering From Application Failure: Hot Restart.
The ChorusOS operating system offers various services which enable an actor to extend its address space dynamically by allocating memory regions. An actor may also shrink its address space by freeing memory regions. A memory area to be allocated or freed is identified by the system through a region descriptor of the following type:
typedef struct { VmAddr startAddr; VmSize size; VmFlags options; VmAddr endAddr; void* opaque1; VmFlags opaque2; } KnRgnDesc;
The startAddr field defines the starting address of the memory region, while the size field defines its length (expressed in bytes). The options field enables low level control over the attributes of the memory region to be allocated. The opaque1 and opaque2 fields should be set to NULL if they are not being used by the application.
The options field consists of a number of flags, of which the following are the most important:
K_WRITABLE notifies the system that the newly created memory region must be writable, otherwise the memory region will be read-only.
K_FILLZERO notifies the system that the content of the memory region to be created must be filled with zeroes upon creation. If this flag is not set, the content of the memory region will be unspecified.
K_ANYWHERE notifies the system that the address used to allocate the region is not critical to the application. An appropriate address will be selected by the system and returned to the application. This avoids the need for the application to find out which addresses are already in use within the actor address space. It also enables memory to be allocated within a library without conflict with an existing address space.
K_SUPERVISOR notifies the system that the memory to be allocated will be part of the supervisor address space rather than part of the user address space. This flag is usually set by actors running in supervisor mode.
Additional options are available on a ChorusOS operating system configured with the virtual memory feature. The most important of these enables control of the swap policy to be applied to the pages of the created memory region:
K_NODEMAND notifies the system that no page fault should ever occur on the memory region referred to in the previous paragraph. Physical pages are allocated to the memory region at creation time and will never be swapped out. Thus, the region is locked in memory.
A memory region is allocated through the following call:
#include <chorus.h> int rgnAllocate(KnCap* actorCap, KnRgnDesc* rgnDesc);
This call frees a memory region within the address space of an actor. The memory region is described by the rgnDesc parameter and the actor is defined by the actorCap parameter.
The following example includes these steps:
A memory region is allocated with the K_ANYWHERE option.
The address of the allocated region is retrieved and printed.
A string is copied to the beginning of the region.
A second region is created, immediately preceding the first.
The string is copied from the beginning of the first region to the beginning of the second region.
An area of memory spanning the junction between the two regions is freed.
The string is copied from the beginning of the second region to the lowest memory address still accessible outside the freed memory area.
It is confirmed that the program is able to run as a user or supervisor actor.
The illustration in Figure 8-1 shows only the main steps of the example. For more information, refer to the rgnAllocate(2K) and rgnFree(2K) man pages.
(file: progov/rgnAlloc.c) #include <stdio.h> #include <stdlib.h> #include <string.h> #include <chorus.h> #define RGN_SIZE_1 (6 * vmPageSize()) #define RGN_SIZE_2 (3 * vmPageSize()) #define FREE_START (2 * vmPageSize()) #define FREE_SIZE (4 * vmPageSize()) #define STILL_ALLOC_START (FREE_SIZE - (RGN_SIZE_2 - FREE_START)) int main(int argc, char** argv, char**envp) { KnRgnDesc rgnDesc; KnActorPrivilege actorP; int res; VmFlags rgnOpt = 0; char* ptr1 = NULL; /* Avoids "unexpected" warning */ char* ptr2 = NULL; /* Avoids "unexpected" warning */ printf("Starting rgnAllocate example\n"); res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get actor privilege, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { rgnOpt = K_SUPERVISOR; } rgnDesc.size = RGN_SIZE_1; rgnDesc.options = rgnOpt | K_WRITABLE | K_FILLZERO | K_ANYWHERE; rgnDesc.opaque1 = NULL; rgnDesc.opaque2 = NULL; /* * No need to set rgnDesc.startAddr * since we set the K_ANYWHERE flag */ res = rgnAllocate(K_MYACTOR, &rgnDesc); if (res == K_OK) { printf("Successfully allocated memory starting at 0x%x\n", rgnDesc.startAddr); ptr1 = (char*) rgnDesc.startAddr; } else { printf("First rgnAllocate failed with error %d\n", res); exit(1); } strcpy(ptr1, "Fill the allocated memory with this string\n"); /* * Second allocate has a fixed address, such that * both memory areas will be contiguous. Hence * we do not want the K_ANYWHERE flag any more. */ rgnDesc.size = RGN_SIZE_2; rgnDesc.options &= ~K_ANYWHERE; rgnDesc.startAddr -= RGN_SIZE_2; res = rgnAllocate(K_MYACTOR, &rgnDesc); if (res == K_OK) { printf("Successfully allocated memory starting at 0x%x)\n", rgnDesc.startAddr); ptr2 = (char*) rgnDesc.startAddr; } else { printf("Second rgnAllocate failed with error %d\n", res); exit(1); } /* Copy from first allocated area to second one */ strcpy(ptr2, ptr1); /* * Free a memory area spanning both areas * previously created */ rgnDesc.options = NULL; rgnDesc.startAddr = (VmAddr) (ptr2 + FREE_START); rgnDesc.size = FREE_SIZE; res = rgnFree(K_MYACTOR, &rgnDesc); if (res != K_OK) { printf("Cannot free memory, error %d\n", res); exit(1); } /* * Access to ptr2: beginning of secondly allocated area * is still valid. * Access to ptr1 is now invalid: memory has been freed. */ printf("%s", ptr2); /* * Access to "end" of first allocated area * is still valid */ ptr1 += STILL_ALLOC_START; strcpy(ptr1, ptr2); /* * Remaining memory areas not yet freed will * effectively be freed at actor termination time. */ return 0; }
Region descriptors are only used to describe a creation or deletion operation on the system. They are not kept by the system because of the way in which they are given to the rgnAllocate(2K) call. For example, an allocation of two contiguous areas with the same attributes (writable, fill zero) and the same opaque fields results in the system recognizing a single region, the size of which is the sum of the sizes passed as part of the two region descriptors.
You cannot allocate a region in a range of addresses that are not free. No implicit deallocation of the address space is undertaken by the system. Instead an error code K_EOVERLAP is returned to the caller.
A call to rgnFree(2K) does not reuse a region descriptor that was previously used to allocate a memory area. A free operation can freely span several regions that were allocated by separate operations. Also, a free operation can only free a chunk of memory in the center of a large memory area that was allocated in a single operation.
Only the precise region described by the region descriptor can be freed. The free operation is not extended to match the address range which was allocated at rgnAllocate(2K) time.
The options field of the region descriptor must be set to 0 to use the free operation. If the options field is set to K_FREEALL, all memory regions of the actor will be freed (the code, the data, and the stacks). The K_FREEALL option should therefore be used with care.
All memory areas that have been dynamically allocated are freed when an actor terminates.
The following code example shows how to share memory between applications.
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <chorus.h> #include <cx/afexec.h> AcParam param; #define RGN_SIZE (3 * vmPageSize()) #define SHARED_RGN_SIZE (1 * vmPageSize()) typedef struct sampleSharedArea { KnSem sem; char data[1]; } sharea_t; char capString[80]; char sharedAddr[20]; int main(int argc, char** argv, char**envp) { KnRgnDesc rgnDesc; sharea_t* ptr; KnCap spawningCap; KnCap spawnedCap; int res; VmFlags rgnOpt = 0; KnActorPrivilege actorP; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get privilege, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { rgnOpt = K_SUPERVISOR; } if (argc == 1) { /* * This is the first actor (or spawning actor): * Allocate a memory region * Initialize a semaphore within the region * Spawn the second actor * Wait on the semaphore * Get data written in shared mem by spawned actor * Terminate */ rgnDesc.size = RGN_SIZE; rgnDesc.options = rgnOpt | K_ANYWHERE | K_WRITABLE | K_FILLZERO; rgnDesc.opaque1 = NULL; rgnDesc.opaque2 = NULL; res = rgnAllocate(K_MYACTOR, &rgnDesc); if (res != K_OK) { printf("Cannot allocate memory, error %d\n", res); exit(1); } ptr = (sharea_t*) rgnDesc.startAddr; strcpy(&ptr->data[0], "First actor initializing the shared mem\n"); res = semInit(&ptr->sem, 0); /* * Get my own capability and pass it as a string argument * to spawned actor, so that it may use it to share memory */ (void) actorSelf(&spawningCap); sprintf(capString, "0x%x 0x%x 0x%x 0x%x", spawningCap.ui.uiHead, spawningCap.ui.uiTail, spawningCap.key.keyHead, spawningCap.key.keyTail); /* * Pass address of memory to be shared as a string argument * to spawned actor. */ sprintf(sharedAddr, "0x%x", ptr); param.acFlags = (actorP == K_SUPACTOR)? AFX_SUPERVISOR_SPACE : AFX_USER_SPACE; res = afexeclp(argv[0], &spawnedCap, ¶m , argv[0], capString, sharedAddr, NULL); if (res == -1) { printf("cannot spawn second actor, error %d\n", errno); exit(1); } (void) semP(&ptr->sem, K_NOTIMEOUT); printf("%s", &ptr->data[0]); } else { KnRgnDesc srcRgn; KnRgnDesc tgtRgn; unsigned long uHead, uTail, kHead, kTail; /* * This is the spawned actor: * Get arguments * Set up the memory sharing * Write some string in shared memory * Wake up spawning actor * Terminate */ sscanf(argv[1], "0x%x 0x%x 0x%x 0x%x", &uHead, &uTail, &kHead, &kTail); spawningCap.ui.uiHead = uHead; spawningCap.ui.uiTail = uTail; spawningCap.key.keyHead = kHead; spawningCap.key.keyTail = kTail; sscanf(argv[2], "0x%x", &srcRgn.startAddr); if (actorP != K_SUPACTOR) { srcRgn.size = SHARED_RGN_SIZE; tgtRgn.startAddr = srcRgn.startAddr; tgtRgn.size = SHARED_RGN_SIZE; tgtRgn.options = rgnOpt | K_WRITABLE; tgtRgn.opaque1 = NULL; tgtRgn.opaque2 = NULL; res = rgnMapFromActor(K_MYACTOR, &tgtRgn, &spawningCap, &srcRgn); if (res != K_OK) { printf("Cannot share memory, error %d\n", res); exit(1); } ptr = (sharea_t*) tgtRgn.startAddr; } else { /* * Both actors are running in supervisor space, * There is no need to perform the rgnMapFromActor. * One may use the received shared address. */ ptr = (sharea_t*) srcRgn.startAddr; } /* Get data from spawning actor */ printf("Spawning actor sent: %s", &ptr->data[0]); /* Modify contents of shared memory */ sprintf(&ptr->data[0], "Spawned actor mapped shared memory at 0x%x\n", ptr); res = semV(&ptr->sem); if (res != K_OK) { printf("Spawned actor failed on semV, error %d\n", res); exit(1); } } return 0; }
This chapter demonstrates the use of the ChorusOS application programming interfaces for synchronizing threads and also shows how mutexes and semaphores are used in the ChorusOS operating system. For a detailed description of the available models for thread scheduling and ChorusOS scheduling policies, refer to "Scheduling" in ChorusOS 5.0 Features and Architecture Overview.
After implementing the examples provided in this chapter, you should understand how these APIs can be used in an application.
To set the scheduling attributes of threads at thread creation time, use the void* schedParams parameter of threadCreate(2K). To obtain and modify the scheduling attributes of a thread dynamically, use the following call:
#include <chorus.h> #include <sched/chFifo.h> #include <sched/chRr.h> #include <sched/chRt.h> int threadScheduler(KnCap* actorCap, KnThreadLid thLi, void* oldParam, void* newParam);
The previous service enables you to get or set the scheduling parameters for any thread of any actor, provided both the actor capability and the thread identifier are known. The threadScheduler(2K) call returns the current scheduling attributes of the target thread at the location defined by oldParam (if non-NULL). It also sets the attributes of the target thread according to the description provided at the location defined by newParam (if non-NULL).
Because the size, layout, and semantics of scheduling parameters can vary (depending on the scheduler configured in the system, or on the class of the ROUND_ROBIN framework) the parameters are left empty in the generic interface definition. However, all scheduling parameter descriptions are similar, at least for the initial fields:
struct KnFifoThParms { KnSchedClass fifoClass; /* Always set to K_SCHED_FIFO */ KnFifoPriority fifoPriority; } KnFifoThParms; struct KnRrThParms { KnSchedClass rrClass; /* Always set to K_SCHED_RR */ KnRrPriority rrPriority; } KnRrThParms;
In the previous example, the first field defines the scheduling policy applied (or to be applied) to the thread. The second field defines the priority of the thread within the scheduling policy.
Code Example 9-1 demonstrates how the childCreate() routine receives the scheduling attributes of the thread to be created. The main thread invokes the modified routine so that the created thread will start as soon as it is created, rather than waiting for the main thread to give up the processor. Therefore, the created thread must be attributed a higher priority than the main thread. For more information, refer to the threadScheduler(2K) and threadCreate(2K) man pages.
This example applies only in the case of the FIFO and RR scheduling classes.
(file: progov/thSched.c) #include <stdio.h> #include <stdlib.h> #include <chorus.h> #define USER_STACK_SIZE (1024 * sizeof(long)) KnSem sampleSem; int childSchedCreate(KnPc entry, void* schedParams) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; /* Set defaults startInfo fields */ startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; /* Get actor's privilege */ res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } /* Set thread privilege */ if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } /* Allocate a stack for user threads */ if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } /* Set entry point for the new thread */ startInfo.dsEntry = entry; /* Create the thread in the active state */ res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, schedParams, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } void sampleThread() { int myThreadLi; int res; myThreadLi = threadSelf(); printf("I am the new thread. My thread identifier is: %d\n", myThreadLi); res = semV(&sampleSem); if (res != K_OK){ printf("Cannot perform the semV operation, error %d\n", res); exit(1); } (void) threadDelete(K_MYACTOR, K_MYSELF); } int main(int argc, char** argv, char**envp) { int myThreadLi; int newThreadLi; int res; KnThreadDefaultSched schedParams; res = semInit(&sampleSem, 0); if (res != K_OK) { printf("Cannot initialize the semaphore, error %d\n", res); exit(1); } /* acquire my own scheduling attributes */ res = threadScheduler(K_MYACTOR, K_MYSELF, &schedParams, NULL); if (res != K_OK) { printf("threadScheduler failed, res=%d\n", res); exit(1); } /* Increase priority of thread to be created */ schedParams.tdPriority -= 1; newThreadLi = childSchedCreate((KnPc)sampleThread, &schedParams); myThreadLi = threadSelf(); printf("Parent thread identifier = %d, Child thread identifier = %d\n", myThreadLi, newThreadLi); res = semP(&sampleSem, K_NOTIMEOUT); if (res != K_OK) { printf("Cannot perform the semP operation, error %d\n", res); exit(1); } return 0; }
A semaphore is an integer counter associated with a queue of waiting threads (the queue can be empty). At initialization, the semaphore counter receives a user-defined positive or NULL value. Initialization is performed by invoking the following ChorusOS operating system call:
#include <chorus.h> int semInit(KnSem* semaphore, unsigned int count);
The semaphore parameter is the location of the semaphore, and count is the semaphore counter. As is the case with mutexes, the ChorusOS operating system does not allocate semaphores itself. You must have allocated the semaphore previously. This enables you to allocate semaphores freely, wherever convenient for your application. Because data structures representing semaphores are allocated by the applications themselves, the ChorusOS operating system does not impose any limit on the maximum number of semaphores that may be used within the system.
Two atomic operations, P and V, are provided with these semaphores.
#include <chorus.h> int semP(KnSem* semaphore, KnTimeVal* waitLimit);
semP(2K) decrements the counter by one. When the counter reaches a negative value, the invoking thread is blocked and queued within the semaphore queue. Otherwise, the thread continues its execution normally. The waitLimit parameter may be used to control how long the thread remains queued. If waitLimit is set to K_NOTIMEOUT, the thread will remain blocked until the V operation is performed. If the thread is awakened due to the expiration of the time period, a specific error code will be returned as the result of the semP(2K) invocation. In this case, the counter is incremented to compensate for the effect of the semP(2K) operation.
#include <chorus.h> int semV(KnSem* semaphore);
semV(2K) increments the counter by one. If the counter is still lower than or equal to zero, one of the waiting threads is picked up from the queue and awakened. If the counter is greater than zero, there should be no threads waiting in the queue.
Figure 9-1 shows an example of two threads synchronizing by means of a semaphore.
In the following example, two threads explicitly synchronize by means of a semaphore, so that the actor is eventually destroyed when the created thread has performed its function. Refer to the semInit(2K) man page for more information.
(file: progov/semaphore.c) #include <stdio.h> #include <stdlib.h> #include <chorus.h> #define USER_STACK_SIZE (1024 * sizeof(long)) KnSem sampleSem; /* Semaphore allocated as global variable */ int childCreate(KnPc entry) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } startInfo.dsEntry = entry; res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, 0, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } void sampleThread() { int myThreadLi; int res; myThreadLi = threadSelf(); printf("I am the new thread. My thread identifier is: %d\n", myThreadLi); res = semV(&sampleSem); if (res != K_OK){ printf("Cannot perform the semV operation, error %d\n", res); exit(1); } /* Suicide */ res = threadDelete(K_MYACTOR, K_MYSELF); if (res != K_OK){ printf("Cannot suicide, error %d\n", res); exit(1); } /* Should never reach this point! */ } int main(int argc, char** argv, char**envp) { int myThreadLi; int newThreadLi; int res; /* * Initialize the semaphore to 0 so that * the first semP() operation blocks. */ res = semInit(&sampleSem, 0); if (res != K_OK) { printf("Cannot initialize the semaphore, error %d\n", res); exit(1); } newThreadLi = childCreate((KnPc)sampleThread); myThreadLi = threadSelf(); printf("Parent thread identifier = %d, Child thread identifier = %d\n", myThreadLi, newThreadLi); /* * Since semaphore has been initialized to 0 * this semP will block until a semV is performed * by the created thread, letting the main thread know * that created thread's job is done. */ res = semP(&sampleSem, K_NOTIMEOUT); if (res != K_OK) { printf("Cannot perform the semP operation, error %d\n", res); exit(1); } /* * Created thread has run and done all of its job. * It is time to safely exit. */ return 0; }
The semaphore sampleSem is allocated as global data of the actor. Because the address space of the actor is shared by all threads running within the actor, both threads can freely access the semaphore for synchronization.
Avoid performing the semaphore initialization after creating a child thread. Depending on the scheduling, the second thread may start its execution as soon as it is created, and could reach the semV() operation before the semaphore has been initialized. Although the semV() appears to work in this case, semP() would never return because semInit() would reset the counter to 0.
The synchronization works regardless of the order in which the semP() and semV() operations are executed. If semP() is performed first, the counter will be set to -1 and the main thread will be blocked. The semV() call is used to wake the main thread. If scheduling is reversed, the semV() will set the counter to 1, so that when the semP() operation occurs, the counter will be decremented to 0, but the thread will not be blocked.
Assume that two threads need to access one or more global variables in a consistent fashion (for example, if each thread needs to add two numbers to a unique global counter). The unique global counter should always reflect the accurate sum of all numbers added by both threads, regardless of the scheduling.
Reflecting this sum could be done using semaphores. However, the ChorusOS operating system provides mutexes which have been specifically designed and tuned for these specific requirements.
A mutex is a binary flag associated with a (possibly empty) queue of waiting threads. The mutex may be locked or free. At initialization, the mutex is set to the free state.
#include <chorus.h> int mutexInit(KnMutex* mutex);
For this example to work, the mutexes should have been previously allocated. Mutexes may be allocated wherever convenient for the application. There is no limit imposed on the maximum number of mutexes.
Three operations are provided with mutexes.
mutexGet() acquires the mutex. If the mutex is free, it is atomically locked and the thread continues its execution.
#include <chorus.h> int mutexGet(KnMutex* mutex);
If the mutex is locked when the mutexGet() operation is invoked, the thread is blocked and queued in the list of threads (waiting for the mutex to become free).
There is no way to limit the time that a thread waits to acquire a mutex.
mutexRel() releases the mutex, returning it to its free state. If threads are blocked while waiting for the mutex, one of them is picked up from the list and activated with the mutex locked.
#include <chorus.h> int mutexRel(KnMutex* mutex);
mutexTry() is similar to mutexGet() but the thread is not blocked if the mutex is already locked when the operation is invoked.
#include <chorus.h> int mutexTry(KnMutex* mutex);
By checking the return value of mutexTry(), you can determine whether the mutex was free and has been acquired by the current thread, or, whether the mutex was already locked, in which case the operation has failed.
For more information, refer to the mutexInit(2K) man page.
The following example shows a small, basic library routine called sampleAdd() which receives two integer arguments and adds them to a global variable one after the other. Both the main thread and the created thread perform a number of calls to the library. When the job is completed, the main thread prints the result and terminates the actor.
(file: progov/mutex.c) #include <stdio.h> #include <stdlib.h> #include <chorus.h> #define USER_STACK_SIZE (1024 * sizeof(long)) KnSem sampleSem; KnMutex sampleMutex; long grandTotal; int childCreate(KnPc entry) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } startInfo.dsEntry = entry; res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, 0, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } int sampleAdd(int a, int b) { (void) mutexGet(&sampleMutex) grandTotal += a + b; (void) mutexRel(&sampleMutex); } void sampleThread() { int res; int i; for(i = 0; i < 10; i++) { sampleAdd(threadSelf(), i); /* Why not ??? */ } res = semV(&sampleSem); if (res != K_OK){ printf("Cannot perform the semV operation, error %d\n", res); exit(1); } /* Suicide */ (void) threadDelete(K_MYACTOR, K_MYSELF); } int main(int argc, char** argv, char**envp) { int i; int res; res = semInit(&sampleSem, 0); if (res != K_OK) { printf("Cannot initialize the semaphore, error %d\n", res); exit(1); } (void) mutexInit(&sampleMutex); (void) childCreate((KnPc)sampleThread); for(i = 0; i < 20; i++){ sampleAdd(threadSelf(), i); /* Why not ??? */ } res = semP(&sampleSem, K_NOTIMEOUT); if (res != K_OK) { printf("Cannot perform the semP operation, error %d\n", res); exit(1); } printf("grandTotal is %d\n", grandTotal); return 0; }
Note the following points concerning the previous example:
The mutex is allocated within the global data of the actor and is initialized before being used.
The sampleAdd() routine uses the mutex to protect access to the grandTotal variable and make it atomic. Note that the mutexGet() and mutexRel() operations perform the bulk of the work. Mutex operations should always be used in pairs, as in the previous example.
A mutex is not recursive. A thread that has locked a mutex will deadlock if it tries to perform a second mutexGet() operation on the same mutex.
This chapter deals with the concepts of local access points that are native to ChorusOS systems. It also demonstrates the use of the application programming interfaces for handling LAPs.
After implementing the examples provided in this chapter, you should understand how these APIs can be used in an application.
Local Access Points (LAPs) are a low overhead mechanism for calling service routines in supervisor actors on the local site by both user and supervisor actor calls.
A LAP is designated and invoked via its LAP descriptor. A LAP descriptor can be directly transmitted by a server to one or more specific client actors via shared memory or as an argument to another invocation.
For more information, refer to the lapInvoke(2K), lapResolve(2K), svLapBind(2K), svLapCreate(2K) and svLapDelete(2K) man pages.
The LAPBIND
feature provides a nameserver from
which a LAP descriptor may be requested and obtained indirectly, using a static
symbolic name. This symbolic name may be an arbitrary character string, whose
size is limited to K_LAPNAMEMAX (currently set
to 7 characters) . Otherwise, svLapBind returns K_ENOMEM. Using the nameserver, a LAP may be exported to any potential
client that knows the symbolic name of the LAP (or of the service exported
by the LAP).
A server may optionally establish a name binding using svLapBind(2K). The binding can be removed using svLapUnbind(2K). A client uses lapResolve(2K) to obtain a LAP descriptor, given its symbolic name, and optionally waiting if the name is not yet available.
If the LAPSAFE
feature has been configured
in the system, a LAP frame descriptor is allocated to
register the calling thread as a temporary resource of the invoked actor.
Each LAP frame has an associated level (LAP frame level)
which represents the number of LAP frames for the considered
thread.
LAPSAFE
enforces stronger checking during LAP invocation. This ensures that the microkernel will synchronize
the svLapDelete() operation with concurrent lap invocations.
The full context of the calling thread is saved prior to the LAP invocation. This ensures that the calling thread can return from
its invocation even if a failure occurs during the execution of the LAP handler. This option is mandatory if a LAP
is called from user mode.
The svLapCreate() system call creates a new local
access point for the actor designated by the actor capability (KnCap
).
The svLapBind() system call binds the LAP descriptor pointed to by lapdesc with the symbolic name pointed to by name.
The svLapDelete() system call deletes the local access point whose descriptor is pointed to by lapdesc.
The following example demonstrates a POSIX application that uses these restricted microkernel calls. To launch the example, type:
rsh <target_name> lap <lapserver>
This example applies to supervisor actors only. It is a POSIX application, running in privileged mode.
(file: progov/lap.c) #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <chorus.h> #include <spawn.h> #include <lap/chLap.h> #include <exec/chExec.h> KnCap actorCap; KnLapDesc lapDesc; char* lapArgument; KnTimeVal timeval; void lapHandler(char* message, char* cookie) { int res; res = actorSelf(&actorCap); if (res != K_OK) { printf("actorSelf failed, returns %d\n",res); exit(1); } printf("LAP handler is running \n"); printf(" thread LI = %d, actor UI = 0x%x 0x%x\n", threadSelf(), actorCap.ui.uiHead, actorCap.ui.uiTail); printf(" Argument = %s, cookie = %s\n",message, cookie); } int main(int argc, char** argv, char** envp) { int res; KnActorPrivilege actorP; char* cookie = "Chorus"; char* spawn_args[4]; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get actor privilege, error %d\n", res); exit(1); } if (argc == 1) { printf("Must be run with one argument: LAP name\n"); exit(1); } if (argc == 2) { /* * This is the Server process. * connect a LAP handler, * bind a symbolic name to the LAP (name given as argv[1]) * spawn a client actor (give LAP symbolic name in argument) * wait one minute for Lap invocation */ /* * Spawn the client process */ if (actorP != K_SUPACTOR) { printf("This program must be run in supervisor mode\n"); exit(1); } spawn_args[0] = argv[0]; spawn_args[1] = argv[1]; spawn_args[2] = "ARGH"; spawn_args[3] = NULL; res = posix_spawnp(NULL, spawn_args[0], NULL, NULL, spawn_args, NULL); if (res < 0) { printf("Cannot spawn client actor, error %d\n", res); exit(1); } /* * Create the LAP */ res = svLapCreate(K_MYACTOR, (KnLapHdl) lapHandler, cookie, K_LAP_SETJMP, &lapDesc); if (res != K_OK) { printf("svLapCreate failed, returns %d\n",res); exit(1); } /* * Bind a symbolic name */ res = svLapBind(&lapDesc, argv[1], 0); if (res != K_OK) { printf("svLapBind failed, returns %d\n",res); exit(1); } /* * Wait one minute * Other client processes can be run from the console: * rsh target lap lap-name lap-argument * */ timeval.tmSec = 60; timeval.tmNSec = 0; (void) threadDelay(&timeVal); /* * Unbind the LAP name and Delete the LAP */ res = svLapUnbind(argv[1]); if (res != K_OK) { printf("svLapUnBind failed, returns %d\n",res); exit(1); } res = svLapDelete(&lapDesc); if (res != K_OK) { printf("svLapDelete failed, returns %d\n",res); exit(1); } printf("Server actor is leaving ...\n"); } else { /* * This is the Client Actor: * argv[1] is the LAP name, argv[2] is the LAP argument. * Get the LAP descriptor and invoke the LAP handler. */ res = actorSelf(&actorCap); if (res != K_OK) { printf("actorSelf failed ! Return code = %d\n",res); exit(1); } printf("Client actor is running, thread li = %d, " "actor UI = 0x%x 0x%x\n", threadSelf(), actorCap.ui.uiHead, actorCap.ui.uiTail); /* * Get the LAP descriptor knowing its name */ res = lapResolve(&lapDesc, argv[1], 0); if (res != K_OK) { printf("lapResolve failed, returns %d\n", res); exit(1); } /* * Invoke the LAP handler */ res = lapInvoke(&lapDesc, argv[2]); if (res != K_OK) { printf("lapInvoke failed, returns %d\n", res); exit(1); } printf("Client actor is leaving ...\n"); } return 0; }
In the previous example, the main thread:
Checks if it is running as a supervisor process.
Spawns another copy of itself using posix_spawn().
Creates a Local Access Point and connects a LAP handler, which prints the unique identifier of the current thread (Actor UI + thread LI), the LAP argument and the LAP cookie.
Binds a symbolic name received as the first argument.
Waits one minute for LAP invocations.
Frees the LAP and its name.
Terminates.
The spawned actor:
Receives two arguments: the symbolic LAP name and the argument to be passed to the LAP handler.
Retrieves the LAP descriptor.
Invokes the LAP handler.
Terminates.
This chapter deals with the aspects of messaging and interprocess communications that are native to ChorusOS systems. It also demonstrates the use of the application programming interfaces for handling message queues and IPC.
After implementing the examples provided in this chapter, you should understand how these APIs can be used in an application.
The MIPC
API is an optional feature that facilitates
message queues in ChorusOS systems. This feature enables an application,
composed of one or more actors, to create a shared communication environment --
often referred to as a message space. Within this message space, actors can
exchange messages efficiently. Supervisor and user actors of the same application
can use MIPC
to exchange messages. Furthermore, messages
may be initially allocated and sent by interrupt handlers to be processed
later by threads.
The MIPC
API includes the following system
calls:
Allocate a message
Free a message
Get a message
Get a message from any declared queues
Retrieve statistics about a message pool
Post a message
Retrieve statistics about a message queue
Remove a message from a queue
Create a message space
Open a message space.
Message queues are designed around the concept of a message space which encapsulates within a single entity:
A set of message pools shared by all actors of the application.
A set of message queues through which these actors exchange messages allocated from the shared message pools.
A message space is a temporary resource that must be created explicitly by one actor within the application. Once created, a message space can be opened by other actors within the application. Actors that have opened the same message space are said to share the message space. A message space is deleted automatically when the actor is deleted.
A message pool is defined by two parameters (message size and number of messages) provided by the application when it creates the message space. The configuration of the set of message pools defined within a message space depends on the application requirements.
A message is an array of bytes which can be structured and used at application level through any appropriate convention. Messages are presented to actors as pointers within their address space.
Messages are posted to message queues belonging to the same message space. All actors sharing a message space can allocate messages from the message pools. In the same way, all actors sharing a message space have send and receive rights on each queue of the message space.
Even though most applications need to create only a single message space, the message queue feature is designed to enable an application to create or open multiple message spaces. However, messages allocated from one message space cannot be sent to a queue of a different message space. The following example illustrates the typical use of message spaces:
The first actor, aware of the overall requirements of the application, creates the message space.
Other actors of the application open the shared message space.
An actor allocates a message from a message pool, and fills it with the data to be sent.
The actor which allocated the message then posts it to the appropriate queue and assigns a priority to the message.
The destination actor obtains the message from the queue. At this point, the message is removed from the queue.
After the destination actor has processed the message, the message can be freed. It is then available to be reallocated by the application. Alternatively, the destination actor may modify the message and post it to another queue.
To make the service as efficient as possible, physical memory is allocated for all messages and data structures of the message space at message space creation. At message space open time, this memory is transparently mapped by the system into the actor address space. Further operations such as posting and receiving a message are performed without any copy.
You can create a message space as follows:
#include <mipc/chMipc.h> int msgSpaceCreate (KnMsgSpaceId spaceGid, unsigned int msgQueueNb, unsigned int msgPoolNb, const KnMsgPool* msgPools);
The spaceGid parameter is an unique global identifier assigned by the application to the message space being created. This identifier is also used by other actors of the application to open the message space. Therefore, the identifier serves to bind actors participating in the application to the same message space. The K_PRIVATEID predefined global identifier indicates that the message space created will be private to the invoking actor. This means that its queues and message pools will only be accessible to threads executing within this actor. No other actor will be able to open the message space. The message space is described by the following three parameters:
msqgQueueNb indicates how many queues must be created within the message space. Each queue in the message space is then designated by its index within the set of queues. This may vary from 0 to msgQueueNb - 1.
msgPoolNb is the number of message pools to be created in the message space.
msgPools is a pointer to an array of msgPoolNb pool descriptions. Each pool is described by a KnMsgPool data structure, which includes the following information:
msgSize, which defines the size of each message belonging to the pool
msgNumber, which defines how many messages of msgSize bytes must be created in this pool
Figure 11-1 shows an example of a message space recently created by an actor.
The created message space is assigned a local identifier which is returned to the invoking actor as the return value of the msgSpaceCreate(2K). The scope of this local identifier is the invoking actor.
A message space may be opened by other actors through the following call:
#include <mipc/chMipc.h> int msgSpaceOpen(KnMsgSpaceId spaceGid);
The message space assigned with the spaceGid unique global identifier must have been created previously by a call to msgSpaceCreate(2K). A local identifier is returned to the invoking actor. This message space local identifier can then be used to manipulate messages and queues within the message space. Figure 11-2 shows an example of a message space recently opened by a second actor.
A message may be allocated by the following call:
#include <mipc/chMipc.h> int msgAllocate(int spaceLid, unsigned int poolIndex, unsigned int msgSize, KnTimeVal* waitLimit, char** msgAddr);
msgAllocate(2K) attempts to allocate a message from the appropriate pool of the message space identified by the spaceLid return value of a previous call to msgSpaceOpen(2K), or to msgSpaceCreate(2K). If poolIndex is not set to K_ANY_MSGPOOL, the allocated message is the first free (unallocated) message of the pool defined by poolIndex, regardless of the value specified by the msgSize parameter. Otherwise, if poolIndex is set to K_ANY_MSGPOOL, the message is allocated from the first pool for which the message size fits the requested msgSize. In this context, first pool means the pool with the lowest index in the set of pools defined at message space creation time. If the pool is empty, no attempt is made to allocate a message from another pool.
If the message pool is empty (all messages have been allocated and none are free), msgAllocate(2K) will block, waiting for a message in the pool to be freed. The invoking thread is blocked until the wait condition defined by waitLimit expires.
If successful, the address of the allocated message is stored at the location defined by msgAddr. The returned address is the address of the message within the address space of the actor. Remember that a message space is mapped to the address space of the actors sharing it. However, the message spaces (and therefore the messages themselves), may be mapped to different addresses in different actors. This is especially true for message spaces shared between supervisor and user actors.
Figure 11-3 illustrates two actors allocating two messages from two different pools of the same message space.
After a message has been allocated and initialized by the application, it may be posted to a message queue with:
#include <mipc/chMipc.h> int msgPut(int spaceLid, unsigned int queueIndex, char* msg, unsigned int prio);
msgPut(2K) posts the message (the address of which is msg) to the message queue queueIndex within the message space (the local identifier of which is spaceLid). The message has already been allocated previously by a call to msgAllocate(2K). The message is inserted into the queue according to its priority, prio. Messages with a high priority are taken from the queue first.
A message is posted to a queue without any message copy. Additionally, the message may be posted within an interrupt handler, or with preemption disabled.
Figure 11-4 illustrates how the actors from the previous figure post their messages to different queues.
To obtain a message from a queue (if any) use the following call:
#include <mipc/chMipc.h> int msgGet(int spaceLid, unsigned int queueIndex, KnTimeVal* waitLimit, char** msgAddr, KnUniqueId* srcActor);
msgGet(2K) enables the invoking thread to get the first message with the highest priority pending in the message queue queueIndex, within the message space whose local identifier is spaceLid. Messages with equal priority are posted and delivered in a first-in first-out order.
The address of the message delivered to the invoking thread is returned at the location defined by the msgAddr parameter. If no message is pending, the invoking thread is blocked until a message is sent to the message queue, or until the time-out (as defined by the expiration of the waitLimit parameter).
If the srcActor parameter is non-NULL, it points to a location where the unique identifier of the actor is to be stored. The actor that posted the message is referred to as the source actor).
No data copy is performed to deliver the message to the invoking thread. Multiple threads can be blocked, waiting in the same message queue. With the msgGetAny() call it is possible for one thread to wait for message arrival on all message queues, and to obtain the first posted message from any queue. It is not possible to define a subset of queues and to wait for a message to arrive on this subset. Consequently, msgGet() allows you to wait for a message on a queue described by an index held in its second argument, and msgGetAny() allows you to wait for a message from all queues.
The msgGetAny() call is defined as follows:
int msgGetAny(int spaceLid, unsigned int* msgQueueId, KnTimeVal* waitLimit, char** msgAddr, KnUniqueId* srcActor);
Figure 11-5 illustrates previously created actors receiving messages from queues.
A message that is of no further use to the application can be returned to its pool of messages and will be available for reallocation with the following call:
#include <mipc/chMipc.h> int msgFree(int spaceLid, char* msg);
To retrieve information about a specific message queue or message pool, use the following call:
#include <mipc/chMipc.h> int msgPoolStat (int spaceLid, unsigned int msgPoolId, KnMsgPoolStat* stat); int msgQueueStat(int spaceLid, unsigned int msgQueueId, KnMsgQueueStat* stat);
Information about a message pool is stored in the KnMsgPoolStat structure which contains:
the sizes of the messages in the pool.
the number of messages in the pool.
the number of free messages in the pool (messages that can be allocated using msgAllocate).
Information about a message queue is stored in the KnMsgQueueStat structure which contains:
the number of messages currently waiting in the queue.
msgFree and msgNumber may be invalid after the next time slice or thread schedule because other threads or applications have access to the message space.
Example 11-1 illustrates the basic use of the message queue feature. The example uses the posix_spawn() system call and demonstrates how MIPC can be used with a POSIX process.
For more information, refer to the msgSpaceCreate(2K), msgSpaceOpen(2K), msgAllocate(2K), msgPut(2K), msgGet(2K) and msgFree(2K) man pages.
(file: opt/examples/progov/msgSpace.c) #include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <chorus.h> #include <mipc/chMipc.h> #include <spawn.h> char* spawn_args[3]; #define NB_MSG_POOLS 2 #define NB_MSG_QUEUES 3 #define SMALL_MSG_SZ 32 #define LARGE_MSG_SZ 256 #define NB_SMALL_MSG 13 #define NB_LARGE_MSG 4 #define SAMPLE_SPACE 1111 #define LARGE_POOL 0 #define SMALL_POOL 1 #define Q1 0 #define Q2 1 #define Q3 2 KnMsgPool samplePools[NB_MSG_POOLS]; char* tagPtr = "Spawned"; int main(int argc, char** argv, char**envp) { int res; int msgSpaceLi; char* smallMsg; char* smallReply; char* largeMsg; KnActorPrivilege actorP; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get actor privilege, error %d\n", res); exit(1); } if (argc == 1) { /* * This is the first actor (or spawning actor): * Create a message space, * Spawn another actor, * Allocate, modify and post a small message on Q2 * Get a large Message from Q3, print its contents, free it * Get reply of small message on Q1, print its contents, free it. */ samplePools[LARGE_POOL].msgSize = LARGE_MSG_SZ; samplePools[LARGE_POOL].msgNumber = NB_LARGE_MSG; samplePools[SMALL_POOL].msgSize = SMALL_MSG_SZ; samplePools[SMALL_POOL].msgNumber = NB_SMALL_MSG; msgSpaceLi = msgSpaceCreate(SAMPLE_SPACE, NB_MSG_QUEUES, NB_MSG_POOLS, samplePools); if (msgSpaceLi < 0) { printf("Cannot create the message space error %d\n", msgSpaceLi); exit(1); } /* * Message Space has been created, spawn the other actor, * argv[1] set to "Spawned" to differentiate the 2 actors. */ spawn_args[0] = argv[0]; spawn_args[1] = tagPtr; spawn_args[2] = NULL; res = posix_spawnp(NULL, spawn_args[0], NULL, NULL, spawn_args, NULL); if (res < 0) { printf("Cannot spawn second actor, error %d\n", res); exit(1); } /* * Allocate a small message */ res = msgAllocate(msgSpaceLi, SMALL_POOL, SMALL_MSG_SZ, K_NOTIMEOUT, &smallMsg); if (res != K_OK) { printf("Cannot allocate a small message, error %d\n", res); exit(1); } /* * Initialize the allocated message */ strncpy(smallMsg, "Sending a small message\n", SMALL_MSG_SZ); /* * Post the allocated small message to Q2 with priority 2 */ res = msgPut(msgSpaceLi, Q2, smallMsg, 2); if (res != K_OK) { printf("Cannot post the small message to Q2, error %d\n", res); exit(1); } /* * Get a large message from Q3 and print its contents */ res = msgGet(msgSpaceLi, Q3, K_NOTIMEOUT, &largeMsg, NULL); if (res != K_OK) { printf("Cannot get the large message from Q3, error %d\n", res); exit(1); } printf("Received large message contains:\n%s\n", largeMsg); /* * Free the received large message */ res = msgFree(msgSpaceLi, largeMsg); if (res != K_OK) { printf("Cannot free the large message, error %d\n", res); exit(1); } /* * Get the reply to small message from Q1 and print its contents */ res = msgGet(msgSpaceLi, Q1, K_NOTIMEOUT, &smallReply, NULL); if (res != K_OK) { printf("Cannot get the small message reply from Q1, " "error %d\n", res); exit(1); } printf("Received small reply contains:\n%s\n", smallReply); /* * Free the received small reply */ res = msgFree(msgSpaceLi, smallReply); if (res != K_OK) { printf("Cannot free the small reply message, error %d\n", res); exit(1); } } else { /* * This is the spawned actor: * Check we have effectively been spawned * Open the message space * Allocate, initialize and post a large message to Q3 * Get a small message from Q2, print its contents * Modify it and repost it to Q1 */ int l; if ((argc != 2) || (strcmp(argv[1], tagPtr) != 0)) { printf("%s does not take any argument!\n", argv[0]); exit(1); } /* * Open the message space, using the same global identifier */ msgSpaceLi = msgSpaceOpen(SAMPLE_SPACE); if (msgSpaceLi < 0) { printf("Cannot open the message space error %d\n", msgSpaceLi); exit(1); } /* * Allocate the large message */ res = msgAllocate(msgSpaceLi, K_ANY_MSGPOOL, LARGE_MSG_SZ, K_NOTIMEOUT, &largeMsg); if (res != K_OK) { printf("Cannot allocate a large message, error %d\n", res); exit(1); } strcpy(largeMsg, "Sending a very large large large message\n"); /* * Post the large message to Q3 with priority 0 */ res = msgPut(msgSpaceLi, Q3, largeMsg, 0); if (res != K_OK) { printf("Cannot post the large message to Q3, error %d\n", res); exit(1); } /* * Get the small message from Q2 */ res = msgGet(msgSpaceLi, Q2, K_NOTIMEOUT, &smallMsg, NULL); if (res != K_OK) { printf("Cannot get the small message from Q2, error %d\n", res); exit(1); } printf("Spawned actor received small message containing:\n%s\n", smallMsg); for (l = 0; l < strlen(smallMsg); l++) { if ((smallMsg[l]>= 'a') && (smallMsg[l] <= 'z')) { smallMsg[l] = smallMsg[l] - 'a' + 'A'; } } /* * Post the small message back to Q1, with priority 4 */ res = msgPut(msgSpaceLi, Q1, smallMsg, 4); if (res != K_OK) { printf("Cannot post the small message reply to Q1, error %d\n", res); exit(1); } } return 0; }
Two actors are used, one spawned by the other. The first actor:
Creates a message space with two pools of messages and three message queues (as shown in Figure 11-5).
Allocates a small message, initializes it and posts it to a queue.
Waits for a large message on a second queue, prints its contents and deallocates it.
Waits for the small message to come back on a third queue, prints its contents, deallocates it, then terminates.
During this time, the second actor:
Opens the message space, allocates a large message to be initialized and sends it to the first actor.
Receives the small message, converts all lower case characters to upper case, and posts it back to the third queue before terminating.
The IPC
feature enables threads to communicate
and synchronize when they do not share memory, for example, when they do not
run on the same node. Communications rely on the exchange of messages through
ports.
The IPC
feature includes a number of APIs. These are listed in the following table.
API |
Purpose |
---|---|
actorPi() |
Modify the PI of an actor |
portCreate() |
Create a port |
portDeclare() |
Declare a port |
portDelete() |
Destroy a port |
portDisable() |
Disable a port |
portEnable() |
Enable a port |
portGetSeqNum() |
Get a port sequence number |
portLi() |
Acquire the LI of a port |
portMigrate() |
Migrate a port |
portPi() |
Modify the PI of a port |
portUi() |
Acquire the UI of a port |
grpAllocate() |
Allocate a group name |
grpPortInsert() |
Insert a port into a group |
grpPortRemove() |
Remove a port from a group |
ipcCall() |
Send synchronously |
ipcGetData() |
Get the current message body |
ipcReceive() |
Receive a message |
ipcReply() |
Reply to the current message |
ipcRestore() |
Restore a message as the current message |
ipcSave() |
Save the current message |
ipcSend() |
Send asynchronously |
ipcSysInfo() |
Get information about the current message |
ipcTarget() |
Construct an address |
svMsgHandler() |
Connect a message handler |
svMsgHdlReply() |
Prepare a reply to a handled message |
The IPC feature enables threads to exchange messages in either asynchronous or demand/response mode. The demand/response mode is also known as a Remote Procedure Call (RPC).
In asynchronous mode, the sender of an asynchronous message is only blocked during the time of local processing of the message. The system does not guarantee that the message has been deposited at the destination location.
In RPC mode, the RPC protocol enables the construction of client-server applications using a demand/response protocol for the management of transactions. The client is blocked until a response is returned from the server, or a user-defined (optional) timeout occurs. RPC ensures at-most-once semantics for the delivery of the request. It also ensures that the response received by a client originates from the server. The response received by the client must also correspond to the request (and not to a former request to which the response might have been lost).
RPC also enables a client to be unblocked (which will generate an error) if the server is unreachable or has crashed before emitting a response.
Finally, the RPC protocol supports abort propagation. When a thread that is waiting for an RPC reply is aborted, the event will be propagated to the thread which is currently servicing the client request.
A thread that attempts to receive a message on a port is blocked until the new message is received or a until user-defined (optional) time-out occurs.
A thread can attempt to receive a message on several ports at a time. Among the set of ports attached to an actor, a subset of enabled ports is defined. A thread can also attempt to receive a message sent to any of the enabled ports in its actors.
Ports attached to an actor can be dynamically enabled or disabled. When a port is enabled, it receives a priority value. If several of the enabled ports hold a message when a thread attempts to receive the message on the enabled set of ports, the port with the highest priority will be selected.
An actor's default port is not automatically enabled. If a port has not been enabled, it is automatically disabled. This does not mean that the port cannot be used to send or to receive messages. It means that the port cannot be used in multiple port receive requests because the default value is disabled.
The conventional way for an actor to receive messages delivered to its ports is to have threads explicitly express receive requests. An alternative to this method is the use of message handlers. Instead of explicitly creating threads, an actor can attach a handler (a routine in its address space) to the port. When a message is delivered to the port, the handler is executed within the context of a thread provided by the microkernel.
Message handlers and explicit receive requests are exclusive. When a message handler has been attached to a port, any attempt by a thread to receive a message on that port will return an error.
The use of message handlers is restricted to supervisor actors. Message handlers enable significant optimization of the RPC protocol when both client and server reside on the same site, thereby avoiding thread context switches and memory copies. From the point of view of the microkernel, the client thread is used to run the handler and copying of the message into microkernel buffers is avoided.
The method in which messages are consumed (threads or handlers) is completely transparent to the client (the sender of the message). The strategy is selected on the server only.
The IPC_REMOTE feature enables support for communication among multiple sites in a network, using a location-transparent communication feature. Without this feature, IPC services may be used only within a single site.
The following example illustrates the use of the IPC mechanisms provided in the ChorusOS operating system.
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <chorus.h> #include <spawn.h> #include "ipc/chIpc.h" #define ABORT_DELAY 1000 /* Delay for ipcReceive */ static KnUniqueId thePortUi; /* Our port unique identifier */ static int thePortLi; /* ....... local identifier */ static KnCap groupCap; /* Our group capability */ /* The outgoing annex and body */ static char sndAnnex[] = "Hello world from Chorus ...\n"; static char sndBody[] = "The sea is calm, the tide is full ...\n"; /* The received annex and body */ static char rcvAnnex[K_CMSGANNEXSIZE]; static char rcvBody[1000]; /* Argumentf for the spawnd process */ static char* spawn_args[3]; /* Port group stamp */ char* stamp; #define MYSTAMP 100 #define STAMPSIZE 10 int main(int argc, char** argv, char** envp) { int rslt; /* Work */ KnMsgDesc smsg; /* Descriptor for message being sent */ KnMsgDesc rmsg; /* Descriptor for message being received */ KnIpcDest ipcDest; /* IPC address */ if (argc == 1) { /* * Server actor * Create the destination port */ thePortLi = portCreate(K_MYACTOR, &thePortUi); if (thePortLi < 0) { printf("portCreate failed, returns %d\n", thePortLi); exit(1); } /* * Allocate a port group and insert the port into it */ rslt = grpAllocate(K_STATUSER, &groupCap, MYSTAMP); if (rslt < 0) { printf("grpAllocate failed, returns %d\n", rslt); exit(1); } rslt = grpPortInsert(&groupCap, &thePortUi); if (rslt < 0) { printf("grpPortInsert failed, returns %d\n", rslt); exit(1); } /* * Spawn the client actor * The group stamp is given in argument * Exit in case of malloc failure */ stamp = malloc(STAMPSIZE); if (stamp == NULL) { printf("Can't allocate %d bytes\n", STAMPSIZE); exit(1); } sprintf(stamp, "%d", MYSTAMP); spawn_args[0] = argv[0]; spawn_args[1] = stamp; spawn_args[2] = NULL; rslt = posix_spawnp(NULL, spawn_args[0], NULL, NULL, spawn_args, NULL); if (rslt < 0) { printf("Cannot spawn client actor, error %d\n", rslt); exit(1); } /* * Receive the message */ rmsg.flags = 0; rmsg.bodySize = sizeof(rcvBody); rmsg.bodyAddr = (VmAddr)rcvBody; rmsg.annexAddr = (VmAddr)rcvAnnex; rslt = ipcReceive(&rmsg, &thePortLi, ABORT_DELAY); if (rslt < 0) { printf("ipcReceive failed, returns %d\n", rslt); exit(1); } printf ("%s\n%s\n", rcvAnnex, rcvBody); rslt = portDelete(K_MYACTOR, thePortLi); if (rslt < 0) { printf("portDelete failed, returns %d\n", rslt); exit(1); } } else { /* * Get the port group capability giving the stamp. * Stamp has been received in argv[1] */ rslt = grpAllocate(K_STATUSER, &groupCap, (int) atoi(argv[1])); if (rslt < 0) { printf("grpAllocate failed, returns %d\n", rslt); exit(1); } /* * Prepare the message descriptor for the message to send */ smsg.flags = 0; smsg.bodySize = sizeof(sndBody); smsg.bodyAddr = (VmAddr)sndBody; smsg.annexAddr = (VmAddr)sndAnnex; /* * Prepare the IPC address for the message destination. * Send the message in broadcast mode. */ ipcDest.target = groupCap.ui; rslt = ipcTarget(&ipcDest.target, K_BROADMODE); if (rslt < 0) { printf("ipcTarget failed, returns %d\n", rslt); exit(1); } /* Send from our DEFAULT port */ rslt = ipcSend(&smsg, K_DEFAULTPORT, &ipcDest); if (rslt < 0) { printf("ipcSend failed, returns %d\n", rslt); exit(1); } } return 0; }
In this example, the main thread (which is implicitly created):
Creates a port
Creates a port group and inserts the port into it
Spawns another copy of itself, using posix_spawn(), and passes the port group stamp as an argument
Waits for a message on the created port
Prints the contents of the body and the annex
Frees all used resources and terminates
The spawned actor:
Retrieves the port group capability, passing the stamp received as an argument
Prepares and sends a message to this group in broadcast mode (the annex and body of the message are initialized with strings)
Terminates
The following comparison will help you decide which APIs to use for your specific application.
The MIPC service provides local communication only. MIPC has the following major benefits:
MIPC system calls are real-time compliant.
Messages are exchanged through a zero-copy interface.
Messages can be allocated and posted from interrupt handlers.
The MIPC service is therefore highly suited to real-time applications.
The IPC subsystem provides location-transparent, message-based communication services between applications. Two communication models are provided:
The RPC model (see the ipcCall(2K) ipcReceive(2K) and ipcReply(2K) man pages).
Asynchronous communication (see the ipcSend(2K) and ipcReceive(2K) man pages).
Although the IPC service implements fine-grained locking policies (which makes its services highly preemptable) it cannot enforce real-time constraints. IPC services are therefore intended specifically for distributed applications.
This chapter deals with the time management services available on ChorusOS systems and demonstrates the use of the ChorusOS native application programming interfaces for managing CPU and real-time.
After implementing the examples provided in this chapter, you will understand how these APIs might be used in an application.
The ChorusOS operating system offers five time management services:
Clock service (tick)
Date service
timeout service
Timer service (TIMER
)
Virtual time and virtual timeout service (VTIMER
)
The configuration of your ChorusOS operating system determines which services are available. Table 12-1 indicates which services are available for a given configuration:
Table 12-1 Time Management Service Availability
Service |
Availability |
---|---|
tick |
always available |
date |
configured with |
timeout |
always available |
timer |
configured with |
virtual time |
configured with |
The tick service enables the system to manage the clock, counting only the ticks since the last reboot. No other tick count is available.
A user or supervisor actor can obtain the time elapsed since the last reboot through the following system call:
#include <exec> int sysTime(KnTimeVal* time);
This call fills the time data structure, which is built from the following two fields:
tmSec which indicates the number of whole seconds elapsed since the last reboot
tmNSec which indicates the number of nanoseconds
The resolution of the value depends on the platform on which the system is running, and may be obtained by a call to:
#include <chorus.h> int sysTimeGetRes(KnTimeVal* resolution);
The time value returned at the location defined by the resolution parameter represents the smallest possible difference between two distinct values of the system time.
The two functions described previously return K_OK if successful and K_EFAULT if the data is outside the actor's address space. The following code example illustrates the use of these two general functions:
#include <stdio.h> #include <stdlib.h> #include <chorus.h> int main( ) { KnTimeVal time; int result; result = sysTime(&time); if (result != K_OK) { fprintf(stderr, "error on sysTime %s\n", strSysError(result)); exit(1); } printf("system time: %d seconds and %d nanoseconds\n", time.tmSec, time.tmNSec); result = sysTimeGetRes(&time); if (result != K_OK) { fprintf(stderr, "error on sysTime %s\n", strSysError(result)); exit(1); } printf("time resolution: %d seconds and %d nanoseconds\n", time.tmSec, time.tmNSec); }
This program produces the following output:
neon time1_u started aid = 23 system time: 12013 seconds and 240000000 nanoseconds time resolution: 0 seconds et 10000000 nanoseconds |
The date service enables the ChorusOS operating system to maintain
the current date (usually expressed in seconds since 01/01/1970). Calls to
set and get the time are available in the standard C libraries ctime
and localtime
, and are not detailed
in this document.
This section demonstrates how the date service is handled on ChorusOS systems. The following example uses the ctime() function to convert the system date to a standard character string.
#include <stdio.h> #include <chorus.h> #include <date/chDate.h> int result; KnTimeVal resol, time1, time2; int main( ) { result = univTimeGetRes(&resolution); if (result != K_OK) printf("error on univTimeGetRes: %s\n", strSysError(result)); else printf("resolution: %d sec, %d nano\n", resol.tmSec, resol.tmNSec); result = univTime (&time1); if (result != K_OK) printf("error on univTime: %s\n", strSysError(result)); printf("time is: %d seconds, %d nanoseconds\n", time1.tmSec, time1.tmNSec); printf(" ==> %s\n", ctime(&time1)); time2.tmSec = 40000000; time2.tmNSec = 0; result = univTimeSet(&time1, &time2); if (result != K_OK) printf("error on univTimeSet: %s\n", strSysError(result)); result = univTime (&time1); if (result != K_OK) printf("error on univTime: %s\n", strSysError(result)); printf("time is: %d seconds, %d nanoseconds\n", time1.tmSec, time1.tmNSec); printf(" ==> %s\n", ctime(&time1)); }
This program produces the following output:
neon date_s.r started aid = 2 resolution: 0 sec, 10000000 nano time is: 947179797 seconds, 590000000 nanoseconds Thu Jan 6 17:29:57 2000 time is: 40000000 seconds, 10000000 nanoseconds Thu Apr 8 23:06:40 1971 |
The timeout service enables supervisor actors to set up timeouts. A timeout may be described as a callback performed after a given delay has expired. Callbacks are performed using the LAP invocation mechanism.
The TIMEOUT
feature provides the traditional
one-shot timeout service. At timeout expiration, a caller-provided handler
is executed directly at the interrupt level. The execution is generally performed
on the interrupt stack (if it exists) with the thread scheduling disabled.
The handler execution environment will be restricted accordingly. This feature
is restricted to supervisor threads.
In the current version of the ChorusOS operating system, timeouts are based on a regular system-wide clock tick, and timeout granularity is determined by the clock tick.
The timeout API includes the following system calls:
Cancel a timeout
Request a timeout
Get timeout resolution
The following example demonstrates the use of timeout and LAP handlers. The example assumes that the program is run in supervisor mode.
In this example, the original value of val is 10. The example includes the following steps:
A LAP is created, which doubles the value of val when it is called and posts an event.
The main program installs the LAP, sets a timeout and waits for a semaphore.
After being awakened, the program completes its function and then exits.
#include <exec/chExec.h> #include <exec/chIo.h> #include <stdcIntf.h> #include <stdio.h> #include <exec/chTimeout.h> int result; int val; KnTimeVal servTimeout = {0, 50 * K_NPERM}; /* 50 millisec */ KnTimeVal servDelay = {0, 500 * K_NPERM}; /* 5000 millisec */ KnTimeout servTimeoutDesc; KnThSem servThSem; /* * Timeout handler, used to test */ void toHandler(args, opMsg) void* args; void* opMsg; { val = val * 2; threadSemPost(&servThSem); sysWrite("timeout handler", strlen("timeout handler")); } void timeout() { void *opMsg; if (_stdc_execPrivilege == K_SUPACTOR) { KnLapDesc lapDesc; result = svLapCreate(&_stdc_execActor, toHandler, opMsg, K_LAP_SAMEHOST, &lapDesc); if (result != K_EOK) { printf("svLapCreate error\n"); } result = svSysTimeoutSet(&servTimeoutDesc, &servTimeout, 0, &lapDesc); if (result != K_EOK) { printf("svSysTimeoutSet error\n"); } /* Wait for the timeout handler to complete. should NOT return K_ETIMEOUT */ result = threadSemWait(&servThSem, &servDelay); if (result != K_EOK) { printf("threadSemWait error \n"); } printf("timeout completed\n"); } } main(int argc, char* argv[], char** envp) { val = 10; printf("val = %d\n",val); threadSemInit(&servThSem); timeout(); printf("val = %d\n",val); }
The timer service is an extension of the timeout mechanism, enabling user and supervisor actors to set up call backs in a flexible manner. Both one-shot and periodic timers are provided. timeout notification is achieved through user-provided handler threads which are woken in the application actor.
The timer facility uses the principle of a timer object within the actor. Timer objects may be created, set and deleted dynamically. After being created, the timer objects are addressed by local identifiers within the context of the actor, and are deleted automatically when the actor terminates.
The application creates one or more threads dedicated to timer notification handling by declaring themselves ready to handle these types of events. The relationship between a timer object and a thread (or set of threads) is established trough a threadPool object that is used to block threads waiting for expiration of a timer.
Therefore, the basic mechanism for dealing with timers is:
Allocate and initialize a threadPool object.
Create one thread which will block on the threadPool object.
Create a timer associated with the threadPool object.
Set the timer (effectively arming it).
The second and third steps may be performed in any order. When timer expiration occurs, the dedicated thread is unblocked so that it may now execute any operation that should be performed at timer expiration. For example, the thread may print a warning message, re-arm the timer (unless it is a periodic timer), and block itself again. As is usually the case with the ChorusOS operating system data structures, these threadPool objects must be pre-allocated by the application.
A threadPool object is initialized as follows:
#include <etimer> int timerThreadPoolInit(KnThreadPool* threadPool);
A timer can then be created as follows:
int timerCreate(KnCap* actorCap, int clockType, KnThreadPool* threadPool, void* cookie, int* timerLi);
The previous code fragment creates a timer object in the actor defined by the actorCap parameter. Applications usually use K_MYACTOR. When the timer is armed and reaches expiration, one of the threads blocked on the threadPool object is selected and awakened. This thread is passed the cookie parameter of the timerCreate(2K) call. If successful, timerCreate(2K) returns the local identifier of the created timer in the location defined by the timerLi parameter. At the time of this printing the only clock type supported is K_CLOCK_REALTIME, which corresponds to the time returned by sysTime(2K).
A thread can block itself on a threadPool object through the following call:
#include <etimer> int timerThreadPoolWait(KnThreadPool* threadPool, void** cookie, int* overrun, KnTimeVal* waitLimit);
The threadPool object must have been initialized previously.
The timerThreadPoolWait(2K) call blocks the invoking thread until a timer associated with threadPool expires or until the waitLimit condition is reached. Upon timer expiration, the thread returns from this call, and the cookie field has been updated with the value associated with the timer.
The overrun counter is used to indicate to the thread that the timeout notification has been delayed (in this case the overrun value is 1), or that a number of timeout notifications have been lost (in this case the overrun value is always greater than 1).
A timer may be armed with:
#include <etimer> int timerSet(KnCap* actorCap, int timerLi, int flag, KnITimer* new, KnITimer* old);
The timerSet() call arms the timer defined by the first two parameters where timerLi is the timer identifier returned by timerCreate(2K). The timerSet(2K) call enables the specification of the timeout with a relative or an absolute time (using the flag parameter). The timeout can be specified using the new parameter, a structure containing the following fields:
KnTimeVal ITmValue. This field specifies the specific time at which the timeout will be invoked for the first time.
If the flag is set to K_TIMER_ABSOLUTE, the time value is an absolute time (in relation to time as managed by the sysTime() service).
If the flag is set to K_TIMER_INTERVAL, the time value is a delay relative to the current time.
KnTimeVal ITmReload. This field contains the subsequent interval for a periodic timer. If its value is 0, the timer will be a one-shot timer.
If the old parameter is non-NULL, the time remaining before the timer expires is returned at the location defined by old. If new is non-NULL and the timer has already been set, the current setting is cancelled and replaced with the new one. If the new time specified is 0, the current setting is simply cancelled. If new is set to NULL, the current setting specification is left unchanged.
For more information, refer to the timerThreadPoolInit(2K), timerCreate(2K), timerSet(2K), timerSet(2K), timerThreadPoolWait(2K) man pages.
The following example illustrates the use of timer services for both user and supervisor actors.
(file: opt/examples/progov/timers.c) #include <stdio.h> #include <stdlib.h> #include <chorus.h> #include <etimer.h> KnThreadPool samplePool; int periodic; int oneShot; int periodicLid; int oneShotLid; #define USER_STACK_SIZE (1024 * sizeof(long)) KnSem sampleSem; /* Semaphore allocated as global variable */ int childCreate(KnPc entry) { KnActorPrivilege actorP; KnDefaultStartInfo_f startInfo; char* userStack; int childLid = -1; int res; startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; res = actorPrivilege(K_MYACTOR, &actorP, NULL); if (res != K_OK) { printf("Cannot get the privilege of the actor, error %d\n", res); exit(1); } if (actorP == K_SUPACTOR) { startInfo.dsPrivilege = K_SUPTHREAD; } else { startInfo.dsPrivilege = K_USERTHREAD; } if (actorP != K_SUPACTOR) { userStack = malloc(USER_STACK_SIZE); if (userStack == NULL) { printf("Cannot allocate user stack\n"); exit(1); } startInfo.dsUserStackPointer = userStack + USER_STACK_SIZE; } startInfo.dsEntry = entry; res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, 0, &startInfo); if (res != K_OK) { printf("Cannot create the thread, error %d\n", res); exit(1); } return childLid; } void timerWait(int myThLi) { } void sampleThread() { int myThLi; int res; void* cookie; int overrun; KnITimer periodicTimer; KnTimeVal tv; myThLi = threadSelf(); printf("Thread %d started\n", myThLi); for(;;) { res = timerThreadPoolWait(&samplePool, &cookie, &overrun, K_NOTIMEOUT); if (res != K_OK) { printf("Cannot wait on thread pool, error %d\n", res); exit(1); } if (overrun != 0) { printf("Thread %d. We were late! overrun set to : %d\n", myThLi, overrun); } if (cookie == &periodic) { printf("Thread %d. Time is flying away!\n", myThLi); } else if (cookie == &oneShot) { printf("Thread %d. Isn't it time to go home?\n", myThLi); periodicTimer.ITmValue.tmSec = 0; /* seconds */ periodicTimer.ITmValue.tmNSec = 0; /* nanoseconds */ periodicTimer.ITmReload.tmSec = 0; /* seconds */ periodicTimer.ITmReload.tmNSec = 0; /* nanoseconds */ res = timerSet(K_MYACTOR, periodicLid, NULL, &periodicTimer, NULL); if (res != K_OK) { printf("Cannot cancel periodic timer, error %d\n", res); exit(1); } /* * Periodic timer is cancelled * Get current time, * Wait for a short while (3 seconds) and quit */ res = sysTime(&tv); if (res != K_OK) { printf("Cannot get system time, error %d\n", res); exit(1); } printf("Current system time is %d seconds\n", tv.tmSec); printf("No more periodic messages should be printed now!\n"); K_MILLI_TO_TIMEVAL(&tv, 3000); (void) threadDelay(&tv); /* We are all done ! */ exit(0); } else { printf("Spurious timer!\n"); } } /* for() */ } int main(int argc, char** argv, char** envp) { int res; KnTimeVal tv; int thLi1; int thLi2; KnITimer periodicTimer; KnITimer oneShotTimer; res = timerThreadPoolInit(&samplePool); if (res != K_OK) { printf("Cannot initialize thread pool, error %d\n", res); exit(1); } res = timerCreate(K_MYACTOR, K_CLOCK_REALTIME, &samplePool, &periodic, &periodicLid); if (res != K_OK) { printf("Cannot create periodic timer, error %d\n", res); exit(1); } res = timerCreate(K_MYACTOR, K_CLOCK_REALTIME, &samplePool, &oneShot, &oneShotLid); if (res != K_OK) { printf("Cannot create one shot timer, error %d\n", res); exit(1); } thLi1 = childCreate((KnPc)sampleThread); thLi2 = childCreate((KnPc)sampleThread); res = sysTime(&tv); if (res != K_OK) { printf("Cannot get system time, error %d\n", res); exit(1); } printf("Current system time is %d seconds\n", tv.tmSec); periodicTimer.ITmValue.tmSec = 1; /* seconds */ periodicTimer.ITmValue.tmNSec = 0; /* nanoseconds */ periodicTimer.ITmReload.tmSec = 1; /* seconds */ periodicTimer.ITmReload.tmNSec = 0; /* nanoseconds */ res = timerSet(K_MYACTOR, periodicLid, NULL, &periodicTimer, NULL); if (res != K_OK) { printf("Cannot arm periodic timer, error %d\n", res); exit(1); } oneShotTimer.ITmValue.tmSec = tv.tmSec + 30; /* seconds */ oneShotTimer.ITmValue.tmNSec = 0; /* nanoseconds */ oneShotTimer.ITmReload.tmSec = 0; /* seconds */ oneShotTimer.ITmReload.tmNSec = 0; /* nanoseconds */ res = timerSet(K_MYACTOR, oneShotLid, K_TIMER_ABSOLUTE, &oneShotTimer, NULL); if (res != K_OK) { printf("Cannot arm one shot timer, error %d\n", res); exit(1); } res = threadDelete(K_MYACTOR, K_MYSELF); if (res != K_OK) { printf("Cannot kill myself, error %d\n", res); exit(1); } return 0; }
The previous example includes the following step:
The main thread sets up everything that is required, so that the two newly-created threads will respond to a single periodic timer of one second for a duration of thirty seconds.
The thirty-second period is bounded by a one-shot timer handled by the same pool of two threads.
Before starting, the current system time is printed.
When the thirty second timer has elapsed, the periodic timer is cancelled and the current system time is displayed again.
A small delay occurs before the actor terminates, enabling you to check that the periodic timer has been cancelled correctly.
This chapter demonstrates the use of the ChorusOS native application programming interfaces for handling thread exceptions, aborts, faults, traps, timeouts, and so forth. The APIs covered here include those related to each type of handler.
After implementing the examples provided in this chapter, you will understand how these APIs might be used in an application.
The ChorusOS operating system provides three kinds of exceptions:
Generated voluntarily by the current thread -- traps are used to change the thread privilege level to execute system code.
Generated involuntarily by the current thread -- usually due to errors such as division by 0.
Explicitly generated by specific microkernel modules or supervisor actors -- panics correspond to software/hardware faults which are not recoverable at the application level.
The core executive API provides basic services for subsystems to handle these three kinds of exceptions. For this purpose, the core executive exports an interface for subsystems to declare trap, exception, and panic handlers. Exception handlers are declared on a per actor basis, while trap handlers and panic handlers are declared on a site-wide basis.
The core executive API includes a number of exception handler system calls, described in the following table.
Table 13-1 Exception Handler System Calls
System Call |
Purpose |
---|---|
svExcHandler() |
Sets an actor's exception handler (compatible with the ChorusOS 4.x API) |
svActorExcHandlerConnect() |
Connects an actor's exception handler |
svActorExcHandlerDisconnect() |
Disconnects an actor's exception handler |
svActorExcHandlerGetConnected() |
Gets an actor's exception handler |
A variety of microkernel system calls can be used to block a current thread when a request cannot be satisfied immediately or has to await a particular event. Thread blocking and wakeup operations are performed by microkernel features through the invocation of a microkernel internal interface.
A thread can be forced to exit the blocked state. Subsystem managers might need to awaken blocked threads prematurely to enable them to perform their function (to process asynchronous signals, for example). When a thread is blocked, it may be ABORTABLE, depending on the blocking primitive invoked and/or the arguments of this invocation.
The threadAbort() primitive forces a thread which is blocked in an ABORTABLE state to be awakened. Generally, a corresponding blocking call returns a specific error code, indicating an abort.
Because the invoker of threadAbort() may not know whether the thread is currently blocked or not, the microkernel will define the behavior of the threadAbort() call if the thread is active.
Therefore, the effect of threadAbort() depends on the state of the thread.
If a thread is blocked into an ABORTABLE microkernel call without an abort handler for its home actor, the aborted thread returns with the K_EABORT error code. If an abort handler is set, the error code is not returned and the abort handler is invoked.
If a thread is not blocked in an ABORTABLE state, the abort event will be recorded. The thread is said to be in an ABORTED state. The abort will be handled later.
When a thread is in an ABORTED state, it handles the abort if either of the following situations occurs:
The thread executes in its home actor (internal mode). In this case, the thread can execute an abort handler. Supervisor actors can attach abort handlers to other actors to trap the entry of threads of actors in an ABORTED state. If this type of handler is attached to the thread's home actor, it is invoked as soon as the thread executes in its home actor while in the ABORTED state.
If the thread was executing in a different actor or a microkernel call when aborted, and an abort handler is in effect, the handler will be invoked as soon as the thread returns to its home actor execution environment.
After the abort handler has been executed, the abort is considered to have been handled:
The thread invokes an ABORTABLE blocking primitive. In this case, the primitive returns with the K_EABORT error code. The abort has been handled.
The thread invokes the threadAborted() microkernel primitive. This primitive returns its current state (aborted or not) and clears the state.
Asynchronous execution control operations (threadAbort()) take effect immediately only if the target thread is currently running in internal mode, that is, if its current execution actor matches its home actor. In the case, where the thread has changed its execution actor (through a microkernel call, trap, or other) , the operation is deferred until the execution actor is reset to the home actor, on return from the microkernel call or trap, for example. Deletion of a thread is also considered an asynchronous control operation in this sense. Thus, a thread is immune from deletion (except if it deletes itself) when its execution actor is changed for a trap or other cross-actor invocation. It is deleted only when it returns to its home actor.
The following examples illustrate the abort handling capabilities of the ChorusOS operating system.
This example enables an application to abort a thread whose identication is given as an argument. The first four parameters correspond to the capability of the thread home actor, and the fifth argument is the local identication of the thread in the home actor.
#include <utilities.h> KnCap actorCap; int result; int main(int argc, char *argv[ ]) { if (argc != 6) { fprintf(stderr, "bad call\n"); exit(1); } readCap(argv+1, &actorCap); result = threadAbort(&actorCap, atoi(argv[5])); if (result < 0) fprintf(stderr, "error on threadAbort: %d\n", strSysError(result)); }
In the following example, a thread requests an abort of itself before it enters a loop and commits to an infinite abortable delay.
#include <chorus.h> int main( ) { int i, result; threadAbort(K_MYACTOR, K_MYSELF); printf("Before looping\n"); for (i = 0; i < 50000000; i++); printf("After looping\n"); result = threadDelay(K_NOTIMEOUT); if(result == K_EABORT) printf("threadDelay aborted\n") else printf("result = %d\n", result); }
The output of this example is:
neon abortableDelay_u started aid = 23 Before looping After looping threadDelay aborted |
In the previous example the abort request immediately awakens the thread when the abort request calls the threadDelay() primitive. It has no effect on the previously executed loop.
The following example modifies the code used in the previous example to produce a non-abortable application (notAbortDel). The code has been modified as follows:
The request to auto-abort the thread is deleted.
The abortable call threadDelay(K_NOTIMEOUT) is replaced by the equivalent non-abortable call threadDelay(K_NOTIMEOUT_NOABORT).
The notAbortDel application is then executed using the following:
neon-n notAbortDel_u & [1] 16123 started aid = 23 Before looping After looping |
In the previous example, a request for the thread to abort is made using the threadAbort() application created in Example 13-1. To verify that the request has no effect and that the thread or actor must be killed:
$ rsh neon arun cs -la 23 started aid = 22 ChorusOS r5.0.0 Site 0 Time 1d 19h 14m 22 ACTOR-UI KEY LID TYPE STATUS TH# NAME 200000d0 869da80a 00000017 00000000 0023 USER STARTED 001 notAbortDel_u THREAD-LI PRIORITY TT IT CTX SC-MS-PN 0007 140 00000430 00000430 ff6380 0- 1- 0 main ............................ $ neon threadAbort_u 200000d0 869da80a 17 0 7 started aid = 22 $ rsh neon aps grep notAbort 0 23 notAbortDel_u 0 N/A $ rsh neon akill 23 |
In the following example, an application called abortedState is created as follows:.
#include <chorus.h> int main() { int i, result; threadAbort(K_MYACTOR, K_MYSELF); printf("Before looping \n"); for (i=0; i<100000; i++); printf("After looping\n"); threadAbort(K_MYACTOR, K_MYSELF); result = threadAborted( ); if (result == 1) printf("Aborted state\n"); else printf("Non aborted\n"); result = threadAborted( ); if (result == 1) printf("Aborted state\n"); else printf("Non aborted\n"); }
The output of this example is:
$ neon abortedState_u started aid = 23 Before looping After looping Aborted state Non aborted |
Note that abort requests are not accumulated. An abort request for a thread that is already in the aborted state will be ignored.
A specific handler is associated with a given trap. It receives as an argument, the context of the calling thread when the trap occurred. Such a trap handler is defined system-wide, that is, all actors will invoke the same handler after being defined. The handler has access to the information required to identify the system call and to retrieve the arguments.
Only one handler may be attached to a given trap. After the handler is installed, it is automatically invoked. The ChorusOS operating system provides an interface that is based on the LAP mechanism for connecting and disconnecting threads. This interface can be used by supervisor threads only and enables a specific trap handler to be connected to different trap numbers.
When a trap handler is called, its argument is a pointer to a KnSysTrapDesc object, which has the following fields:
The thread's context that is saved when the trap occurs. The values of the processor's registers can be accessed by the trap handler. The trap handler can also modify the register values before returning.
The number of the trap that was invoked.
The core executive API includes the following trap handler system calls:
Table 13-2 Trap Handler System Calls
System Call |
Purpose |
---|---|
svSysTrapHandlerConnect() |
Connects a trap handler |
svSysTrapHandlerDisconnect() |
Disconnects a trap handler |
svSysTrapHandlerGetConnected() |
Gets a trap handler |
svTrapConnect() |
Connects a trap handler |
svTrapDisconnect() |
Disconnects a trap handler |
svTrapConnect() and svTrapDisconnect() are implemented in the library and are provided for backward compatibility only. The new system calls svSysTrapHandlerConnect() and svSysTrapHandlerDisconnect() use a LAP handler for enhanced security.
The following three examples illustrate how to connect, disconnect, and get trap handlers in the ChorusOS operating system.
A trap handler specified as a LAP handler can be connected to a given trap number by calling the primitive:
#include <exec/chTrap.h> int svSysTrapHandlerConnect( unsigned int trapNumber, KnLapDesc *trapLapDesc );
This call returns K_OK if it is successful and returns a negative value if unsuccessful.
Value |
Error |
---|---|
K_EBUSY |
trapNumber is already connected |
K_EINVAL |
trapNumber is invalid |
A trap handler previously connected to a trap number with a call to svSysTrapHandlerConnect() can be disconnected by calling the primitive:
#include <exec/chTrap.h> int svSysTrapHandlerDisconnect( unsigned int trapNumber, KnLapDesc *currentTrapLapDesc );
The value of currentTrapLapDesc can be K_CONNECTED_LAP. If this is not the case, the value must point to a LAP descriptor that is identical to the LAP desciptor currently connected to the trap number.
The function will K_OK if successful or a negative value if an error occurs.
Value |
Error |
---|---|
K_EINVAL |
currentTrapLapDesc is not K_CONNECTED_LAP and does not match the LAP descriptor of the current trap handler. |
A copy of the LAP descriptor corresponding to the trap handler that is currently connected to the trap number can be obtained by calling the primitive:
#include <exec/chTrap.h> int svSysTrapHandlerGetConnected( unsigned int trapNumber, KnLapDesc *currentTrapLapDesc );
The function returns K_OK on success and K_EINVAL if an error occurs.
This chapter demonstrates the use of the ChorusOS application programming interfaces that help when diagnosing and recovering from application or system failure. The APIs covered here include:
Software black box
Provides a robust repository for realtime collection and storage of a time ordered vector of historical event information to enable postmortem data analysis for system and application failures.
Application-level two-stage watchdog timer
Provides a two-stage watchdog timer. If the timer is not periodically reset, it will generate an interrupt to initiate a controlled processor restart and then, if the system cannot initiate a controlled restart, will trigger a hardware reset without software intervention (to ensure that the target restarts).
System logging
Provides a configurable system log service to enable the system and applications to log errors, warnings, or other messages to local or remote files.
After implementing the examples provided in this chapter, you should understand how these APIs can be used in an application.
The ChorusOS black box enables applications to read the event buffer and log events. The black box feature can be defined as a set of microkernel ring buffers. Multiple black boxes can be configured, each identified with a unique integer identifier. The number and sizes of black boxes on your ChorusOS are tunable. A specified black box can be frozen and events directed to another black box.
Events are tagged with specific information (event producer identifier, event producer name, filter data, and timestamp). Some filtering is performed before events are inserted into a black box. Events can be filtered in or out based on the event producer name, and on the severity of the event. Descriptive event tags can be added to a black box along with each event to enable finer grained post-process filtering.
All black box APIs are thread-safe. Calling any one of these APIs will produce valid results, even if the call is made by multiple threads. The bb_event and bb_freeze events are async-signal-safe. This means that they can safely be called from a signal handler, and cannot be interrupted by a signal. No black box APIs are cancel-safe or interruptible.
At node startup, a specific number of black boxes are allocated. The node will select one of these black boxes to be active. This is performed sufficiently early in the boot process so that all user applications are able to log events to it as soon as they start. Some black box parameters are configurable at node startup time, that is before applications start running.
Before an application component manager launches an application, it prepares to catch the application in case it fails unexpectedly. If the application does fail, the component manager logs information about the failure to the black box, and triggers a freeze of the current black box.
Any application can log events in a black box at any point by calling bb_event. If multiple entities on the node are generating black box events, a specific thread may have fewer events than expected recorded in the black box.
At some point, an event may occur which requires freezing the black box. This is accomplished by calling bb_freeze. When the current black box is frozen, a new black box is chosen (if one is available), and new events are routed into it.
The criteria by which event filtering is performed can be modified dynamically. See "Filtering APIs" for details on the capabilities of the filtering APIs.
A system process can locate a frozen black box, open it, and read its contents to diagnose a failure. After the failure has been diagnosed, the process calls bb_release to allow the system to reselect that black box.
The black box API can also be used to store microkernel-level information. In this case the black box logs exceptions, traps, panics, and failed system calls. To use this feature, the kern.blackbox.kernelLogging tunable must be set to 1.
When the HOT_RESTART feature is selected,persistent memory can be used to store the black box data. To use persistant memory in this way, the BSP must be compiled with the -DBB_PERSISTENT flag.
When the BSP is compiled with the -DBB_PERSISTENT flag, the total size allocated for the black boxes is fixed and equally distributed for the specified number of black boxes.
Example 14-1 demonstrates the basic use of the black box API in an application.
#include <sys/blackbox.h> #define SEV_LOW 4 #define SEV_HIGH 24 #define SEV_EMERG 31 int main(int argc, char **argv) { int fd; bb_event("main", SEV_LOW, "Starting with %d args", argc); if (argc > 1) { bb_event("main", SEV_LOW, "argv[1] = %s", argv[1]); } else { bb_event("main", SEV_EMERG, "no arguments exiting"); exit(1); } if ((fd = open(argv[1], O_RDWR)) == 1) { bb_event("main", SEV_HIGH, "failed to open \"%s\": error %d", argv[1], errno); } ... }
Depending on the node's filtering configuration, some of the events in this example may not be added to the black box.
The filtering APIs include the following interfaces:
bb_getfilters
bb_getprodids
bb_getseverity
bb_setfilters
bb_setprodids
bb_setseverity
BB_SEV_CLEAR
BB_SEV_CLEAR_LIMIT
BB_SEV_SET
BB_SEV_SET_LIMIT
BB_SEV_TEST
You can specify an event producer to use the filter list and filtered severity bitmap (also known as the fine grained filters) by identifying the producer in a set passed to bb_setprodids.
The filter list contains a set of pairs of tag and severity. These entities are described in the definition of bb_getseverity. An event is entered into the black box if:
The call to bb_event has a tag that matches a tag in the filter list.
The severity of the call is enabled in the tag's severity bitmap in the filter list.
The caller of bb_event has been enabled to use the filter list.
The filtered severity bitmap is a node-wide severity bitmap. An event will be entered into the black box if:
The call to bb_event has a severity that is enabled in the filtered severity bitmap.
The caller of bb_event has been enabled to use the filtered severity bitmap.
The global severity bitmap is also a node-wide severity bitmap. If a call to bb_event does not find a match in the filter list or the filtered severity bitmap, or if the caller is not using these filters, the bb_event call will fall back to using the global severity bitmap. An event will be entered into the black box if the call to bb_event has a severity that is enabled in the global severity bitmap.
The global severity bitmap can be modified by calling bb_getseverity, modifying the bitmap, and passing the new bitmap to bb_setseverity.
The three filters are used in order, from the most to the least specific:
filter list
filtered severity bitmap
global severity bitmap
The following example illustrates the use of tags in libraries and some filtering interfaces. It also shows what a black box dump should look like.
#include <sys/blackbox.h> #define SEV_TRACE 8 #define SEV_INIT 15 #define SEV_HIGH 28 #define SEV_OOM 30 int foo_init(foo_t **fp, hat_t *hp) { hat_t *nhp; bar_t *bp; uint_t total_reqs, i; *fp = NULL; if (hp == NULL) { bb_event("libA:foo_init", SEV_TRACE, "hat pointer is NULL, skipping"); return (0); } for (nhp = hp, total_reqs = 0; nhp != NULL; nhp = nhp->hat_next) total_reqs += hp->hat_reqs; bb_event("libA:foo_init", SEV_TRACE, "total of %d requests", total_reqs); if (total_reqs == 0) return (0); *fp = (foo_t *)malloc(sizeof (foo_t)); if (*fp == NULL) { bb_event("libA:foo_init", SEV_OOM, "out of memory for foo_t"); return (-1); } bp = (bar_t *)malloc(total_reqs * sizeof (bar_t)); if (bp == NULL) { bb_event("libA:foo_init", SEV_OOM, "no memory for %d bars", total_reqs); free(*fp); return (-1); } (*fp)->foo_array = bp; (*fp)->foo_hatptr = hp; (*fp)->foo_reqcnt = total_reqs; for (nhp = hp; nhp != NULL; nhp = nhp->hat_next, bp += nhp->hat_reqs) { nhp->hat_array = bp; for (i = 0; i < nhp->hat_reqs; i++) bp->bar_id = i; } bb_event("libA:foo_init", SEV_INIT, "foo at %p created", *fp); return (0); }
Within the program itself:
int process_hatlist(hat_t *hp) { foo_t *fp; bb_event("process_hatlist", SEV_TRACE, "received hat list %p", hp); if (foo_init(&fp, hp) != 0) { bb_event("process_hatlist", SEV_HIGH, "cant make foo for hatlist %p", hp); return ( 1); } if (fp != NULL) { foo_getreqs(fp); foo_enqueue(fp); } return (0); }
Given the program from the previous example, the following examples illustrate filters and the potential black box traces that they may generate. (Note that this is simply an example trace format and may bear no resemblance to any output an administrative CLI program might display.)
Filter:
bb_filter_t filts[] = { { 0, "libA:foo_init" }, }; bb_prodid_t prods[] = { { BB_ALL_PROD, BB_ALL_PIDS } }; bb_severity_t sev; BB_SEV_SET_LIMIT(SEV_TRACE, &filts[0].bbf_severity); BB_SEV_SET_LIMIT(SEV_HIGH, &sev); bb_setfilters(filts, 1); bb_setprodids(prods, 1); bb_setseverity(sev);
Trace:
2000 03 15 18:56:02.928980 hatrouter[2583/1]: libA:foo_init(8): total of 93 requests 2000-03-15 18:56:02.929301 hatrouter[2583/1]: libA:foo_init(15): foo at 0x78a89380 created 2000-03-15 18:56:03.017327 foodb[2448/1]: libA:foo_init(8): total of 93 requests 2000-03-15 18:56:03.017936 foodb[2448/1]: libA:foo_init(15): foo at 0x300004bc8d8 created
In the previous example, the filter and trace indicate that for both producers, foo_init was successful. No errors ocurred and the chain of 93 hat_ts were successfully passed from the hatrouter to the foodb. The number after the tag libA:foo_init indicates the severity level of the event.
Filter:
bb_filter_t filts[] = { { 0, BB_ALL_TAGS }, }; bb_prodid_t prods[] = { { "hatrouter", BB_ALL_PIDS } }; bb_severity_t sev; BB_SEV_SET_LIMIT(SEV_TRACE, &filts[0].bbf_severity); BB_SEV_SET_LIMIT(SEV_HIGH, &sev); bb_setfilters(filts, 1); bb_setprodids(prods, 1); bb_setseverity(sev);
Trace:
2000-03-15 19:33:28.139821 hatrouter[2583/1]: process_hatlist(8): received hat list 0x7f78a190 2000-03-15 19:33:28.148295 hatrouter[2583/1]: libA:foo_init(30): out of memory for 28173 bar's 2000-03-15 19:33:28.148376 hatrouter[2583/1]: process_hatlist(28): can't make foo for hat list 0x7f78a190
In the previous example, the filter and trace suggest that the hatrouter received an extremely long chain of hat_ts and could not allocate enough bar_ts to back them up.
The watchdog timer API provides a set of system calls that enable a ChorusOS actor to manage a two-stage watchdog timer. The two-stage watchdog timer can be used by ChorusOS applications or middleware to monitor their sanity.
A user or supervisor actor takes control of the timer by invoking the watchdog timer API. When under the control of an application, the two-stage watchdog timer must be reloaded periodically. If the first-stage timer expires, an interrupt is triggered, enabling the collection of diagnostic information by means of the system dump feature. The system is then restarted.
If the system freezes before the completion of stage one, it will be reset when the second-stage timer expires.
The two-stage watchdog timer uses two watchdog devices. The first device operates in interrupt mode (stage one), while the second operates in reset mode (stage two). The first-stage interrupt handler is system-wide and is therefore not exposed to the user, as shown in Figure 14-1.
When the watchdog timer is enabled, the second stage reset-mode watchdog device is armed and is running all the time, even when the two-stage watchdog timer is unallocated and not controlled by an application.
In this situation, the running watchdog device is reloaded silently by the watchdog API. It is never subsequently disarmed. This ensures that the system is guarded continually against system lockups.
If a process is unscheduled for a long time, for example, as the result of a debugging session, for example, a system dump will occur (if implemented). Following the system dump, the system will be reset on timeout of the first and second-stage watchdog devices.
The watchdog timer API includes the following system calls. Note that in this table, the watchdog timer refers to the watchdog timer named by handle.
Table 14-1 Watchdog Timer API System Calls
System Call |
Purpose |
---|---|
wdt_alloc() |
Returns a valid handle that identifies the allocated watchdog timer |
wdt_realloc() |
Reallocates the watchdog timer, that was returned by the last call to wdt_alloc(). This enables a new actor created in a higly available (HA) environment to take over watchdog management for an actor that has died. If the watchdog timer is armed, it will also be reloaded by this call. |
wdt_free() |
Disarms and releases the watchdog timer |
wdt_get_maxinterval() |
Returns the maximum timeout interval that can be set for the timer |
wdt_set_interval() |
Sets the timeout interval for the watchdog timer. If both components of interval are zero, the timer is disarmed. |
wdt_get_interval() |
Returns the current timeout interval set for the watchdog timer |
wdt_arm() |
Starts a new timeout interval, the duration of which will be set by wdt_set_interval() |
wdt_disarm() |
Disarms the watchdog timer |
wdt_is_armed() |
Returns the state of the watchdog timer. A positive value is returned if the timer is armed. |
wdt_pat() |
Reloads the watchdog timer. The timer begins a new timeout interval, the duration of which is set by wdt_set_interval(). |
wdt_startup_commit() |
Indicates whether the watchdog timer startup sequence has completed successfully. Successful completion denotes that the system no longer needs to be rebooted if the reset mode watchdog device armed by the boot framework expires. Because the reset mode watchdog device must not be disarmed, the system will continue to reload the reset mode device silently until it is shut down, or until the watchdog timer is explicitly allocated and armed by the HA framework (or by another application). |
wdt_shutdown() |
Indicates whether the system is being shut down. During shutdown, the reset mode watchdog device must not be reloaded for more than the configured timeout interval. This will ensure that the system is reset even if the shutdown sequence does not complete within the expected time period. This call will fail if it is invoked while the watchdog device is armed. |
The watchdog API is typically used to monitor the progress of an application. In the following example, a thread is dedicated to the watchdog.
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <errno.h> #include <assert.h> #include <util/chKnTimeVal.h> #include <sched/chSched.h> #include <exec/chExec.h> #include <sync/chSem.h> #include <cx/wdt.h> #define WDT_RUNS 10 /* number of times the test is run */ #define WDT_PAT_TIMES 3 /* number of times the timer * is patted at each interval */ #define WDT_PAT_INC 10 /* timeout interval increment */ #define WDT_PAT_MIN 5 /* initial, minimum timeout interval */ #define WDT_STACKSIZE 1000 /* worker thread stack size */ #define WDT_THREAD_PRIO 3 /* worker thread (very high) priority */ #define WDT_EXTRA_SLEEP 10 /* extra delay to check that the timer * is truly disarmed */ wdt_handle_t wdt_handle = 0; /* watchdog timer handle */ /* * sleep for 'secs' seconds, displaying a symbol at each elapsed second */ void wdt_sleep(int secs) { KnTimeVal tv = { 1, 0 }; int i; putchar('['); for (i = 0; i < secs; i++) { threadDelay(&tv); if (i && (i % 60 == 59)) putchar('#'); else if (i && (i % 10 == 9)) putchar('|'); else putchar('.'); fflush(stdout); } printf("]\n"); } /* * check the return value of a system call */ void wdt_check(char* fn, int res) { printf("%s: ", fn); if (res < 0) printf("ERROR: res = %d, errno = %d (%s)\n", res, errno, strerror(errno)); else printf("SUCCESS: res = %d\n", res); } /* * arm, pat, then disarm the watchdog timer */ int wdt_arm_pat_disarm(int min, int max, int inc, int times, int armed) { timespec_t tv; int res, intv, wait, pats; intv = min; while (intv < max) { tv.tv_sec = intv; tv.tv_nsec = 0; /* timer already armed? */ if (!armed) { printf("\nsetting interval to %d s...\n", intv); res = wdt_set_interval(wdt_handle, &tv); wdt_check("wdt_set_interval()", res); if (res < 0) return res; printf("arming...\n"); res = wdt_arm(wdt_handle); wdt_check("wdt_arm()", res); if (res < 0) return res; } printf("patting...\n"); for (pats = 0; pats < times; pats++) { /* pat 1 sec before the timer expires */ wait = intv - 1; printf("sleeping for %d s\n", wait); wdt_sleep(wait); res = wdt_pat(wdt_handle); wdt_check("wdt_pat()", res); if (res < 0) return res; } printf("disarming...\n"); res = wdt_disarm(wdt_handle); wdt_check("wdt_disarm()", res); if (res < 0) return res; armed = 0; /* wait 'WDT_EXTRA_SLEEP' secs more than last configured interval */ wait = intv + WDT_EXTRA_SLEEP; printf("checking that the watchdog timer is stopped...\n"); printf("sleeping for %d s...\n", wait); wdt_sleep(wait); intv += inc; } return 0; } static KnSem wdt_sem; /* * worker thread */ void wdt_thread() { timespec_t max; int armed, i, res, min; for (i = 0; i < WDT_RUNS; i++) { /* Allocate watchdog timer. If timer is already allocated (busy), a previous instance of the actor was killed. In this case, reallocate timer and, if it is armed, continue the test without disarming it */ res = wdt_alloc(&wdt_handle); wdt_check("wdt_alloc()", res); if (res < 0 && errno == EBUSY) { printf("watchdog timer busy -> reallocating it\n"); /* Reallocate timer, using the magic handle (as we have no knowledge of the actual handle previously allocated). this call also pats the timer if it is armed */ wdt_handle = WDT_MAGIC_HANDLE; res = wdt_realloc(&wdt_handle); wdt_check("wdt_realloc()", res); } printf("--> watchdog handle = %d\n", wdt_handle); if (res < 0) break; /* the timer may be already armed when we reallocate it */ res = armed = wdt_is_armed(wdt_handle); wdt_check("wdt_is_armed()", res); if (res < 0) break; printf("--> the watchdog timer is currently%s armed\n", armed ? "" : " NOT"); /* ok, get current timeout interval of the armed timer */ if (armed) { timespec_t intv; res = wdt_get_interval(wdt_handle, &intv); wdt_check("wdt_get_interval()", res); if (res < 0) break; printf("--> current interval = %d.%09d s\n", intv.tv_sec, intv.tv_nsec); /* continue test at point where preceding actor was killed */ min = intv.tv_sec; } else { min = WDT_PAT_MIN; } /* then get the maximum allowed timeout interval */ res = wdt_get_maxinterval(wdt_handle, &max); wdt_check("wdt_get_maxinterval()", res); printf( "--> max interval = %d.%09d s\n", max.tv_sec, max.tv_nsec); if (res < 0) break; res = wdt_arm_pat_disarm(min, max.tv_sec - WDT_EXTRA_SLEEP, WDT_PAT_INC, WDT_PAT_TIMES, armed); if (res < 0) break; res = wdt_free(wdt_handle); wdt_check("wdt_free()", res); if (res < 0) break; } /* test completed */ semV(&wdt_sem); threadDelete(K_MYACTOR, K_MYSELF); } static long wdt_stack[WDT_STACKSIZE]; /* watchdog thread stack */ int main() { KnThreadLid childLid; KnThreadDefaultSched schedParam; KnDefaultStartInfo_f startInfo; KnActorPrivilege actorPriv; int res; semInit(&wdt_sem, 0); actorPrivilege(K_MYACTOR, &actorPriv, NULL); startInfo.dsPrivilege = (actorPriv == K_SUPACTOR) ? K_SUPTHREAD : K_USERTHREAD; startInfo.dsType = K_DEFAULT_START_INFO; startInfo.dsSystemStackSize = K_DEFAULT_STACK_SIZE; startInfo.dsUserStackPointer = &wdt_stack[WDT_STACKSIZE]; startInfo.dsEntry = (KnPc) wdt_thread; /* heavily increase the priority of the worker thread to ensure that it will not be later delayed by other applicative threads (if any) */ schedParam.tdClass = K_SCHED_DEFAULT; schedParam.tdPriority = WDT_THREAD_PRIO; /* create the worker thread */ res = threadCreate(K_MYACTOR, &childLid, K_ACTIVE, &schedParam, &startInfo); assert(res == 0); /* then wait for the completion of the test */ semP(&wdt_sem, K_NOTIMEOUT); return 0; }
The ChorusOS operating system provides a general message logging feature (syslog) which is used by the microkernel and applications. This logging feature provides support for logging console activity on a target system and provides a way to store information kept on persistent storage (disk).
The Syslog feature consists of a logging server (syslogd) and a device into which messages are written. Distribution of messages is controlled by the configuration file syslog.conf. The syslog APIs enable applications to send messages to the syslog.
System logging enables the microkernel and other applications to produce error, warning, and other messages. The messages are saved in a file in append-only mode and can be viewed at a later time. Log file data are never overwritten. The logging service modifies the file by appending new data.
To use the system logging feature, an application writes log messages tagged with a specific header (including the facility and the severity). These messages are then sent to the logging service. The local configuration of the logging service defines how the messages are processed. The criteria used to select how a message is processed are based on the values in the Facility and/or Severity fields.
discarded (filtered out)
appended to a local file
forwarded to another node where it will be stored in a file
The following example shows a basic LOG_ALERT message.
The syslog() call logs a message at priority LOG_ALERT.
The FTP daemon ftpd makes a call to openlog() to announce that all messages it logs should have the identifying string, ftpd. It also announces that messages should be treated by syslogd() in the same way that other messages from system daemons are treated, and should include the process ID of the process logging the message.
The FTP daemon then makes a call to setlogmask() to indicate that only messages with priorities from LOG_EMERG through LOG_ERR should be logged. Messages with any other priority will not be logged.
The FTP daemon then calls syslog() to log a message at priority LOG_INFO.
syslog(LOG_ALERT, "who: internal error 23"); openlog("ftpd", LOG_PID, LOG_DAEMON); setlogmask(LOG_UPTO(LOG_ERR)); syslog(LOG_INFO, "Connection from host %d", CallingHost);
A utility written locally would use the following call to syslog() to log a message at priority LOG_INFO. The message would be treated by syslogd() in the same way that other messages to the facility LOG_LOCAL2 are treated:
syslog(LOG_INFO|LOG_LOCAL2, "error: %m");
This chapter deals with the process restart and persistent memory services available on ChorusOS systems and demonstrates the use of the ChorusOS APIs for increasing the availability of application processes.
After implementing the examples provided in this chapter, you will understand how these APIs might be used in an application.
For additional information on ChorusOS hot restart and its features, refer to the ChorusOS 5.0 Features and Architecture Overview.
This section describes how to set up a ChorusOS system to use the hot restart feature. It covers the following:
Configuring the ChorusOS system for hot restart.
Running the graphical hot restart demonstration program provided with the Sun Embedded Workshop software.
This chapter assumes that you have already correctly installed the Sun Embedded Workshop software on a host machine, and that you have a target machine which can be booted from a network boot server. You should also be familiar with configuring your ChorusOS system and building a system image. For more information on these topics, see Chapter 4, Building Makefiles and Configuring the System Image.
Before beginning to program and run processes that will use the hot restart feature, you must update and configure your system for hot restart. System configuration for hot restart involves the following steps:
Including the necessary ChorusOS optional features in your system
Ensuring that the settings for the tunable parameters used by the hot restart feature are suitable for your system.
These steps are described in the following sections.
To incorporate hot restart into your ChorusOS system, use the ews graphical tool or the configurator(1CC) command line utility to include the following optional features in your system profile:
HOT_RESTART. This feature exports the hot restart API and restart mechanism.
LAPSAFE and LAPBIND. These features provide the necessary support for the HOT_RESTART feature.
The HOT_RESTART
feature implements persistent
memory as a portion of the random access memory (RAM) on the target device.
Although the persistent memory bank does not use virtual memory or swapping, HOT_RESTART
is compatible with all three main memory models:
flat, protected, and virtual.
The size of the persistent memory bank is defined in bytes by the system tunable parameter, pmm.rambankSize. The value of this parameter is static and cannot be modified while the system is running. In addition, because the RAM persistent memory bank does not use virtual memory or swapping, objects in persistent memory are locked in memory until they are freed. For these two reasons, it is important to ensure that pmm.rambankSize is set to a value realistic for the amount of data likely to be stored in persistent memory at any given time.
A portion of space reserved for an object in the persistent memory bank is known as a persistent memory block. A block is a contiguous set of memory pages, which means that the size of a block is always a multiple of the page size. For more information on the page size relevant to your platform, see the vmPageSize(2K) man page.
For each restartable process that is running, the system stores the following data in persistent memory:
The text and initialized data which were loaded into memory from stable storage. This is known as the process image. The process image occupies a single block of persistent memory.
The executed text, initialized data, and BSS (data initialized to zero), from which the process is running. This is known as the executing image of the process. The executing image occupies two blocks of persistent memory: one block for the text and one block for the data. The heap and stack for the executing process are stored in non-persistent memory.
Although it can be difficult to predict the likely required value of pmm.rambankSize in the early stages of the development cycle, the following rule of thumb, derived from the previously mentioned statements, may be of use to developers at the system design stage:
Restartable processes require an absolute minimum of twice their size in persistent memory. This minimum memory portion will accommodate the process's process image and executing image (although it does not enable rounding of memory block sizes to the nearest page).
The process can also allocate additional memory. Therefore, pmm.rambankSize should be greater than twice the combined size of the restartable processes expected to run simultaneously.
Sharing persistent memory blocks between user processes, or between user and supervisor processes is not supported. Persistent memory blocks can only be shared between supervisor processes.
The default value of the pmm.rambankSize tunable parameter is one megabyte.
The HOT_RESTART feature uses a number of system tunable parameters. Each parameter has a default value which can serve as a guideline, and is generally suitable for getting started with hot restart programming. All tunable parameters are static -- they cannot be modified while the system is running.
Two parameters define the limits for persistent memory occupation of the system's persistent memory bank:
pmm.rambankSize is the maximum amount of persistent memory available in the system (in bytes). The default value is one megabyte (0x100000). See the previous section for guidelines on setting this parameter to suit your system. To run the hot restart demonstration program, you will need to increase the value of this parameter to four megabytes (0x400000).
pmm.maxBlocks is the maximum number of recorded persistent memory blocks which can be allocated in the persistent memory bank. A block is a variable-sized number of contiguous pages of RAM. Each time a process (supervisor or user) issues a request to store a piece of data in persistent memory, a block of the appropriate size, rounded up to the nearest whole page, is allocated. The default value is 30.
Two parameters control the maximum number of restartable processes and restart groups permitted in a system:
hrCtrl.maxprocesses is the maximum number of hot restartable processes which can be registered in the system. A process is registered in the system when it is first run, and remains registered until all processes in its group have terminated cleanly. The default value is 32. If hrCtrl.maxprocesses is greater than 65536, this value will be used instead.
hrCtrl.maxGroups is the maximum number of restart groups that can be present in the system at the same time. Its default value is 32.
Two parameters define the system's restart policy (see "Site Restart"). These parameters are fairly sensitive -- different values can produce very different behavior in the system. The system manages a restart counter for each restart group. Each time a group is restarted, the system increases its restart counter by one.
hrCtrl.interval is the frequency with which a group's restart counter is decreased, in seconds. Every hrCtrl.interval seconds, the system decreases the group's restart counter by one (until the counter reaches zero). The default value for hrCtrl.interval is three seconds.
hrCtrl.maxBadness is the maximum value a group's restart counter can reach before it triggers a site restart. In other words, when a group's restart counter reaches this value, a site restart is automatically performed. The default value is 25. If set to zero, the system will never trigger a site restart.
After updating your system's features for hot restart and setting the tunable parameters to suit your requirements, you are ready to build the system image.
To run the examples and hot restart demonstration, include the examples directory and X11 library in your system build paths (if they are not already included). For information on building a system image for your particular target platform, see the corresponding document in the ChorusOS 5.0 Target Platform Collection.
After the system image has been correctly built, copy the image to your boot server and reboot the target machine. You are now ready to begin programming and running applications that can use the hot restart feature.
The Sun Embedded Workshop software includes a graphical demonstration
of the hot restart feature. The demonstration is based on a well-known program, Xmaze, which has been slightly modified to make it hot restartable.
Some of the program's data is stored in persistent memory, which means that
when the program is restarted, it starts at a point close to the point it
had reached prior to the restart. The resulting application is a ChorusOS
process called xdemo
.
To run the hot restart demonstration program, do the following:
Ensure that your system features are correctly set for hot restart (see "Features".
Adjust the following system parameters to suit the memory requirements of the Xmaze demonstration program using Ews or the configurator(1CC) command line utility, as shown in the following table:
Tunable parameter |
Description |
Required Value |
---|---|---|
pmm.rambankSize |
Size of persistent memory bank (in bytes) |
0x400000 |
kern.exec.dflSysStackSize |
Default system stack size (in bytes) |
0x8000 |
Configure your system image build to include the X11 library and ChorusOS examples directory (if this is not already the case):
% make reconfigure NEWCONF '_s <src_dir>/opt/X11' |
If you have made changes to the system image since the previous build, rebuild the system and copy the system image to the appropriate location (for example, the boot directory if you are using tftp-based boot). Reboot the target machine.
Ensure that a copy of the xdemo process
is present in a directory mounted on the target machine. If you use the make root command, a copy of the process is already stored in build_dir/root/bin/examples. If this directory
is not mounted, or to use a different mounted directory, do the following:
$ cp build_dir/BUILD_EXAMPLES/restartDemo/xdemo example_directory
Set the target machine's DISPLAY environment variable to the host machine:
$ rsh target setenv DISPLAY host_IP_address:0.0
Run the restartable process:
$ rsh target arun -g 0 example_directory/xdemo
The process will be run as a member of the restart group with a group ID 0.
The Xmaze demonstration appears on the screen. As the demonstration runs, it periodically stores its state as data in persistent memory. Allow the demonstration to run for a short time, then restart the process by typing the following command on the host console:
$ rsh target akill pid
The process identifier (PID) is printed on the host console when the process starts.
The process is restarted, and the Xmaze demonstration continues from a point close to where it left off before the restart.
The akill command provoked the restart because it was not called with the restart-specific option -g. To kill the Xmaze demonstration process without restarting it, type:
$ rsh target akill -g 0
Because the xdemo
process runs from the command
line it is a direct process, and will be started automatically by the system
when the site is restarted. To confirm this, rerun the process, and provoke
a site restart by typing the following:
$ rsh target restart
After the system has been re-initialized, the demonstration will be restarted.
This is a basic illustration of the use of the hot restart feature. The site restart is provoked manually from the command line. As an alternative, try restarting the process using akill -g sufficiently frequently to trigger an automatic site restart. To do this, set the system's restart policy to be more sensitive to process failure. The following example configuration will invoke a site restart if the process is restarted twice within four seconds:
Tunable parameter |
Value |
---|---|
hrCtrl.interval |
4 |
hrCtrl.maxBadness |
2 |
This section provides a description of the API exported by the Persistent Memory Manager. In particular, it covers the following topics:
How persistent memory is managed by the system.
Allocating and retrieving blocks of persistent memory with the Persistent Memory Manager API
Freeing blocks of persistent memory with the Persistent Memory Manager API
In this section, an example "hello world" program is used to illustrate different aspects of the Persistent Memory Manager interface. The code for this example is provided in "A Basic Application".
To run the example, compile the code and copy it to a directory which is mounted on the target machine. See "Compiling and Running the Examples" for information about compiling and running the hot restart examples.
Within a running ChorusOS system, access to persistent memory is provided by a ChorusOS process known as the Persistent Memory Manager. The Persistent Memory Manager exports a specific API for allocating and freeing blocks of memory in the persistent memory bank. This API differs from the API used for allocating and deallocating traditional ChorusOS memory regions (rgnAllocate(2K), rgnFree(2K), svPagesAllocate(2K) and svPagesFree(2K)) for the following reasons:
Persistent memory blocks, by definition, persist across a process or site restart. The API provided for manipulating traditional ChorusOS memory regions does not support memory recovery after a restart.
Persistent memory blocks, unlike traditional memory regions, are named. This name is used to retrieve a block of memory allocated in the persistent memory bank.
Persistent memory blocks, unlike traditional memory regions, can be grouped, for the purposes of simultaneous deallocation. In other words, a single API call can free multiple blocks of persistent memory (which may have been allocated by different processes in the ChorusOS system).
The Persistent Memory Manager API is available to all ChorusOS processes (not just restartable processes). This section describes the use of this API.
Before proceeding with a description of the different functions in the Persistent Memory Manager API, consider the following basic restartable application, an implementation of the "hello world" example. When the process is run for the first time, it displays the following message on the host console:
Hello world! |
When the process is restarted, it will display the following message on the target console:
Hello again! I have been restarted. |
The basic flow of execution is as follows:
The restartable process begins at the start of its main() program -- initialization of program data.
The process uses the pmmAllocate(2RESTART) function to allocate a block of persistent memory. This block is used to store a status counter (which the process will set to zero).
The first message is displayed, and the counter is incremented by one.
The process attempts to access an invalid pointer value, thereby causing a crash that will require the process to be restarted. Note that the ChorusOS VIRTUAL_ADDRESS_SPACE optional feature must be set to true for this crash to be invoked.
The restarting process recommences execution at the start of its main() program, and will call pmmAllocate() a second time to retrieve the value of the status counter.
Because the counter is no longer set to zero, the process will display the second message.
The process calls pmmFree(2RESTART) to free the persistent memory block where the counter is stored, and then exits cleanly.
#include <stdio.h> #include <pmm/chPmm.h> #include <hr/hr.h> #define HR_GROUP "HELLO_GROUP" int main() { int res; int any = 1; int* counter_p; /* It will be stored in persistent memory */ long *p; PmmName name; KnRgnDesc rgn; /* * Initialize the name and medium fields * to identify the persistent memory block in the system. */ bzero(&name, sizeof(name)); strcpy(name.medium,"RAM"); strcpy(name.name,"PM1"); /* * Initialize the block fields */ bzero(&rgn, sizeof(rgn)); rgn.options = K_ANYWHERE | K_RESERVED; rgn.size = vmPageSize(); res = rgnAllocate(K_MYprocess, &rgn); if (res != K_OK) { printf("rgnAllocate() failed res=%d\n", res); HR_EXIT_HDL(); exit(-1); } p = (long*) rgn.startAddr; /* * From now on p is a bad pointer, since * VIRTUAL_ADDRESS_SPACE is true. */ /* * Allocate the persistent memory block that stores * counter_p. */ res=pmmAllocate((VmAddr *)&counter_p, &name,sizeof(int), HR_GROUP, sizeof(HR_GROUP)); if (res != K_OK) { printf("Cannot allocate or map the persistent memory block called %s." " Error = %d\n", name.name, res); HR_EXIT_HDL(); exit(-1); } /* * From the value of *counter_p the process detects * whether it has been hot restarted or not. */ if ( *counter_p==0 ) { /* * This is the first time the process is run. */ printf("Hello world!\n"); /* * Increment the counter */ (*counter_p)++; /* * Normally the next instruction causes a core dump and * a hot restart of the process */ *p = 0xDeadBeef; } else { /* * The process has been restarted * NOTE: this message will appear on the console! */ printf("The process has been restarted.\n"); /* * Free the persistent memory block before exiting */ res = pmmFree(&name); if (res != K_OK) { printf(" pmmFree failed, res=%d. Exit\n", res); HR_EXIT_HDL(); exit(-1); } /* * Terminate cleanly. */ printf("Example finished. Exit.\n"); HR_EXIT_HDL(); exit(0); } /* Never reached */ }
The aspects of this program that are of interest to users of the Persistent Memory Manager API are discussed in the rest of this section.
The "hello world" application uses a block of persistent memory to store a counter indicating whether it has been restarted. The value of the counter controls the program's flow of execution. This is a common use of persistent memory. A counter or flag such as this is usually necessary because it is the only way a process can know whether it has been restarted.
A block of persistent memory is described in the system by a structure of the following type:
#include <pmm/chPmm.h> typedef struct { PmmMedium medium; PmmMemName name; } PmmName; PmmName MyPmmName = { "RAM", "myname" };
Within the structure, medium is a character string which identifies the memory bank to be used. In the current implementation, it must always be set to RAM. The name parameter is a user-defined, NULL-terminated character string that uniquely identifies the block of memory in the memory bank. The lifetime of a block name is identical to the lifetime of the block itself in persistent memory. The system parameter, pmm.maxBlocks, defines the number of distinct persistent memory blocks (and therefore names) that can be allocated at any one time. The default value is 30.
Sharing persistent memory blocks between user processes, or between user and supervisor processes is not supported. Persistent memory blocks can only be shared between supervisor processes.
To allocate or retrieve a block of persistent memory, use the pmmAllocate() function call, defined as follows:
#include <pmm/chPmm.h> KnError pmmAllocate( VmAddr *addr, PmmName *name, size_t size, PmmDelKey delKey, size_t delKeySize);
If no memory block corresponding to the specified PmmName structure is found in persistent memory, pmmAllocate() allocates a block of size size in persistent memory, fills it with nulls, and returns the pointer *addr to the address of the block. The address is determined by the system and cannot be manually specified or changed.
If a block identified with the specified PmmName already exists in persistent memory, pmmAllocate() returns a pointer to the existing memory block as an address (*addr), and the size parameter is ignored. Persistent memory blocks are always mapped to the same address. In other words, the address returned by the first and subsequent calls to pmmAllocate() is always the same for a given block.
As a result of this dual functionality of the pmmAllocate() call, the difference between initially allocating and subsequently retrieving a persistent memory block is transparent at the programming level. The first time the code of the "hello world" example is run, the call to pmmAllocate() will allocate an integer-sized block of persistent memory that contains the initialized value of counter (0).
res=pmmAllocate((VmAddr *)&counter_p, &name,sizeof(int), HR_GROUP, sizeof(HR_GROUP));
The second time the code is run, pmmAllocate() returns a pointer to the value of counter in persistent memory.
The delKey and delKeySize parameters passed to pmmAllocate() are used to define the deletion key associated with a memory block. A deletion key is a user-defined binary array, used to mark a set of persistent memory blocks which can be freed simultaneously using the pmmFreeAll(2RESTART) function, described in "Freeing a Persistent Memory Block Explicitly".
This section describes the API calls used to free persistent memory blocks.
A persistent memory block can remain in memory beyond the lifetime of a run-time instance of the process that allocates the block. This immediately raises the question of responsibility for freeing blocks of persistent memory. When a traditional ChorusOS user process terminates, any memory regions it previously allocated (using rgnAllocate(2K) are freed automatically. Clearly, this basic rule makes little sense in the case of persistent memory blocks (which can survive beyond such a termination).
The hot restart feature provides two solutions to this problem:
Processes can free blocks of persistent memory explicitly, using the API function pmmFree() or pmmFreeAll(). This is the only solution available for non-restartable processes that use persistent memory. For these processes, freeing persistent memory is entirely the programmer's responsibility.
If persistent memory needs to survive beyond the persistent lifetime of the allocating process (that is, even after the process has terminated cleanly), implementing this solution will require careful application design or the presence of a garbage collection process.
Explicit freeing of persistent memory blocks is described in the following section.
Hot restartable processes can benefit by using the automatic clean up mechanism provided by the Hot Restart Controller. This mechanism is described in more detail in Freeing Persistent Memory "Freeing Persistent Memory".
In both cases, freeing a persistent memory block has the same effect -- the block is freed immediately and permanently and cannot be retrieved. The name of the block becomes available for reuse and can be used to identify a different memory block.
Use the pmmFree() or pmmFreeAll() function to free a persistent memory block explicitly. The explicit freeing of a given memory block can be performed by any process (not necessarily the process that originally allocated the block). It is the programmer's responsibility to ensure that any persistent memory block that has been freed is no longer in use.
Use pmmFree() to free a single memory block identified by a PmmName:
#include <pmm/chPmm.h> int pmmFree( PmmName *name )
Use pmmFreeAll() to free a group of persistent memory blocks that were allocated with the same deletion key. The deletion key for a persistent memory block is specified when the block is allocated with pmmAllocate().
#include <pmm/chPmm.h> int pmmFreeAll( PmmDelKey delkey, size_t delKeySize );
A typical use of a deletion key is to mark all persistent memory blocks used by a process or a group of processes with the same key, and then have a separate, independent process that frees all the blocks when a particular job is completed (or a specific event occurs). The "hello world" example uses pmmFree() to free the single memory block it allocated before terminating. If the "hello world" process did not free its own persistent memory block, the following call to pmmFreeAll() from another process would free the block, and also with any other blocks marked with the deletion key HR_GROUP:
pmmFreeAll( HR_GROUP, sizeof(HR_GROUP) );
This section discusses programming and running restartable processes on the ChorusOS operating system. The following topics are presented:
An overview of how restartable processes are represented and managed by the system.
A description of the API and C_INIT commands used for loading, restarting, and terminating a restartable process.
A description of the API and C_INIT commands used to restart a site.
An overview of the restartSpawn example program, used to illustrate the use of the Hot Restart Controller and Persistent Memory Manager APIs
A restartable process can be reconstructed rapidly from a process image (text and data) without accessing stable storage. The management of restartable processes is handled by a ChorusOS supervisor process known as the Hot Restart Controller. The Hot Restart Controller is responsible for:
Loading and running restartable processes, and controlling their storage in persistent memory.
Monitoring restartable processes for abnormal termination, and restarting their restart group if an abnormal termination occurs.
Triggering a site restart if a group is restarted too frequently (based on a system's restart policy, as described in "Getting Started with Hot Restart").
The following section looks at the API provided by the Hot Restart Controller and the corresponding restart-related commands provided by the C_INIT process. Before proceeding to a description of the API, however, it is important to understand how a restartable process is managed within the system.
Processes do not explicitly declare themselves restartable, that is, there is no function call to declare a process restartable at the start of its main() program. Instead, a process can be run as a restartable process. Specifically, a process can be run as either a direct or indirect restartable process:
Direct restartable processes
are loaded and run using the C_INIT
command arun with the -g option.
Indirect restartable processes are spawned from restartable processes using the hrfexec(2RESTART) family of API calls. hrfexec() calls operate in a similar manner to afexec() calls, but provide an additional PmmName parameter. The PmmName parameter is used to identify the calls used for the purposes of process restart:
#include <hr/hr.h> int hrfexecve( PmmName * baseName, const char * path, KnCap * cprocesscap, const AcParam * param, char const * argv, char const * envp); (...)
The distinction between direct and indirect processes is important in understanding the automatic restart mechanism provided by the Hot Restart Controller. When an error occurs, the Hot Restart Controller first stops all processes in the group. After the processes are stopped, only the direct restartable processes will be restarted. These processes (re-executed from their initial entry point) are responsible for restarting any indirect processes they may have spawned.
Restartable processes, just like traditional ChorusOS processes, are identified in the system by a unique capability and PID. Restartable processes also run in a user group (with a user ID) like traditional ChorusOS processes. The life of each of these credentials is the same as the life of a specific run-time instance of the process -- when a restartable process is restarted, it is given a new capability, PID and user ID.
Hot restartable processes also have two additional credentials which persist across a process restart, and characterize the processes in the Hot Restart Controller:
Each restartable process has a unique name.
The maximum number of restartable processes (unique names) that can be registered
in the Hot Restart Controller is fixed by the hrCtrl.maxprocesses
system parameter.
It is the programmer's responsibility to ensure that each process running in the system uses a unique name because this is not checked by the system. Attempting to run two processes that use the same name will cause unpredictable results.
Each restartable process is a member of a restart group. A restart group is uniquely identified in the system by an integer (known as the group's ID). The maximum number of group IDs in a system is fixed by the hrCtrl.maxGroups parameter.
As previously discussed in "Memory Requirements and Design Constraints", the system uses persistent memory to store the following data for each executing restartable process:
The process's process image: a copy of the text and initialized data segments from which the process will be loaded after a restart.
The process's executing image: a copy of the text and data from which the process is executed.
This data is stored in three persistent memory blocks. One memory block is used for the process image, one is used for the executed text, and the last is used for the process data. These blocks are allocated and freed upon requests from the Hot Restart Controller to the Persistent Memory Manager. Other processes cannot access or free these persistent memory blocks. However, restartable processes can allocate additional blocks under the control of the Hot Restart Controller. This is described in "Freeing Persistent Memory".
One approach to understanding how the Hot Restart Controller API is used, is to consider it in the context of the run-time life-cycle of a restartable process. A restartable process's code is not executed just once (from the start of the main() program to its final return). The code may be re-executed several times if there are multiple restarts. Data that is initialized, and processes that are initially loaded during the first execution will only need to be retrieved or restarted on subsequent executions. Therefore, it is important to view the restart API in the context of this first execution, and then of subsequent executions.
This section looks at the way the Hot Restart Controller API is used in the context of the life-cycle of a typical restartable process.
Use the C_INIT command arun with the -g option, or the function call hrfexec() to load a restartable process from stable storage into persistent memory. Both arun and hrfexec() provide support for specifying the persistent credentials of a restartable process when the process is initially loaded.
For a direct process that was run with the arun command, the process name will be system-generated, and the group ID is passed using the -g option. If the group ID is not already in use, a new group is created which contains the direct process. If the group ID already exists, the direct process is added to the corresponding restart group. If no ID is passed after -g, the process is started in the restart group with ID 0.
A restart group can contain any number of direct processes.
For an indirect process run with hrfexec(), the process name is specified using a PmmName structure. An indirect process automatically becomes a member of the same process group as the process that spawned it.
Processes that were created directly using acreate(2K) or actorCreate(2K) are not hot restartable and are unable to use the Hot Restart Controller API.
When a process is run as a restartable process, the Hot Restart Controller checks whether a process identified with the specified name is already registered. If this is not the case (as with the initial load), the Hot Restart Controller first solicits the Persistent Memory Manager to allocate the persistent memory blocks which will store the process's process image and executing image. If successful, the Hot Restart Controller registers the name of the new process as a restartable process, running in the specified group.
The subsequent load and start of the persistent process is the same as for a process run using a member of the afexec(2K) function family (see the man page for a description of this process). The only difference is that the process is loaded from its process image (in persistent memory) and not from stable storage.
A restartable process's name remains registered in the Hot Restart Controller for the life of its process group. The lifespan of the group may extend beyond the lifespan of the process. It is the programmer's responsibility to ensure that no two restartable processes will attempt to register with the same name in the Hot Restart Controller.
After a restartable process has been registered and loaded, it runs under the control of the Hot Restart Controller. If the process fails, the failure will invoke the restart of all direct members of its restart group. These direct processes will be responsible for restarting any indirect processes registered in the group. To query a process's restart group, use hrGetprocessGroup(2RESTART):
#include <hr/hr.h> hrGetprocessGroup(int pid)
In the context of hot restart, a process is considered to have terminated abnormally (and will therefore invoke the restart of its group) if any of the following occur:
Unrecoverable error (division by zero, unresolved page fault, invalid op code, and so forth).
Premature exit call, that is, an exit call prior to the expected completion of the process's task.
The process is killed without using the restart-specific command (akill(1M) with the -g option) or function call (hrKillGroup(2RESTART)) provided for this purpose.
There is no single API call that can explicitly force a group of processes to restart. For cases in which it may be desirable to provoke a restart (for example, for testing purposes). The easiest way to do so is to deliberately provoke one of the previous cases. In the "hello world" example introduced in the previous chapter, this was done by causing a segmentation fault.
When a process fails, all processes in the failed restart group stop running and the Hot Restart Controller restarts all direct processes in the group from their initial entry point. The direct processes are responsible for restarting any indirect processes, using hrfexec(). When hrfexec() is called with a name that is already registered in the Hot Restart Controller, the Controller recognizes the process name and restarts the process from the process image, instead of loading it from stable storage.
A restartable process is always restarted at the same address. Its capability, process ID and user ID are not guaranteed to be will not necessarily be the same after restart. All system resources obtained before the restart are lost: in particular, open files, including those that were inherited at the time of initial creation are lost. This may include standard I/O connected to an rsh connection.
A restarted process uses the same arguments and environment parameters that were specified when the process was initially started. For direct restartable processes, a new set of pre-open stdin/stdout/stderr has been provided, which is connected to /dev/console. For indirect members, a new set of pre-open stdin/stdout/stderr is provided by the invoker of hrfexec(), just as for afexec(2K).
Just like any process, a restartable process can free persistent memory blocks using pmmFree() or pmmFreeAll(). This is described in "Freeing a Persistent Memory Block".
Restartable processes which allocate memory with pmmAllocate() can also use a basic automatic deallocation mechanism provided by the Hot Restart Controller. This saves the process from having to free its persistent memory explicitly. Instead, the persistent memory will remain allocated for the lifespan of the process's group, and then be freed automatically by the Hot Restart Controller when the last member of the process's restart group terminates cleanly. The disadvantage of this system is that the lifespan of the restart group may extend well beyond the point at which the memory block is no longer required. In this situation, the memory block will take up space in persistent memory unnecessarily.
To mark a persistent memory block for automatic de-allocation by the
Hot Restart Controller, pass the macros HR_GROUP_KEY
and HR_GROUP_KEYSIZE
as the delKey and delKeySize arguments respectively
in the call to pmmAllocate(). These macros tie the lifespan
of the persistent memory block to the lifespan of the calling process's restart
group.
A block marked for automatic de-allocation by the Hot Restart Controller
can still be freed explicitly by calling pmmFree() with
the block's PmmName. However, attempting to call
pmmFreeAll() by passing the HR_GROUP_KEY
and HR_GROUP_KEYSIZE
macros will result in an error
because this is not permitted.
Any process that exits before the expected completion of its task is considered to have aborted abnormally and will cause a restart of its process group. This can be useful for cases where the process exits prematurely as a result of an error. This mechanism can also be useful for invoking a process restart where this is required, for example, if an execution problem is detected.
To enable a restartable process to terminate cleanly without causing a restart, use the HR_EXIT_HDL() macro prior to the call to exit(3STDC):
#include <hr/hr.h> HR_EXIT_HDL();
The purpose of the preceding macro is to add an additional hot restart exit handler to the process's atexit(3STDC) function. The hot restart exit handler effectively removes the process in question from the Hot Restart Controller's responsibility. After a process has called HR_EXIT_HDL(), the Hot Restart Controller will no longer monitor the process for abnormal termination. As a result, when the process exits, it will terminate cleanly and not trigger a restart.
The HR_EXIT_HDL() macro should be called shortly before the process exits. Calling this macro earlier in the process code will mean that any unexpected exit between the macro call and the final exit will not be detected by the Hot Restart Controller. As a result, the process will not be restarted if it exits abnormally.
Cleanly terminating a process does not unregister the process in the Hot Restart Controller or remove the process's process image and executing image from persistent memory. This is because a cleanly terminated process will still be restarted if its group is restarted (because a group is always restarted in its initial state). In other words, when a group is restarted, all direct restartable processes will recommence execution at their initial entry point, regardless of whether or not they had already exited before the restart occurred. This is demonstrated by the following diagram. Both direct process one (DP1) and indirect process two (IP2) terminate cleanly, but are automatically restarted when direct process two (DP2) crashes.
Because of this behavior, it is useful to record the clean termination of restartable processes that will never require being reexecuted completely during a group's life by setting a flag in persistent memory. A restarted process can check the state of this flag at the start of its execution, and therefore detect whether it should re-execute or not.
For each group of restartable processes present in a ChorusOS system, the Hot Restart Controller stores a list of the processes for each group in a persistent memory block. A process is added to the list when it is first started. When a process cleanly terminates, the Hot Restart Controller notes this in the list. When all processes in the list have terminated cleanly, the Hot Restart Controller performs the following:
Deallocates the persistent memory blocks used to store the
images of the terminated processes, as well as blocks that were allocated
using the HR_GROUP_KEY
and HR_GROUP_KEYSIZE
deletion key macros. The process names used by the processes
can then be reused by other restartable processes (which will be loaded into
memory as new processes).
Adds the group ID to the list of available IDs for new process groups.
A group of processes can only terminate if all of its member processes terminate cleanly. This is important to remember in situations where not all indirect processes are restarted after a group restart. This is a matter of execution flow: if certain conditions in a direct process change the process's flow from one execution to another, the direct process may not restart an indirect process that was running prior to the restart. As a result, the indirect process will never terminate cleanly and so the group will not be able to terminate.
For example, consider the situation in the following diagram. The direct process spawns the indirect process only after certain conditions are met. These conditions are met the first time the direct process runs. After the direct process restarts, the conditios are no longer satisfied, so the indirect process is no longer spawned.
In the preceding diagram, the process group will not be able to terminate until the indirect process has been rerun using hrfexec(), and has terminated cleanly.
When a restart group cannot terminate because of one or more direct processes, the Hot Restart Controller detects this situation and displays the following message on the target console:
HR_CTRL: group gid blocked, some members have not terminated: list_of_processes |
gid is the ID of the group in question, and list_of_processes provides the name of each process which prevents the group from terminating. When this message is displayed, a common solution is to kill the process group using the akill command with the -g option. However, this solution is useful only if none of the indirect processes need to be run to complete the group's task.
A better solution is to use careful application design. If the preceding situation is likely to occur, flags can be stored in persistent memory to identify indirect processes that have not terminated cleanly. A process can then be made responsible for cleaning up the group, that is, restarting each indirect process that is flagged. This clean-up process can be run using the arun -g command when the Hot Restart Controller notification is displayed on the target console. Alternatively, the group could be designed so that the clean-up process is always run just before the group is expected to terminate. In this case the problem is solved without accessing the C_INIT console.
At times it may be necessary to circumvent the automatic restart mechanism provided by the Hot Restart Controller and explicitly terminate (kill) a restartable process. Processes which are killed will not be restarted. Killing a process automatically kills all processes within the process's restart group. This is because a restart group must remain consistent. The restart group may not be able to function correctly if a process is no longer available.
Restartable processes can be explicitly killed using either of the following:
the C_INIT command akill(1M) with the -g option,
the API call hrKillGroup(2RESTART) with the process's group ID:
#include <hr/hr.h> int hrKillGroup (int groupId);
The group ID can be queried using the hrGetActorGroup(2RESTART) call:
#include <hr/hr.h> int hrGetActorGroup(int pid);
Either method produces the same result: all processes in the associated restart group are killed. The Hot Restart Controller terminates the group as though all processes had exited cleanly (see "Group Termination" Group Termination).
A site restart is a hot restart of the entire system. All data of boot processes are reset to their original values from the previously loaded system image and the system enters its start-up phase again. As C_INIT restarts, sysadm.ini is reexecuted. Any calls to start restartable processes in the sysadm.ini file are ignored for a site restart because all direct restartable processes are automatically restarted by the system after the sysadm.ini file has been read.
When the system is restarted, previously mounted disks are not automatically remounted. To resolve this issue, ensure that the disks are mounted in the sysadm.ini file, or create a hot restartable process that will automatically mount the disks.
A site restart can be invoked automatically by the Hot Restart Controller, according to the tunable parameters that define the system's restart policy. For more information, see "Tunable Parameters".
To invoke a site restart programmatically, use the sysShutdown(2K) function call with the -i 1 arguments:
int sysShutdown (int argc, char** argv)
To provoke a site restart from the C_INIT command-line console, use the command shutdown -i 1 or restart(1M) command.
The following code example, restartSpawn, illustrates many of the function calls covered in this and previous chapters. The example is provided as an overview of the restart mechanism and the use of persistent memory. Specific parts of the example could be used as the basis of a more complex user application that incorporates hot restart.
The restartSpawn example uses two restartable processes, a parent process, HR_parent.r and a child, HR_child.r which is spawned by the parent. Both processes should be compiled as supervisor processes. The source code for the two processes is provided in "Example Application Code". The example can be summarized as follows:
The parent process uses a set of control structures stored in persistent memory. It spawns the child process using hrfexec() , then explicitly crashes, causing itself to be restarted by the system. The parent process restarts the child process indirectly each time it runs, through a call to hrfexec().
The child process also uses a set of control structures stored in persistent memory. It executes a four-step loop which causes the following to be displayed:
=========== Message =========== STEP 1 STEP 2 STEP 3 STEP 4 ======== End of message========
The message is displayed independently of the number of times the parent process crashes, or the site is restarted.
This section describes the environment used for programming and compiling applications that use the API exported by the hot restart feature. For additional general information about compiling and linking ChorusOS processes, see Chapter 6, Building Applications for ChorusOS Systems.
The hot restart programming interface is declared in the following files:
For the Persistent Memory Manager API:
install_dir/chorus-family/kernel/include/chorus/pmm/chPmm.h
For the Hot Restart Controller API:
install_dir/chorus-family/os/include/chorus/hr/hr.h
install_dir/chorus-family/os/include/chorus/hr/hrCtrl.h
Detailed descriptions of each function call are available in the ChorusOS man pages.
A restartable process can be compiled using any of the following standard Imakefile macros:
UserprocessTarget
SupprocessTarget
CXXUserprocessTarget
CXXSupprocessTarget
Processes that use dynamic or shared libraries (compiled with Imake macros of the type Dynamic...Target or Shared...Target) are not hot restartable.
Use the following table to link processes that use the API exported by the hot restart feature. Note that all ChorusOS processes are automatically linked with the libc.a library.
API Function |
Library |
---|---|
hrfexec() HR_EXIT_HDL() hrKillGroup() hrGetprocessGroup() |
libc.a |
pmmAllocate() pmmFree() pmmFreeAll() |
pmmlib.a |
The following is an example Imakefile for a restartable process that uses the Persistent Memory Manager API:
SRCS = HR_process.c UserprocessTarget(HR_process_s, HR_process.o, $(NUCLEUS_DIR)/lib/pmm/pmmlib.a) Depend($(SRCS))
This section provides the following:
instructions for compiling and running the "hello world" and process spawn examples described in this guide.
source code and Makefiles for these examples.
The source code for the example applications is also provided in install_dir/chorus-family/src/opt/examples after the examples package has been installed on your system.
Two examples have been designed to illustrate the use of the hot restart API and are provided with the Sun Embedded WorkShop software. The examples are as follows:
helloRestart
: a basic illustration
of persistent memory programming using the 'hello world' process. This example
is discussed in"The "hello world" Restartable process". helloRestart
can be compiled as either a supervisor or a user process.
restartSpawn
: this example illustrates
how a hot-restartable process can be spawned from a process. This example
is further discussed in "The restartSpawn Example". Both the processes
in the restartSpawn
example are supervisor processes.
To compile the examples, ensure that the examples directory is included in your system image build configuration. Binaries for all examples are provided in build_dir/build-EXAMPLES after the building of the examples directory.
To run the examples, first copy them to a directory which is mounted on the target, or use the make root command to build a root directory to mount.
Use the C_INIT command arun with the -g option to run a restartable process from the command line. For example, to run the 'hello world' restart example:
$ rsh target arun -g 0 example_directory/HR_hello_u |
Where target is the target name, and example_directory is the directory mounted on the target machine where the restartable hello world process binary is stored. The -g 0 option runs the hello world restartable process as a member of a restart group with ID 0.
The restartable "hello world" process is a basic illustration of the use of persistent memory.
See "Using Persistent Memory" for further information on this process.
#include <stdio.h> #include <pmm/chPmm.h> #include <hr/hr.h> #define HR_GROUP "HELLO_GROUP" int main() { int res; int any = 1; int* counter_p; /* It will be stored in persistent memory */ long *p; PmmName name; KnRgnDesc rgn; /* * Initialize the name and medium fields * to identify the persistent memory block in the system. */ bzero(&name, sizeof(name)); strcpy(name.medium,"RAM"); strcpy(name.name,"PM1"); /* * Initialize the block fields */ bzero(&rgn, sizeof(rgn)); rgn.options = K_ANYWHERE | K_RESERVED; rgn.size = vmPageSize(); res = rgnAllocate(K_MYprocess, &rgn); if (res != K_OK) { printf("rgnAllocate() failed res=%d\n", res); HR_EXIT_HDL(); exit(-1); } p = (long*) rgn.startAddr; /* * From now on p is a bad pointer, since * VIRTUAL_ADDRESS_SPACE is true. */ /* * Allocate the persistent memory block that stores * counter_p. */ res=pmmAllocate((VmAddr *)&counter_p, &name,sizeof(int), HR_GROUP, sizeof(HR_GROUP)); if (res != K_OK) { printf("Cannot allocate or map the persistent memory block called %s." " Error = %d\n", name.name, res); HR_EXIT_HDL(); exit(-1); } /* * From the value of *counter_p the process detects * whether it has been hot restarted or not. */ if ( *counter_p==0 ) { /* * This is the first time the process is run. */ printf("Hello world!\n"); /* * Increment the counter */ (*counter_p)++; /* * Normally the next instruction causes a core dump and * a hot restart of the process */ *p = 0xDeadBeef; } else { /* * The process has been restarted * NOTE: this message will appear on the console! */ printf("The process has been restarted.\n"); /* * Free the persistent memory block before exiting */ res = pmmFree(&name); if (res != K_OK) { printf(" pmmFree failed, res=%d. Exit\n", res); HR_EXIT_HDL(); exit(-1); } /* * Terminate cleanly. */ printf("Example finished. Exit.\n"); HR_EXIT_HDL(); exit(0); } /* Never reached */ }
SRCS = helloRestart.c SupprocessTarget(helloRestart.r, helloRestart.o, $(NUCLEUS_DIR)/lib/pmm/pmmlib.a) UserprocessTarget(helloRestart_u, helloRestart.o, $(NUCLEUS_DIR)/lib/pmm/pmmlib.a) Depend($(SRCS))
The restartSpawn example comprises two processes: HR_parent.r and HR_child.r. Both processes must be compiled as supervisor processes (see "Putting It All Together: the restartSpawn Example Program" for an overview of the restartSpawn example).
#include <stdio.h> #include <stdlib.h> #include <strings.h> #include <am/afexec.h> #include <pmm/chPmm.h> #include <exec/chModules.h> #include <hr/hr.h> #include <err.h> #include <errno.h> #define PM_MEDIUM "RAM" #define PM_NAME "PARENT_PM" #define MAX_LOOPS 8 /* * Some static variables */ char baseName[PATH_MAX]; char last_global_data; /* * Declaration of objects that will be stored in persistent memory. * restarted: number of times the process has been restarted. * counter: number of times the process's main loop is run. */ typedef struct _HR_Status { int restarted; int counter; } HR_Status; /* * Wait "sec" seconds. */ void waitSec(int sec) { KnTimeVal delay; delay.tmSec = sec; delay.tmNSec = 0; (void) threadDelay(&delay); } /* * Create a child hot restartable process. * Start the child process only if the parent has * not been hot restarted. */ void childCreate() { KnCap childCap; KnprocessPrivilege curActPriv; PmmName childName; int res; int childPid = -1; char path[PATH_MAX]; char* argv[3]; res = processPrivilege(K_MYprocess, &curActPriv, NULL); if (res != K_OK) { printf("processPrivilege failed, res=%d\n", res); HR_EXIT_HDL(); exit(-1); } if (curActPriv != K_SUPprocess) { argv[0] = "HR_child"; } else { argv[0] = "HR_child_u"; } argv[1] = NULL; argv[2] = NULL; strcpy(childName.medium, "RAM"); strcpy(childName.name, "CHILD"); strcpy(path, baseName); if (curActPriv == K_SUPprocess) { strcat(path, "HR_child"); } else { strcat(path, "HR_child_u"); } childPid = hrfexecv(&childName, path, &childCap, NULL, argv); if (childPid == -1) { printf("Cannot hrfexecv(%s), error=%d\n", path, errno); HR_EXIT_HDL(); exit(-1); } } /* * Cause a hot restart by exiting without * first calling HR_EXIT_HDL(). */ void crash_exit() { printf("\nPARENT hot-restarts (exits with no HR_EXIT_HDL)!\n"); exit(1); } /* * Cause a segmentation fault. */ void crash_seg() { KnRgnDesc rgn; unsigned long* badSupPtr; int res; rgn.options = K_ANYWHERE | K_RESERVED; rgn.size = vmPageSize(); rgn.opaque1 = NULL; rgn.opaque2 = 0; res = rgnAllocate(K_MYprocess, &rgn); if (res != K_OK) { printf("unable to allocate a page res=%d\n", res); return; } badSupPtr = (unsigned long*) rgn.startAddr; printf("\nPARENT crashes (segmentation fault)!\n"); /* * Generate an unrecoverable page fault, since * VIRTUAL_ADDRESS_SPACE is true */ *badSupPtr = (unsigned long) 0xffffffff; /* * it should never return with */ printf("Can't generate a crash\n"); return; } /* * Cause a failure due to division by 0. * Note: This does not crash on some platforms. */ int crash_div() { int i; int z; int x = 1; printf("\nPARENT tries to crash with division by 0!\n"); for (i = 10; i > -1; i--) { z = x/i; } return z; } /* * Perform a site restart. */ void site_restart() { char* argv[3]; int res; argv[0] = "shutdown"; argv[1] = "-i"; argv[2] = "1"; res = sysShutdown (3, argv); if (res) { printf("parent error=%d\n", res); } else { waitSec(5); printf("Timeout ! \n"); } } /* * Kill the group processes and free persistent memory * blocks allocated by the parent process. */ void clean_up(PmmName *np) { int res; int group = 1; int actId; actId = agetId(); res=pmmFree(np); if (res != K_OK) { printf("\nCannot free the persistent memory block called %s." " Error = %d\n", np->name, res); HR_EXIT_HDL(); exit(-1); } printf("\nPersistent memory has been freed.\n"); group=hrGetprocessGroup(actId); if (group < 0) { printf("Cannot get process group. Error = %s\n", errno); HR_EXIT_HDL(); exit(-1); } printf("Example finished. Exit.\n"); res=hrKillGroup(group); if (res != K_OK) { printf("Cannot kill process group %d. Error = %d\n", group, res); HR_EXIT_HDL(); exit(-1); } } /* * main */ int main(int argc, char** argv, char**envp) { int res; int counter; int ref; int* mem_version; static PmmName name; HR_Status* st; char* endPath; KnprocessPrivilege curActPriv; /* * Check that argc != 0. Otherwise exit. */ if(argc==0) { printf("Cannot start this test. argc == %d. Exit.\n", argc); HR_EXIT_HDL(); exit(-1); } res = processPrivilege(K_MYprocess, &curActPriv, NULL); if (res != K_OK) { printf("processPrivilege failed, res=%d\n", res); HR_EXIT_HDL(); exit(-1); } if (curActPriv != K_SUPprocess) { printf("This example can only be run in supervisor mode. Exit.\n"); HR_EXIT_HDL(); exit(-1); } /* * If the example runs in flat memory mode, it will not work. * Some of the failures will not always cause a hot-restart. * Print an error message and exit. */ res = sysGetConf(K_MODULE_MEM_NAME, K_GETCONF_VERSION, mem_version); if (res != K_OK) { printf("Cannot get memory configuration." " res=%d\n", res); HR_EXIT_HDL(); exit(-1); } if (*mem_version==K_MEM_VERSION_FLM) { printf("Sorry. The example cannot be run in flat memory" " configuration. Exit.\n"); HR_EXIT_HDL(); exit(-1); } /* * Get the directory of the current process. */ strcpy(baseName, argv[0]); endPath = strrchr(baseName, '/'); *(endPath+1) = '\0'; /* * Initialize the name and medium fields to identify * the HR_Status structure. */ bzero(&name, sizeof(name)); strcpy(name.medium,PM_MEDIUM); strcpy(name.name,PM_NAME); /* * Allocate or map the data in st in persistent memory. */ res=pmmAllocate((VmAddr *)&st, &name, sizeof(HR_Status), HR_GROUP_KEY, HR_GROUP_KEYSIZE); if (res != K_OK) { printf("Cannot allocate or map the persistent memory block called %s." " Error = %d, errno=%d\n", name.name, res, errno); HR_EXIT_HDL(); exit(-1); } /* * If the process has been restarted, print out a message. */ if (st->restarted>0) { printf("PARENT RESTARTS (%d-th time)\n", st->restarted); } /* * Increase the "restarted" counter. */ st->restarted++; /* * Create a child hot-restartable process. */ childCreate(); /* * main loop * provokes different faults in the parent process. * This causes the parent AND the child to hot restart. */ while ( st->counter<MAX_LOOPS ) { waitSec(2 + rand() % 2); st->counter++; ref = (st->counter%5); switch ( ref ) { case 1: crash_seg(); break; case 2: res = crash_div(); /* * If you get here, it means that division by 0 does not * crash your system! */ printf("The parent process does not crash" " with division by 0. Continue.\n"); break; case 3: crash_exit(); break; case 4: site_restart(); break; default: break; } } /* * Example complete. Free persistent memory blocks and exit. */ clean_up(&name); }
#include <stdio.h> #include <strings.h> #include <pmm/chPmm.h> #include <hr/hr.h> #include <exec/chExec.h> #include <pd/chPd.h> #include <errno.h> #define PM_MEDIUM "RAM" #define PM_NAME "CHILD_PM" #define MESSAGE_NAME "CHILD_MESSAGE" #define MESSAGE_SIZE 100 typedef struct _HR_Status { int restarted; int checkpoint; } HR_Status; /* * Static variables */ static HR_Status *st; /* * Wait "sec" seconds. */ void waitSec(int sec) { KnTimeVal delay; delay.tmSec = sec; delay.tmNSec = 0; (void) threadDelay(&delay); } /* * General operations in all steps. */ void gen_step (char** message, char* m_out) { strcat(*message, m_out); printf("%s", m_out); fflush(NULL); /* * st is stored in persistent memory. * If the process does not reach the end of the next instruction * before a hot restart, the current step will be repeated. */ st->checkpoint=++(st->checkpoint) % 4; } /* * step1 */ void step1 (char** message) { gen_step(message, " STEP 1 "); } /* * step2 */ void step2 (char** message) { gen_step(message, " STEP 2 "); } /* * step3 */ void step3 (char** message) { gen_step(message, " STEP 3 "); } /* * step4 */ void step4 (char** message) { gen_step(message, " STEP 4 "); /* * Print out the entire message at the end of the cycle. * The entire message is printed even if the child process * is restarted during a cycle. * * =========== Message =========== * STEP 1 STEP 2 STEP 3 STEP 4 * ======== End of message======== * * Note that output from the parent process may garble * this output. */ printf("\n\n=========== Message ===========\n"); printf("%s", *message); printf("\n======== End of message========\n\n"); /* * Reset the message. */ bzero(*message, MESSAGE_SIZE); } /* * Function to be executed before the process exits for any reason. */ void before_exit() { printf("CHILD EXITS!\n"); } /* * main */ int main(int argc, char** argv, char**envp) { int res; int counter; static PmmName name; static PmmName m_name; size_t size; PdKey key; char message[MESSAGE_SIZE]; KnprocessPrivilege curActPriv; res = processPrivilege(K_MYprocess, &curActPriv, NULL); if (res != K_OK) { printf("processPrivilege failed, res=%d\n", res); HR_EXIT_HDL(); exit(-1); } if (curActPriv == K_SUPprocess) { /* * Create a private process data key with a * destructor associated with it. */ res = padKeyCreate(&key, (KnPdHdl)before_exit); if(res != 0) { printf("Couldn't create PD key. Exit with errno %d\n", errno); HR_EXIT_HDL(); exit(-1); } res = padSet(K_MYprocess, key, "M"); if (res != K_OK) { printf("Cannot set the PD key, error %d\n", res); HR_EXIT_HDL(); exit(-1); } } else { res=atexit(&before_exit); /* * atexit() accepts up to 32 functions so this cannot fail. */ } /* * Initialize the name and medium fields for the * HR_Status structure. */ bzero(&name, sizeof(name)); strcpy(name.medium,PM_MEDIUM); strcpy(name.name,PM_NAME); /* * Allocate or map the data in st in persistent memory. */ res=pmmAllocate((VmAddr *)&st, &name, sizeof(HR_Status), HR_GROUP_KEY, HR_GROUP_KEYSIZE); if (res != K_OK) { printf("Cannot allocate or map the persistent memory block called %s." " Error = %d\n", name.name, res); HR_EXIT_HDL(); exit(-1); } /* * Initialize the name and medium fields for the * message char buffer. */ bzero(&m_name, sizeof(m_name)); strcpy(m_name.medium,PM_MEDIUM); strcpy(m_name.name,MESSAGE_NAME); /* * Allocate or map the message data in persistent memory. */ res=pmmAllocate((VmAddr *)&message, &m_name, MESSAGE_SIZE, HR_GROUP_KEY, HR_GROUP_KEYSIZE); if (res != K_OK) { printf("Cannot allocate or map the persistent memory block called %s." " Error = %d\n", name.name, res); HR_EXIT_HDL(); exit(-1); } /* * If the process has been restarted, print out a message. */ if (st->restarted>0) { printf("CHILD RESTARTS (%d-th time)\n", st->restarted); } /* * Increase the "restarted" counter. */ st->restarted++; /* * Loop forever. * Each time the parent process crashes, the child process will be * stopped with it since they belong * to the same group. */ while ( 1 ) { waitSec(1); switch ( st->checkpoint ) { case 0: step1(&message); break; case 1: step2(&message); break; case 2: step3(&message); break; case 3: step4(&message); break; default: break; } } return 0; }
SRCS = HR_child.c HR_parent.c SupprocessTarget(HR_child.r,HR_child.o,$(NUCLEUS_DIR)/lib/pmm/pmmlib.a) SupprocessTarget(HR_parent.r,HR_parent.o,$(NUCLEUS_DIR)/lib/pmm/pmmlib.a) Depend($(SRCS))