The Solaris fault management daemon (fmd) is the central point in Solaris for fault management. It receives observations from various sources and delivers them to subscribing diagnosis engines; if those diagnosis engines diagnose a problem, the fault manager publishes additional protocol
events to track the problem lifecycle from initial diagnosis through repair and final problem resolution. The event protocol is specified in the Sun Fault Management Event Protocol Specification. The interfaces described here allow an external process to subscribe to protocol events. See the Fault
Management Daemon Programmer's Reference Guide for additional information on fmd.
The fmd module API (not a Committed interface) allows plugin modules to load within the fmd process, subscribe to events of interest, and participate in various diagnosis and response activities. Of those modules, some are notification agents and will subscribe to events describing diagnoses
and their subsequent lifecycle and render these to console/syslog (for the syslog-msgs agent) and via SNMP trap and browsable MIB (for the snmp-trapgen module and the corresponding dlmod for the SNMP daemon). It has not been possible to subscribe to protocol
events outside of the context of an fmd plugin. The libfmevent interface provides this external subscription mechanism. External subscribers may receive protocol events as fmd modules do, but they cannot participate in other aspects of the fmd module API such as diagnosis. External
subscribers are therefore suitable as notification agents and for transporting fault management events.
Fault Management Protocol Events
This protocol is defined in the Sun Fault Management Event Protocol Specification. Note that while the API described on this manual page are Committed, the protocol events themselves (in class names and all event payload) are not Committed along with this API. The protocol specification
document describes the commitment level of individual event classes and their payload content. In broad terms, the list.* events are Committed in most of their content and semantics while events of other classes are generally Uncommitted with a few exceptions.
All protocol events include an identifying class string, with the hierarchies defined in the protocol document and individual events registered in the Events Registry. The libfmevent mechanism will permit subscription to events with Category 1 class of “list”
and “swevent”, that is, to classes matching patterns “list.*” and “swevent.*”.
All protocol events consist of a number of (name, datatype, value) tuples (“nvpairs”). Depending on the event class various nvpairs are required and have well-defined meanings. In Solaris fmd protocol events are represented as name-value lists using the libnvpair(3LIB) interfaces.
The API is simple to use in the common case (see Examples), but provides substantial control to cater for more-complex scenarios.
We obtain an opaque subscription handle using fmev_shdl_init(), quoting the ABI version and optionally nominating alloc(), zalloc() and free() functions (the defaults use the umem family). More
than one handle may be opened if desired. Each handle opened establishes a communication channel with fmd, the implementation of which is opaque to the libfmevent mechanism.
On a handle we may establish one or more subscriptions using fmev_shdl_subscribe(). Events of interest are specified using a simple wildcarded pattern which is matched against the event class of incoming events. For each match that is made a callback is performed to a
function we associate with the subscription, passing a nominated cookie to that function. Subscriptions may be dropped using fmev_shdl_unsubscribe() quoting exactly the same class or class pattern as was used to establish the subscription.
Each call to fmev_shdl_subscribe() creates a single thread dedicated to serving callback requests arising from this subscription.
An event callback handler has as arguments an opaque event handle, the event class, the event nvlist, and the cookie it was registered with in fmev_shdl_subscribe(). The timestamp for when the event was generated (not when it was received) is available as a struct
timespec with fmev_timespec(), or more directly with fmev_time_sec() and fmev_time_nsec(); an event handle and struct tm can also be passed to fmev_localtime() to fill the struct
tm. A high-resolution timestamp for an event may be retrieved using fmev_hrtime(); this value has the semantics described in gethrtime(3C)
The event handle, class string pointer, and nvlist_t pointer passed as arguments to a callback are valid for the duration of the callback. If the application wants to continue to process the event beyond the duration of the callback then it can hold the event with fmev_hold
(), and later release it with fmev_rele(). When the reference count drops to zero the event is freed.
In libfmevent.h an enumeration fmev_err_t of error types is defined. To render an error message string from an fmev_err_t use fmev_strerror(). An fmev_errno is defined which
returns the error number for the last failed libfmevent API call made by the current thread. You may not assign to fmev_errno.
If a function returns type fmev_err_t, then success is indicated by FMEV_SUCCESS (or FMEV_OK as an alias); on failure a FMEVERR_* value is returned (see fm/libfmevent.h).
If a function returns a pointer type then failure is indicated by a NULL return, and fmev_errno will record the error type.
A subscription handle is required in order to establish and manage subscriptions. This handle represents the abstract communication mechanism between the application and the fault management daemon running in the current zone.
A subscription handle is represented by the opaque fmev_shdl_t datatype. A handle is initialized with fmev_shdl_init() and quoted to subsequent API members.
To simplify usage of the API, subscription attributes for all subscriptions established on a handle are a property of the handle itself ; they cannot be varied per-subscription. In such use cases multiple handles will need to be used.
libfmevent ABI version
The first argument to fmev_shdl_init() indicates the libfmevent ABI version with which the handle is being opened. Specify either LIBFMEVENT_VERSION_LATEST to indicate the most recent version available at compile time or LIBFMEVENT_VERSION_1
(_2, etc. as the interface evolves) for an explicit choice.
Interfaces present in an earlier version of the interface will continue to be present with the same or compatible semantics in all subsequent versions. When additional interfaces and functionality are introduced the ABI version will be incremented. When an ABI version is chosen in fmev_shdl_init
(), only interfaces introduced in or before that version will be available to the application via that handle. Attempts to use later API members will fail with FMEVERR_VERSION_MISMATCH.
This manual page describes LIBFMEVENT_VERSION_1.
The libfmevent API is not least-privilege aware; you need to have all privileges to call fmev_shdl_init(). Once a handle has been initialized with fmev_shdl_init() a process can drop privileges down to the basic set and continue to
use fmev_shdl_subscribe() and other libfmevent interfaces on that handle.
Underlying Event Transport
The implementation of the event transport by which events are published from the fault manager and multiplexed out to libfmevent consumers is strictly private. It is subject to change at any time, and you should not encode any dependency on the underlying mechanism into
your application. Use only the API described on this manual page and in libfmevent.h.
The underlying transport mechanism is guaranteed to have the property that a subscriber may attach to it even before the fault manager is running. If the fault manager starts first then any events published before the first consumer subscribes will wait in the transport until a consumer
The underlying transport will also have some maximum depth to the queue of events pending delivery. This may be hit if there are no consumers, or if consumers are not processing events quickly enough. In practice the rate of events is small. When this maximum depth is reached additional
events will be dropped.
The underlying transport has no concept of priority delivery; all events are treated equally.
Subscription Handle Initialization
Obtain a new subscription handle with fmev_shdl_init(). The first argument is the libfmevent ABI version to be used (see above). The remaining three arguments should be all NULL to leave the library to use its default
allocator functions (the libumem family), or all non-NULL to appoint wrappers to custom allocation functions if required.
The library does not support the version requested.
An error occurred in trying to allocate data structures.
The alloc(), zalloc(), or free() arguments must either be all NULL or all non-NULL.
Insufficient privilege to perform operation. In version 1 root privilege is required.
Internal library error.
Fault Manager Authority Information
Once a subscription handle has been initialized, authority information for the fault manager to which the client is connected may be retrieved with fmev_shdl_getauthority(). The caller is responsible for freeing the returned nvlist using nvlist_free(3NVPAIR).
Subscription Handle Finalization
Close a subscription handle with fmev_shdl_fini(). This call must not be performed from within the context of an event callback handler, else it will fail with FMEVERR_API.
The fmev_shdl_fini() call will remove all active subscriptions on the handle and free resources used in managing the handle.
May not be called from event delivery context for a subscription on the same handle.
Subscribing To Events
To establish a new subscription on a handle, use fmev_shdl_subscribe(). Besides the handle argument you provide the class or class pattern to subscribe to (the latter permitting simple wildcarding using '*'), a callback function pointer for a function to be called for
all matching events, and a cookie to pass to that callback function.
The class pattern must match events per the fault management protocol specification, such as “list.suspect” or “list.*”. Patterns that do not map onto existing events will not be rejected - they just won't result in any callbacks.
A callback function has type fmev_cbfunc_t. The first argument is an opaque event handle for use in event access functions described below. The second argument is the event class string, and the third argument is the event nvlist; these could be retrieved using fmev_class
() and fmev_attr_list() on the event handle, but they are supplied as arguments for convenience. The final argument is the cookie requested when the subscription was established in fmev_shdl_subscribe().
Each call to fmev_shdl_subscribe() opens a new door into the process that the kernel uses for event delivery. Each subscription therefore uses one file descriptor in the process.
See below for more detail on event callback context.
Class pattern is NULL or callback function is NULL.
Class pattern is the empty string, or exceeds the maximum length of FMEV_MAX_CLASS.
An attempt to fmev_shdl_zalloc() additional memory failed.
Duplicate subscription request. Only one subscription for a given class pattern may exist on a handle.
A system-imposed limit on the maximum number of subscribers to the underlying transport mechanism has been reached.
An unknown error occurred in trying to establish the subscription.
An unsubscribe request using fmev_shdl_unsubscribe() must exactly match a previous subscription request or it will fail with FMEVERR_NOMATCH. The request stops further callbacks for this subscription, waits for any existing active callbacks to complete,
and drops the subscription.
Do not call fmev_shdl_unsubscribe from event callback context, else it will fail with FMEVERR_API.
A NULL pattern was specified, or the call was attempted from callback context.
The pattern provided does not match any open subscription. The pattern must be an exact match.
The class pattern is the empty string or exceeds FMEV_MAX_CLASS.
Event Callback Context
Event callback context is defined as the duration of a callback event, from the moment we enter the registered callback function to the moment it returns. There are a few restrictions on actions that may be performed from callback context:
You can perform long-running actions, but this thread will not be available to service other event deliveries until you return.
You must not cause the current thread to exit.
You must not call either fmev_shdl_unsubscribe() or fmev_shdl_fini() for the subscription handle on which this callback has been made.
You can invoke fork(), popen(), etc.
A callback receives an fmev_t as a handle on the associated event. The callback may use the access functions described below to retrieve various event attributes.
By default, an event handle fmev_t is valid for the duration of the callback context. You cannot access the event outside of callback context.
If you need to continue to work with an event beyond the initial callback context in which it is received, you may place a “hold” on the event with fmev_hold(). When finished with the event, release it with fmev_rele(). These calls increment
and decrement a reference count on the event; when it drops to zero the event is freed. On initial entry to a callback the reference count is 1, and this is always decremented when the callback returns.
An alternative to fmev_hold() is fmev_dup(), which duplicates the event and returns a new event handle with a reference count of 1. When fmev_rele() is applied to the new handle and reduces the reference count to 0, the event is freed.
The advantage of fmev_dup() is that it allocates new memory to hold the event rather than continuing to hold a buffer provided by the underlying delivery mechanism. If your operation is going to be long-running, you may want to use fmev_dup() to avoid starving
the underlying mechanism of event buffers.
Given an fmev_t, a callback function can use fmev_ev2shdl() to retrieve the subscription handle on which the subscription was made that resulted in this event delivery.
The fmev_hold() and fmev_rele() functions always succeed.
The fmev_dup() function may fail and return NULL with fmev_errno of:
A NULL event handle was passed.
The fmev_shdl_alloc() call failed.
A delivery callback already receives the event class as an argument, so fmev_class() will only be of use outside of callback context (that is, for an event that was held or duped in callback context and is now being processed in an asynchronous handler). This is a convenience
function that returns the same result as accessing the event attributes with fmev_attr_list() and using nvlist_lookup_string(3NVPAIR)
to lookup a string member of name “class”.
The string returned by fmev_class() is valid for as long as the event handle itself.
The fmev_class() function may fail and return NULL with fmev_errno of:
A NULL event handle was passed.
The event appears corrupted.
Event Attribute List
All events are defined as a series of (name, type) pairs. An instance of an event is therefore a series of tuples (name, type, value). Allowed types are defined in the protocol specification. In Solaris, and in libfmevent, an event is represented as an nvlist_t using
the libnvpair(3LIB) library.
The nvlist of event attributes can be accessed using fmev_attr_list(). The resulting nvlist_t pointer is valid for the same duration as the underlying event handle. Do not use nvlist_free() to free the nvlist. You may then lookup members,
iterate over members, and so on using the libnvpair interfaces.
The fmev_attr_list() function may fail and return NULL with fmev_errno of:
A NULL event handle was passed.
The event appears corrupted.
These functions refer to the time at which the event was originally produced, not the time at which it was forwarded to libfmevent or delivered to the callback.
Use fmev_timespec() to fill a struct timespec with the event time in seconds since the Epoch (tv_sec, signed integer) and nanoseconds past that second (tv_nsec, a signed long). This call can fail and return
FMEVERR_OVERFLOW if the seconds value will not fit in a signed 32-bit integer (as used in struct timespec tv_sec).
You can use fmev_time_sec() and fmev_time_nsec() to retrieve the same second and nanosecond values as uint64_t quantities.
The fmev_localtime function takes an event handle and a struct tm pointer and fills that structure according to the timestamp. The result is suitable for use with strftime(3C). This call will return NULL and fmev_errno of FMEVERR_OVERFLOW under the same conditions as above.
The fmev_timespec() function cannot fit the seconds value into the signed long integer tv_sec member of a struct timespec.
A string can be duplicated using fmev_shdl_strdup(); this will allocate memory for the copy using the allocator nominated in fmev_shdl_init(). The caller is responsible for freeing the buffer using fmev_shdl_strfree(); the caller can
modify the duplicated string but must not change the string length.
An FMRI retrieved from a received event as an nvlist_t may be rendered as a string using fmev_shdl_nvl2str(). The nvlist must be a legal FMRI (recognized class, version and payload), or NULL is returned with fmev_errno
() of FMEVERR_INVALIDARG. The formatted string is rendered into a buffer allocated using the memory allocation functions nominated in fmev_shdl_init(), and the caller is responsible for freeing that buffer using fmev_shdl_strfree().
The fmev_shdl_alloc(), fmev_shdl_zalloc(), and fmev_shdl_free() functions allocate and free memory using the choices made for the given handle when it was initialized, typically the libumem(3LIB) family if all were specified NULL.
Subscription Handle Control
The fmev_shdlctl_*() interfaces offer control over various properties of the subscription handle, allowing fine-tuning for particular applications. In the common case the default handle properties will suffice.
These properties apply to the handle and uniformly to all subscriptions made on that handle. The properties may only be changed when there are no subscriptions in place on the handle, otherwise FMEVERR_BUSY is returned.
Event delivery is performed through invocations of a private door. A new door is opened for each fmev_shdl_subscribe() call. These invocations occur in the context of a single private thread associated with the door for a subscription. Many of the fmev_shdlctl_*() interfaces
are concerned with controlling various aspects of this delivery thread.
If you have applied fmev_shdlctl_thrcreate(), “custom thread creation semantics” apply on the handle; otherwise “default thread creation semantics” are in force. Some fmev_shdlctl_*() interfaces apply only to default thread creation
The fmev_shdlctl_serialize() control requests that all deliveries on a handle, regardless of which subscription request they are for, be serialized - no concurrent deliveries on this handle. Without this control applied deliveries arising from each subscription established
with fmev_shdl_subscribe() are individually single-threaded, but if multiple subscriptions have been established then deliveries arising from separate subscriptions may be concurrent. This control applies to both custom and default thread creation semantics.
The fmev_shdlctl_thrattr() control applies only to default thread creation semantics. Threads that are created to service subscriptions will be created with pthread_create
(3C) using the pthread_attr_t provided by this interface. The attribute structure is not copied and so must persist for as long as it is in force on the handle.
The default thread attributes are also the minimum requirement: threads must be created PTHREAD_CREATE_DETACHED and PTHREAD_SCOPE_SYSTEM. A NULL pointer for the pthread_attr_t will reinstate these default
The fmev_shdlctl_sigmask() control applies only to default thread creation semantics. Threads that are created to service subscriptions will be created with the requested signal set masked - a pthread_sigmask(3C) request to SIG_SETMASK to this mask prior to pthread_create(). The default is to mask all signals except SIGABRT.
See door_xcreate(3C) for a detailed description of thread setup and creation functions for door server threads.
The fmev_shdlctl_thrsetup() function runs in the context of the newly-created thread before it binds to the door created to service the subscription. It is therefore a suitable place to perform any thread-specific operations the application may require. This control applies
to both custom and default thread creation semantics.
Using fmev_shdlctl_thrcreate() forfeits the default thread creation semantics described above. The function appointed is responsible for all of the tasks required of a door_xcreate_server_func_t in door_xcreate().
The fmev_shdlctl_*() functions may fail and return NULL with fmev_errno of:
Subscriptions are in place on this handle.
Example 1 Subscription example
The following example subscribes to list.suspect events and prints out a simple message for each one that is received. It foregoes most error checking for the sake of clarity.
* Callback to receive list.suspect events
mycb(fmev_t ev, const char *class, nvlist_t *attr, void *cookie)
struct tm tm;
if (strcmp(class, "list.suspect") != 0)
return; /* only happens if this code has a bug! */
(void) strftime(buf, sizeof (buf), NULL,
(void) nvlist_lookup_string(attr, "code", &evcode);
(void) fprintf(stderr, "Event class %s published at %s, "
"event code %s\n", class, buf, evcode);
main(int argc, char *argv)
hdl = fmev_shdl_init(LIBFMEVENT_VERSION_LATEST,
NULL, NULL, NULL);
(void) fmev_shdl_subscribe(hdl, "list.suspect", mycb, NULL);
/* Wait here until signalled with SIGTERM to finish */
(void) sigaddset(&set, SIGTERM);
/* fmev_shdl_fini would do this for us if we skipped it */
(void) fmev_shdl_unsubscribe(hdl, "list.suspect");