Sun Cluster Data Service for Kerberos Guide for Solaris OS

Tuning the Sun Cluster HA for Kerberos Fault Monitor

The Sun Cluster HA for Kerberos fault monitor is contained in the resource that represents Kerberos. You create this resource when you register and configure Sun Cluster HA for Kerberos. For more information, see Registering and Configuring Sun Cluster HA for Kerberos.

System properties and extension properties of this resource control the behavior of the fault monitor. The default values of these properties determine the preset behavior of the fault monitor. The preset behavior should be suitable for most Sun Cluster installations. Therefore, you should tune the Sun Cluster HA for Kerberos fault monitor only if you need to modify this preset behavior.

Tuning the Sun Cluster HA for Kerberos fault monitor involves the following tasks:

Setting the interval between fault monitor probes
Setting the timeout for fault monitor probes
Defining the criteria for persistent faults
Specifying the failover behavior of a resource

Perform these tasks when you register and configure Sun Cluster HA for Kerberos. For more information, see the following sections:

The information that you need to tune theRegistering and Configuring Sun Cluster HA for Kerberos fault monitor is provided in the follow subsection.

Operations by the Fault Monitor During a Probe

The probing consists of checking to see if kadmind(1M) and krb5kdc(1M) are listening to their respective ports. During more thorough probing a new principal is created, the principal is authenticated, and then this principal fetches itself from the database to test the administrative daemon, kadmind.

The probe executes the following steps.

Probe the ports for kadmind(1M) and krb5kdc(1M) to make sure that they are listening. Run the probe command by using the time-out value that the resource property Probe_timeout specifies. The probe is run every Cheap_probe_interval, which by default is every 30 seconds.
Every Thorough_probe_interval (by default 300 seconds) kadmin.local(1M) is used to add a principal. The probe then performs a kinit(1) with the newly created principal. The probe uses the newly created principal to run kadmin(1M) to retrieve its record from the principal database.
The result of these probe commands can be either fail or succeed. If Kerberos successfully responds, the probe returns to its infinite loop, waiting for the next probe time.

If the probe fails, the probe considers this scenario a failure of the Kerberos data service and records the failure in its history. The Kerberos probe considers every failure a complete failure.
Based on the success or failure history, a failure can cause a local restart or a data service failover. Refer to Tuning Fault Monitors for Sun Cluster Data Services in Sun Cluster Data Services Planning and Administration Guide for Solaris OS for further details.