Probing Algorithm and Functionality

Language:

The HA for Samba fault monitor is controlled by the extension properties that control the probing frequency. The default values of these properties determine the preset behavior of the fault monitor. The preset behavior should be suitable for most Oracle Solaris Cluster installations. Therefore, you should tune the HA for Samba fault monitor only if you need to modify this preset behavior.

Setting the interval between fault monitor probes (Thorough_probe_interval)
Setting the time-out for fault monitor probes (Probe_timeout)
Setting the number of times the fault monitor attempts to restart the resource (Retry_count)

The HA for Samba fault monitor checks the smbd, nmbd, and winbindd components in an infinite loop. During each cycle, the fault monitor checks the relevant component and reports either a failure or success.

If the fault monitor is successful, it returns to its infinite loop and continues the next cycle of probing and sleeping.
If the fault monitor reports a failure, a request is made to the cluster to restart the resource. If the fault monitor reports another failure, another request is made to the cluster to restart the resource. This behavior continues whenever the fault monitor reports a failure.

If successive restarts exceed the value of Retry_count in the Thorough_probe_interval property, a request is made to fail over the resource group onto a different node.

Operations of the winbind Probe

The winbind fault monitor periodically checks that the fault monitor user can be retrieved by using the getent passwd samba-fault-monitor-user command.

Note - The winbindd daemon resolves user and group information as a service to the name service switch. When running winbindd, the name service cache daemon must be turned off. To disable this daemon, see Step 6 in How to Configure the Samba Software.

Operations of the Samba Probe

The Samba probe checks the nmbd daemon by using the nmblookup program for each interface that is specified within the smb.conf file.

The Samba probe checks the smbd daemon by using the smbclient program, together with the samba-fault-monitor-user, to access the scmondir share.

If the smbclient program cannot connect, there might be network or server issues that are causing the smbclient program to fail. These errors might be transient and correctable within a few seconds. Therefore, before a failure is called by the probe, the smbclient program is retried within 85% of the Probe_timeout property setting minus 15 seconds, which is approximately the timeout for the first smbclient failure. However, doing this is only realistic if the Probe_timeout setting is at least 30 seconds. If the Probe_timeout setting is less than 30 seconds, the smbclient program is tried only once.