Sun Cluster 3.1 Data Services Developer's Guide

Returning From svc_start

Even when svc_start returns successfully, it is possible the underlying application failed to start. Therefore, svc_start must probe the application to verify that it is running before returning a success message. The probe must also take into account that the application might not be immediately available because it takes some time to start up. The svc_start method calls svc_wait, which is defined in xfnts.c, to verify the application is running, as follows.

Example 8–7

/* Wait for the service to start up fully */
       "Calling svc_wait to verify that service has started.");

   rc = svc_wait(scds_handle);

       "Returned from svc_wait");

   if (rc == 0) {
      scds_syslog(LOG_INFO, "Successfully started the service.");
   } else {
      scds_syslog(LOG_ERR, "Failed to start the service.");

The svc_wait method calls scds_get_netaddr_list(3HA) to obtain the network-address resources needed to probe the application, as follows.

Example 8–8

/* obtain the network resource to use for probing */
   if (scds_get_netaddr_list(scds_handle, &netaddr)) {
          "No network address resources found in resource group.");
      return (1);

   /* Return an error if there are no network resources */
   if (netaddr == NULL || netaddr->num_netaddrs == 0) {
          "No network address resource in resource group.");
      return (1);

Then svc_wait obtains the start_timeout and stop_timeout values, as follows.

Example 8–9

svc_start_timeout = scds_get_rs_start_timeout(scds_handle)
   probe_timeout = scds_get_ext_probe_timeout(scds_handle)

To account for the time the server might take to start up, svc_wait calls scds_svc_wait and passes a timeout value equivalent to three percent of the start_timeout value. Then svc_wait calls svc_probe to verify that the application has started. The svc_probe method makes a simple socket connection to the server on the specified port. If fails to connect to the port, svc_probe returns a value of 100, indicating total failure. If the connection goes through but the disconnect to the port fails, then svc_probe returns a value of 50.

On failure or partial failure of svc_probe, svc_wait calls scds_svc_wait with a timeout value of 5. The scds_svc_wait method limits the frequency of the probes to every five seconds. This method also counts the number of attempts to start the service. If the number of attempts exceeds the value of the Retry_count property of the resource within the period specified by the Retry_interval property of the resource, the scds_svc_wait method returns failure. In this case, the svc_start method also returns failure.

Example 8–10

#define    SVC_CONNECT_TIMEOUT_PCT    95
#define    SVC_WAIT_PCT       3
   if (scds_svc_wait(scds_handle, (svc_start_timeout * SVC_WAIT_PCT)/100)
      != SCHA_ERR_NOERR) {

      scds_syslog(LOG_ERR, "Service failed to start.");
      return (1);

   do {
       * probe the data service on the IP address of the
       * network resource and the portname
      rc = svc_probe(scds_handle,
          netaddr->netaddrs[0].port_proto.port, probe_timeout);
      if (rc == SCHA_ERR_NOERR) {
         /* Success. Free up resources and return */
         return (0);

       /* Call scds_svc_wait() so that if service fails too
      if (scds_svc_wait(scds_handle, SVC_WAIT_TIME)
         != SCHA_ERR_NOERR) {
         scds_syslog(LOG_ERR, "Service failed to start.");
         return (1);

   /* Rely on RGM to timeout and terminate the program */
   } while (1);

Note –

Before it exits, the xfnts_start method calls scds_close to reclaim resources allocated by scds_initialize. See The scds_initialize Call and the scds_close(3HA) man page for details.