Sun Cluster 2.2 API Developer's Guide

Chapter 2 Sample Data Service

This chapter describes the Sun Cluster Data Services API sample application, in.named. The in.named daemon is the Solaris implementation of the Internet Domain Name Service (DNS).

Overview

The sample application described in this chapter demonstrates how to make a data service application highly available. This sample is for illustrative purposes only. There is no guarantee that this particular application will be highly available.

This chapter assumes that you have read the Sun Cluster Data Services API man pages describing hareg(1M) and haget(1M).

This sample application demonstrates many, but not all, of the features included in the API. Note these aspects of the sample application:

Setting Up the Sample Application

The in.named data service uses only one logical host, even when the underlying cluster has more than one logical host. The method implementations will compute dynamically which logical host is being used. For example, if the hahost1 logical host is used, then the in.named data is placed on the hahost1 disk set.

An administrator can place the boot file (pointed to by the -b flag argument) on any arbitrary file system in the diskset, depending on which file system has space. However, the HA-in.named method implementations need a specific starting point from which to find the boot file. The sample application places this starting point in the administrative file system under the hainnamed subdirectory. It is placed in the hainnamed.config configuration file, which contains a single directory name that indicates a directory elsewhere in the logical host's multihosted disk. This is where the data actually resides (it is a level of indirection).

For our hahost1 logical host, the path name for the file hainnamed.config is /hahost1/hainnamed/hainnamed.config

In general, the path name for an arbitrary logical host would be /loghost/hainnamed/hainnamed.config

The HA-in.named methods are written to compute dynamically which logical host is being used for HA-in.named by testing, for the presence or absence of this configuration file, for each logical host.

For example, if file systems A1 through A5 reside on the hahost1 diskset, and the administrator chooses to locate the HA-in.named data in the directory /hahost1/A1/hainnamed, then the hainnamed.config file must contain that directory name.

In the /hahost1/A1/hainnamed directory, the administrator must create a named.boot file for in.named. (See the in.named(1M) man page for information about the contents of the named.boot file.) The administrator updates the in.named database by editing the named.boot file in this directory, just as he or she would edit the /etc/named.boot file in a non-HA in.named configuration. See "Administering HA-in.named: Updating the Database", for additional discussion of administration and updates.

Basic Functionality of the in.named Method Implementations

Consider the basic functionality of the HA-in.named method implementations. The start method is not registered in this case, and all the work is accomplished in the start_net method. Similarly, the stop method is not registered for HA-in.named, and all the work is accomplished in the stop_net method. The start_net method starts up the in.named daemon, and the stop_net method kills the in.named daemon by sending a -TERM signal.

The Sun Cluster API requires each method to be idempotent--that is, repeated calls on a method must have the same effect as a single call on that method. For HA-in.named, the idempotency is achieved by having each method test whether its work has already been accomplished. That is, start_net tests whether the in.named daemon is already running, and stop_net tests whether the in.named daemon is already stopped.

The Sun Cluster process monitor facility consists of two components, the pmfadm(1M) command and the rpc.pmfd(1M) process monitor daemon. In the sample application, the pmfadm(1M) command is used to start and kill the in.named daemon, and to query whether the in.named daemon is already running. See the pmfadm(1M) and rpc.pmfd(1M) man pages for details.

The HA-in.named method implementations use the haget(1M) utility program to extract information about the Sun Cluster configuration. (See the haget(1M) man page for details.) The method implementations log their error messages to syslog(3), because the code runs without user attendance. They use the same syslog facility that Sun Cluster uses. Determine the syslog facility name by calling haget(1M) with the option -f syslog_facility.

start_net Method for the in.named Data Service

The following is a sample start_net method for the in.named data service.


#! /bin/sh 

#

#Copyright 13 Apr 1996 Sun Microsystems, Inc. All Rights Reserved. 

#

#ident	 "@(#)innamed_start_net.sh	 1.1 	96/04/13 SMI" 

#

# HA-in.named start_net method 

ARGV0=`basename $0` 

SYSLOG_FACILITY=`haget -f syslog_facility`
MASTERED_LOGICAL_HOSTS="$1" 

if [ -z "$MASTERED_LOGICAL_HOSTS" ]; then 

	# This physical host does not currently master any logical hosts.
	exit 0
fi
# Replace comma with space to form an sh word list:
MASTERED_LOGICAL_HOSTS="`echo $MASTERED_LOGICAL_HOSTS | tr ',' '`"

# Dynamically search the list of logical hosts which this physical

# host currently masters, to see if one of them is the logical host

# that HA-in.named uses.
MYLH=
for LH in $MASTERED_LOGICAL_HOSTS ; do 

	# Map logical hostname to administrative file system name: 

	PATHPREFIX_FS=`haget -f pathprefix $LH`
	CONFIG="${PATHPREFIX_FS}/hainnamed/hainnamed.config" 

	if [ -f $CONFIG ]; then 

			MYLH=$LH 

			break 

	fi 

done 

if [ -z "$MYLH" ]; then 

	# This host does not currently master the logical host 

	# that HA-in.named uses. 

	exit 0 

fi


# This host currently masters the logical host that HA-in.named uses, $MYLH
# See if in.named is already running, if so exit. (We must have 

# started it on some earlier cluster reconfiguration when this 

# physical host first took over mastery of the $MYLH logical host.) 
# We determine whether in.named is already running by using the pmfadm
# command to query its status: if the query succeeds, it is already
# running.
if pmfadm -q hainnamed >/dev/null 2>&1 ; then
	exit 0 

fi

HA_INNAMED_DIR="`cat $CONFIG`" 
if [ ! -d $HA_INNAMED_DIR ]; then
	logger -p ${SYSLOG_FACILITY}.err \ 
			"${ARGV0}: directory $HA_INNAMED_DIR missing or not mounted" 

	exit 1 

fi

# We cd to the HA_INNAMED_DIR directory because the named.boot file 
# contains the names of other files. By cd'ing, we permit all of 

# those names to be relative names, relative to the current directory

cd $HA_INNAMED_DIR 

if [ ! -s named.boot ]; then 
	logger -p ${SYSLOG_FACILITY}.err \ 

			"${ARGV0}:file $HA_INNAMED_DIR/named.boot is missing or empty" 

	exit 1 

fi

# Run the in.named daemon under the control of the Sun Cluster process 
# monitory facility. Let it crash and restart up to 4 times an hour; 
# if it crashes more often than that, the process monitor facility daemon
# will cease trying to restart it.			
pmfadm -c hainnamed -n 4 -t 60 /usr/sbin/in.named -b named.boot 
if [ $? -ne 0 ]; then						 
	logger -p ${SYSLOG_FACILITY}.err \
			"${ARGV0}: pmfadm -c of in.named failed" 
	exit 1							 
fi									 

exit 0

stop_net Method for the in.named Data Service

The following is a sample stop_net method for the in.named data service.


#! /bin/sh 

#	Copyright 13 Apr 1996 Sun Microsystems, Inc. All Rights Reserved.
#

#ident "@(#)innamed_stop_net.sh 	1.1 96/04/13 SMI" 

# 

# HA-in.named stop_net method

# 
ARGV0=`basename $0`

SYSLOG_FACILITY=`haget -f syslog_facility` 

NOT_MASTERED_LOGICAL_HOSTS="$2" 

if [ -z "$NOT_MASTERED_LOGICAL_HOSTS" ]; then 

	# This physical host currently masters all logical hosts.

	exit 0 

fi 

# Replace comma with space to have an sh word list: NOT_MASTERED_LOGICAL_HOSTS="`echo $NOT_MASTERED_LOGICAL_HOSTS |tr ',' ' '`"
# Dynamically search the list of logical hosts that this physical

# host should not master, to see if one of them is the logical host

# that HA-in.named uses.  There are two cases to consider: 

# (1) This physical host gave up mastery of that logical host during 

# some earlier cluster reconfiguration.  In that case, the HA administrative 

# file system for the logical host will no longer be mounted so the 

# /HA administrative_file_system/hainnamed directory will not exist. 

# This method has no work to do, because the work got done during the 

# earlier cluster reconfiguration when this physical host first gave up 

# mastery of the logical host. 

# (2) This cluster reconfiguration is the one in which this physical

# host is giving up mastery of the logical host.  In that case, the 

# administrative file system is still mounted when the stop_net method 

# is called and the /HA administrative_file_system/hainnamed directory 

# will exist. 


MYLH= 

for LH in $NOT_MASTERED_LOGICAL_HOSTS ; do 

	# Map logical hostname to pathprefix file system name:

	PATHPREFIX_FS=`haget -f pathprefix $LH`

	CONFIGDIR="${PATHPREFIX_FS}/hainnamed 

	if [ -d $CONFIGDIR ]; then 

			MYLH=$LH 

			break 

	fi 

done 

if [ -z "$MYLH" ]; then 

	# This host is not giving up mastery of the HA-in.named logical host

	# during this cluster reconfiguration. 

	exit 0 

fi
# This host is giving up mastery of the HA-in.named logical host, $MYLH

# during this cluster reconfiguration.
#
# See if in.named is running, and if so, kill it. If it is not running,

# then either we must have killed it during some earlier reconfiguration

# when this physical host first gave up mastery of the logical host, or

# this physical host has not had mastery of the logical host since it

# last rebooted. 

# 

# Tell process monitor to kill the in.named daemon, if it was already

# running. 
if pmfadm -q hainnamed; then 
	pmfadm -s hainnamed TERM	
	if [ $? -ne 0 ]; then 
			logger ${SYSLOG_FACILITY}.err \
				"${ARGV0}: pmfadm -s of in.named failed"
			exit 1 
	fi 
fi


exit 0

abort_net Method for the in.named Data Service

The abort method is not registered for the HA-in.named example. The abort_net method uses the same code as the stop_net method; when HA-in.named is registered with Sun Cluster by the hareg(1M) utility, the abort_net registration points to the code used by stop_net.

Setting Timeout Values for in.named Methods

When you register your data service using hareg(1M), you can specify timeout values for methods you have created, such as the start_net, stop_net, and fm_start methods. However, the timeout values you set for your methods must be less than half the timeout value set for logical host takeover during cluster reconfiguration. The default timeout value for logical host takeover is 180 seconds. Therefore, if the timeout values you set for your methods are greater than 90 seconds, you must increase the timeout value for logical host takeover. Otherwise, your methods will time out.

You can increase the logical host takeover timeout values with the scconf(1M) command. Refer to the scconf(1M) man page and to ??? Section 3.15, "Configuring Timeouts for Cluster Transition Steps," in the Sun Cluster 2.2 System Administration Guide for details.

Improving the in.named Methods

Consider some possible improvements to the start_net and stop_net methods for HA-in.named. The methods can benefit from better error detection and handling. For example, you can test whether the /usr/sbin/in.named binary exists, is executable, and is non-empty. If not, an error message can be logged. Before attempting to cat(1) the file hainnamed.config, verify that the file exists, exhibits the correct permissions, and is non-empty.

The methods also can test for the existence of the non-HA in.named data file /etc/named.boot. If the file exists, there is confusion about whether this host is running non-HA in.named or HA-in.named; only one can run at a time. The code can treat this case as a severe configuration error, log appropriate messages, and neither start nor kill in.named.

DNS Clients

In Solaris, a host that is a client of DNS has an /etc/resolv.conf file. The file lists name server hosts to contact for DNS service. The name server hosts are listed as IP addresses rather than host names. More than one host IP address might be listed.

Network clients of HA-in.named would list the IP address of the logical host, for example, that of hahost1, in the /etc/resolv.conf file.

There are periods when a physical host does not master the logical host that HA-in.named uses. However, the host must have the ability to be a client of HA-in.named during those periods. To achieve this, add the IP address of the logical host to the /etc/resolv.conf file on all physical hosts of the cluster.

Administering HA-in.named: Updating the Database

Administration of HA-in.named resembles that of non-HA in.named. To update the in.named database, log in to the server (it is a security risk to grant root NFS access to the file system where the in.named data files are stored). For HA-in.named, log in to the physical server that currently masters the logical host that HA-in.named has been configured to use. Use the hastat(1M) utility to determine which physical host masters which logical hosts.

You perform an update to HA-in.named by editing its data files. Do this in a way that leaves the data files well-formed in the event of a sudden crash. For example, after logging in, cd to the directory where the HA-in.named data is stored (in our example, the directory /hahost1/A1/hainnamed). Then edit a new temporary copy of the data file, and once you are finished, move this copy onto the real data file name. For example:


% cd /hahost1/A1/hainnamed
% cp named.boot named.boot.new
% vi named.boot.new
% sync
% mv named.boot.new named.boot

As explained in the in.named(1M) man page, you then can use the kill(1M) command to send a SIGHUP signal to the in.named daemon, to cause it to re-read the file.

Documenting HA-in.named

You must document the installation and configuration of the highly available data service. This documentation must explain how to configure any administrative files that live in the administrative file system, and how to install the data service's data on one or more of the logical host's file systems or raw partitions. You should also document administration history and updates for the HA version of your data service.

Fault Monitoring Methods for the in.named Data Service

Sun Cluster enables the author of an HA data service to write fault monitoring methods for the data service. As an example, one can write a modest fault monitor for in.named, and can query in.named periodically using nslookup(1M). If the look-up times out using a very long time-out value, the fault monitor will conclude that the in.named daemon is hung and must be killed and restarted.

Fault monitoring will be executed only on the physical host on which in.named is running, that is, on the host that masters the logical host used by in.named. The non-master physical hosts do not perform fault monitoring.

The fault monitor is started by the FM_START method and stopped by the FM_STOP method. It has no need for the FM_INIT method--HA-in.named would not register an FM_INIT method when calling hareg(1M).

The following is a sample FM_START method for the in.named data service.


#! /bin/sh

# Copyright 26 Oct 1996 Sun Microsystems, Inc.  All Rights Reserved.

#ident "@(#)innamed_fm_start.sh  1.1  96/04/13 SMI"

# HA in.named fm_start method

# Called-back by Solaris Cluster as the FM_START method for HA in.named.

#

ARGV0=`basename $0`

SYSLOG_FACILITY=`haget -f syslog_facility` 



MASTERED_LOGICAL_HOSTS="$1"

if [ -z "$MASTERED_LOGICAL_HOSTS" ]; then

		# This physical host does not currently master any logical hosts.

		exit 0

fi 



# Replace comma with space to form an sh word list:

MASTERED_LOGICAL_HOSTS="`echo $MASTERED_LOGICAL_HOSTS  tr ',' ' '`"



# Dynamically search the list of logical hosts which this physical

# host currently masters, to see if one of them is the logical host

# that HA-in.named uses.


MYLH=

for LH in $MASTERED_LOGICAL_HOSTS ; do

	# Map logical hostname to administrative file system name:

	PATHPREFIX_FS=`haget -f pathprefix $LH` 

	CONFIG="${PATHPREFIX_FS}/hainnamed/hainnamed.config"



	if [ -f $CONFIG ]; then

			MYLH=$LH

			break

	fi

done

if [ -z "$MYLH" ]; then

	# This host does not currently master the logical host

	# that HA-in.named uses.

	exit 0

fi



# This host currently masters the logical host that HA in.named uses,

# $MYLH.

# Create an asynchronous process to periodically probe the in.named

# daemon, under the control of the process monitor facility.

# The asynchronous probe is in its own shell script:

#     hainnamed_fmprobe

# The asynchronous process will be terminated by the FM_STOP method.

pmfadm -c hainnamedfm hainnamed_fmprobe $MYLH

exit 0

The following is a sample FM_STOP method for the in.named data service.


#! /bin/sh

#

# Copyright 26 Oct 1996 Sun Microsystems, Inc.  All Rights Reserved.

#

#ident "@(#)innamed_fm_stop.sh  1.1  96/04/13 SMI"

#

# HA in.named fm_stop method

#

# Called back by Sun Cluster as the FM_STOP method for HA in.named.

#

# Stop the asynchronous fault monitoring process that was created

# earlier under the control of pmfd.

#

# Ignore errors when calling pmfadm just in case the hainnamed_fmprobe

# is already not running.  Reasons for it being already not running

# include the fact that it is started only on the physical host that

# currently masters the logical host, the fact that FM_STOP can be

# called even though FM_START has not be en called, and the fact

# that it may have died an early death all by itself.

pmfadm -s hainnamedfm TERM >/dev/null 2>&1

exit 0

The following is a sample probe script, ha.innamed_fmprobe, for the in.named data service. It is started under the control of the process monitor facility by the FM_START method.


#! /bin/sh

#

# Copyright 26 Oct 1996 Sun Microsystems, Inc.  All Rights Reserved.

#

#ident "@(#)hainnamed_fmprobe.sh  1.1  96/04/13 SMI"

#

# Usage: hainnamed_fmprobe logical_host

#

# Periodically probes the in.named running on the logical_host.

# If the probe times out, then this script will query the pmfd to

# see if the pmfd is still running in.named:

# (i) if so, this script assumes that in.named is hung and

# sends a KILL signal to the in.named process, causing it to

# die.  pmfd will restart in.named provided it has not used

# up its ration of restarts per time period.

# (ii) if not, this script will assume that in.named has exhausted

# its ration of restarts.  This script will call hactl -g to give up

# mastery of the logical host to some other new master physical host.

#

ARGV0=`basename $0`

LOGICAL_HOST="$1"

SYSLOG_FACILITY=`haget -f syslog_facility`

PROBE_INTERVAL_SECS=60

MIN_PROBE_SECS=`hactl -f min_probe_timeout_secs`

PROBE_TIMEOUT_SECS=`expr $MIN_PROBE_SECS + 180`

CLUSTER_KEY=`hactl -f cluster_key`

NSLOOKUP=/usr/sbin/nslookup

if [ ! -x $NSLOOKUP  -o  ! -s $NSLOOKUP ]; then

	logger ${SYSLOG_FACILITY}.err \

		"${ARGV0}: $NSLOOKUP does not exist or is not executable"

	exit 1

fi


while true; do
	# Call nslookup under a timeout, using hatimerun.

	# The -norecurse option tells in.named not to consult

	# other name service instances on other hosts beyond the

	# one on $LOGICAL_HOST.

	# The -retry=10000 is telling nslookup to take forever

	# retrying: this means that for a hung server, nslookup

	# will never itself giveup, rather, the timeout on hatimerun

	# will expire first.

	hatimerun -t $PROBE_TIMEOUT_SECS \

		$NSLOOKUP -norecurse -retry=10000 $LOGICAL_HOST $LOGICAL_HOST

	if [ $? -ne 99 ]; then

			sleep $PROBE_INTERVAL_SECS

			continue

	fi



	# Here when the timeout occurred.

	logger -p ${SYSLOG_FACILITY}.err \

		"${ARGV0}: nslookup of in.named on $LOGICAL_HOST timed-out"

	if pmfadm -q hainnamed then

			# The in.named process exists.  Kill it on the

			# assumption that it is hung.  Sleep a short time,

			# and if hainnamed still exists in the pmfd, assume

			# that pmfd is restarting it (it has not yet used

			# up its ration of restarts per time interval.)

			logger -p ${SYSLOG_FACILITY}.err \

				"${ARGV0}: KILLing hung in.named"

			pmfadm -k hainnamed KILL

			sleep 30

			if pmfadm -q hainnamed; then

					continue

			fi

	fi

	# Here when pmfadm -q says that hainnamed no longer

	# exists in pmfd.  Assume that the ration of restarts

	# was exhausted.  Also assume that something is amiss

	# that moving to a new master could improve.

	logger -p ${SYSLOG_FACILITY}.err \

		"${ARGV0}: in.named restarted too many times, not restarting"

	logger -p ${SYSLOG_FACILITY}.err \

		"${ARGV0}: giving up mastery of $LOGICAL_HOST"

	hactl -g -s hainnamed -k $CLUSTER_KEY -l $LOGICAL_HOST

done