C H A P T E R  1

General Installation and Configuration

This chapter discusses general installation and configuration troubleshooting issues for the Sun StorageTek Availability Suite software.

The following topics are included:


Software Installation Status

During installation, three types of package are installed: CORE, Remote Mirror, and Point-in-Time Copy. At any time, you can verify that the necessary packages have been installed and are running.

The installation process installs the following CORE packages:

The installation process installs the following Remote Mirror packages:

The installation process installs the following Point-In-Time Copy packages:

The following commands will check and display the installation status of the Availability Suite product set.


# pkgchk SUNWscmr SUNWscmu SUNWspsvr SUNWspsvu  SUNWrdcr SUNWrdcu \SUNWiir SUNWiiu

Alternatively, you may check each individual package name one at a time.


# pkginfo -l SUNWscmr SUNWscmu SUNWspsvr SUNWspsvu  SUNWrdcr \SUNWrdcu  SUNWiir SUNWiiu


State of Services

The Solaris Service Management Facility, smf(5), provides the system support to start and stop the Availability Suite services. The following five services are added to the smf during the installation of the Availability Suite packages. Any service in the following list depends on the service or services above it in the list.

Checking Status

To verify the status of Availability Suite services, run dscfgadm -i.

Verifying Service Status

When all services are running, you should see the following output:


# dscfgadm -i
SERVICE         STATE           ENABLED        
nws_scm         online          true           
nws_sv          online          true           
nws_ii          online          true           
nws_rdc         online          true           
nws_rdcsyncd    online          true           
 
Availability Suite Configuration:
Local configuration database: valid

If the services have never been started, or if they have been disabled by the administrator, dscfgadm -i should give the following output:


# dscfgadm -i
SERVICE         STATE           ENABLED        
nws_scm         disabled        false          
nws_sv          disabled        false          
nws_ii          disabled        false          
nws_rdc         disabled        false          
nws_rdcsyncd    disabled        false          
 
Availability Suite Configuration:
Local configuration database: valid

The following commands show the services upon which each Availability Suite service depends.

It is worth noting that the nws_scm service, upon which all other Availability Suite services depend, cannot start up until the Solaris milestones of milestone/devices and milestone/single-user have been reached.

 

Displaying Service Dependencies

The nws_sv dependency is correctly listed twice, since it is a dependency for both nws_ii and nws_rdc.


# svcs -d -o FMRI nws_scm
FMRI
svc:/milestone/devices:default
svc:/milestone/single-user:default

 


# svcs -d -o FMRI nws_sv
FMRI
svc:/system/nws_scm:default

 


# svcs -d -o FMRI nws_ii
FMRI
svc:/system/nws_sv:default

 


# svcs -d -o FMRI nws_rdc
FMRI
svc:/system/nws_sv:default
svc:/system/nws_ii:default

 


# svcs -d -o FMRI nws_rdcsyncd
FMRI
svc:/system/nws_rdc:default
svc:/milestone/multi-user:default

The following commands show the services which depend upon each Availability Suite service.

 


# svcs -D -o FMRI nws_scm
FMRI
svc:/system/nws_sv:default
svc:/system/filesystem/local:default

 


# svcs -D -o FMRI nws_sv
FMRI
svc:/system/nws_ii:default
svc:/system/nws_rdc:default
svc:/system/filesystem/local:default

 


# svcs -D -o FMRI nws_ii
FMRI
svc:/system/nws_rdc:default
svc:/system/filesystem/local:default

 


# svcs -D -o FMRI nws_rdc
FMRI
svc:/system/nws_rdcsyncd:default
svc:/system/filesystem/local:default

Displaying File System Dependencies

If the Availability Suite services are enabled, the Solaris service filesystem/local is dependent on all of the Availability Suite services. This dependency is required since any local file system (but not the root (/) file system) can be configured as a Point-in-Time Copy, Remote Mirror, or both. If the Availability Suite services are enabled (dscfgadm -e), the filesystem/local dependency is set to the type require_all. If the services are disabled (dscfgadm -d), the filesystem/local dependency is set to the type optional_all.

dscfgadm -i displays the following if the filesystem/local dependency is not correctly configured:


# dscfgadm -iSERVICE         STATE           ENABLEDnws_scm***      online          truenws_sv***       online          truenws_ii***       online          truenws_rdc         disabled        falsenws_rdcsyncd    disabled        falseAvailability Suite Configuration:Local configuration database: valid*** Warning: The services above have an incorrect dependency. To repair the problem, run "dscfgadm".

Running dscfgadm with no arguments will correct the dependency type:


# dscfgadmLocal configuration database is already initialized.Warning: Fixing dependency for nws_scm.Warning: Fixing dependency for nws_sv.Warning: Fixing dependency for nws_ii.The following Availability Suite services are enabled:nws_scm nws_sv nws_ii

Starting and Stopping Services

The starting and stopping of Availability Suite services must be done using the dscfgadm -e (enable) and -d (disable) commands. See dscfgadm(1M) for more information. The use of svcadm to enable or disable Availability Suite services is not supported, since the service dependencies on svc:/system/filesystem/local will not be properly configured. See Checking Status for more information.

If you are in this situation, run dscfgadm with no arguments to allow it to correct the dependency types between Availability Suite services and svc:/system/filesystem/local.

Maintenance State

If checking the status of a service shows that a service is in the maintenance state, try the following:

1. Run svcadm(1M) to clear a service from the maintenance state.

2. If the service is still in the maintenance state, check the state of the local configuration database using dscfgadm -i. If the state is not valid, run dscfgadm with no arguments to reinitialize the configuration database. Try clearing the service, using the method detailed in step 1 above.

3. Check the logs for information that may indicate the source of the problem. See Log Files for more information about the logs.

Offline State

If checking the state of the services using dscfgadm -i shows that a service is in the offline state, it is likely that a dependency has not been satisfied. You can try the following:

1. Use svcs(1) to check the status of a service's dependent services.

2. Refer to the logs for information, which may point to the cause of the problem.

Be sure to notice any errors originating from both the offline service and its dependent services. See Log Files for more information regarding the logs.


State of Daemons

This section provides information on starting, stopping, and checking the status of daemons.

Checking Daemon Status

Enabled Availability Suite services make use of several daemons. To verify that the daemons are running when the services are enabled, you may issue the following commands.

For the nws_scm service:


# ps -ef | grep nskernd
    root 14245     1   0 13:16:53 ?           0:02 /usr/lib/nskernd
# ps -ef | grep dscfglockd
    root 14222     1   0 13:16:51 ?           0:01 /usr/lib/dscfglockd -f /etc/dscfg_lockdb

For Remote Mirror:


# ps -ef | grep sndr
    root 14330     1   0 13:17:02 ?           0:00 /usr/lib/sndrsyncd
root 14322     1   0 13:17:02 ?           0:00 /usr/lib/sndrd

Starting and Stopping Daemons

Do not start or stop daemons manually. Enabling and disabling the services using dscfgadm will start and stop the daemons. See Starting and Stopping Services for more information.



Note - The sndrd and sndrsyncd daemons are started in the nws_rdcsyncd service, but are stopped in the nws_rdc service.




System Start-up

If the Availability Suite services are enabled but fail to come online during a reboot, the system boot will drop you into a minimal shell environment to rectify the problem before it continues booting up the system.

If this situation occurs, try the following steps:

1. Run dscfgadm -i to see the state of the services.

2. If a service is in maintenance mode, follow the steps detailed in Maintenance State.

3. If a service is in offline mode, follow the steps detailed in Offline State.

If these steps fail to rectify the problem, refer to the section on SMF (Solaris Service Management Facility) services in the "System Administration Guide: Basic Administration" from the Solaris 10 System Administrator Collection for more information regarding troubleshooting a failed boot.


Configuration Files

This section provides information on configuration files and the Suntrademark Cluster configuration database.

/etc/dscfg_local

The /etc/dscfg_local file contains all the configuration information for volumes under Availability Suite control that are not highly-available as part of a Sun Cluster.

To check status of the local configuration database, run dscfgadm -i. Ensure that the status of the local configuration database is valid. If it is not valid, and you have backed up the local configuration database, you may choose to restore it using the steps in Non-Cluster Environments. If you do not have a back up, run dscfgadm with no arguments to reinitialize the local dscfg.

/etc/dscfg_cluster

The /etc/dscfg_cluster file contains the Sun Cluster device ID (DID) device specification of a partition (slice) which is 5.5MB in size or larger. This full specified DID device specification (for example, /dev/did/rdsk/d11s7) must be identical on all Sun Cluster nodes supporting the Availability Suite services.

Cluster Configuration Database

The Sun Cluster-specific Availability Suite configuration file contains all the configuration information for volumes under Availability Suite control that are highly available as part of a Sun Cluster.

To check status of the cluster configuration database, run dscfgadm -i on all nodes of the Sun Cluster. Ensure that the status of the cluster configuration database is valid, and that the same database is used on all nodes of the Sun Cluster. If not, run dscfgadm -s on all nodes of the Sun Cluster to set and initialize the Sun Cluster configuration. If you have backed up the cluster configuration database, you may choose to restore that backup. See Cluster Environments for more information.

/etc/nsswitch.conf

If entries in the /etc/nsswitch.conf are not configured correctly, you might encounter the following problems:



Note - The services port number must be the same between all interconnected remote mirror host systems.



When the hosts: and services: entries are included in the /etc/nsswitch.conf file, ensure that files is placed before nis, nisplus, ldap, dns, or any other service the machine is using. For example, for systems using the network information system (NIS) naming service, the file must include:


hosts: files nis
services: files nis

If you need to edit the /etc/nsswitch.conf(4) file, use a text editor.


Log Files

/var/adm/ds.log

The /var/adm/ds.log file contains time-stamped messages about Availability Suite software, including both errors and information messages. For example:


Mar 05 15:56:16 scm: scmadm cache enable succeeded
Mar 05 15:56:16 ii: iiboot resume cluster tag <none>

Since the invocation of most Availability Suite commands is logged in this file, it is a useful place to determine what recent Availability Suite administration activity has occurred.

/var/adm/messages

Other errors and informational messages are also logged to the /var/adm/messages file. For example:


Mar 5 16:21:24 doubleplay pseudo: [ID 129642 kern.info] pseudo-device: ii0
Mar 5 16:21:24 doubleplay genunix: [ID 936769 kern.info] ii0 is /pseudo/ii@0

SMF Service Logs

SMF services are logged in the /var/svc/log directory. Each service has its own log file. The logs pertaining to Availability Suite services are: