C H A P T E R 1 |
General Installation and Configuration |
This chapter discusses general installation and configuration troubleshooting issues for the Sun StorageTek Availability Suite software.
The following topics are included:
During installation, three types of package are installed: CORE, Remote Mirror, and Point-in-Time Copy. At any time, you can verify that the necessary packages have been installed and are running.
The installation process installs the following CORE packages:
The installation process installs the following Remote Mirror packages:
The installation process installs the following Point-In-Time Copy packages:
The following commands will check and display the installation status of the Availability Suite product set.
Alternatively, you may check each individual package name one at a time.
The Solaris Service Management Facility, smf(5), provides the system support to start and stop the Availability Suite services. The following five services are added to the smf during the installation of the Availability Suite packages. Any service in the following list depends on the service or services above it in the list.
To verify the status of Availability Suite services, run dscfgadm -i.
When all services are running, you should see the following output:
If the services have never been started, or if they have been disabled by the administrator, dscfgadm -i should give the following output:
The following commands show the services upon which each Availability Suite service depends.
It is worth noting that the nws_scm service, upon which all other Availability Suite services depend, cannot start up until the Solaris milestones of milestone/devices and milestone/single-user have been reached.
The nws_sv dependency is correctly listed twice, since it is a dependency for both nws_ii and nws_rdc.
The following commands show the services which depend upon each Availability Suite service.
# svcs -D -o FMRI nws_sv FMRI svc:/system/nws_ii:default svc:/system/nws_rdc:default svc:/system/filesystem/local:default |
# svcs -D -o FMRI nws_rdc FMRI svc:/system/nws_rdcsyncd:default svc:/system/filesystem/local:default |
If the Availability Suite services are enabled, the Solaris service filesystem/local is dependent on all of the Availability Suite services. This dependency is required since any local file system (but not the root (/) file system) can be configured as a Point-in-Time Copy, Remote Mirror, or both. If the Availability Suite services are enabled (dscfgadm -e), the filesystem/local dependency is set to the type require_all. If the services are disabled (dscfgadm -d), the filesystem/local dependency is set to the type optional_all.
dscfgadm -i displays the following if the filesystem/local dependency is not correctly configured:
Running dscfgadm with no arguments will correct the dependency type:
The starting and stopping of Availability Suite services must be done using the dscfgadm -e (enable) and -d (disable) commands. See dscfgadm(1M) for more information. The use of svcadm to enable or disable Availability Suite services is not supported, since the service dependencies on svc:/system/filesystem/local will not be properly configured. See Checking Status for more information.
If you are in this situation, run dscfgadm with no arguments to allow it to correct the dependency types between Availability Suite services and svc:/system/filesystem/local.
If checking the status of a service shows that a service is in the maintenance state, try the following:
1. Run svcadm(1M) to clear a service from the maintenance state.
2. If the service is still in the maintenance state, check the state of the local configuration database using dscfgadm -i. If the state is not valid, run dscfgadm with no arguments to reinitialize the configuration database. Try clearing the service, using the method detailed in step 1 above.
3. Check the logs for information that may indicate the source of the problem. See Log Files for more information about the logs.
If checking the state of the services using dscfgadm -i shows that a service is in the offline state, it is likely that a dependency has not been satisfied. You can try the following:
1. Use svcs(1) to check the status of a service's dependent services.
2. Refer to the logs for information, which may point to the cause of the problem.
Be sure to notice any errors originating from both the offline service and its dependent services. See Log Files for more information regarding the logs.
This section provides information on starting, stopping, and checking the status of daemons.
Enabled Availability Suite services make use of several daemons. To verify that the daemons are running when the services are enabled, you may issue the following commands.
# ps -ef | grep nskernd root 14245 1 0 13:16:53 ? 0:02 /usr/lib/nskernd # ps -ef | grep dscfglockd root 14222 1 0 13:16:51 ? 0:01 /usr/lib/dscfglockd -f /etc/dscfg_lockdb |
# ps -ef | grep sndr root 14330 1 0 13:17:02 ? 0:00 /usr/lib/sndrsyncd root 14322 1 0 13:17:02 ? 0:00 /usr/lib/sndrd |
Do not start or stop daemons manually. Enabling and disabling the services using dscfgadm will start and stop the daemons. See Starting and Stopping Services for more information.
Note - The sndrd and sndrsyncd daemons are started in the nws_rdcsyncd service, but are stopped in the nws_rdc service. |
If the Availability Suite services are enabled but fail to come online during a reboot, the system boot will drop you into a minimal shell environment to rectify the problem before it continues booting up the system.
If this situation occurs, try the following steps:
1. Run dscfgadm -i to see the state of the services.
2. If a service is in maintenance mode, follow the steps detailed in Maintenance State.
3. If a service is in offline mode, follow the steps detailed in Offline State.
If these steps fail to rectify the problem, refer to the section on SMF (Solaris Service Management Facility) services in the "System Administration Guide: Basic Administration" from the Solaris 10 System Administrator Collection for more information regarding troubleshooting a failed boot.
This section provides information on configuration files and the Sun Cluster configuration database.
The /etc/dscfg_local file contains all the configuration information for volumes under Availability Suite control that are not highly-available as part of a Sun Cluster.
To check status of the local configuration database, run dscfgadm -i. Ensure that the status of the local configuration database is valid. If it is not valid, and you have backed up the local configuration database, you may choose to restore it using the steps in Non-Cluster Environments. If you do not have a back up, run dscfgadm with no arguments to reinitialize the local dscfg.
The /etc/dscfg_cluster file contains the Sun Cluster device ID (DID) device specification of a partition (slice) which is 5.5MB in size or larger. This full specified DID device specification (for example, /dev/did/rdsk/d11s7) must be identical on all Sun Cluster nodes supporting the Availability Suite services.
The Sun Cluster-specific Availability Suite configuration file contains all the configuration information for volumes under Availability Suite control that are highly available as part of a Sun Cluster.
To check status of the cluster configuration database, run dscfgadm -i on all nodes of the Sun Cluster. Ensure that the status of the cluster configuration database is valid, and that the same database is used on all nodes of the Sun Cluster. If not, run dscfgadm -s on all nodes of the Sun Cluster to set and initialize the Sun Cluster configuration. If you have backed up the cluster configuration database, you may choose to restore that backup. See Cluster Environments for more information.
If entries in the /etc/nsswitch.conf are not configured correctly, you might encounter the following problems:
Note - The services port number must be the same between all interconnected remote mirror host systems. |
When the hosts: and services: entries are included in the /etc/nsswitch.conf file, ensure that files is placed before nis, nisplus, ldap, dns, or any other service the machine is using. For example, for systems using the network information system (NIS) naming service, the file must include:
If you need to edit the /etc/nsswitch.conf(4) file, use a text editor.
The /var/adm/ds.log file contains time-stamped messages about Availability Suite software, including both errors and information messages. For example:
Mar 05 15:56:16 scm: scmadm cache enable succeeded Mar 05 15:56:16 ii: iiboot resume cluster tag <none> |
Since the invocation of most Availability Suite commands is logged in this file, it is a useful place to determine what recent Availability Suite administration activity has occurred.
Other errors and informational messages are also logged to the /var/adm/messages file. For example:
Mar 5 16:21:24 doubleplay pseudo: [ID 129642 kern.info] pseudo-device: ii0 Mar 5 16:21:24 doubleplay genunix: [ID 936769 kern.info] ii0 is /pseudo/ii@0 |
SMF services are logged in the /var/svc/log directory. Each service has its own log file. The logs pertaining to Availability Suite services are:
Copyright © 2006, Sun Microsystems, Inc. All Rights Reserved.