Skip Headers
StorageTek Automated Cartridge System Library Software High Availability 8.3 Cluster Installation, Configuration, and Operation
Release 8.3
E51939-02
  Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
 
Next
Next
 

12 Troubleshooting Tips

ACSLS HA 8.3 is the integration of the ACSLS application operating on a two-node system under Solaris 11 with IPMP and ZFS under the control of Solaris Cluster 4.1.

Verifying that ACSLS is Running

To verify that ACSLS services run on the active node, use the following command as user acsss:

# su - acsss
$ acsss status

If one or more services are disabled, enable them with $ acsss enable.

If the status display reveals that one or more of the ACSLS services is in maintenance mode, then run the command: $ acsss l-status.

Look for the path to the log file of the faulty service and view that log for hints that might explain why the service was placed in maintenance mode.

If one or more of the acsls services is in maintenance mode, they can be cleared by disabling then enabling them with the acsss command.

$ acsss shutdown
$ acsss enable

As root, you can also clear an individual service. # svcadm clear <service name>

The service will not be cleared until the underlying fault has been corrected.

Specific operational logs should also be reviewed as a means to reveal the source of a problem. Most of these are found in the $ACS_HOME/log directory.

The primary log to review is the acsss_event.log. This log records most events surrounding the overall operation of ACSLS.

If the problem has to do with the ACSLS GUI or with logical library operation, the relevant logs are found in the $ACS_HOME/log/sslm directory.

For the ACSLS GUI and WebLogic, look for the AcslsDomain.log, the AdminServer.log, and the gui_trace.logs.

Installation problems surrounding WebLogic are found in the weblogic.log.

For Logical Library issues, once a logical library has been configured, you can consult the slim_event.logs, and the smce_stderr.log.

Addressing Connection to the Shared Disk Resource

  1. Verify that the acsls-storage resource is online to the active cluster node.

    # clrs status acsls-storage
    
  2. If the acsls-storage resource is not online, verify if the resource is mounted to ZFS on the active node:

    # zpool status
    

    If the acslspool is not mounted on the active node, verify if it is mounted on the standby node

    # ssh standby hostname zpool status
    

    If the shared disk resource is mounted on the standby node, then switch cluster control to that node.

    # clrg switch -n standby hostname acsls-rg
    
  3. If the acslspool is not mounted on the active node, and the acsls-storage resource is offline, verify if the acslspool is visible to the active node.

    # zpool import (no argument)
    

    Note:

    This operation works only if acsls-storage is offline. To bring it offline, use the command clrs disable acsls-storage.

    If the acslspool is visible to the active node, then you can attempt to import it:

    # zpool import -f acslspool
    

    If the import operation succeeds, then bring the acsls-storage resource online to Solaris Cluster:

    # clrs enable acsls-storage
    

    If the acslspool is not visible to the active node, it will be necessary to troubleshoot the physical connection to the shared drive.

Determining Why You Cannot ping the Logical Host

  1. Verify that the logical hostname is registered with Solaris Cluster.

    # clrslh list
    
  2. Determine the active node:

    # clrg status | grep -i Online
    
  3. Verify that you can ping the active node.

    # ping <node name>
    
  4. Verify that the logical-host-name resource is online to the active node.

    # clrslh status
    

    If the logical host is not online, then enable it.

    # clrs enable <logical host>
    
  5. Verify the state of the IP interfaces assigned to the public group.

    # ipadm
    

    In the output display, verify the ok state of each member of the public ipmp group.

  6. For each interface in the public group (ipmp0), verify its physical state.

    # dladm show-phys
    
  7. Verify that the logical host is plumbed to one or the other of the two interfaces in the public ipmp group (revealed in step-5)

    # arp <logical-hostname>
    # ifconfig net0
    # ifconfig net4
    

    This example assumes that net0 and ne4 were assigned to the public ipmp group.

    The MAC address of one of the two interfaces should agree with the MAC address assigned to the logical hostname.

Checking the Interconnection Between Nodes

If you suspect that cluster control fails because of lost communication for Cluster between the two nodes, you can check the private interconnection for Cluster as follows:

# cluster status -t interconnect