12 Troubleshooting Tips

ACSLS HA 8.4 is the integration of the ACSLS application operating on a two-node system under Solaris 11.2 with IPMP and ZFS under the control of Solaris Cluster 4.2.

Verifying that ACSLS is Running

To verify that ACSLS services run on the active node, use the following command as user acsss:

# su - acsss
$ acsss status

If one or more services are disabled, enable them with $ acsss enable.

If the status display reveals that one or more of the ACSLS services is in maintenance mode, run the command $ acsss l-status.

Look for the path to the log file of the faulty service and view that log for hints that might explain why the service was placed in maintenance mode.

If one or more of the acsls services is in maintenance mode, they can be cleared by disabling then enabling them with the acsss command.

$ acsss shutdown
$ acsss enable

As root, use # svcadm clear <service name> to clear an individual service.

The service is not cleared until the underlying fault has been corrected.

Specific operational logs should also be reviewed as a means to reveal the source of a problem. Most of these are found in the $ACS_HOME/log directory.

The primary log to review is the acsss_event.log. This log records most events surrounding the overall operation of ACSLS.

If the problem has to do with the ACSLS GUI or with logical library operation, the relevant logs are found in the $ACS_HOME/log/sslm directory.

For the ACSLS GUI and WebLogic, look for the AcslsDomain.log, the AdminServer.log, and the gui_trace.logs.

Installation problems surrounding WebLogic are found in the weblogic.log.

For logical library issues, once a logical library has been configured, consult the slim_event.logs, and the smce_stderr.log.

Addressing Connection to the Shared Disk Resource

  1. Verify that the acsls-storage resource is online to the active cluster node.

    # clrs status acsls-storage
    
  2. If the acsls-storage resource is not online, verify if the resource is mounted to ZFS on the active node:

    # zpool status
    

    If the acslspool is not mounted on the active node, verify if it is mounted on the standby node

    # ssh standby hostname zpool status
    

    If the shared disk resource is mounted on the standby node, then switch cluster control to that node.

    # clrg switch -n standby hostname acsls-rg
    
  3. If the acslspool is not mounted on the active node, and the acsls-storage resource is offline, verify that the acslspool is visible to the active node.

    # zpool import (no argument)
    

    Note:

    This operation works only if acsls-storage is offline. To bring it offline, use the command clrs disable acsls-storage.

    If the acslspool is visible to the active node, attempt to import it:

    # zpool import -f acslspool
    

    If the import operation succeeds, bring the acsls-storage resource online to Solaris Cluster:

    # clrs enable acsls-storage
    

    If the acslspool is not visible to the active node, it is be necessary to troubleshoot the physical connection to the shared drive.

When the Logical Host Cannot be pinged

  1. Verify that the logical hostname is registered with Solaris Cluster.

    # clrslh list
    
  2. Determine the active node:

    # clrg status | grep -i Online
    
  3. Verify the active node can be pinged.

    # ping <node name>
    
  4. Verify that the logical-host name resource is online to the active node.

    # clrslh status
    

    If the logical host is not online, then enable it.

    # clrs enable <logical host>
    
  5. Verify the state of the IP interfaces assigned to the public group.

    # ipadm
    

    In the output display, verify the ok state of each member of the public ipmp group.

  6. For each interface in the public group (ipmp0), verify its physical state.

    # dladm show-phys
    
  7. Verify that the logical host is plumbed to one or the other of the two interfaces in the public ipmp group (revealed in step-5)

    # arp <logical-hostname>
    # ifconfig net0
    # ifconfig net4
    

    This example assumes that net0 and net4 were assigned to the public ipmp group.

    The MAC address of one of the two interfaces should agree with the MAC address assigned to the logical hostname.

Checking the Interconnection Between Nodes

If it is suspected that cluster control fails because of lost communication for Cluster between the two nodes, check the private interconnection for Cluster as follows:

# cluster status -t interconnect