Skip Headers
Oracle® Auto Service Request
Release 3.5 for Oracle Linux and Solaris
  Go To Table Of Contents
Contents

ASR General Troubleshooting

This section provides a variety of troubleshooting procedures for the ASR software.


Note:

This section provides instructions for Solaris. When possible, corresponding Oracle Linux instructions are provided. Please see the appropriate Oracle Linux documentation for details for general administration commands.

ASR Diagnostic Utility

An ASR Diagnostic Utility is included to provide analysis support of any installation problems. This utility packages the data collected and stores it in a consistent, configurable location for later retrieval/delivery and analysis. The .zip file, which is created, will need to be sent to Oracle manually.

Utility Contents

The ASR Diagnostic Utility consists of the following files:

  • diag-config.properties - property file for customizing the diagnostic utility configuration

  • asrDiagUtil.sh - shell script for invoking the utility method

  • com.sun.svc.asr.util.diag.jar - Java method for collecting diagnostic data

  • README – a “readme” text file describing the utility

At command prompt, run ./asrDiagUtil.sh and follow the on-screen instructions on where the diagnostic data file is being generated.

Delivery Model

The ASR Diagnostic Utility is available as part of the ASR software bundle. After you download the latest ASR software bundle, access and run the utility:

  1. Install SUNWswasr.<version>.<timestamp>.pkg (see "Install the ASR Package" for more information)

  2. Verify the ASR Diagnostic Utility files are located under /opt/SUNWswasr/util/diag

  3. Run the utility.


    Note:

    Support for Oracle Linux begins with ASR 2.7.

Configure the ASR Diagnostic Utility

The diag-config.properties file consists a list of properties for specifying location of the configuration and log directories. It also contains "toggle switches" for enabling and disabling a particular data set to be collected:

  • com.sun.svc.asr.util.diag.home.directory – The property for specifying where the diagnostic data .zip bundle will be generated. Default is current directory where the ASR Diagnostic Utility is located.

  • com.sun.svc.asr.util.diag.zip.file.prefix – The property for configuring the diagnostic data .zip file's name.

  • com.sun.svc.asr.util.diag.zip.recursive property – The property for enabling traversing into subdirectories of any configuration or log directories.

Service Tools Bundle (STB) Troubleshooting

This section provides a variety of steps to check on the state of the Service Tools Bundle (STB) that must installed on most ASR systems. If issues arise during the installation and operation of ASR, STB may be part of the issue.

Check the STB Agent

  1. Open a browser window to the system you wish to check using the following command. Be sure to include the / (slash) after agent.

    http://asr_system_hostname:6481/stv1/agent/

  2. A response similar to the following will be displayed:

    <st1:response>
    <agent>
    <agent_urn><agent urn number></agent_urn>
    <agent_version>1.1.4</agent_version>
    <registry_version>1.1.4</registry_version>
    <system_info>
    <system>SunOS</system>
    <host><your host name></host>
    <release>5.10</release>
    <architecture>sparc</architecture>
    <platform>SUNW,Sun-Fire-V215::Generic_137111-06</platform>
    <manufacturer>Sun Microsystems, Inc.</manufacturer>
    <cpu_manufacturer>Sun Microsystems, Inc.</cpu_manufacturer>
    <serial_number>0707FL2015</serial_number>
    <hostid><host ID number></hostid>
    </system_info>
    </agent>
    </st1:response>
    
  3. If you do not get a response from the Service Tags agent, consult the Service Tags man pages:

    man in.stlisten
    man stclient
    

Check the Service Tags Version

Follow the procedure below to check the Service Tags version:

  1. Open a terminal window and log in as root to the ASR system you wish to check.

  2. Run the following command to get the Service Tags version:

    stclient -v

ASR requires Service Tags version 1.1.4 or later.

Check Service Tags Probe

Follow the procedure below to determine that the Service Tag discovery probe is running:

  1. Open a terminal window and log in as root to the ASR system you wish to check.

  2. To determine that the Service Tag discovery probe is running, run the following command:

    svcs -l svc:/network/stdiscover

  3. If the probe is running correctly, the following information is displayed:

    fmri svc:/network/stdiscover:default
    name Service Tag discovery probe
    enabled true
    state online
    next_state none
    state_time Wed Sep 03 21:07:28 2008
    restarter svc:/network/inetd:default
    

Check Service Tags Listener

Follow the procedure below to determine that the Service Tags Listener is running:

  1. Open a terminal window and log in as root to the ASR system you wish to check.

  2. To determine if the Service Tags listener is running, run the following command:

    svcs -l svc:/network/stlisten

  3. If the listener is running correctly, the following information is displayed:

    fmri svc:/network/stlisten:default
    name Service Tag Discovery Listener
    enabled true
    state online
    next_state none
    state_time Wed Sep 03 21:07:28 2008
    restarter svc:/network/inetd:default
    xibreXR_US root@s4u-v215c-abc12
    

Unable to Contact Service Tags on Asset

This message indicates that the activation failed during Service Tags discovery. The issue can be either Service Tags is not installed on the ASR Asset or is installed but not running. Also the issue can be network connectivity between ASR Manager and the ASR Asset. Complete the following checks:

  1. Check if Service Tags is installed and running on an ASR Asset. Run:

    stclient -x

    If you cannot run this command, either Service Tags is not installed or not online.

  2. Check if the Service Tags services are installed and online using the following command:

    svcs | grep reg

  3. The results should be similar to the following example:

    online Aug_23 svc:/application/stosreg:default
    online Aug_23 svc:/application/sthwreg:default
    
  4. If you cannot find these services, it means Service Tags is not installed on the ASR asset.

  5. If the Service Tags services are online, check if psncollector is online. Run:

    svcs | grep psncollector

  6. The results should be similar to the following example:

    online Sep_09 svc:/application/psncollector:default

  7. Make sure that there are no TCP Wrappers installed on the ASR asset to prevent any service tags discovery issues. Run the following command from the ASR Manager system:

    wget http://<assetHostNameOrIPaddress>:6481/stv1/agent/

  8. If there are TCP wrappers installed on the ASR asset, edit /etc/hosts.allow on the asset by adding:

    in.stlisten:<OASM host name>

Service Tags on Asset Reports Unknown or Empty Serial Number/Product Name

If serial number is empty or "unknown" complete the following steps:

  1. Input the correct serial number using the SNEEP command:

    /opt/SUNWsneep/bin/sneep -s <serial number>


  2. Note:

    SNEEP is part of the Services Tools Bundle that is a prerequisite of ASR (for more information, see "Install Service Tags"

  3. For versions of SNEEP older than 2.6, enter the following command:

    svcadm restart psncollector


  4. Note:

    If you are using SNEEP version 2.6, it is not necessary to manually restart the psncollector after inputting the serial number.

  5. You can view the serial number using the following URL:

    http://<AgentipAddress>:6481/stv1/agent/

  6. If product name is empty or "unknown" check if the Hardware Service Tags are installed and online. Run:

    svcs | grep sthwreg

  7. The results should be similar to the following example:

    online Aug_23 svc:/application/sthwreg:default

  8. If you cannot find this service, it means Hardware Service Tags are not installed on the ASR asset.

Activation Failed for Asset <asset name> Due to Data Error

This message indicates that the message creation failed because of bad or missing data. Most of the time, this error is the result of an incorrect or incomplete serial number or product name. To resolve this message, complete the following steps:

  1. Verify the serial number using the SNEEP command:

    sneep

  2. If serial number is not correct then input the correct serial number using the following SNEEP command:

    /opt/SUNWsneep/bin/setcsn -c <serial number>


  3. Note:

    SNEEP is part of the Services Tools Bundle that is a prerequisite of ASR (for more information, see "Install Service Tags"

  4. For versions of SNEEP older than 2.6, run the following command:

    svcadm restart psncollector


  5. Note:

    If you are using SNEEP version 2.6, it is not necessary to manually restart the psncollector after inputting the serial number.

  6. You can view the serial number using the following URL:

    http://<AgentipAddress>:6481/stv1/agent/

  7. Check if the Hardware Service Tags are installed and online. Run:

    svcs | grep sthwreg

  8. The results should be similar to the following example:

    online Aug_23 svc:/application/sthwreg:default

  9. If you cannot find this service, it means Hardware Service Tags are not installed on the ASR asset.

Cannot Retrieve the OASM IP Address

This error message indicates that the ASR Asset activation failed because the Oracle Automated Service Manager (OASM) IP address could not be retrieved. The final step for activating an ASR Asset includes this command:

asr activate_asset -i <host IP address>

When activation fails, the following error message displays:

Cannot retrieve the SASM IP address, please add the SASM IP address to /etc/hosts

You must edit the /etc/hosts file to update the localhost entry. For example, as root, change an entry that looks like this:

127.0.0.1    hostname123.com hostname123 localhost.localdomain localhost

to this:

127.0.0.1    localhost.localdomain localhost

Services are Disabled: stdiscover or stlisten

Service tag processes (stlisten and stdiscover) must be online in order to activate assets successfully.

  1. Check to determine if the stdiscover or stlisten services are disabled. Run the following command:

    svcs stlisten stdiscover

    If the services have been disabled, the output would look like this:

    STATE       STIME         FMRI
    disabled    12:20:14      svc:/network/stdiscover:default
    disabled    12:20:14      svc:/network/stlisten:default
    
  2. To enable the stdiscover and stlisten services, run the following command:

    svcadm enable stlisten stdiscover

  3. Verify the services are online:

    svcs stlisten stdiscover

    Once the services have been enabled, the output would look like this:

    STATE       STIME         FMRI
    enabled     12:20:14      svc:/network/stdiscover:default
    enabled     12:20:14      svc:/network/stlisten:default
    

Check the State of the SMA Service

The SMA service needs to be online in order to support Solaris FMA enrichment data properly. Prior to configuring FMA, complete the following steps:

  1. To check that the state of the SMA service is online, run:

    svcs sma

  2. If SMA is online, the state should indicate online, as in the following example:

    STATE      STIME          FMRI
    online     15:40:31       svc:/application/management/sma:default
    
  3. If SMA is not online, run the following command to enable it:

    svcadm enable sma

  4. Repeat these steps to confirm SMA is online.

Check the State of ASR Bundles

For diagnostic purposes, it may be necessary to check the state of various application bundles installed on the ASR Manager system using the following procedure.

  1. Open a terminal window and log in as root to the ASR Manager.

  2. Enter the following command:

    asr diag

  3. Review the results of this command below along with the settings you should see:

    id State Bundle
    263 ACTIVE com.sun.svc.asr.sw_1.0.0 /fragnebts=264, 265
    264 RESOLVED com.sun.svc.asr.sw-frag_1.0.0 Master=263
    265 RESOLVED com.sun.svc.asr.sw-rulesdefinitions_1.0.0 Master=263
    266 ACTIVE com.sun.svc.ServiceActivation_1.0.0
    
  4. The state of each bundle should be as follows:

    • com.sun.svc.asr.sw bundle should be ACTIVE

    • com.sun.svc.asr.sw-frag should be RESOLVED

    • com.sun.svc.asr.sw-rules definitions should be RESOLVED

    • com.sun.svc.ServiceActivation should be ACTIVE

  5. If any of these states are incorrect, enter the following commands:

    asr stop
    asr start
    
  6. Repeat steps 1 to 3.

  7. To ensure everything is working properly, run the following commands:

    asr test_connection
    asr send_test
    

ASR Log Files

When you are troubleshooting ASR, you can change the level of information displayed in the logs, and increase or decrease the number of logs that are saved before being overwritten. The logs are written to the sw-asr.log files. Log files are located on the ASR Manager system at /var/opt/SUNWsasm/log

There are four levels of logs:

  1. Fine: Displays the highest level of information. It contains fine, informational, warnings and severe messages.

  2. Info: Displays not only informational data, but also both warnings and severe messages. This is the default setting.

  3. Warning: Displays warnings and severe messages.

  4. Severe: Displays the least amount of information; severe messages only.

The default number of logs collected and saved is 5. Once that number is reached, ASR begins overwriting the oldest file. You have the option to change the number of logs collected and saved. If you are gathering as much information as possible in a short time, you might want to limit the number of logs saved to accommodate the larger files.

Set Log Level

Follow the procedure below to set logging levels:

  1. Open a terminal window and log in as root on the ASR Manager system.

  2. To view the current level of information being gathered, run:

    asr get_loglevel

  3. To change the logging level, run:

    asr set_loglevel level

    The choices for level are: Fine, Info, Warning, or Severe.

Set Log File Counts

Follow the procedure below to set log file counts:

  1. Open a terminal window and log in as root on the ASR Manager system.

  2. To view the current number of logs being saved, enter the following command:

    asr get_logfilecount

  3. To change the number of logs being saved, enter the following command:

    asr set_logfilecount <number>

Installing ASR Manager on Blade Systems

Before installing ASR Manager on a blade system, make sure the service svc:/milestone/multi-user-server status is online.

Installing ASR Manager on a Local Zone

If the ASR Manager is installed on a local zone, it is not possible to activate the ASR Manager as an ASR asset. If this is attempted, an error will be received: Asset cannot be activated due to unknown product name or serial number. This is a known issue expected to be corrected in a future version of ASR.

Miscellaneous Errors and Resolution

This section provides a variety of error conditions and resolution steps.

Error Messages and Resolutions

Error Message Resolution
WARNING: Unable to retrieve fault details. For additional information and some insights into how to correct, please see the ASR Installation and Operations Guide - located at www.oracle.com/asr. See the ASR General Troubleshooting Section.
  1. Verify that the asset has got the right Solaris minimum required version and patch level as per the ASR qualified systems web page (see http://www.oracle.com/asr for more information).
  2. Review the community string properties on the asset. ASR Manager requires public as the value of the community string in order to retrieve FMA enrichment and additional fault details. (See "Enable M-Series XSCF Telemetry" for more details)

  3. Review the FMA trap destination configuration file, and restart sma and fmd SMF services.

WARNING: This trap is rejected because the asset is disabled Enable the ASR Asset using one of the following commands:

asr enable_asset -i <ip>(where ip is the IP address of the ASR asset)

or

asr enable_asset -h <host>(where host is the hostname of the ASR asset)

WARNING: this trap is rejected because OASM ASR Plug-in is not activated Enable the ASR Manager using one of the following commands:

asr activate_asset -i <ip>(where ip is the IP address of the ASR asset)

or

asr activate_asset -h <host>(where host is the hostname of the ASR asset)

WARNING: this trap is rejected because the asset is not found Enable the ASR Asset using one of the following commands:

asr activate_asset -i <ip>(where ip is the IP address of the ASR asset)

or

asr activate_asset -h <host>(where host is the hostname of the ASR asset)

SEVERE: Cannot attach snmp trap to snmp service! This indicates that there could be another process using port 162. Kill that process and then run:

svcadm restart sasm

Failure to Register Errors The sasm.log has more detailed information and a Java stacktrace on what failed during registration. When a failure error is encountered, additional details can be found in:

/var/opt/SUNWsasm/configuration/sasm.log

No Such Host Exception This error indicates that the host running ASR Manager cannot resolve the IP address for the Data Transport Service server. Refer to Section 5.5.5, "Test Connectivity from the ASR Manager to Oracle" to troubleshoot and resolve the problem.
Not Authorized. The Sun Online Account provided could not be verified by the transport server This error indicates that the communication between transport server and Oracle is down or busy. This can also indicate that the queue set-up is wrong or that the user does not have permissions to the queue.
Socket Exception: Malformed reply from SOCKS server This error indicates one of the following:
  • The socks configuration in the config.ini file is incorrect or missing. Action: This usually indicates that you need to supply a user/password for the socks settings.

  • The socks is not able to route to the transport server endpoint. Action: Add the correct http proxy information or socks settings. Refer to "Configure ASR to Send HTTPS Traffic Through a Proxy Server" to correct the information.


Only One Client Can Access Console at a Time

If you get this error message running an ASR command on the ASR Manager system, it indicates that only one command can go into the OASM admin port at a time. Each command has a max handle on the connection for 60 seconds before OASM console kills the connection. Try executing the command after 60 seconds. If you still get same message, do the following:

  1. Check if OASM is running:

    ps -ef | grep SUNWsasm

  2. Results:

    root 16817 1 0 16:09:49 ? 4:24 java -cp /var/opt/SUNWsasm/lib/com.sun.svc.container.ManagementTier.jar:/var/opt

  3. If OASM is running, kill the process using the following command:

    kill -9 <Process_ID>

  4. Restart the OASM using the following command:

    svcadm restart sasm

ASR Fault Rules Updates

ASR uses fault rules to filter the telemetry data sent from ASR Assets. This filtering is done to remove telemetry that contains no real fault data and general telemetry “noise.” The filtering process also ensures that telemetry that contains faults is reported. These fault rules can change as ASR improves its filtering and as new platforms and telemetry sources are supported by ASR. ASR installs a cron job on the ASR Manager system to periodically check Oracle's auto-update server for any new rules updates. When there are new rules, the ASR Manager automatically downloads and installs the latest rules bundle. If the cron job is not set to download the fault rules automatically, an e-mail is sent to:

For more information on fault rules, refer to:


Note:

If the asr heartbeat is disabled in crontab, you will not be notified, via e-mail, if your ASR fault rules are out of date with the most current release. To be sure your fault rules are current, you can run the asr update_rules command from the ASR Manager system.

ASR Manager Crashed, Move Assets to a New ASR Manager

In cases where an ASR Manager experiences a critical failure, you can set up a new ASR Manager and reconfigure ASR Assets to report to the new host. The following steps describe a sample scenario:

  1. An ASR Manager is set up (e.g., hostname: ASRHOST01, IP address: 10.10.10.1) and configured on the network. This ASR host is registered and activated to itself.

  2. All ASR assets are configured to report failures to the ASR Manager host (ASRHOST01), and all ASR assets are activated on the host.

  3. A critical failure occurs in the cabinet of ASRHOST01 (for example: a fire destroys the system and its data). The assets need to be attached to a different ASR Manager host (e.g., hostname: ASRHOST02).

  4. A new ASR Manager is set up (e.g., hostname: ASRHOST02, IP address: 10.10.10.2) and configured on the network. The new ASR host is registered and activated to itself.

  5. All ASR assets are now re-configured to report failures to the new ASR Manager host ASRHOST02, and the trap destination is changed to report failures to ASRHOST02.

  6. All ASR assets are now activated on ASRHOST02


Note:

In order to reduce the additional work with moving the ASR Manager to a different location (e.g., from ASRHOST1 to ASRHOST2), you can create an ASR backup on another host or on the existing host. Creating a backup is crucial when recovering from a crash (see "ASR Backup and Restore" for a details on creating an ASR backup).

ASR - No Heartbeat

The ASR Manager must be configured correctly to send the daily cron job for asr heartbeat. After 50 hours, the unit will be marked as a 'Heartbeat Failure' unit.

If an ASR Manager is in Heartbeat Failure mode for 90 days, it will be automatically deactivated at the ASR backend and in My Oracle Support. Also, any assets that are configured via that ASR Manager will also be marked deactivated. This will prevent any future events creating automatic Service Requests.

You can check to see if any ASR Manager or ASR Asset are in Heartbeat Failure by reviewing the ASR status in My Oracle Support.

If you feel that ASR Manager is configured correctly, then you can troubleshoot your ASR Manager hardware to resolve the problem. See MOS knowledge article 1346328.1 for the instructions to your particular hardware:

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1346328.1

See Appendix A, "Heartbeat Failure Notification E-mail Examples" for an e-mail example you may receive should this problem occur.

ASR Assets for Solaris 11

In cases where you are having issues with configuring ASR on Solaris 11 assets using the asradm command, then review the status of the following asr-notify SMF service:

svcs asr-notify

Output should look like this:

STATE        STIME      FMRI
online       13:00:31   svc:/system/fm/asr-notify:default

Note:

If the asr-notify service status is in maintenance mode, then clear the maintenance mode:
svcadm clear asr-notify

re-register the Solaris 11 asset with ASR manager