13 Troubleshooting OSM

This chapter provides guidelines to help you troubleshoot problems with your Oracle Communications Order and Service Management (OSM) system.

Information You Need for Troubleshooting

When you are diagnosing and resolving problems, you must be able to obtain the following information:

  • Database AWR report for a particular period of time.

  • Database ASH report for a particular period of time.

  • WebLogic administration server logs and output files.

  • WebLogic managed server logs and output files.

  • WebLogic node manager's logs and output files (if configured).

  • JVM garbage collector logs (if collected).

  • JVM heap dumps, which provide warnings about the size of the files.

  • JVM thread dumps (several in succession).

  • OSM model and a single order extracted from the database schema. For more information, see "Exporting and Importing the OSM Model and a Single Order."

General Checklist for Resolving Problems

If you have a problem with your OSM system, go through the following checklist before you contact Oracle Technical Support:

  • What exactly is the problem? Can you isolate it? For example, if an order causes a problem on one computer, does it give the same result on another computer?

    Oracle Technical Support needs a clear and concise description of the problem, including when it began to occur.

  • What do the log files say?

    This is the first thing that Oracle Technical Support asks for. Check the error log for the OSM component you are having problems with.

  • Have you read the documentation?

    Look through the list of common problems and their solutions in "Diagnosing Some Common Problems with OSM".

  • Has anything changed in the system? Did you install any new hardware or new software? Did the network change in any way? Does the problem resemble another one you had previously? Has your system usage recently jumped significantly?

  • Is the system otherwise operating normally? Has response time or the level of system resources changed? Are users complaining about additional or different problems?

Diagnosing Some Common Problems with OSM

This section describes common problems and their solutions.

Cannot Log in or Access Certain Functionality

If you cannot log in or access certain functionality, check the following possible causes:

  • Are you a valid user in the Oracle WebLogic Server security realm?

  • Is the OSM web application deployed?

  • Are all OSM Enterprise Java Beans (EJB) deployed?

  • Are the OSM database resources deployed?

  • Do you belong to the correct groups in the WebLogic Server security realm?

  • Do you belong to any OSM workgroup?

System Appears Slow

If the functionality of OSM appears to be present, but performance is slow, check the following possible causes:

  • The amount of memory being used (check the max memory configuration in the WebLogic server startup script on the workstation where you have deployed OSM)

  • The CPU and disk usage on the machine hosting the OSM database

  • The database connections

  • For slow worklist access, check the number of flexible headers on your worklist. The number of flexible headers has a direct negative effect on worklist performance.

Error: "Java.lang.StackOverflowError" when Using Task Web Client

You may see the error "Java.lang.StackOverflowError" in the log files when you use the plus ( + ) and minus ( - ) buttons to add or delete data elements in the OSM Task web client. If this happens, you can address the problem by tuning the thread stack size parameter in WebLogic Server as described below.

Note:

The procedures below set the value to 1MB. This is a suggested value to start with, but you should adjust the value if necessary, according to your needs.

Solution

To increase the thread stack size setting for WebLogic servers on UNIX and Linux:

  1. Back up the domain_home/bin/setDomainEnv.sh file by saving a copy with a different name.

  2. Open the domain_home/bin/setDomainEnv.sh file in a text editor.

  3. Search for the following:

    USER_MEM_ARGS="
    
  4. Do one of the following:

    • If you find the search text, change the value of the variable so that the following option is set:

      -Xss2m
      
    • If do not find the search text, do the following:

      1. Search for the following line:

        # IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values
        
      2. Above the line that you searched for, add the USER_MEM_ARGS environment variable as follows:

        USER_MEM_ARGS="-Xss2m"
        
        # IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values
        ...
        
  5. Save and close the file.

To increase the thread stack size setting for WebLogic servers on Windows:

  1. Back up the domain_home\bin\setDomainEnv.cmd file by saving a copy with a different name.

  2. Open the domain_home\bin\setDomainEnv.cmd file in a text editor.

  3. Search for the line that begins with the following:

    set USER_MEM_ARGS
    
  4. Do one of the following:

    • If you find the search text, change the value of the variable so that the following option is set:

      -Xss1m
      
    • If you do not find the search text, do the following:

      1. Search for the following:

        @REM IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values
        
      2. Above the line that you searched for, add the USER_MEM_ARGS environment variable as follows:

        set USER_MEM_ARGS=--Xss2m
        
        @REM IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values
        ...
        
  5. Save and close the file.

Error: "Login failed. Please try again."

If the error "Login failed. Please try again" is displayed when trying to log in through the web client and you have entered the correct user name and password, you probably do not belong to the correct groups in the WebLogic Server security realm.

Solution

Log in to the WebLogic Administration Console using the administrator account. Make sure you have been added to the group OMS_client. Try to log in again.

No Users or Groups Are Displayed

After OSM installation, you do not see any users or groups on the Users and Groups tab. This is because non-dynamic changes have been made, and the WebLogic administration server (and managed server, if applicable) requires a restart.

Solution

  1. Restart the administration/managed server to clear the condition.

    If the condition does not clear, proceed with the steps below.

  2. Log in to the WebLogic Administration Console and select Domain.

  3. Select the Security tab.

  4. Select Advanced. If necessary, scroll down the page to find Advanced.

  5. Select the Allow Security Management Operations if Non-dynamic Changes have been Made check box.

  6. Click Save.

  7. Navigate to the Users and Groups tab.

    Your users and groups appear.

Automation Plug-ins Are Not Getting Called

If the custom automation plug-ins are not getting called, check the following possible causes:

  • Is the Automation configuration deployed properly?

  • Are the JMS resources deployed?

  • Are the JMS destinations, queues, and topics configured properly?

Too Many Open Files

If you have a large number of external clients connected to OSM and receive the error: "java.net.SocketException: Too many open files", do one of the following:

  • From the WebLogic Administration Console, select Servers, then Server, then Protocols, and then HTTP. Reduce the value in Duration from the default 30 seconds to 15 or even 5 seconds. This will allow the WebLogic server to close idle HTTP connections and release more sockets.

Proxy Fails on a Clustered System

If a proxy fails on a clustered OSM system, all HTTP requests that would normally go through the proxy can no longer get to the OSM server. The problem could be with the physical host the server is running on, or it could be a problem with a standalone managing server that is not part of the cluster but is part of the domain.

To recover, restart the proxy.

Orders Are Not Being Created on a Clustered System

If messages are being successfully added to the JMS queue but the corresponding orders are not being created in OSM, first ensure that the servers are running. If they are running, check to see whether the address has been set for your cluster. The Cluster Address field is located in the General tab of the settings for your cluster. If the cluster address is not set, or does not contain the correct values for your managed servers, OSM will not pick up orders from the JMS queue. Generally, this value is set when the domain is created, but it can be changed or removed manually, which can cause this problem to occur.

For more information about the correct value for a cluster address, see the discussion about configuring the WebLogic Server Domain in the OSM Installation Guide chapter on installing OSM in a clustered environment.

Problems Displaying Gantt Charts on Solaris 5.10 Hosted Systems

When running on Solaris 5.10 only, to use X server to display Gantt charts in the Task web client, you must configure the Java settings for the Oracle WebLogic Server to avoid display problems and system instability and performance problems.

See the discussion about enabling graphical displays in the post-installation section of the OSM Installation Guide.

OSM Fails to Process Orders Because of Metadata Errors

Metadata errors can occur in any cartridge with orchestration model entities and can cause order processing failures. Search for the string Metadata Errors in the Console view of the Cartridge Management editor in Design Studio. If you are not using Design Studio to deploy cartridges, look in the WebLogic Server logs for the same string.

For more information, see the discussion of metadata errors in OSM Developer's Guide.

Error: "Not Backend Servers Available"

If the error "No Backend Servers Available" is displayed, you are likely disconnected from your servers. Ensure your servers are connected and functional before continuing with OSM operations.

Quick Fix Button Active During Order Template Conflicts in Design Studio

Conflicts can occur when order templates are created in Design Studio. Presently, Quick Fix does not work for order template conflicts, even if the Quick Fix button is active. All order template conflicts must be resolved manually.

Cannot Create New Orders on a New Cartridge Version

Order creation can fail on a new version of an existing cartridge, even after you have updated all required entities, and built and deployed the cartridge.

When the createOrder request fails, you receive a response like the following example:

<env:Envelope xmlns:env="http://schemas/soap/envelope/"> 
   <env:Header/> 
   <env:Body> 
      <env:Fault 
xmlns:ord="http://URL/communications/ordermanagement"> 
         <faultcode>ord:fault</faultcode> 
         <faultstring>Failed to create and start the order due to 
java.lang.RuntimeException: OMSException: encountered error 
starting orchestration caused by:Cannot find task for notification 
id</faultstring> 
         <faultactor>unknown</faultactor> 
         <detail> 
            <InvalidOrderSpecificationFault 
xmlns="http://URL/communications/ordermanagement"> 
               <Description>Failed to create and start the order due to 
java.lang.RuntimeException: OMSException: encountered error 
starting orchestration caused by:Cannot find task for notification 
id</Description> 
            </InvalidOrderSpecificationFault> 
         </detail> 
      </env:Fault> 
   </env:Body> 
</env:Envelope>

Solution

  1. Open the solution cartridge.

  2. Click the Dependency tab of the model project.

  3. Remove all the dependencies that are displayed for the project.

  4. Re-add all the dependencies.

  5. Restart Design Studio.

Error: "exact fetch returns more than requested number of rows"

You may see the error "exact fetch returns more than requested number of rows"" in the log files if there are memory issues relating very large orders causing contention issues in orchestration XQuery calls when multiple orchestration plans are running at the same time. The default orchestration plan concurrency level is 8. You can reduce this value as described below.

Solution

To decrease the orchestration plan concurrency level on UNIX and Linux:

  1. Back up the domain_home/bin/setDomainEnv.sh file by saving a copy with a different name.

  2. Open the domain_home/bin/setDomainEnv.sh file in a text editor.

  3. Search for the following:

    USER_MEM_ARGS="
    
  4. Search for the following line:

    # IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values
    
  5. Above the line that you searched for, add the following Java option:

    export JAVA_OPTIONS="${JAVA_OPTIONS} -Doracle.communications.ordermanagement.orchestration.generation.model.ConcurrencyLevel=7
    
    # IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values
    ...
    

    You can set the value lower if the error message continues to appear.

  6. Save and close the file.

Invalid Entity View Fault Error

When you are using the GetEntity assets or accounts web service operation, you might receive a view fault error.

This error might be caused by any of the following:

  • Specified view name does not exist.

  • User does not have permission to use the specified view or entity. For information about permissions, see the topic about modeling assets and accounts in OSM Modeling Guide.

  • Specified Id and entity type do not match. For example, you search for an address type with Id 1000718. Although the Id number exists, it is not an address entity type.

For information about using OSM entity management web services for Asset and Account entities, see OSM Developer's Guide.

Getting Help with OSM Problems

If you cannot resolve your problems with OSM, contact Oracle Technical Support.

Before You Contact Support

Problems can often be fixed by shutting down OSM and restarting the computer that OSM runs on.

If that does not solve the problem, the first troubleshooting step is to look at the error log for the application or process that reported the problem. Consult "General Checklist for Resolving Problems" before reporting the problem to Oracle.

Reporting Problems

If "General Checklist for Resolving Problems" does not help you to resolve the problem, write down the pertinent information:

  • A clear and concise description of the problem, including when it began to occur.

  • Relevant portions of the log files.

  • Relevant configuration files, such as oms-config.xml.

  • Recent changes in your system, even if you do not think they are relevant.

  • List of all the OSM components and patches installed on your system.

When you are ready, report the problem to Oracle.