Troubleshooting OSM Installation Problems

8 Troubleshooting OSM Installation Problems

This chapter describes some of the issues you may encounter during the Oracle Communications Order and Service Management (OSM) installation process and their solutions.

Artifacts Generated by the Installer

The OSM installer generates the following files and artifacts, which can be found in the $OSM_CONFIG_HOME/configuration/environment-name directory. If $OSM_CONFIG_HOME is not defined, then the directory is $HOME/.osm/configuration/environment-name.

configuration.properties file: This file is generated by the discover.sh script and holds the information you provided and the information that the script discovered about the target environment.
*Model.yaml.: These files represent the last attempted domain configuration to the WebLogic installation in this environment. These are generated after running the configDomain.sh and configOSM.sh scripts.
osm_schema_installs or installer_schema_upgrades: This directory contains the following items, which are generated after running the configDB andconfigOSM scripts. The directory installer_schema_upgrades gets created if migration is from legacy installer schema:
- InstallPlan-OMS-CORE.csv and InstallPlan-SEMELE-CORE.csv: These files contain data related to the DB InstallPlan actions along with the status and error messages, if any.
- AnalysisReport.xml: This file contains the OSM schema migration analysis report.
- staging: This directory has model files for the OSM schema as well as the Semele-related models, which have to be created or upgraded.
osm-wdt-app-archive.zip: This archive file represents the last attempted deployment of applications to the WebLogic installation in this environment. This is generated after running the configDomain and configOSM scripts.
update_domain_output: This directory contains files that provide information about servers and resources that need to be restarted. These files get generated after running the configDomain and configOSM scripts.
wdt_logs: The updateDomain.log file is available under this directory. This contains logs related to domain update. This gets generated by the WDT tool after running the configDomain and configOSM scripts.

Apart from these, the OSM installer also generates the installer logs for the installer scripts run. These can be found under the directory $HOME/osm-installer-log/. Here, you can find the log file osm-installer-log_YYYYMMDD_HHMMSS.log. Here, $HOME is the user home directory.

Coherence Configuration Error: ORA-00001: unique constraint

The following errors can occur in the OSM WebLogic server logs when creating an order:

ORA-00001: unique constraint (ORDERMGMT4701.XPKOM_ORDER_FLOW_COORDINATOR) violated
ORA-00001: unique constraint (ORDERMGMT_OSMPRD.XPKOM_ORDER_HEADER) violated
ORA-00001: unique constraint (ORDERMGMT_OSMPRD.XPKOM_HIST$ORDER_INSTANCE) violated
ORA-00001: unique constraint (ORDERMGMT_OSMPRD.XPKOM_ORDER_INSTANCE) violated

These errors occur because incorrect or missing Coherence settings cause the nodes in the cluster to be unaware of each other. The servers are unaware that they must generate order IDs that take the other servers into consideration. This problem does not occur if the same server gets all of the createOrder requests. The problem occurs when any other server gets a request and uses the wrong formula to generate the order ID.

For more information about Coherence, see "Configuring Oracle Coherence for an OSM Cluster."

Coherence Not Able to Start in a Firewall Enabled Environment

When a firewall is configured between the servers, the firewall blocks all communication between the nodes of the coherence cluster. The ports used by coherence communication need to be opened to allow coherence traffic to go through. The coherence cluster port as well as each server's local coherence listening port will need to be opened from the firewall. Each server's local coherence listening port need to be defined by you instead of being allocated by coherence.

Note:

The port specified for the local coherence listening port must not be the same as the unicast port used by the coherence cluster.

You can specify the port using the -D args as given below to configure the server's local coherence listening port:

-D coherence.localport=9000

For more details, refer to the following Knowledge Management Articles on My Oracle Support:

What Are All the Ports Needed to Be Opened for Coherence (Doc ID 1472388.1)
How To Set Up Coherence Cluster With Firewall Configured Between The Hosted Machines? (Doc ID 2423425.1)

Error About T3 After Initial OSM Startup

The first time you start the OSM server after installation, you may see an exception indicating T3 file attachment not found.

If this occurs, restart the server.

Node Manager Does Not Create IP Address for Whole Server Migration

When you start up a managed server that is configured for whole server migration, the managed server fails to start because node manager does not create the floating IP address for the managed server.

If this occurs, ensure that you have selected Automatic Server Migration Enabled when you configured the managed server. Node manager does not allocate IP addresses to managed server unless this value is selected. See "Configure Managed Servers for Whole Server Migration" for information about setting this value.

Handling an OSM Database Schema Installation Failure

When the installer fails during an installation, you receive an error message. Before you continue with the installation, you must find and resolve the issue that caused the failure. There are several places where you can look to find information about the issue.

The database installation action plan spreadsheet is a file that contains a summary of all the installation actions that are part of this OSM database schema installation or upgrade. The actions are listed in the order that they are performed. The spreadsheet includes actions that have not yet been completed. To find the action that caused the failure, do the following:

Go to the $OSM_CFG_HOME/configuration/$osm_env_name/osm_schema_installs/YYYY-MM-DD-HHMMSS/ and look for files InstallPlan-OMS-CORE.csv and InstallPlan-SEMELE-CORE.csv.
Review the status column in these files. The failed action is the first action with a status that is FAILED. The error_message column of that row contains the reason for the failure.

The installation log file gives a more detailed description of all the installation actions that have been run for this installation. This log file is located in the $HOME/osm-installer-log/osm-installer-log_YYYYMMDD_HHMMSS.log file. The failed action is typically at the bottom, that is, the last action that was performed.

Once the issue is resolved, you can rerun the same installer script. It will continue from the point it failed. Remember to rerun the same configOSM.sh or configDB.sh installer scripts that you used earlier.

Also, the following database tables contain information about the database installation:

om_$install$plan_actions: This contains the same information as the database plan action spreadsheet. Compare this table with the spreadsheet in case of a database connection failure.
om_$install$plan: This contains a summary of the installation that has been performed on this OSM database schema.

Database Connection Problems During Installation

If you receive database connection errors, you can try the following options to fix the issue:

If you have an issue while running the discover.sh script, verify the information that you have provided in the script (by using the back command as required). Correct any errors and try again. If the information provided is accurate, verify connectivity to the database from the host running the installer and rectify any issues. Use the back and next commands to trigger the re-evaluation of database and to retry the database connection.
If an issue arises while running the configDB or the configDomain script, validate that the database information in this environment's configuration.properties is accurate. You should also check for connectivity issues or database server availability issues. If configuration.properties is not accurate, you need to rerun the discover.sh script using the existing configuration properties and use this to update the database details. Once the issue is rectified, rerun the install script.

This issue is related to latency in the database connection. Acceptable network latency should be between 0.2 and 0.4 msec. Anything higher than 1 msec can substantially reduce OSM performance.

To verify network latency, do the following:

Log in to the machine running the OSM server.
Run the following command:
```
#ping -s osm_database
```
where osm_database is the host name or IP address of the machine running the OSM database server.

The system responds with lines similar to the following:
```
PING osm_database: 56 data bytes
64 bytes from osm_database: icmp_seq=0. time=0.389 ms
64 bytes from osm_database: icmp_seq=1. time=0.357 ms
```
A value for time of less than 0.4 indicates acceptable network latency. A value greater than 1.0 indicates excessive network latency.

Note:

Solaris uses the -s option. Linux and AIX do not require this option.

JMS Server Connection Problems

After installation, when you restart the server, you may receive an error message from the JMS server connecting to the database. Many retries of the operation occur.

First, check the database connectivity as the database listener or database instance might be down. As a last resort, you may have to re-create the JMS server resource (not recommended) or re-run the OSM installation.

JDBC Errors When First Order Submitted

If you receive JDBC errors when the first order is submitted to OSM, you may need to turn on JDBC logging. Refer to the Oracle WebLogic Server documentation.

No Users or Groups Are Displayed

After OSM installation, you do not see any users or groups on the Users and Groups tab in the WebLogic Server Administration Console. This is because non-dynamic changes have been made, and the WebLogic administration server (and managed server, if applicable) requires a restart.

To resolve this issue:

Restart the administration/managed server to clear the condition.

If the condition does not clear, proceed with the steps below.
Log in to the WebLogic Server Administration Console and select Domain.
Select the Security tab.
Select Advanced. If necessary, scroll down the page to find Advanced.
Select the Allow Security Management Operations if Non-dynamic Changes have been Made check box.
Click Save.
Navigate to the Users and Groups tab.

Your users and groups are displayed.

OSM and RCU Installers Are Slow to Run Database Tablespace Query

It can take an unusually long time for the OSM Installer and RCU Installer to run a database tablespace query. Purging the Oracle Database recycle bin ensures that the installers can run the database tablespace query more quickly.

To purge the Oracle Database recycle bin system wide:

Log in to SQL*Plus as a user with sysdba privileges.
Enter the following command:
```
purge dba_recyclebin;
```
The recycle bin is purged system wide.

To purge the Oracle Database recycle bin for a single user:

Log in to SQL*Plus as the OSM installer database user.
Enter the following command:
```
purge recyclebin;
```
The recycle bin for the database user is purged.

OSM Installer Issues

You may see the following error if you have outstanding Weblogic edit sessions while configuring OSM in the Weblogic domain:

WLSDPLY-09015: updateDomain deployment failed: Domain has outstanding edit session weblogic, deploy cannot proceed and will exit

To fix this issue:

Ensure that there is no open configuration modification activity on the domain. This could be happening via scripts invoking WLST or similar APIs, via the WebLogic Console or Enterprise Manager, or a similar user interface.
Log out of the WebLogic console.
Rerun the script to configure the domain.

Command for unpack.jar Fails with a Write Error

If you run the unpack.jar command and you receive a write error, you must provide a target application tag (-app_dir) while running the command.

For example:

./unpack.sh -template=/scratch/oracle/Middleware/user_projects/domains/osmprak_72251to730_upgddomain_final8may.jar -domain=/scratch/oracle/Middleware/user_projects/domains/osmprak_72251to730_upgddomain -app_dir=/scratch/oracle/Middleware/user_projects/applications/osmprak_72251to730_upgddomain

Managed Servers are Unable to form Coherence Cluster

After restarting the managed servers upon successful installation of OSM, you may see the following warning in the managed server log file:

<Warning> <com.oracle.coherence> <BEA-000000> <2021-03-22 04:28:03.513/133.795 Oracle Coherence GE 192.0.2.1 
<Warning> (thread=Cluster, member=n/a): Delaying formation of a new cluster; 
TcpRing failed to connect to senior Member(Id=1, Timestamp=2021-03-22 04:24:17.89, Address=192.0.2.1:7777, MachineId=43781, 
Location=site:location.compute.example.com,machine:192.0.2.1,process:9670,member:M1, Role=c1); 
if this persists, it is likely the result of a local or remote firewall rule blocking connections to TCP-ring port 7777>

This issue occurs because the required ports are not enabled in the firewall. As a result, OSM managed servers cannot form a Coherence cluster and the load distribution may not happen properly.

Note:

This issue occurs mostly in private cloud environments.

To resolve the issue, enable the following ports in the firewall:

7
17991, 17992, and 17993
In the setDomainEnv.sh file, enable these ports in JAVA_OPTIONS for all the machines as shown below:
```
export JAVA_OPTIONS="${JAVA_OPTIONS} -Dcoherence.localport=17991 -Dcoherence.localport.adjust=17993
```

Coherence uses these ports for forming the cluster.

For more details, see the "What Are All The Ports Needed To Be Opened For Coherence?" knowledge article (Doc ID: 1472388.1) on My Oracle Support.

Insufficient Partition Space Error on Solaris

When installing OSM on Solaris SPARC (64-bit) or on a Solaris VM, the installer displays the following warning on large partitions (more than 1 TB):

"WARNING: / partition has insufficient space to install the items selected.
539661.3 MB additional space would be needed to install the selected items."

To resolve this issue, reduce the free space in the partition to below 1 TB by running the mkfile command. After the installation is complete, delete the filler files to regain the free space.