A Troubleshooting High Availability

This appendix describes common problems that you might encounter when deploying and managing Oracle Application Server in high availability configurations, and explains how to solve them. It contains the following topics:

Section A.1, "Troubleshooting OracleAS Disaster Recovery Topologies"
Section A.2, "Troubleshooting Middle-Tier Components"
Section A.3, "Need More Help?"

A.1 Troubleshooting OracleAS Disaster Recovery Topologies

This section describes common problems and solutions in OracleAS Disaster Recovery configurations. It contains the following topics:

Section A.1.1, "Changing the Default Oracle Data Guard Configuration Set Up by Oracle Application Server Guard"
Section A.1.2, "Failure to Bring Up Standby Instances After Failover or Switchover"
Section A.1.3, "Switchover Operation Fails At the Step dcmctl resyncInstance -force -script"
Section A.1.4, "An Oracle Application Server Guard asgctl verify Operation Does Not Check Temp Directories"
Section A.1.5, "Unable to Start Standalone OracleAS Web Cache Installations at the Standby Site"
Section A.1.6, "Standby Site Middle-tier Installation Uses Wrong Hostname"
Section A.1.7, "Failure of Farm Verification Operation with Standby Farm"
Section A.1.8, "Sync Farm Operation Returns Error Message"
Section A.1.9, "On Windows Systems Use of asgctl startup Command May Fail If the PATH Environment Variable Has Exceeded 1024 Characters"
Section A.1.10, "Adding an Instance from a Remote Client Adds an Instance on the Local Instance and Not on the Remote Instance"
Section A.1.11, "Oracle Application Server Guard Returns an Inappropriate Message When It Cannot Find the User Specified Database Identifier"
Section A.1.12, "Database Instance on Standby Site Must Be Shut Down Before Issuing an asgctl create standby database Command"
Section A.1.13, "Known Issue with Disaster Recovery Cloning on Windows"
Section A.1.14, "The asgctl shutdown topology Command Does Not Shut Down an MRCA Database That is Detected To Be of a repCA Type Database"
Section A.1.15, "Connecting to an Oracle Application Server Guard Server May Return an Authentication Error"
Section A.1.16, "Running Instantiate Topology Across Nodes After Executing a Failover Operation Results in an ORA-01665 Error"
Section A.1.17, "Oracle Application Server Guard Is Unable to Shutdown the Database Because More Than One Instance of Oracle RAC is Running"
Section A.1.18, "Create Standby Fails if Initiated on a Different ASGCTL Shell"
Section A.1.19, "Resolve Missing Archived Logs"
Section A.1.20, "Heartbeat Failure After Failover in Alert Logs"
Section A.1.21, "Create Standby Database Fails If Database Uses OMF Storage or ASM Storage"
Section A.1.22, "Database Already Exists Errors During Create Standby"
Section A.1.23, "Oracle Application Server Guard Add Instance Command Fails When Attempting to Add an Oracle RAC Database to the Topology"
Section A.1.24, "A Create Standby Database Operation Fails with an ASG_DGA-12500 Error Message on Windows"
Section A.1.25, "Use Fully Qualified Instance Names to Ensure Uniqueness"
Section A.1.26, "Misleading Message on JSSO Page"
Section A.1.27, "Instantiate Topology Fails if TNS Alias Includes Domain"
Section A.1.28, "ORA-32001 Errors during Create Standby Database"
Section A.1.29, "ORA-09925 Errors when Bringing Up RAC Database Manually after Switchover"
Section A.1.30, "Recommended Method of Patching an Oracle Application Server Disaster Recovery Site"

A.1.1 Changing the Default Oracle Data Guard Configuration Set Up by Oracle Application Server Guard

In some cases, you may want to change the default Data Guard configuration that is set up by Oracle Application Server Guard.

Problem

For example, if you are running Oracle BPEL Process Manager in an OracleAS Disaster Recovery topology, you want to ensure that:

The Oracle BPEL Process Manager dehydration data is stored in databases that are included in the OracleAS Disaster Recovery topology. This ensures that when a Disaster Recover switchover or failover operation is performed, the database and related services are switched over or failed over in coordination with the Oracle Application Server services in the Disaster Recovery topology.
The Oracle BPEL Process Manager dehydration data stored in databases at the primary and standby sites are continuously synchronized.
When a switchover operation or failover operation occurs, Oracle BPEL Process Manager uses the database at the standby site.

Solution

To achieve this:

Store the dehydration data in a database.

Note:
If you create the standby database using the asgctl create standby database command, then the following two steps will be performed for you by the create standby database command.
Set the Oracle Data Guard data protection mode for the primary database to maximum availability mode (instead of maximum protection mode). Using maximum availability mode allows logs to be applied continuously at the standby site without shutting down the primary database if the standby database is taken offline.

Run this command on the primary database:
```
SQL> alter database set standby database to maximize availability;
```
Place the standby database in managed recovery mode. This puts the standby database in a constant state of media recovery. Configuring the standby database for managed recovery is not a requirement of maximum availability, but it provides for shorter failover times.

On the standby database, run the following command to place the standby database in managed recovery mode. Add the optional disconnect from session clause if you want to end the session after the command:
```
SQL> alter database recovery managed standby database disconnect from session;
```

These steps change the Oracle Data Guard protection mode of the primary database from maximum performance to maximum availability. For details on the different Oracle Data Guard protection modes (maximum protection, maximum availability, and maximum performance), see Oracle Data Guard Concepts and Administration in the Oracle Database documentation set.

Running the primary database in maximum availability mode may cause a hang waiting for an available online log file. A maximum availability primary database will not reuse an online log file until it has been archived to the standby database. This could happen if the standby database is taken offline for a long time.

Only data with same synchronization requirements should be stored in the same database. For example, the Oracle BPEL Process Manager dehydration store and the OracleAS Portal data should be stored in separate databases because the synchronization objectives of Oracle BPEL Process Manager and OracleAS Portal are different. The synchronization objective of Oracle BPEL Process Manager dehydration store is to maintain consistency between the dehydration store and the BPEL process, while the synchronization objective of OracleAS Portal is to ensure that data and configuration maintained within the middle tier and database do not diverge.

Actions Performed by the sync topology Command

When the primary database is in maximum availability mode and the standby database is in managed recovery mode, the asgctl sync topology command does the following:

Performs a log switch at the primary and ensures that the log is shipped and archived.
Performs process management at the primary and standby sites.
Encapsulates the incremental changes for all the data in the Oracle homes.
Restores the standby peers to the configuration level of the primary.
Propagates the changes to all standby instances.
For standby databases:
- With managed recovery running, the sync topology command simply reports the sync SCN and the current database SCN of the standby database. For this configuration, the standby database SCN is guaranteed to be beyond the sync SCN. ASG logs the sync SCN level as it corresponds to the current SCN level of the standby database.
- Without managed recovery running, the sync topology command recovers the standby database to the sync SCN. It is equivalent to running the following command:
```
alter database recover managed standby database until change <sync-scn>
```

A.1.2 Failure to Bring Up Standby Instances After Failover or Switchover

Standby instances are not started after a failover or switchover operation.

Problem

IP addresses are used in instance configuration. OracleAS Disaster Recovery setup does not require identical IP addresses in peer instances between the production and standby site. OracleAS Disaster Recovery synchronization does not reconcile IP address differences between the production and standby sites. Thus, if you use explicit IP address xxx.xx.xxx.xx in your configuration, the standby configuration after synchronization will not work.

Solution

Avoid using explicit IP addresses. For example, in OracleAS Web Cache and Oracle HTTP Server configurations, use ANY or host names instead of IP addresses as listening addresses

A.1.3 Switchover Operation Fails At the Step dcmctl resyncInstance -force -script

The OracleAS Disaster Recovery asgctl switchover operation requires that the value of the TMP environment variable be defined the same in the opmn.xml file on both the primary and standby sites.

Problem

OracleAS Disaster Recovery switchover fails at the step dcmctl resyncInstance -force -script and displays a message that a directory could not be found.

Solution

During a switchover operation, the opmn.xml file is copied from the primary site to the standby site. For this reason, the value of the TMP variable must be defined the same in the opmn.xml file on both primary and standby sites; otherwise, the switchover operation will fail. Make sure the TMP variable is defined identically in the opmn.xml files and resolves to the same directory structure on both sites before attempting to perform an asgctl switchover operation.

For example, the following code snippets for a Windows and UNIX environment show a sample definition of the TMP variable.

Example in Windows Environment: 
------------------------------- 
.
.
.
<ias-instance id="infraprod.iasha28.us.oracle.com"> 
 <environment> 
 <variable id="TMP" value="C:\DOCUME~1\ntregres\LOCALS~1\Temp"/> 
 </environment> 
.
.
.
Example in UNIX Environment: 
---------------------------- 
.
.
.
<ias-instance id="infraprod.iasha28.us.oracle.com"> 
 <environment> 
 <variable id="TMP" value="/tmp"/> 
 </environment> 
.
.
.

A workaround to this problem is to change the value of the TMP variable in the opmn.xml file on the primary site, perform a dcmctl update config operation, then perform the asgctl switchover operation. This approach saves you having to reinstall the mid-tiers to make use of an altered TMP variable.

A.1.4 An Oracle Application Server Guard asgctl verify Operation Does Not Check Temp Directories

The same TEMP directory structure that exists on a primary site must be set up on the standby site.

Problem

DCM does not work properly when the same TEMP directory structure that exists on a primary site is not set up on the standby site. An Oracle Application Server Guard verify operation does not detect this problem.

Solution

Maintain the same TEMP directories on both the primary and standby sites. When creating environment variables for the standby site, ensure that each standby peer's environment is a replica of the production home. An area that is commonly forgotten or overlooked is the TEMP directory.

A.1.5 Unable to Start Standalone OracleAS Web Cache Installations at the Standby Site

OracleAS Web Cache cannot be started at the standby site possibly due to misconfigured. This is applicable only for 10.1.2.x and 10.1.4.x environments, and not for 10.1.3.x environments.

Problem

OracleAS Disaster Recovery synchronization does not synchronize standalone OracleAS Web Cache installations.

Solution

Use the standard Oracle Application Server full CD image to install the OracleAS Web Cache component

A.1.6 Standby Site Middle-tier Installation Uses Wrong Hostname

A middle-tier installation in the standby site uses the wrong hostname even after the system's physical hostname is changed.

Problem

Depending on the Oracle Application Server installation, there are different methods of specifying a physical hostname. Before performing an Oracle Application Server installation, you must use the appropriate method or methods of specifying a physical hostname for that Oracle Application Server release to ensure that the installer uses the correct physical hostname.

Solution

Section 1.2.1.1, "Physical Hostnames"and its subsections provide the instructions for creating a physical hostname prior to installing an instance for a particular Oracle Application Server release. Follow the instructions for the Oracle Application Server release you are installing.

A.1.7 Failure of Farm Verification Operation with Standby Farm

When performing a verify farm with standby farm operation, the operation fails with an error message indicating that the middle-tier system instance cannot be found and that the standby farm is not symmetrical with the production farm.

Problem

The verify farm with standby farm operation is trying to verify that the production and standby farms are symmetrical to one another, that they are consistent, and conform to the requirements for disaster recovery.

One part of the verify operation is a check to confirm that hostname resolution is the same for the hosts at the production and standby site.

For example, suppose that the /etc/hosts file for node1 at the production site has this entry for node1:

123.45.67.890 node1.us.oracle.com node1 infra

In this case, the entries for node1 in other /etc/hosts files in the topology should also have node1.us.oracle.com in the second column of the entry. For example, this would be a valid entry for node1 in the etc/hosts file for node1 at the standby site:

123.45.68.891 node1.us.oracle.com node1 infra

Solution

All of the /etc/hosts file entries for a particular host must have the same name in the second column of the /etc/hosts file entry for the host. Otherwise, the verify operation will not succeed.

A.1.8 Sync Farm Operation Returns Error Message

A sync farm to operation returns the error message: "Cannot Connect to asdb"

Problem

Occasionally, an administrator may forget to set the primary database using the asgctl command line utility in performing an operation that requires that the asdb database connection be established prior to an operation. The following example shows this scenario for a sync farm to operation:

ASGCTL> connect asg hsunnab13 ias_admin/iastest2
Successfully connected to hsunnab13:7890
ASGCTL>  
.
.
.
<Other asgctl operations may follow, such as verify farm, dump farm, 
<and show operation history, and so forth that do not require the connection
<to the asdb database to be established or a time span may elapse of no activity
<and the administrator may miss performing this vital command.
.
.
.
ASGCTL> sync farm to usunnaa11
prodinfra(asr1012): Syncronizing each instance in the farm to standby farm
prodinfra: -->ASG_ORACLE-300: ORA-01031: insufficient privileges
prodinfra: -->ASG_DUF-3700: Failed in SQL*Plus executing SQL statement:  connect null/******@asdb.us.oracle.com as sysdba;.
prodinfra: -->ASG_DUF-3502: Failed to connect to database asdb.us.oracle.com.
prodinfra: -->ASG_DUF-3504: Failed to start database asdb.us.oracle.com.
prodinfra: -->ASG_DUF-3027: Error while executing Syncronizing each instance in the farm to standby farm at step - init step.

Solution

Perform the asgctl set primary database command. This command sets the connection parameters required to open the asdb database in order to perform the sync farm to operation. Note that the set primary database command must also precede the instantiate farm to command and switchover farm to command if the primary database has not been specified in the current connection session.

A.1.9 On Windows Systems Use of asgctl startup Command May Fail If the PATH Environment Variable Has Exceeded 1024 Characters

On Windows systems, if your system PATH environment variable has exceeded the 1024 character limit because you have many Oracle Application Server instances installed or many third party software installations, or both on your system, the asgctl startup command may fail because you are starting the Oracle Application Server Guard server outside of OPMN and the system cannot resolve the directory path.

Problem

Occasionally, on Windows systems with many installations, Oracle Application Server instances or third party software, or both, the asgctl startup command, which is run outside of OPMN, may return a popup error stating it could not find a dynamic link library for a particular file, orawsec9.dll, followed by a DufException. For example:

C:\product\10.1.3\OC4J_1\dsa\bin> asgctl startup
<<Popup Error:>>
The dynamic link library *orawsec9.dll* could not be found.
<<The exception:>>
oracle.duf.DufException
        at oracle.duf.DufOsBase.constructInstance(DufOsBase.java:1331)
        at oracle.duf.DufOsBase.getDufOs(DufOsBase.java:122)
        at 
oracle.duf.DufHomeMgr.getCurrentHomePath(DufHomeMgr.java:582)
        at oracle.duf.dufclient.DufClient.main(DufClient.java:132)
stado42: -->ASG_SYSTEM-100: oracle.duf.DufException
-----------------------------------------------------------------------------

However, this dll does exist in the ORACLE_HOME\bin directory.

This error is not seen in Oracle Application Server Guard standalone kit because the file orawsec9.dll exists in the ORACLE_HOME\dsa\bin folder.

Solution

The workaround is to either manually edit the system PATH variable with the required path information or manually override the PATH in the command prompt by specifying the relevant %PATH% variables. For example:

C:\set PATH=C:\product\10.1.3\OracleAS_OC4J_2\bin;
C:\product\10.1.3\OracleAS_OHS1\jre\1.4.2\bin\client;
C:\product\10.1.3\OracleAS_OHS1\jre\1.4.2\bin;
C:\product\10.1.3\OracleAS_OHS1\bin;C:\product\10.1.3\OC4J_1\bin

C:\product\10.1.3\OC4J_1\dsa\bin> asgctl startup

A.1.10 Adding an Instance from a Remote Client Adds an Instance on the Local Instance and Not on the Remote Instance

When using the asgctl add instance command, the Oracle Application Server Guard client must be run from a system that is already included in the topology.

Problem

For example, when an Oracle Application Server Guard client is connected to the Oracle Application Server Guard server that is to be added to an existing topology, the following error is returned:

ASG_IAS-15785: ERROR: The topology is missing the instance that exists in the home where the ASG server is running.
You must first discover or add the instance in home
ASGCTL>

Solution

Use an Oracle Application Server Guard client from a system that is already included in the topology to perform the asgctl add instance command to add an instance to the topology.

A.1.11 Oracle Application Server Guard Returns an Inappropriate Message When It Cannot Find the User Specified Database Identifier

When adding an Oracle RAC instance to the topology using the Oracle Application Server Guard the add instance command and Oracle Application Server Guard cannot find the user specified identifier, an inappropriate error message is returned. If the user entered the database name rather that the Oracle instance SID, there is no indication that this is the problem.

Problem

If Oracle Application Server Guard is unable to locate the oratab entry (on UNIX) or the system registry service (on Windows) for the user specified database identifier, the following ASG_SYSTEM-100 message now precedes the existing ASG_DUF-3554 message and both messages will be displayed to the console:On UNIX systems:

ASG_SYSTEM-100: An Oracle database is identified by its database unique name (db_name)
ASG_DUF-3554: The Oracle home that contains SID <user specified identifier> cannot be found

On Windows systems:

ASG_SYSTEM-100: An Oracle database is identified by its system identifier (SID)
ASG_DUF-3554: The Oracle home that contains SID <user specified identifier> cannot be found

Solution

When you encounter the message shown in the preceding example, be sure you entered the Oracle instance SID, not the database name.

A.1.12 Database Instance on Standby Site Must Be Shut Down Before Issuing an asgctl create standby database Command

You must shut down a standby site database instance if it is running in order for the asgctl create standby database command to succeed.

Problem

If you run the asgctl create standby database command without shutting down the database on the standby site, the following error is returned:

ASG_DGA-12500: Standby database instance "<instance_name>" already exists on host "<hostname>"

Solution

Shut down the database on the standby site if it is up and running before issuing the asgctl create standby database command.

A.1.13 Known Issue with Disaster Recovery Cloning on Windows

For the Windows platform, you must add the directory that contains the jar utility to the PATH when installing a JDK on the standby system.

Problem

If you do not add the directory that contains the jar utility to the PATH when installing a JDK on the standby system, the ASG on the standalone system cannot access the jar.exe utility, and you receive the following error while cloning:

standbynode: -->ASG_SYSTEM-100: operable program or batch file.
standbynode: -->ASG_DUF-4040: Error executing the external program or script.
The error code is "1"
standbynode: -->ASG_IAS-15690: Error running the restore script
standbynode: -->ASG_IAS-15698: Error during backup topology operation - copy step standbynode: -->ASG_DUF-3027: Error while executing Clone Instance at step -
unpack step.

Solution

If you receive this error, add the jar utility to the PATH on the standby system and restart the ASG server.

A.1.14 The asgctl shutdown topology Command Does Not Shut Down an MRCA Database That is Detected To Be of a repCA Type Database

The asgctl shutdown topology command only handles non-database instances.

Problem

In a repCA environment when Oracle Application Server Guard detects an instance and determines it to be a repCA type database, its instance is ignored in a shutdown topology operation. Any repCA type database is considered to be managed outside of Oracle Application Server Guard. Therefore, within an environment where an MRCA database has been added to the topology, the database will not be handled by the asgctl shutdown topology command.

Solution

Shut down any repCA type database by alternative methods other than the asgctl shutdown topology command.

A.1.15 Connecting to an Oracle Application Server Guard Server May Return an Authentication Error

An authentication error occurs when trying to connect to an Oracle Application Server Guard 10.1.2.0.2 or 10.1.2.1 server, even though the correct user name and password were entered.

Problem

When a user connects to an Oracle Application Server Guard server and gets an authentication error even though the correct user name and password were entered.

Note:

This DSA configuration file parameter is not documented in the "Oracle Application Server Guard Configuration File Parameters" section of the Oracle Application Server Guard Release Information readme.txt file.

Solution

Put the following flag in the dsa.conf file in the <ORACLE_HOME>/dsa directory and try the operation again:

_realm_override=1

A.1.16 Running Instantiate Topology Across Nodes After Executing a Failover Operation Results in an ORA-01665 Error

After running the asgctl failover operation, you must first perform an asgctl create standby database command to create the standby database on the remote host before performing an asgctl instantiate topology operation.

Problem

If you attempt to perform an asgctl instantiate topology operation immediately following an asgctl failover operation, an "ORA-01665: control file is not a standby control file" error message is returned.

Solution

To work around this problem, you must first perform an asgctl create standby database command to create the standby database on the remote host.

A.1.17 Oracle Application Server Guard Is Unable to Shutdown the Database Because More Than One Instance of Oracle RAC is Running

When you are running Oracle Application Server Guard in an Oracle RAC environment, you should have only one Oracle RAC instance running while performing Oracle Application Server Guard operations.

Problem

If you have more than one Oracle RAC instance running while performing Oracle Application Server Guard operations, an error occurs where the primary database complains that it is mounted by more than one instance, which prevents a shutdown. As a result, the following error will be seen:

ASGCTL> create standby database orcl1 on stanb06v3
.
.
.
This operation requires the database to be shutdown. Do you want to
continue? Yes or No 
y
Database must be mounted exclusive
stanb06v1: -->ASG_DUF-4950: An error occurred on host "stanb06v1" with IP
"141.86.22.32" and port "7890"
stanb06v1: -->ASG_DUF-3514: Failed to stop database orcl1.us.oracle.com.
stanb06v1: -->ASG_DGA-13002: Error during Create Physical Standby:
Prepare-primary processing.
stanb06v1: -->ASG_DUF-3027: Error while executing Creating physical standby
database - prepare phase at step - primary processing step.

Solution

Be sure to have only one Oracle RAC instance running while performing Oracle Application Server Guard operations

A.1.18 Create Standby Fails if Initiated on a Different ASGCTL Shell

The create standby database command fails if initiated by ASG clients from any node other than the source primary node where the database resides.

Problem

If you ran the create standby command from the production database to the standby database where prodnode1 is the primary site database nodename and standbynode1 is its standby database nodename. The ASGCTL shell should always be invoked and connected to prodnode1. If you try to run ASGCTL shell from standbynode1 and connect to prodnode1, the create standby command fails.

Solution

Run the create standby command from the same primary (source) node, where the database for the primary site resides.

A.1.19 Resolve Missing Archived Logs

The sync topology command in a RAC-RAC Linux environment returns missing archive logs errors.

Problem

The sync topology command in a RAC-RAC Linux environment fails and returns missing archive logs errors such as the following:

ASG_SYSTEM_-100: Please resolve missing archived logs and try again.

Solution

Ping the standby node using tnsping. If you are unable to ping the standby node, stop and restart the listener for that node and retry the tnsping.

A.1.20 Heartbeat Failure After Failover in Alert Logs

A warning appears in the alert logs of the database after a failover scenario.

Problem

The following warning appears in the alert logs of the database after a failover scenario, where the new primary database fails to tnsping its remote database instance.

Errors in file c:\oracle\product\10.2.0\admin\orcl\udump\orcl1_rfs_1816.trc:
ORA-16009: remote archive log destination must be a STANDBY database
.
Fri Sep 08 09:11:13 2006
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_arc1_496.trc:
ORA-16009: remote archive log destination must be a STANDBY database
.
Fri Sep 08 09:11:13 2006
PING[ARC1]: Heartbeat failed to connect to standby 'orcl1_remote1'. Error is 16009.
Fri Sep 08 09:11:50 2006
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[67]: Assigned to RFS process 628
RFS[67]: Database mount ID mismatch [0x4342404d:0x4341ffb0]
Fri Sep 08 09:11:50 2006
Errors in file c:\oracle\product\10.2.0\admin\orcl\udump\orcl1_rfs_628.trc:
ORA-16009: remote archive log destination must be a STANDBY database
.
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[68]: Assigned to RFS process 2488
RFS[68]: Database mount ID mismatch [0x4342404d:0x4341ffb0]
Fri Sep 08 09:12:05 2006
Errors in file c:\oracle\product\10.2.0\admin\orcl\udump\orcl1_rfs_2488.trc:
ORA-16009: remote archive log destination must be a STANDBY database
.
Fri Sep 08 09:12:14 2006
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_arc1_496.trc:
ORA-16009: remote archive log destination must be a STANDBY database

Solution

To avoid these error messages in the alert logs, null the log_archive_dest_2 parameter using the following commands:

alter system set log_archive_dest_2='SERVICE=null LGWR ASYNC REOPEN=60';
alter system set log_archive_dest_state_2='defer';

A.1.21 Create Standby Database Fails If Database Uses OMF Storage or ASM Storage

The create standby database command fails with ASG_ORACLE-300: ORA-01276 errors with some database storage options.

Problem

The create standby database command fails with ASG_ORACLE-300: ORA-01276 errors if the database storage option uses OMF (Oracle Managed Files) or ASM (Automatic Storage Management).

Solution

Create a new database instance using DBCA on the primary site with alternate storage options before running the create standby database command.

A.1.22 Database Already Exists Errors During Create Standby

Error messages appear when attempting to overwrite an existing database.

Problem

If you run a create standby database command and the database already exists at the target host, you get the following error messages:

Checking whether standby instance already exists
proddnode1: -->ASG_DUF-4950: An error occurred on host "proddnode1" with IP
"a.b.c.d" and port "7891"
standbynode1: -->ASG_DUF-4950: An error occurred on host "standbynode1" with IP
"e.f.g.h" and port "7891"
standbynode1: -->ASG_DGA-12500: Standby database instance "orcl" already exists
on host "standbynode1".
standbynode1: -->ASG_DGA-13001: Error during Create Physical Standby:
Prepare-check standby.
standbynode1: -->ASG_DUF-3027: Error while executing Creating physical standby
database - prepare phase at step - check standby step.

Solution

The create standby database command uses a database at a primary site to create a standby database at a standby site. It assumes that only the Oracle database software has been installed on the standby site peer host in the same Oracle database home directory as the primary database at the primary site; the command fails with the error messages in the preceding example if an actual database exists in the standby site peer host's Oracle database home.

Perform the following steps, then run the create standby database command again to create the standby database at the standby site peer host:

If the database at the standby site peer host is running, shut it down using the SQL*Plus shutdown immediate or shutdown abort command.
Remove references to the database from the standby site peer host as follows:

On Windows:

Run the oradim command to delete the Oracle SID:
```
> oradim -delete -sid <database-sid>
```
On UNIX:

Delete the database entry from the oratab file.

For non-Real Application Clusters databases, the entry has this format:
```
DBSID:oracle_home
```
For Real Applications Clusters databases, the entry has this format:
```
DBuniqueName:oracle_home
```
Delete any initialization files in the Oracle database home at the standby site peer host.

A.1.23 Oracle Application Server Guard Add Instance Command Fails When Attempting to Add an Oracle RAC Database to the Topology

When using the add instance command to add an Oracle RAC database instance to the topology, the method of referring to the database is different on Linux systems than on Windows systems.

On Linux systems, the oratab entry is used for discovery of the home. For non-RAC installations the database SID is used in the oratab entry and for RAC installations the database unique name is used in the oratab entry.

On Windows systems, the system registry is used for discovery of the home, and the database SID located in the registry is used. Consequently, on Windows systems when adding an instance to the topology that is an Oracle RAC database, you must use the database SID instead of the database name when referring to the Oracle RAC database instance.

There must be an oratab entry (on Linux) or registry entry (on Windows) with the SID of the primary database instance that ASG attaches to with the asgctl add instance command.

See Section A.1.22 for examples of database SIDs for Windows and UNIX databases and of a database unique name for UNIX systems.

Problem

The Oracle RAC database install on Windows does not store the Oracle RAC database name or the global database name anywhere in the registry. Therefore, the workaround to this problem for Windows systems is as follows. When using the asgctl add instance command, always use the database SID of a RAC database on Windows and proceed with rest of the Oracle Disaster Recovery cycle of operations, such as create standby database, instantiate topology, sync topology, and switchover topology. For example:

asgctl> add instance <database SID of RAC database on Windows> on <virtualhost>

asgctl> add instance orcl1 on asinfra.us.oracle.com

Solution

Use the database SID of an Oracle RAC database on Windows in asgctl commands.

A.1.24 A Create Standby Database Operation Fails with an ASG_DGA-12500 Error Message on Windows

An error occurs when Oracle Application Server Guard issues a create standby database command on Windows and the target standby database environment has not been cleaned up.

Problem

When Oracle Application Server Guard issues a create standby database command on Windows if the target standby database environment has not been cleaned up, the following error occurs:

ASG_DGA-12500: Standby database instance "db25" already exists on host <hostame>

The target environment may not be clean because a previous attempted setup of the standby failed for some system reason or because of the operations being attempted to 'reestablish' an existing standby database.

Solution

Clean up the environment using the following command.

oradim -delete -sid db25

After cleaning up the environment, the asgctl create standby database command can be reissued.

A.1.25 Use Fully Qualified Instance Names to Ensure Uniqueness

When you add an instance to an OracleAS Disaster Recovery topology, the instance name must be unique within the topology. This condition is validated by Oracle Application Server Guard when the instance is being added. The instance name can be fully qualified with the host on which it is deployed to ensure uniqueness.

Problem

If the instance name of each Oracle Application Server tier is not unique across all the homes on all the nodes in the primary site, when you execute an add instance command for the second instance with the same instance name as an already added instance, you get the following error:

ASGCTL> add instance ohs on mt1 
ASGCTL> add instance ohs on mt2
host2: -->ASG_IAS-15782: Error: Instance "ohs" already exists in the 
topology
ASGCTL>

Solution

Use fully qualified instance names to ensure that the instance names are unique within the topology, for example:

ASGCTL> add instance ohs.mt1.mycompany.com on mt1
ASGCTL> add instance ohs.mt2.mycompany.com on mt2
ASGCTL>

A.1.26 Misleading Message on JSSO Page

A misleading message appears under the Instances and Properties tab of the Java SSO Configuration page.

Problem

When you use Application Server Control to configure Java Single Sign-On (Java SSO), the following message appears at the top of the Java SSO Configuration page if no Java SSO applications are running in the cluster:

There are no active Java SSO applications in the cluster. At least one Java SSO application (javasso) must be running before you can configure Java SSO.

However, the following additional and misleading message appears under the Instances and Property tab in the Java SSO Configuration page:

Java SSO is configured for this cluster.

Solution

When an error message appears at the top of the Java SSO Configuration page, ignore the message that appears under the Instances and Properties tab. In fact, Java SSO cannot be configured until at least one instance of the Java SSO application is running in the cluster.

For more information, see "OC4J Java Single Sign-On" in the Oracle Application Server Containers for J2EE Security Guide

A.1.27 Instantiate Topology Fails if TNS Alias Includes Domain

Instantiate topology fails with error messages if the TNS alias entries for the standby database include the domain.

Problem

The following errors are returned when the TNS alias entries for the standby database include domain:

ORCL.ORACLE.COM =
 (DESCRIPTION = 
 (ADDRESS = (PROTOCOL = TCP)(HOST = standbynode.oracle.com)(PORT = 1521))
 (CONNECT_DATA = 
 (SERVER = DEDICATED)
 (SERVICE_NAME = ORCL.ORACLE.COM)
 ) 
 ) 
.
ASG_ORACLE-300: ORA-12514: TNS:listener does not currently know of service@ requested in connect descriptor

Solution

Add the correct domain to the NAMES.DEFAULT_DOMAIN parameter in sqlnet.ora on the standby database before running the instantiate command.

A.1.28 ORA-32001 Errors during Create Standby Database

In Windows operating systems, errors are returned after executing the create standby database command.

Problem

The create standby database command creates the SPFILE under ORACLE_HOME/dbs directory on the standby instead of ORACLE_HOME/database. As a result, the whenever the database is started up on the standby site, it fails to use the SPFILE under ORACLE_HOME/dbs and uses pfile instead. When the create standby command is executed again from standby site (for role reversal) it fails because the database does not use spfile.

stanbynode1: -->ASG_DUF-4950: An error occurred on host "stada26" with IP "140.87.5.
@ 102" and port "7892"
standbynode1: -->ASG_ORACLE-300: ORA-32001: write to SPFILE requested but no
SPFILE specified at startup
standbynode1: -->ASG_DUF-3700: Failed in SQL*Plus executing SQL statement: 
alter sys
tem set db_file_name_convert=
'C:\WORK\ORADATA\ASDB01','C:\WORK\ORADATA\ASDB01'
SCOPE=SPFILE /* ASG_DGA */;.
standbynode1: -->ASG_DGA-13010: Error during Create Physical Standby:
Finish-configure primary.
standbynode1: -->ASG_DUF-3027: Error while executing Creating physical standby database - finish phase at step - finish step. 
ASG_ORACLE-300: ORA-12514: TNS:listener does not currently know of service
requested in connect descriptor

Solution

On Windows only, after executing the create standby database command, copy the SPFILE from ORACLE_HOME/dbs to ORACLE_HOME/database on the standby database site.

A.1.29 ORA-09925 Errors when Bringing Up RAC Database Manually after Switchover

ORA-09925 errors appear when bringing up a RAC database manually after a switchover operation.

Problem

The following errors appear after a ASG switchover operation, when bringing up some of the RAC database instances manually.

SQL> startup;
ORA-09925: Unable to create audit trail file
Linux Error: 2: No such file or directory
Additional information: 9925

Solution

Make sure the directory pointed at by the audit_file_dest parameter in your init file exists.

For example:

mkdir <ORACLE_HOME>/admin/<dbname>/admin

When the asgctl create standby database command is used to create a RAC database at the standby site, the audit_file_dest init.ora parameter will be defined at the standby site database if it was defined for the production site database.

A.1.30 Recommended Method of Patching an Oracle Application Server Disaster Recovery Site

This section describes how to apply an Oracle Application Server patch set to upgrade the Oracle homes that participate in an Oracle Application Server Disaster Recovery site.

Problem

You are unsure how to apply an Oracle Application Server patch set to upgrade the Oracle homes in your Oracle Application Server Disaster Recovery site.

Solution

The list in this section describes the steps for applying an Oracle Application Server patch set to upgrade the Oracle homes that participate in an Oracle Application Server Disaster Recovery site.

Note:

It is also possible to upgrade or update the version of Oracle Application Server Guard (OracleAS Guard) that is installed in the existing Oracle home for an Application Server instance. This OracleAS Guard-only upgrade only upgrades the OracleAS Guard (ASG) utility; it does not affect the runtime operation of the other components in the Application Server home. See Section 1.1.2, "Using Oracle Application Server Guard in an OracleAS Disaster Recovery Topology" for more information about how to upgrade OracleAS Guard in an Oracle Application Server home.

Use the following procedure to upgrade Oracle Application Server patch versions:

Perform a backup of the production site to ensure that the starting state is secured.
Perform an ASG sync topology operation using a mandatory policy to synchronize all the instances in the topology. This ensures that prior to patching the configuration is updated at the standby site.
Perform an ASG failover operation, but do not perform a DNS switchover for the topology. This breaks the production/standby relationship of the topology and forms two sites. Starting with this step, the backup operation is the last resort recovery of the site prior to the upgrade procedure.
Perform the upgrade at the former standby site. The upgrade of the former standby site is a test that the upgrade will be smooth and successful. Because a DNS switchover was not performed in the previous step, access to the site is still maintained at the former production site. Your recovery point is effectively the point of the backup.
If problems occur in the previous step, you will remedy them when upgrading the former production site.
When the standby site upgrade is complete, upgrade the former production site.
Perform an ASG discover topology operation at the former production site.
Perform an ASG instantiate topology operation at the production site to establish the relationship between the production and standby sites, mirror the configuration, and synchronize the standby site with the production site.
The upgrade is now complete. Your Disaster Recovery topology is ready to resume processing.

A.2 Troubleshooting Middle-Tier Components

This section describes common problems and solutions for middle-tier components in high availability configurations. It contains the following topics:

Section A.2.1, "Using Multiple NICs with OracleAS Cluster (OC4J-EJB)"
Section A.2.2, "Performance Is Slow When Using the "opmn:" URL Prefix"

A.2.1 Using Multiple NICs with OracleAS Cluster (OC4J-EJB)

Problem

If you are running OracleAS Cluster (OC4J-EJB) on computers with two NICs (network interface cards) and you are using one NIC for connecting to the network and the second NIC for connecting to the other node in the cluster, multicast messages may not be sent or received correctly. This means that session information does not get replicated between the nodes in the cluster.

Figure A-1 OracleAS Cluster (OC4J-EJB) Running on Computers with Two NICs

Description of "Figure A-1 OracleAS Cluster (OC4J-EJB) Running on Computers with Two NICs"

Solution

You must start up the OC4J instances by setting the oc4j.multicast.bindInterface parameter to the name or IP address of the other NIC on the node.

For example, using the values shown in Figure A-1, you would start up the OC4J instances with these parameters:

On node 1, configure the OC4J instance to start with up with this parameter:

-Doc4j.multicast.bindInterface=123.45.67.21

On node 2, configure the OC4J instance to start with up with this parameter:

-Doc4j.multicast.bindInterface=123.45.67.22

You specify this parameter and its value in the "Java Options" field in the "Command Line Options" section in the Server Properties page in the Application Server Control Console (Figure A-2).

Figure A-2 Server Properties Page in Application Server Control Console

Description of "Figure A-2 Server Properties Page in Application Server Control Console"

A.2.2 Performance Is Slow When Using the "opmn:" URL Prefix

Problem

If you have applications that use the "opmn:" prefix in their Context.PROVIDER_URL property, you may experience slow performance in the InitialContext method.

The following sample code sets the PROVIDER_URL to a URL with an opmn: prefix.

Hashtable env = new Hashtable();
env.put(Context.PROVIDER_URL, "opmn:ormi://hostname:port/cmpapp");
// ... set other properties ...
Context context = new InitialContext(env);

If the host specified in PROVIDER_URL is down, the application has to make a network connection to OPMN to locate another host. Going through the network to OPMN takes time.

Solution

To avoid making another network connection to OPMN to get another host, set the oracle.j2ee.naming.cache.timeout property so that the values returned from OPMN the first time are cached, and the application can use the values in the cache.

The following sample code sets the oracle.j2ee.naming.cache.timeout property.

Hashtable env = new Hashtable();
env.put(Context.PROVIDER_URL, "opmn:ormi://hostname:port/cmpapp");

// set the cache value
env.put("oracle.j2ee.naming.cache.timeout", "30");

// ... set other properties ...

Context context = new InitialContext(env);

Table A-1 shows valid values for the oracle.j2ee.naming.cache.timeout property:

Table A-1 Values for the oracle.j2ee.naming.cache.timeout Property

Value Meaning

Value	Meaning
`-1`	No caching.
`0`	Cache only once, without any refreshing.
Greater than `0`	Number of seconds after which the cache can be refreshed. Note that this is not automatic; the refresh occurs only when you invoke "`new` `InitialContext()`" again. If the property is not set, the default value is 60.

-1

No caching.

0

Cache only once, without any refreshing.

Greater than 0

Number of seconds after which the cache can be refreshed. Note that this is not automatic; the refresh occurs only when you invoke "new InitialContext()" again.

If the property is not set, the default value is 60.

With the property set, you will still see some delay on the first "new InitialContext()" call, but subsequent calls should be faster because they are retrieving data from the cache instead of making a network connection to OPMN.

Note that for optimal performance, you should also set Dedicated.Connection to either YES or DEFAULT, and set Dedicated.RMIcontext to FALSE.

A.3 Need More Help?

In case the information in the previous section is not sufficient, you can find more solutions on Oracle MetaLink, https://metalink.oracle.com. If you do not find a solution for your problem, log a service request.