A Troubleshooting High Availability

This appendix describes common problems that you might encounter when deploying and managing Oracle Application Server in high availability configurations, and explains how to solve them. It contains the following topics:

Section A.1, "Troubleshooting OracleAS Cold Failover Cluster Configurations"
Section A.2, "Troubleshooting OracleAS Cluster (Identity Management) Configurations"
Section A.3, "Troubleshooting OracleAS Disaster Recovery Configurations"
Section A.4, "Troubleshooting Middle-Tier Components"
Section A.5, "Troubleshooting Backup and Recovery"
Section A.6, "Troubleshooting Real Application Clusters"
Section A.7, "Need More Help?"

A.1 Troubleshooting OracleAS Cold Failover Cluster Configurations

This section describes common problems and solutions in OracleAS Cold Failover Cluster configurations. It contains the following topics:

Section A.1.1, "OracleAS Web Cache Does Not Fail Over"
Section A.1.2, "Unable to Perform Online Database Backup and Restore in OracleAS Cold Failover Cluster Environment"
Section A.1.3, "Cannot Connect to Databasefor Restoration (Windows)"

A.1.1 OracleAS Web Cache Does Not Fail Over

Problem

OracleAS Web Cache does not fail over in an OracleAS Cold Failover Cluster environment (that is, it does not start up on the standby node). It writes the following error in the log file:

[26/Apr/2005:14:36:08 -0700] [error 13079] [ecid: -] No matching CACHE
element found in webcache.xml for current hostname (hostname) and
ORACLE_HOME (/path/to/oracle/home)

Solution

You need to perform these steps for OracleAS Web Cache to fail over in an OracleAS Cold Failover Cluster environment:

Create a two-node OracleAS Web Cache cluster using Application Server Control Console. For the host name, use the physical hostnames of the nodes in the OracleAS Cold Failover Cluster.
Keep both of these cache entries (the CACHE element in webcache.xml) in sync, except for the host name.

For details on OracleAS Web Cache clusters, see chapter 10, "Configuring Cache Clusters", in the Oracle Application Server Web Cache Administrator's Guide.

A.1.2 Unable to Perform Online Database Backup and Restore in OracleAS Cold Failover Cluster Environment

Issues with online database backup and restore are noted here. This information pertains to the OracleAS Cold Failover Cluster environment.

Problem

Unable to perform online recovery of Infrastructure database due to dependencies and cluster administrator trying to bring the database down and then up during the recovery phase by the Backup and Recovery Tool.

Solution 1

To perform a clean recovery, use the following steps:

Bring all resources offline using the cluster administrator (for Windows, use Oracle Fail Safe).
Perform a normal shutdown of the Infrastructure database.
Start only the database service using the following command:

net start OracleService<SID>
Run the Backup and Recovery Tool to perform the recovery of the database.

Solution 2

For Windows, the following steps can be used to perform a recovery:

In Oracle Fail Safe, under "Cluster Resources", select "ASDB(DB Resource)" in the "Database" tab.
For "Database Polling", select "Disabled" from the drop down list.
Using the Backup and Recovery Tool, perform an online restore of the Infrastructure database.

The database is not accessible for a brief period while the Backup and Recovery Tool stops and starts the database. Once the database starts up, it can be accessed by middle-tier and Infrastructure components.

A.1.3 Cannot Connect to Database for Restoration (Windows)

Unable to connect to idle OracleAS Metadata Repository database to restore it after it is shutdown using Microsoft Cluster Administrator.

Problem

When you stop the OracleAS Metadata Repository database using Microsoft Cluster Administrator, Microsoft Cluster Administrator performs the strictest and fastest abort to shut down the database service. After the shutdown, you are unable to connect to the database.

The following steps illustrate the problem:

Access an OracleAS Metadata Repository that is used for testing.
Corrupt a database file (note: do not modify the ts$ table).
Issue a SQL query to ensure that the database is corrupted.
Using Microsoft Cluster Administrator, verify that the database is online.
Using Oracle Fail Safe Manager, disable database polling.
Using Microsoft Cluster Administrator, take the database offline. This also takes OPMN and Application Server Control Console offline as they are dependencies of the database.
Try connecting as sysdba. The connection should fail.

Solution

Use the Oracle Fail Safe Manager to shut down the database. To do so:

In the Oracle Fail Safe Manager, right-click the "ASDB" resource (default if not changed), and select "Immediate".
Start the database service using Windows Service Manager.
Connect to the database as sysdba. The connection should be successful.

A.2 Troubleshooting OracleAS Cluster (Identity Management) Configurations

This section describes common problems and solutions in OracleAS Cluster (Identity Management) configurations. It contains the following topics:

Section A.2.1, "Logging intoOracleAS Single Sign-On Takes a Long Time"
Section A.2.2, "Oracle Internet Directory Does Not Start Up on One of the Nodes"
Section A.2.3, "Unable to Connect to Oracle Internet Directory, and Oracle Internet Directory Cannot Be Restarted"
Section A.2.4, "Cluster Configuration Assistant Fails During Installation"
Section A.2.5, "Oracle Ultra Search Configuration Assistant is Unable to Connect to Oracle Internet Directory During High Availability Infrastructure Installation"
Section A.2.6, "odisrv Process Does Not Fail Over After "opmnctl stopall""
Section A.2.7, "Unpredictable Behavior from OracleAS Cluster (Identity Management) Configuration When System Time on All Nodes Is Not Synchronized"
Section A.2.8, "Wrong Name Specifiedfor Load Balancer"

Problems and solutions related to multimaster replication and other Oracle Internet Directory features are documented in the troubleshooting section of Oracle Internet Directory Administrator's Guide.

A.2.1 Logging into OracleAS Single Sign-On Takes a Long Time

Problem

Logging into OracleAS Single Sign-On might take a long time if you are running OracleAS Single Sign-On and Oracle Internet Directory on opposite sides of a firewall (OracleAS Single Sign-On is running outside the firewall and Oracle Internet Directory inside the firewall) and if the firewall is configured to drop idle connections or recycle connections after the configured timeout period has elapsed.

Solution

Set the timeout on OracleAS Single Sign-On connections to a value smaller than the firewall and load balancer timeout values. The OracleAS Single Sign-On server will remove connections that are idle for longer than the specified value.

You specify this value (in minutes) using the connectionIdleTimeout parameter in the ORACLE_HOME/sso/conf/policy.properties file. For example, the following line sets the timeout value for 20 minutes. The OracleAS Single Sign-On server will remove connections that are idle for longer than 20 minutes.
```
connectionIdleTimeout = 20
```
Restart the OC4J server (OC4J_SECURITY) that is running the OracleAS Single Sign-On server for the new value to take effect.
Set the timeout for database connections in the SQLNET.EXPIRE_TIME parameter in the ORACLE_HOME/network/admin/sqlnet.ora file. You also set this value to a value smaller than the firewall and load balancer timeout values.

This parameter specifies how often the database server sends a probe packet to the client (which is the OracleAS Single Sign-On server). This periodic activity by the probe packet enables the OracleAS Single Sign-On server-to-database connections to stay active.

The value is specified in minutes. In the following example, the database server sends the probe packet every 20 minutes to the client.
```
SQLNET.EXPIRE_TIME = 20
```
Restart the database for the new value to take effect.

Explanation: The firewall or load balancer might drop connections to Oracle Internet Directory and the database if the connections are idle for a certain time. When the firewall or load balancer drops a connection, it might not send a tcp close notification to the OracleAS Single Sign-On server. The OracleAS Single Sign-On server then is unaware that the connection is no longer valid and tries to use it to perform Oracle Internet Directory or database operations. When the OracleAS Single Sign-On server does not get a response, it tries the next connection. Eventually it tries all the connections in the pool before making fresh connections to Oracle Internet Directory or to the database.

By setting the timeout on the OracleAS Single Sign-On server and on the database to a value smaller than the timeout on the firewall or load balancer, you ensure that the connections are valid.

A.2.2 Oracle Internet Directory Does Not Start Up on One of the Nodes

Problem

If the time difference between the nodes in the OracleAS Cluster (Identity Management) is greater than 250 seconds, the Oracle Internet Directory Monitor (oidmon) will stop Oracle Internet Directory on the node that is behind. For example, if the time on node A is ahead of node B's by more than 250 seconds, then oidmon will stop Oracle Internet Directory processes on node B. This is because the oidmon processes on all the nodes update the database every 10 seconds to tell the other nodes it is running. If a node does not respond for 250 seconds, then the other nodes treat that node as a failed node.

Solution

Synchronize the time on all nodes to within 250 seconds of each other.

A.2.3 Unable to Connect to Oracle Internet Directory, and Oracle Internet Directory Cannot Be Restarted

Problem

This issue applies only to Windows 2000 platforms. This issue has two symptoms:

Symptom #1: If you have configured your load balancer to monitor the Oracle Internet Directory ports using TCP port monitoring, you might see the "maximum number of connections reached" error in the Oracle Internet Directory log file. This means that clients are unable to connect to Oracle Internet Directory.

Symptom #2: If Oracle Internet Directory terminates, you are not able to restart it. When you try to restart it, you get a message that Oracle Internet Directory is unable to access its ports because the System Idle Process is already using them. Oracle Internet Directory needs exclusive access to its ports.

Solution

This problem is caused by an application (in this case, the load balancer) that performs TCP port monitoring on the Oracle Internet Directory ports. In TCP port monitoring, the application opens and closes connections to the Oracle Internet Directory ports. In Windows 2000, the connection is not closed properly; this is why you reach the maximum number of connections.

The workaround is not to use TCP port monitoring for the Oracle Internet Directory ports. Instead, use LDAP or HTTP port monitoring.

A.2.4 Cluster Configuration Assistant Fails During Installation

Problems encountered during the clustering of components using the Cluster Configuration Assistant are addressed here.

Problem

During the installation of distributed Oracle Identity Management configurations, the OracleAS Single Sign-On and Oracle Delegated Administration Services components are installed in two of their own nodes separate from the other Oracle Identity Management components. The Cluster Configuration Assistant may attempt to cluster the two resulting OracleAS Single Sign-On/Oracle Delegated Administration Services instances together. However, the error message "Instances containing disabled components cannot be added to a cluster" may appear. This message appears because Enterprise Manager cannot cluster instances with disabled components.

Solution

If the Cluster Configuration Assistant fails, you can cluster the instance after installation. In this case, to cluster the instance, you must use the "dcmctl joincluster" command instead of Application Server Control Console. You cannot use Application Server Control Console in this case because it cannot cluster instances that contain disabled components. In this case, the "home" OC4J instance is disabled.

A.2.5 Oracle Ultra Search Configuration Assistant is Unable to Connect to Oracle Internet Directory During High Availability Infrastructure Installation

During high availability Infrastructure installation, the Oracle Ultra Search Configuration Assistant cannot connect to an Oracle Internet Directory instance at port 3060 of the virtual hostname provided in the virtual hostname addressing screen.

Problem

A common mistake can be made when virtual hostname addressing is used during Infrastructure installation. The load balancer virtual server name is entered, and the load balancer is set up correctly to assume this name. However, the Infrastructure node is not set up correctly to resolve this name. Thus, when the Oracle Ultra Search Configuration Assistant on the Infrastructure node tries to connect to the load balancer virtual server name, the Configuration Assistant cannot find the load balancer.

Solution

The solution is to set up name resolution correctly on the Infrastructure machine for the load balancer virtual server name. This procedure is platform dependent. Check your operating system manual for an accurate procedure. In Unix, this usually involves editing the /etc/hosts file and making sure this file is used for name resolution by editing the /etc/nsswitch.conf file. In Windows, this usually involves editing the C:\WINDOWS\system32\drivers\etc\hosts file.

A.2.6 odisrv Process Does Not Fail Over After "opmnctl stopall"

Issues with odisrv process failover between nodes are documented here.

Problem

In any OracleAS Cluster (Identity Management) solution, when opmnctl stopall is executed to stop all OPMN-managed processes on that node, odisrv is not started automatically on the second node because opmnctl stopall is a normal administrative shutdown, not an actual node failure. In a true node failure, odisrv is started on the remaining node upon death detection of the original odisrv process.

Solution

If planned maintenance is required for an OracleAS Cluster (Identity Management), use the oidctl command to explicitly stop and start odisrv.

On the node where odisrv is running, use the following command to stop it:

ORACLE_HOME/bin/oidctl connect=<dbConnect> server=odisrv inst=1 stop

On the remaining active node, start odisrv using the following command:

ORACLE_HOME/bin/oidctl connect=<dbConnect> server=odisrv instance=1
     flags="host=OIDhost port=OIDport" start

A.2.7 Unpredictable Behavior from OracleAS Cluster (Identity Management) Configuration When System Time on All Nodes Is Not Synchronized

Unpredictable behavior from OracleAS Cluster (Identity Management) nodes if system time on all nodes is not synchronized.

Problem

In a OracleAS Cluster (Identity Management) configuration, the Oracle Internet Directory Monitor (OIDMON) on each node updates the directory database every 10 seconds with metadata. At the same time, it queries the database to verify that all other directory servers are running.

If an OIDMON does not update the database for 250 seconds, the other nodes assume that that node has failed. This delay can be manifested erroneously by nodes with their system clocks set with a difference of more than 250 seconds from the other nodes. When this happens, OIDMON on one of the other nodes will initiate failover operations, which include locally bringing up processes that were running on the failed node. The node where these processes are started continue processing the operations that were underway in the failed node.

As an example, assume a OracleAS Cluster (Identity Management) configuration with nodes A and B. The system clock in node B is 300 seconds behind node A's clock. Node B updates its metadata in the directory database, which includes the system clock value. Node A queries the database for active Oracle Internet Directory servers and determines that node B has failed because its time value is 300 seconds. Node A then initiates failover operations by locally starting all Oracle Internet Directory server processes that were running on node B.

Solution

The system clock value on all nodes in the OracleAS Cluster (Identity Management) configuration should be synchronized using Greenwich mean time so that there is a discrepancy of no more than 250 seconds between them.

Refer to the chapters on Rack-Mounted directory server configurations in the Oracle Internet Directory Administrator's Guide.

A.2.8 Wrong Name Specified for Load Balancer

If a load balancer is deployed in front of Oracle Application Server instances that are clustered together, configuration files of the instances may not have the correct load balancer virtual server name specified.

Problem

For a cluster of Oracle Application Server instances front-ended by a load balancer, a redirect back to the cluster may not contain the load balancer virtual server name. Dynamic pages created by a servlet or JSP may also not use the correct load balancer virtual server name. In both cases, the local hostname is most likely used instead.

To correctly specify the load balancer virtual server name to be used, modifications have to be made to the httpd.conf and default-web-site.xml file for each instance.

Solution

For each Oracle Application Server instance, perform the following steps:

Perform the following steps for Oracle HTTP Server:
1. Stop the Oracle HTTP Server using the following command:
  
  opmnctl stopproc ias_component=HTTP_Server
2. In Oracle HTTP Server's httpd.conf file, change the value for the directive ServerName to the virtual server name of your load balancer. For example, if you use "localhost", change it to the virtual server name of your load balancer.
3. In the same httpd.conf file, change the value of the Port directive to the port number your load balancer is configured with for incoming requests. For example, if the port number specified is 7777, change it to port 80 if that is configured on your load balancer.
4. Execute the following command to update the DCM repository with the above changes:
  
  dcmctl updateConfig -ct ohs
5. Start the Oracle HTTP Server using the following command:
  
  opmnctl startproc ias_component=HTTP_Server
Perform the following steps for OC4J:
1. Stop the OC4J processes for each OracleAS instance using the following command:
  
  opmnctl stopproc ias_component=OC4J
2. Edit the file default-web-site.xml to include the following line:
  
  <frontend host="load_balancer_name" port="port_number" />
  
  Replace "load_balancer_name" with the virtual server name of your load balancer and "port_number" with the port number that is configured for incoming requests in your load balancer (these values are similar to those you entered for httpd.conf above).
3. Execute the following command to update the DCM repository with the changes you made in the default-web-site.xml file:
  
  dcmctl updateconfig -ct oc4j
4. Start the OC4J instances using the following command:
  
  opmnctl startproc ias_component=OC4J

A.3 Troubleshooting OracleAS Disaster Recovery Configurations

This section describes common problems and solutions in OracleAS Disaster Recovery configurations. It contains the following topics:

Section A.3.1, "Standby Site Not Synchronized"
Section A.3.2, "Failure to Bring Up Standby Instances After Failover or Switchover"
Section A.3.3, "Unable to Start Standalone OracleAS Web Cache Installations at the Standby Site"
Section A.3.4, "Standby Site Middle-tier Installation Uses Wrong Hostname"
Section A.3.5, "Failure of Farm Verification Operation with Standby Farm"
Section A.3.6, "Sync Farm Operation Returns Error Message"

A.3.1 Standby Site Not Synchronized

In the OracleAS Disaster Recovery standby site, you may find that the site's OracleAS Metadata Repository is not synchronized with the OracleAS Metadata Repository in the primary site.

Problem

The OracleAS Disaster Recovery solution requires manual configuration and shipping of data files from the primary site to the standby site. Also, the data files (archived database log files) are not applied automatically in the standby site, that is, OracleAS Disaster Recovery does not use managed recovery in Oracle Data Guard.

Solution

The archive log files have to be applied manually. The steps to perform this task is found in Chapter 13, "OracleAS Disaster Recovery".

A.3.2 Failure to Bring Up Standby Instances After Failover or Switchover

Standby instances are not started after a failover or switchover operation.

Problem

IP addresses are used in instance configuration. OracleAS Disaster Recovery setup does not require identical IP addresses in peer instances between the production and standby site. OracleAS Disaster Recovery synchronization does not reconcile IP address differences between the production and standby sites. Thus, if you use explicit IP address xxx.xx.xxx.xx in your configuration, the standby configuration after synchronization will not work.

Solution

Avoid using explicit IP addresses. For example, in OracleAS Web Cache and Oracle HTTP Server configurations, use ANY or host names instead of IP addresses as listening addresses

A.3.3 Unable to Start Standalone OracleAS Web Cache Installations at the Standby Site

OracleAS Web Cache cannot be started at the standby site possibly due to misconfigured standalone OracleAS Web Cache after failover or switchover.

Problem

OracleAS Disaster Recovery synchronization does not synchronize standalone OracleAS Web Cache installations.

Solution

Use the standard Oracle Application Server full CD image to install the OracleAS Web Cache component

A.3.4 Standby Site Middle-tier Installation Uses Wrong Hostname

A middle-tier installation in the standby site uses the wrong hostname even after the machine's physical hostname is changed.

Problem

Besides modifying the physical hostname, you also need to put it as the first entry in /etc/hosts file. Failure to do the latter will cause the installer to use the wrong hostname.

Solution

Put the physical hostname as the first entry in the /etc/hosts file. See Section 13.2.2, "Configuring Hostname Resolution" for more information.

A.3.5 Failure of Farm Verification Operation with Standby Farm

When performing a verify farm with standby farm operation, the operation fails with an error message indicating that the middle-tier machine instance cannot be found and that the standby farm is not symmetrical with the production farm.

Problem

The verify farm with standby farm operation is trying to verify that the production and standby farms are symmetrical to one another, that they are consistent, and conform to the requirements for disaster recovery.

The verify operation is failing because it sees the middle-tier instance as mid_tier.<hostname> and not as mid_tier.<physical_hostname>. You might suspect that this is a problem with the environmental variable _CLUSTER_NETWORK_NAME_, which is set during installation. However, in this case, it is not because a check of the _CLUSTER_NETWORK_NAME_ environmental variable setting finds this entry to be correct. However, a check of the contents of the /etc/hosts file, indicates that the entries for the middle tier in question are incorrect. That is, all middle-tier installations take the hostname from the second column of the /etc/hosts file.

For example, assume the following scenario:

Two environments are used: examp1 and examp2
OracleAS Infrastructure (Oracle Identity Management and OracleAS Metadata Repository) is first installed on examp1 and examp2 as host infra
OracleAS middle-tier (OracleAS Portal and OracleAS Wireless) is then installed on examp1 and examp2 as host node1
Basically, these are two installations (OracleAS Infrastructure and OracleAS middle-tier) on a single node
Updated the latest duf.jar and backup_restore files on all four Oracle homes
Started OracleAS Guard (asgctl) on all four Oracle homes (OracleAS Infrastructure and OracleAS middle-tier on two nodes)
Performed asgctl operations: connect asg, set primary, dump farm
Performed asgctl verify farm with standby farm operation, but it fails because it sees the instance as mid-tier.examp1 and not as mid_tier.node1.us.oracle.com

A check of the /etc/hosts file shows the following entry:

123.45.67.890 examp1 node1.us.oracle.com node1 infra

Then ias.properties and farms shows the following and the verify operation is failing:

IASname=midtier_inst.examp1

However, the /etc/hosts file should actually be the following:

123.45.67.890 node1.us.oracle.com node1 infra

Then ias.properties and farms shows the following and the verify operation succeeds:

IASname=midtier_inst.node1.us.oracle.com

Solution

Check and change the second column entry in your /etc/hosts file to match the hostname of the middle-tier node in question as described in the previous explanation.

A.3.6 Sync Farm Operation Returns Error Message

A sync farm to operation returns the error message: "Cannot Connect to asdb"

Problem

Occasionally, an administrator may forget to set the primary database using the asgctl command line utility in performing an operation that requires that the asdb database connection be established prior to an operation. The following example shows this scenario for a sync farm to operation:

ASGCTL> connect asg hsunnab13 ias_admin/iastest2
Successfully connected to hsunnab13:7890
ASGCTL>  
.
.
.
<Other asgctl operations may follow, such as verify farm, dump farm, 
<and show operation history, and so forth that do not require the connection
<to the asdb database to be established or a time span may elapse of no activity
<and the administrator may miss performing this vital command.
.
.
.
ASGCTL> sync farm to usunnaa11
prodinfra(asr1012): Syncronizing each instance in the farm to standby farm
prodinfra: -->ASG_ORACLE-300: ORA-01031: insufficient privileges
prodinfra: -->ASG_DUF-3700: Failed in SQL*Plus executing SQL statement:  connect null/******@asdb.us.oracle.com as sysdba;.
prodinfra: -->ASG_DUF-3502: Failed to connect to database asdb.us.oracle.com.
prodinfra: -->ASG_DUF-3504: Failed to start database asdb.us.oracle.com.
prodinfra: -->ASG_DUF-3027: Error while executing Syncronizing each instance in the farm to standby farm at step - init step.

Solution

Perform the asgctl set primary database command. This command sets the connection parameters required to open the asdb database in order to perform the sync farm to operation. Note that the set primary database command must also precede the instantiate farm to command and switchover farm to command if the primary database has not been specified in the current connection session.

A.4 Troubleshooting Middle-Tier Components

This section describes common problems and solutions for middle-tier components in high availability configurations. It contains the following topics:

Section A.4.1, "Using Multiple NICs with OracleAS Cluster (OC4J-EJB)"
Section A.4.2, "Performance Is Slow When Using the "opmn:" URL Prefix"

A.4.1 Using Multiple NICs with OracleAS Cluster (OC4J-EJB)

Problem

If you are running OracleAS Cluster (OC4J-EJB) on computers with two NICs (network interface cards) and you are using one NIC for connecting to the network and the second NIC for connecting to the other node in the cluster, multicast messages may not be sent or received correctly. This means that session information does not get replicated between the nodes in the cluster.

Figure A-1 OracleAS Cluster (OC4J-EJB) Running on Computers with Two NICs

Description of "Figure A-1 OracleAS Cluster (OC4J-EJB) Running on Computers with Two NICs"

Solution

You need to start up the OC4J instances by setting the oc4j.multicast.bindInterface parameter to the name or IP address of the other NIC on the node.

For example, using the values shown in Figure A-1, you would start up the OC4J instances with these parameters:

On node 1, configure the OC4J instance to start with up with this parameter:

-Doc4j.multicast.bindInterface=123.45.67.21

On node 2, configure the OC4J instance to start with up with this parameter:

-Doc4j.multicast.bindInterface=123.45.67.22

You specify this parameter and its value in the "Java Options" field in the "Command Line Options" section in the Server Properties page in the Application Server Control Console (Figure A-2).

Figure A-2 Server Properties Page in Application Server Control Console

Description of "Figure A-2 Server Properties Page in Application Server Control Console"

A.4.2 Performance Is Slow When Using the "opmn:" URL Prefix

Problem

If you have applications that use the "opmn:" prefix in their Context.PROVIDER_URL property, you may experience slow performance in the InitialContext method.

The following sample code sets the PROVIDER_URL to a URL with an opmn: prefix.

Hashtable env = new Hashtable();
env.put(Context.PROVIDER_URL, "opmn:ormi://hostname:port/cmpapp");
// ... set other properties ...
Context context = new InitialContext(env);

If the host specified in PROVIDER_URL is down, the application has to make a network connection to OPMN to locate another host. Going through the network to OPMN takes time.

Solution

To avoid making another network connection to OPMN to get another host, set the oracle.j2ee.naming.cache.timeout property so that the values returned from OPMN the first time are cached, and the application can use the values in the cache.

The following sample code sets the oracle.j2ee.naming.cache.timeout property.

Hashtable env = new Hashtable();
env.put(Context.PROVIDER_URL, "opmn:ormi://hostname:port/cmpapp");

// set the cache value
env.put("oracle.j2ee.naming.cache.timeout", "30");

// ... set other properties ...

Context context = new InitialContext(env);

Table A-1 shows valid values for the oracle.j2ee.naming.cache.timeout property:

Table A-1 Values for the oracle.j2ee.naming.cache.timeout Property

Value Meaning

Value	Meaning
`-1`	No caching.
`0`	Cache only once, without any refreshing.
Greater than `0`	Number of seconds after which the cache can be refreshed. Note that this is not automatic; the refresh occurs only when you invoke "`new` `InitialContext()`" again. If the property is not set, the default value is 60.

-1

No caching.

0

Cache only once, without any refreshing.

Greater than 0

Number of seconds after which the cache can be refreshed. Note that this is not automatic; the refresh occurs only when you invoke "new InitialContext()" again.

If the property is not set, the default value is 60.

With the property set, you will still see some delay on the first "new InitialContext()" call, but subsequent calls should be faster because they are retrieving data from the cache instead of making a network connection to OPMN.

Note that for optimal performance, you should also set Dedicated.Connection to either YES or DEFAULT, and set Dedicated.RMIcontext to FALSE.

A.5 Troubleshooting Backup and Recovery

Section A.5.1, "Unable to RestoreOracleAS Metadata Repository to a Different Host"

A.5.1 Unable to Restore OracleAS Metadata Repository to a Different Host

The backing up and restoration of an OracleAS Metadata Repository using the Backup and Recovery Tool from one host to another fails if the ORACLE SID in the new host is different from that of the old host.

Problem

The Backup and Recovery Tool does not work with different ORACLE SID values.

The following is an example of the error message that appears when the restoration fails due to an inconsistent ORACLE SID:

Assume two nodes: A and B. The OracleAS Metadata Repository in machine A is backed up using the Backup and Recovery Tool. When attempting to restore it on machine B using the same tool, the following message appears:

Oracle instance started
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00579: the following error occurred at 09/08/2003 16:29:15
RMAN-06003: ORACLE error from target database: ORA-01103: database name 'M16REP1' in controlfile is not 'M16MR2'
RMAN-06097: text of failing SQL statement: alter database mount
RMAN-06099: error occurred in source file: krmk.pc, line: 4124

Note that "M16REP1" is the ORACLE SID of the database that was backed up.

Solution

None at this time. Restoring the OracleAS Metadata Repository to a database with a different ORACLE SID is currently not supported.

A.6 Troubleshooting Real Application Clusters

Section A.6.1, "Oracle Ultra Search Web Crawler Does Not Failover"

A.6.1 Oracle Ultra Search Web Crawler Does Not Failover

For Real Application Clusters that do not use a cluster file system, the Oracle Ultra Search web crawler does not failover to an available node.

Problem

Currently, the Oracle Ultra Search web crawler is configured so that it can be run only from one node in a Real Application Cluster. If that node (or the database) goes down, the web crawler will not startup on an available node. This situation occurs for non Cluster File System Real Application Clusters.

Solution

When Real Application Clusters use a Cluster File System, Oracle Ultra Search crawler can be launched from any of the Real Application Clusters nodes. At least one node has to be running.

When a Cluster File System is not used, the Oracle Ultra Search crawler always runs on a specified node. If this node stops operating, you must run the wk0reconfig.sql script to move Oracle Ultra Search to another Real Application Clusters node. This script can be run as follows:

> sqlplus wksys/wksys_passwd 
SQL> ORACLE_HOME/ultrasearch/admin/wk0reconfig.sql <instance_name> <connect_url>

<instance_name> is the name of the Real Application Clusters instance that Oracle Ultra Search uses for crawling. This name can be obtained by using the following SQL statement after connecting to the database:

SELECT instance_name FROM v$instance

<connect_url> is the JDBC connection string that guarantees a connection only to the specified instance, such as:

(DESCRIPTION=
  (ADDRESS_LIST=
    (ADDRESS=(PROTOCOL=TCP)
      (HOST=<nodename>)
      (PORT=<listener_port>)))
  (CONNECT_DATA=(SERVICE_NAME=<service_name>)))

Note that when Oracle Ultra Search is switched from one Real Application Clusters node to another, the contents of the cache will be lost. After switching instances, force a re-crawl of the documents to re-populate the cache.

A.7 Need More Help?

In case the information in the previous section is not sufficient, you can find more solutions on Oracle MetaLink, http://metalink.oracle.com. If you do not find a solution for your problem, log a service request.