Oracle® Application Server High Availability Guide
10g Release 3 (10.1.3) B15977-02 |
|
![]() Previous |
![]() Next |
This appendix describes common problems that you might encounter when deploying and managing Oracle Application Server in high availability configurations, and explains how to solve them. It contains the following topics:
This section describes common problems and solutions in OracleAS Disaster Recovery configurations. It contains the following topics:
Section A.1.2, "Failure to Bring Up Standby Instances After Failover or Switchover"
Section A.1.3, "Switchover Operation Fails At the Step dcmctl resyncInstance -force -script"
Section A.1.4, "Unable to Start Standalone OracleAS Web Cache Installations at the Standby Site"
Section A.1.5, "Standby Site Middle-tier Installation Uses Wrong Hostname"
Section A.1.6, "Failure of Farm Verification Operation with Standby Farm"
In the OracleAS Disaster Recovery standby site, you may find that the site's OracleAS Metadata Repository is not synchronized with the OracleAS Metadata Repository in the primary site.
Problem
The OracleAS Disaster Recovery solution requires manual configuration and shipping of data files from the primary site to the standby site. Also, the data files (archived database log files) are not applied automatically in the standby site, that is, OracleAS Disaster Recovery does not use managed recovery in Oracle Data Guard.
Solution
The archive log files have to be applied manually. The steps to perform this task is found in Chapter 5, "OracleAS Disaster Recovery".
Standby instances are not started after a failover or switchover operation.
Problem
IP addresses are used in instance configuration. OracleAS Disaster Recovery setup does not require identical IP addresses in peer instances between the production and standby site. OracleAS Disaster Recovery synchronization does not reconcile IP address differences between the production and standby sites. Thus, if you use explicit IP address xxx.xx.xxx.xx in your configuration, the standby configuration after synchronization will not work.
Solution
Avoid using explicit IP addresses. For example, in OracleAS Web Cache and Oracle HTTP Server configurations, use ANY or host names instead of IP addresses as listening addresses
The OracleAS Disaster Recovery asgctl switchover operation requires that the value of the TMP variable be defined the same in the opmn.xml
file on both the primary and standby sites.
Problem
OracleAS Disaster Recovery switchover fails at the step dmctl resyncInstance -force -script and displays a message that a directory could not be found.
Solution
During a switchover operation, the opmn.xml file is copied from the primary site to the standby site. For this reason, the value of the TMP variable must be defined the same in the opmn.xml
file on both primary and standby sites; otherwise, the switchover operation will fail. Make sure the TMP variable is defined identically in the opmn.xml files and resolves to the same directory structure on both sites before attempting to perform an asgctl switchover operation.
For example, the following code snippets for a Windows and UNIX environment show a sample definition of the TMP variable.
Example in Windows Environment: ------------------------------- . . . <ias-instance id="infraprod.iasha28.us.oracle.com"> <environment> <variable id="TMP" value="C:\DOCUME~1\ntregres\LOCALS~1\Temp"/> </environment> . . . Example in Unix Environment: ---------------------------- . . . <ias-instance id="infraprod.iasha28.us.oracle.com"> <environment> <variable id="TMP" value="/tmp"/> </environment> . . .
A workaround to this problem is to change the value of the TMP variable in the opmn.xml
file on the primary site, perform a dcmctl update config operation, then perform the asgctl switchover operation. This approach saves you having to reinstall the mid-tiers to make use of an altered TMP variable.
OracleAS Web Cache cannot be started at the standby site possibly due to misconfigured standalone OracleAS Web Cache after failover or switchover.
Problem
OracleAS Disaster Recovery synchronization does not synchronize standalone OracleAS Web Cache installations.
Solution
Use the standard Oracle Application Server full CD image to install the OracleAS Web Cache component
A middle-tier installation in the standby site uses the wrong hostname even after the machine's physical hostname is changed.
Problem
Besides modifying the physical hostname, you also need to put it as the first entry in /etc/hosts
file. Failure to do the latter will cause the installer to use the wrong hostname.
Solution
Put the physical hostname as the first entry in the /etc/hosts
file. See Section 5.2.2, "Configuring Hostname Resolution" for more information.
When performing a verify farm with standby farm operation, the operation fails with an error message indicating that the middle-tier machine instance cannot be found and that the standby farm is not symmetrical with the production farm.
Problem
The verify farm with standby farm operation is trying to verify that the production and standby farms are symmetrical to one another, that they are consistent, and conform to the requirements for disaster recovery.
The verify operation is failing because it sees the middle-tier instance as mid_tier.
<hostname>
and not as mid_tier.
<physical_hostname>
. You might suspect that this is a problem with the environmental variable _CLUSTER_NETWORK_NAME_
, which is set during installation. However, in this case, it is not because a check of the _CLUSTER_NETWORK_NAME_
environmental variable setting finds this entry to be correct. However, a check of the contents of the /etc/hosts
file, indicates that the entries for the middle tier in question are incorrect. That is, all middle-tier installations take the hostname from the second column of the /etc/hosts
file.
For example, assume the following scenario:
Two environments are used: examp1
and examp2
OracleAS Infrastructure (Oracle Identity Management and OracleAS Metadata Repository) is first installed on examp1
and examp2
as host infra
OracleAS middle-tier (OracleAS Portal and OracleAS Wireless) is then installed on examp1
and examp2
as host node1
Basically, these are two installations (OracleAS Infrastructure and OracleAS middle-tier) on a single node
Updated the latest duf.jar
and backup_restore
files on all four Oracle homes
Started OracleAS Guard (asgctl
) on all four Oracle homes (OracleAS Infrastructure and OracleAS middle-tier on two nodes)
Performed asgctl
operations: connect asg
, set primary
, dump farm
Performed asgctl verify farm
with standby farm
operation, but it fails because it sees the instance as mid-tier.examp1
and not as mid_tier.node1.us.oracle.com
A check of the /etc/hosts
file shows the following entry:
123.45.67.890 examp1 node1.us.oracle.com node1 infra
Then ias.properties
and farms shows the following and the verify operation is failing:
IASname=midtier_inst.examp1
However, the /etc/hosts
file should actually be the following:
123.45.67.890 node1.us.oracle.com node1 infra
Then ias.properties
and farms shows the following and the verify operation succeeds:
IASname=midtier_inst.node1.us.oracle.com
Solution
Check and change the second column entry in your /etc/hosts
file to match the hostname of the middle-tier node in question as described in the previous explanation.
A sync farm to
operation returns the error message: "Cannot Connect to asdb"
Problem
Occasionally, an administrator may forget to set the primary database using the asgctl
command line utility in performing an operation that requires that the asdb database connection be established prior to an operation. The following example shows this scenario for a sync farm to
operation:
ASGCTL> connect asg hsunnab13 ias_admin/iastest2 Successfully connected to hsunnab13:7890 ASGCTL> . . . <Other asgctl operations may follow, such as verify farm, dump farm, <and show operation history, and so forth that do not require the connection <to the asdb database to be established or a time span may elapse of no activity <and the administrator may miss performing this vital command. . . . ASGCTL> sync farm to usunnaa11 prodinfra(asr1012): Syncronizing each instance in the farm to standby farm prodinfra: -->ASG_ORACLE-300: ORA-01031: insufficient privileges prodinfra: -->ASG_DUF-3700: Failed in SQL*Plus executing SQL statement: connect null/******@asdb.us.oracle.com as sysdba;. prodinfra: -->ASG_DUF-3502: Failed to connect to database asdb.us.oracle.com. prodinfra: -->ASG_DUF-3504: Failed to start database asdb.us.oracle.com. prodinfra: -->ASG_DUF-3027: Error while executing Syncronizing each instance in the farm to standby farm at step - init step.
Solution
Perform the asgctl set primary database
command. This command sets the connection parameters required to open the asdb database in order to perform the sync farm to
operation. Note that the set primary database
command must also precede the instantiate farm to
command and switchover farm to
command if the primary database has not been specified in the current connection session.
On Windows systems, if your system PATH environment variable has exceeded the 1024 character limit because you have many OracleAS instances installed or many third party software installations, or both on your system, the asgctl startup command may fail because you are starting the OracleAS Guard server outside of OPMN and the system cannot resolve the directory path.
Problem
Occasionally, on Windows systems with many installations, OracleAS instances or third party software, or both, the asgctl startup command, which is run outside of OPMN, may return a popup error stating it could not find a dynamic link library for a particular file, orawsec9.dll
, followed by a DufException. For example:
C:\product\10.1.3\OC4J_1\dsa\bin> asgctl startup <<Popup Error:>> The dynamic link library *orawsec9.dll* could not be found. <<The exception:>> oracle.duf.DufException at oracle.duf.DufOsBase.constructInstance(DufOsBase.java:1331) at oracle.duf.DufOsBase.getDufOs(DufOsBase.java:122) at oracle.duf.DufHomeMgr.getCurrentHomePath(DufHomeMgr.java:582) at oracle.duf.dufclient.DufClient.main(DufClient.java:132) stado42: -->ASG_SYSTEM-100: oracle.duf.DufException -----------------------------------------------------------------------------
However, this dll does exist in the ORACLE_HOME\bin directory.
This error is not seen in OracleAS Guard standalone kit because the file orawsec9.dll
exists in the ORACLE_HOME\dsa\bin folder.
Solution
The workaround is to either manually edit the system PATH variable with the required path information or manually override the PATH in the command prompt by specifying the relevant %PATH% variables. For example:
C:\set PATH=C:\product\10.1.3\OracleAS_OC4J_2\bin; C:\product\10.1.3\OracleAS_OHS1\jre\1.4.2\bin\client; C:\product\10.1.3\OracleAS_OHS1\jre\1.4.2\bin; C:\product\10.1.3\OracleAS_OHS1\bin;C:\product\10.1.3\OC4J_1\bin C:\product\10.1.3\OC4J_1\dsa\bin> asgctl startup
This section describes common problems and solutions for middle-tier components in high availability configurations. It contains the following topics:
Section A.2.1, "Using Multiple NICs with OracleAS Cluster (OC4J-EJB)"
Section A.2.2, "Performance Is Slow When Using the "opmn:" URL Prefix"
Problem
If you are running OracleAS Cluster (OC4J-EJB) on computers with two NICs (network interface cards) and you are using one NIC for connecting to the network and the second NIC for connecting to the other node in the cluster, multicast messages may not be sent or received correctly. This means that session information does not get replicated between the nodes in the cluster.
Figure A-1 OracleAS Cluster (OC4J-EJB) Running on Computers with Two NICs
Solution
You need to start up the OC4J instances by setting the oc4j.multicast.bindInterface
parameter to the name or IP address of the other NIC on the node.
For example, using the values shown in Figure A-1, you would start up the OC4J instances with these parameters:
On node 1, configure the OC4J instance to start with up with this parameter:
-Doc4j.multicast.bindInterface=123.45.67.21
On node 2, configure the OC4J instance to start with up with this parameter:
-Doc4j.multicast.bindInterface=123.45.67.22
You specify this parameter and its value in the "Java Options" field in the "Command Line Options" section in the Server Properties page in the Application Server Control Console (Figure A-2).
Figure A-2 Server Properties Page in Application Server Control Console
Problem
If you have applications that use the "opmn:
" prefix in their Context.PROVIDER_URL
property, you may experience slow performance in the InitialContext
method.
The following sample code sets the PROVIDER_URL
to a URL with an opmn:
prefix.
Hashtable env = new Hashtable(); env.put(Context.PROVIDER_URL, "opmn:ormi://hostname:port/cmpapp"); // ... set other properties ... Context context = new InitialContext(env);
If the host specified in PROVIDER_URL
is down, the application has to make a network connection to OPMN to locate another host. Going through the network to OPMN takes time.
To avoid making another network connection to OPMN to get another host, set the oracle.j2ee.naming.cache.timeout
property so that the values returned from OPMN the first time are cached, and the application can use the values in the cache.
The following sample code sets the oracle.j2ee.naming.cache.timeout
property.
Hashtable env = new Hashtable(); env.put(Context.PROVIDER_URL, "opmn:ormi://hostname:port/cmpapp"); // set the cache value env.put("oracle.j2ee.naming.cache.timeout", "30"); // ... set other properties ... Context context = new InitialContext(env);
Table A-1 shows valid values for the oracle.j2ee.naming.cache.timeout
property:
Table A-1 Values for the oracle.j2ee.naming.cache.timeout Property
Value | Meaning |
---|---|
|
No caching. |
|
Cache only once, without any refreshing. |
Greater than |
Number of seconds after which the cache can be refreshed. Note that this is not automatic; the refresh occurs only when you invoke " If the property is not set, the default value is 60. |
With the property set, you will still see some delay on the first "new InitialContext()
" call, but subsequent calls should be faster because they are retrieving data from the cache instead of making a network connection to OPMN.
Note that for optimal performance, you should also set Dedicated.Connection
to either YES
or DEFAULT
, and set Dedicated.RMIcontext
to FALSE
.
In case the information in the previous section is not sufficient, you can find more solutions on Oracle MetaLink, http://metalink.oracle.com
. If you do not find a solution for your problem, log a service request.
See Also:
|