Skip Headers
Oracle® Collaboration Suite Administrator's Guide
10g Release 1 (10.1.2) for Windows or UNIX

Part Number B25490-05
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

F Troubleshooting the Oracle Collaboration Suite Recovery Manager

This appendix describes common problems you may encounter when using the Oracle Collaboration Suite Recovery Manager, and explains how to solve them.

This appendix contains the following sections:

Troubleshooting Overview

Troubleshooting the Oracle Collaboration Suite Recovery Manager may involve Oracle Collaboration Suite, Oracle Application Server and Oracle Database.

Using Log Files

The log files often provide valuable information for troubleshooting the Oracle Collaboration Suite Recovery Manager, although investigating these problems requires other resources in addition to log files.

You can find the locations of log files from configuration parameters in Oracle Collaboration Suite Recovery Manager's configuration file. The default configuration file is ORACLE_HOME/backup_recovery/config/config.inp. The following parameters specify the paths to log files:

  • LOG_PATH is the directory path where the Oracle Collaboration Suite Recovery Manager saves logs generated during backups and restores of database, configuration files, and Oracle Calendar files. The default LOG_PATH is ORACLE_HOME/backup_restore/logs.

  • CONFIG_BACKUP_PATH parameter points to a directory that should be used by the Oracle Collaboration Suite Recovery Manager to backup configuration files and the Oracle Calendar.

  • DATABASE_BACKUP_PATH is the directory path where database backups are stored. This path is required for Oracle Collaboration Suite Database installations.

In addition to the log files for Oracle Collaboration Suite Recovery Manager, other log files may help you to debug specific problems.

  • For DCM related problems, you should check the log files under:

    ORACLE_HOME/dcm/logs
    
    
  • For OPMN related problem, you should check log files under:

    ORACLE_HOME/opmn/logs
    
    
  • For problems specific to Oracle Calendar, you should check log files under these two directories:

    ORACLE_HOME/ocas/logs
    ORACLE_HOME/ocal/log
    
    
  • For issues relating to the database, you should check for log files and trace files under the directories referenced by the following database parameters: CORE_DUMP_DEST, BACKGROUND_DUMP_DEST, and USER_DUMP_DEST.

Screen Output

Screen output from Oracle Collaboration Suite Recovery Manager also provides useful information for troubleshooting. This includes the steps the operation has been executing, the status of the operation, error messages, and location of the log files that include more detailed error and warning messages.

To obtain further information, you can invoke the Oracle Collaboration Suite Recovery Manager in the verbose mode, by specifying the -v option at the command line.

E-mail Notification

For each invocation of the ocs_bkp_restore.sh command for UNIX or Linux, you can receive an e-mail notification about the run. This notification includes the following information:

  • type of the operation, such as backup_instance or restore_instance

  • status of the operation, which is either success or failure

  • date

  • host name

  • ORACLE_HOME value

  • log path

  • configuration backup path

  • database backup path

  • install type

E-mail notification is optional. In order to receive e-mail notifications, you need to choose yes for "Enable Email notification" when configuring the Oracle Collaboration Suite Recovery Manager. The e-mail notification will be sent to the to addresses specified by the configuration.

Error Messages

Error messages are encountered during operation, and can be obtained from the log files, screen output and e-mail notifications previously discussed.

Prerequisite Check

You should perform Prerequisite checks before running certain operations. Some prerequisites are generic for all operations, while others are specific to only a few of the operations.

  1. Enable the ARCHIVELOG mode before making a database backup.

  2. The database listener should be up even when performing a cold database backup.

  3. Oracle Internet Directory should be up for all Applications tier backups.

  4. Ensure that there is enough space in the backup directories.

  5. All OPMN managed processes should be in a consistent state for all online and cold backups and restores. The PID of the OPMN managed processes cannot be 0. These files can be found by checking the status of the OPMN process.

  6. Make sure that the Applications tier processes are down when the Oracle Collaboration Suite Database is not up, as Applications tier processes constantly try to communicate with the Oracle Collaboration Suite Database.

  7. If OPMN cannot start any of its managed process, use the following command:

    ORACLE_HOME/dcm/bin/opmnctl startproc ias-component=component_name
    
    

    In some cases, the OPMN may not be able to start the dcm-daemon; use the following command:

    ORACLE_HOME/dcm/bin/opmctl startproc ias-component=dcm-daemon
    
    

There are some other tips that will help you troubleshoot the Oracle Collaboration Suite Recovery Manager.

Troubleshooting the Installation and Configuration

The following problems can be found during the installation and configuration of the Oracle Collaboration Suite Recovery Manager.

Oracle Collaboration Suite Recovery Manager Not Found

Description

The following command returns a "command not found" error:

On Unix and Linux,

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m configure

On Windows,

ORACLE_HOME\backup_restore\ocs_bkp_restore -m configure

Cause

The problem occurs either because the Oracle Collaboration Suite Recovery Manager is not installed, or because the command is not on the execution PATH.

Solution

If the Oracle Collaboration Suite Recovery Manager is not installed, install it. If the Oracle Collaboration Suite Recovery Manager is already installed, ensure that the ORACLE_HOME/backup_restore directory is on the execution PATH. Alternatively, you can invoke the Oracle Collaboration Suite Recovery Manager from the ORACLE_HOME/backup_restore directory.

No Permission for Executing Oracle Collaboration Suite Recovery Manager

Description

The following command returns a "permission denied" error:

On Unix only,

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh

Cause

The Oracle Collaboration Suite Recovery Manager scripts ocs_bkp_restore.sh and ocs_bkp_restore.pl do not have execute permissions.

Solution

Grant execution permissions to ocs_bkp_restore.pl and ocs_bkp_restore.sh:

chmod 755 ORACLE_HOME/backup_restore/ocs_bkp_restore.pl
chmod 755 ORACLE_HOME/backup_restore/ocs_bkp_restore.sh

Unnecessary Input Required for Applications Tier Configuration

Description

When performing Oracle Collaboration Suite Recovery Manager configuration on the Applications tier using this command,

On Unix and Linux,

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m configure

you will be prompted to enter configuration parameters. Two of the parameters, "Database backup path" and ORACLE_SID, do not apply to the Applications tier.

Cause

Oracle Collaboration Suite Recovery Manager goes through the same set of configuration steps for the Oracle Collaboration Suite Database and the Applications tier. Although "Database backup path" and ORACLE_SID are not required by the Applications tier, they still show up in the prompt.

Solution

You can ignore these two parameters for the Applications tier configuration; press the Enter key for the next prompt.

Unable to Get dbid From the Database

Description

When performing one of the following configuration commands, the command fails with the error message "Unable to get dbid from the database":

On UNIX:

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m configure

On Windows:

ORACLE_HOME\backup_restore\ocs_bkp_restore -m configure

Cause

This problem occurs because the Oracle Collaboration Suite Database or the Listener are down. Both the Oracle Collaboration Suite Database and the Listener must be up in order to configure the Oracle Collaboration Suite Recovery Manager.

Solution

Make sure you can access the Oracle Collaboration Suite Database. If not, start up the database or troubleshoot the connection problem. Both the Oracle Collaboration Suite Database and the Listener must be up in order to configure the Oracle Collaboration Suite Recovery Manager.

Troubleshooting Instance Backups

The following problems can be found during instance backups made with the Oracle Collaboration Suite Recovery Manager.

Cold Backup on Oracle Collaboration Suite Database Hangs

Description

While making a cold backup of the Oracle Collaboration Suite Database when the Applications tier is up, the resources of the host machine are used up, resulting in a hang. The following scenario describes how this problem may occur:

  1. Start the Oracle Collaboration Suite Database.

  2. Start the Applications tier.

  3. Perform a cold backup of the Oracle Collaboration Suite Database:

    On Unix and Linux,

    ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m backup_cold
    
    

    On Windows,

    ORACLE_HOME\backup_restore\ocs_bkp_restore.bat -m backup_cold 
    
    

Cause

After some time, the Oracle Collaboration Suite Database shuts down, which is a normal operation for cold backup. However, because the Applications tier is still up, it will continue to consume system resources on the host machine by making repeated attempts to connect to the Oracle Collaboration Suite Database (I/O slows down and CPU usage increases).

This problem can also occur outside the Oracle Collaboration Suite Recovery Manager, if you shut down the Oracle Collaboration Suite Database before shutting down the Applications tier.

Solution

You should make sure that all Applications tier processes are down before performing cold backup on the Oracle Collaboration Suite Database. You may have to reboot the host machine.

OPMN Restart Failure at Cold Instance Backup

Description

When performing cold instance backup on the Applications tier using the following command, the OPMN may fail to restart:

On Unix and Linux,

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m backup_instance_cold

On Windows,

ORACLE_HOME\backup_restore\ocs_bkp_restore.bat -m 
backup_instance_cold

A screen output may look like this:

ocs_bkp_restore.sh -m backup_cold_instance -n 
Stopping all opmn managed processes ... 
Starting opmn process ... 
Checking the process status ... 
Stopping all opmn managed processes ... 
Checking the process status ... 
The command /scratch/ocs_home/apps/opmn/bin/opmnctl status -fmt 
   %cmp%sta  -noheaders  >  /scratch/ocs_home/apps/backup_restore/logs/
   2005-07-08_10-06-46_output.log 
failed with return code 2 
Unable to check the status of processes.  
. 
Problem running command (Returned 255) 
 /scratch/ocs_home/apps/backup_restore/bkp_restore.sh   -T 
2005-07-08_10-06-46 -m backup_instance_cold -n 

Cause

From the screen output we know that:

  • Configuration backup has completed.

  • OPMN restart has failed.

  • Oracle Calendar Server backup has not been done.

The bkp_restore.sh command shown in the screen output is the Application Server Backup and Recovery Tool which performs the configuration backup and manages OPMN processes. Oracle Calendar backup will be done by Oracle Collaboration Suite Recovery Manager after the Application Server Backup and Recovery command has performed successfully. If OPMN restart fails, the Oracle Collaboration Suite Recovery Manager will exit abnormally. As a result, Oracle Collaboration Suite will not perform Oracle Calendar Server backup.

Solution

All OPMN managed process should be in a consistent state for backups and restores. If OPMN restart has failed, you should manually start OPMN after the backup_cold_instance operation.

Although the cold instance backup operation has failed due to the OPMN restart failure, a configuration backup has completed successfully. A *.jar file for the configuration backup can be seen under the configuration backup directory, and the backup timestamp is recorded in the backup and restore catalog.


See Also:

  • The troubleshooting appendix in the Oracle Process Manager and Notification Server Administrator's Guide, which is part of the Oracle Application Server library

  • Oracle Calendar Reference Manual for instructions on how to use the unidbbackup command


Troubleshooting Database Backups

Oracle recommends that you do not perform a database or repository-only backup on Oracle Collaboration Suite Database. Whenever possible, you should perform an instance backup because it consists of a database or Oracle Collaboration Suite Database backup and a backup of all the configuration files. In contrast, a database-only backup without the configuration backup may result in configuration data inconsistency. For database-only and Oracle Collaboration Suite component-only database install types, use backup_cold or backup_online options instead of backup_instance* options.

Oracle Collaboration Suite Database Has a Portal Validation Warning

You can ignore this error. If you want to suppress the Portal validation from running, you can use the '-z' switch as part of executing the Oracle Collaboration Suite Recovery Manager.

Troubleshooting Configuration Backups

Configuration only backup on Oracle Collaboration Suite Database is not recommended. Whenever possible, Oracle recommends that you do an instance backup.

Troubleshooting Instance Restore

The following problems can be found during instance restore operations performed with the Oracle Collaboration Suite Recovery Manager.

Instance Restore Hang

Description

When performing instance restore on the Applications tier using the following command, and if the restore is based on a cold instance backup described where the configuration backup has completed but OPMN has failed to restart, the restore operation may hang at the configuration file restore.

On Unix and Linux,

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m restore_instance

On Windows,

ORACLE_HOME\backup_restore\ocs_bkp_restore.bat -m restore_instance

This can be observed from the screen output of this command.

Oracle Application Server Backup/Recovery Tool 10g (10.1.2.0.2)
Copyright (c) 2004, 2005, Oracle. All rights reserved.
 
Stopping all opmn managed processes ...
Starting opmn process ...
Performing file system restore ...
Starting opmn process ...
Checking the process status ...
OPMN managed processes shutdown successfully.
Performing configuration restore ...

If the system did not hang, you would have seen the "Configuration restore completed successfully" message and the echo of other instance restore procedures as they complete successfully.

Cause

As part of the configuration backup and restore, the backup and restore operation will backup and restore the configuration files and some OPMN and DCM managed files. The backup and restore may not function properly if OPMN is in an inconsistent state.

Solution

All OPMN managed processes should be in a consistent state for all online backups and restores. Additionally, if a DCM message "ADMN-404040 ... the archive timestamp was not found in the repository" appears in the timestamp_restore_config.log file, ignore it because it represents part of a cleanup operation.

No Oracle Calendar and Non-DCM Files Restore for an Instance Restore

Description

When performing instance restore on the Applications tier using the following command, and if OPMN has failed to be restarted, the restore may only be performed on the configuration files, but not on the non-DCM files and Oracle Calendar Server.

On Unix and Linux,

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m restore_instance

On Windows,

ORACLE_HOME\backup_restore\ocs_bkp_restore.bat -m restore_instance

Cause

Instance restore will not restore non-DCM files and the Oracle Calendar server, because the non-DCM files restore and the Oracle Calendar Server restore processes are performed after the OPMN restart. If OPMN restart has failed, Oracle Collaboration Suite Recovery Manager will exit without performing the restore on non-DCM files and Oracle Calendar Server.

Instance restore will not restore the Oracle Calendar because its backup file cannot be found in the backup directory.

Solution

During an instance restore, if Oracle Collaboration Suite Recovery Manager has exited upon the OPMN restart failure, and as a consequence non-DCM files and Oracle Calendar Server are not restored, you should use the following commands to restore non-DCM files and Oracle Calendar Server:

On UNIX and Linux:

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m restore_nondcm
ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m restore_calendar

On Windows:

ORACLE_HOME\backup_restore\ocs_bkp_restore.bat -m restore_nondcm
ORACLE_HOME\backup_restore\ocs_bkp_restore.bat -m restore_calendar

Oracle Calendar Server Down Error

Description

When performing instance restore on a Applications tier containing the Oracle Calendar Server, using the command:

ORACLE_HOME/backup_restore/ocs_bkp_restore.sh -m restore_instance

The screen output shows this error message:

Stopping CalendarServer process ... 
opmnctl: stopping opmn managed processes... 
  Problem running command (Returned 150) 
  /scratch/ocs_apps_home/apps/opmn/bin/opmnctl stopproc ias-component=CalendarServer 
The process CalendarServer is already DOWN. 

Cause

This error occurs if the Oracle Calendar Server is already down before the operation, and if you do not manually restart the Oracle Calendar Server before the instance restore operation.

Solution

Ignore this problem because the instance restore operation will proceed to completion, despite the error message. To eliminate this message, manually restart the Oracle Calendar server if it is down before the instance restore operation.

A similar error appears during a configuration restore operation.

'ORA-01276: Cannot add file' or 'ORA-25153: Temporary Tablespace is Empty' Errors

Description

During a database restore operation, you may encounter the ORA-01276: Cannot add file or ORA-25153: Temporary Tablespace is Empty errors.

Cause

The restore of database temp files has failed.

Solution

You need to add temp files to any temporary tablespaces in the database. Follow the instructions in "Validating Database Temporary Tablespaces Have Temp Files".

Troubleshooting Database Restores

Database -only restore operations are not recommended, especially when the backup used for the restore is from an instance backup. Oracle recommends that you perform an instance backup and restore instead.

Suppose that after the instance backup you made some new configuration changes, such as deploying a new application. You then realize that some mistakes have been made in the configuration changes, and decide to roll back the deployment. Since the deployed application has configuration data in some local configuration files and in the repository database, performing a restore_instance will restore both the configuration files and the repository database to the state they had before deployment of the application. However, performing a restore_repos operation will only restore the repository database and not the configuration files. This will leave the two mutually inconsistent sets of data.

Troubleshooting Configuration Restores

Configuration-only restore operations are not recommended, especially when the backup used for the restore is from an instance backup. We recommend that you perform an instance backup and restore.

Troubleshooting Miscellaneous Problems

You may also encounter the following additional problems.

OPMN Fails to Start Oracle Mail Processes

Description

OPMN may have failed to start e-mail processes. This can be seen by checking the status of OPMN processes with the following command.

On Unix and Linux,

ORACLE_HOME/opmn/bin/opmnctl status

On Windows,

ORACLE_HOME\opmn\bin\opmnctl status

Cause

The Oracle Mail SMTP port that is used for the installation is port 25 by default, which conflicts with the SENDMAIL processes. The port number cannot be changed after the installation. Even if you kill the SENDMAIL processes before the installation, if you reboot the hosting machine or send e-mail through the Oracle Collaboration Suite "mail" command, the SENDMAIL processes will be reactivated.

Solution

Check if there are any SENDMAIL processes on your host, and kill them.

OPMN Fails to Start Some Processes on Slow Hosts

Description

OPMN may have failed to start some processes on the Applications tier. This can be seen by checking the status of OPMN processes with the following command.

On Unix and Linux,

ORACLE_HOME/opmn/bin/opmnctl status

On Windows,

ORACLE_HOME\opmn\bin\opmnctl status

Cause

OPMN may fail to start some processes on slow hosts because of default timeout settings. There is a default "start timeout" for each process in the OPMN configuration that defines the maximum time allowed for OPMN to start a particular process, and correspondingly a default "stop timeout" that defines the maximum time allowed for OPMN to stop a process. On a slow host, OPMN may not be able to start some processes waiting the default timeout timeframe before giving up.

Solution

You can increase the default "start timeout" and "stop timeout" in the OPMN configuration file opmn.xml. On Unix and Linux, opmn.xml is under ORACLE_HOME/opmn/conf. On Windows, opmn.xml is under ORACLE_HOME\opmn\conf.

Here is an example of how to increase the default "start timeout" and "stop timeout" in opmn.xml:

Before the change:

<process-type id="messaging_server" module-id="messaging">
              <start timeout="420"/>
              <stop timeout="420"/>
              <process-set id="messaging_gtwy_1000" numprocs="1">
                 <start timeout="420"/>
                 <stop timeout="420"/>
                 <restart timeout="420"/>
              </process-set>
           </process-type>

After the change:

<process-type id="messaging_server" module-id="messaging">
              <start timeout="1800"/>
              <stop timeout="1800"/>
              <process-set id="messaging_gtwy_1000" numprocs="1">
                 <start timeout="1800"/>
                 <stop timeout="1800"/>
                 <restart timeout="420"/>
              </process-set>
           </process-type>

If RTCPM has failed to be started, you should add a "start timeout" and a "stop timeout" for RTCPM which are not defined in opmn.xml:

<process-type id="rtcpm" module-id="CUSTOM">
              <start timeout="1800"/>
              <stop timeout="1800"/>
              <process-set id="rtcpm" restart-on-death="true" numprocs="1">
              .
              .
              .
              </process-set>
           </process-type>

If OPMN has failed to start some processes, you should also make sure that there are no Oracle Collaboration Suite processes left running on the Applications tier before executing opmnctl startall.

No File Found In Path Warning

Description

While performing any type of backup operation, you may encounter the following type of warning:

"Warning(s) during backup - please check <backup log file location>"

On an Applications tier backup, you may find the following lines in the indicated log file:

backup_config completed with warning !!! No file found in path: /<ORACLE_HOME>/sysman/emd/collection/*.xml
backup_config completed with warning !!! No file found in path: /<ORACLE_HOME>/ocal/log/
backup_config completed with warning !!! No file found in path: /<ORACLE_HOME>/ocas/linkdb

On an Infrastructure tier backup, you may find the following lines in the indicated log file:

backup_config completed with warning !!! No file found in path: /<ORACLE_HOME>/sso/plugin
backup_config completed with warning !!! No file found in path: /<ORACLE_HOME>/sysman/emd/collection/*.xml

Cause

These directories either are not used, or have not yet been used.

Solution

You can safely ignore warning messages for the specific files listed here. If there are other files not found, you should investigate further.

Cannot Delete Unwanted Backups

Description

You may want to delete backups that are no longer needed, but cannot do it.

Cause

The Oracle Collaboration Suite Recovery Manager does not support backup deletion.

Solution

To delete configuration backups, you can delete the configuration files from the configuration backup directory using an operating system delete command. To delete database backups, you can use RMAN.

Receiving restore_config Operation Fails Error

Description

A restore_config operation fails.

Cause

A restore_config operation fails with the following error:

/oracle_home/dcmctl.bat applyarchiveto -archive
2004-11-29_11-23-18 -script
ADMN-906025
Base Exception:
The exception, 100999, occurred at Oracle Application Server instance
"im_1128.stajx14.us.oracle.com"
"See base exception for details."

Solution

Resolve the indicated problem at the Oracle Application Server instance where the problem originated. Re-synchronize the instance.

java.lang.Exception: Could not delete file
/oracle_home/j2ee/OC4J_SECURITY/applicationdeployments/wirelesssso\jazn-data.xml. Please check file permissions.
at oracle.security.jazn.smi.JAZNPlugin.commit(Unknown Source)
at oracle.ias.sysmgmt.repository.DcmPlugin.commit(Unknown Source) 

Receiving Missing Files Messages During restore_config Operation

Description

A restore_config operation generates missing file messages.

Cause

During a restore_config operation, you receive messages indicating that files are missing, for example:

Could not copy file /oracle_home/Devkit_1129/testdir/ to
   /oracle_home/Devkit_1129/backup_restore/cfg_bkp/2004-12-01_03-26-22.

Solution

During a restore_config operation, a temporary configuration backup is taken so that, if the restore fails, the temporary backup can be restored returning the instance to the same state as before the restore.

If some files are deleted (including files and directories specified in config_misc_files.inp) before a restore operation, then during the temporary backup messages are displayed indicating that certain files are missing. These error and warning messages should be ignored since the missing files are restored as part of the restore_config operation.

Failure Due to Loss or Corruption of the opmn.xml File

Description

The loss or corruption of the opmn.xml file is causing a failure.

Cause

The loss or corruption of the opmn.xml file causes the following error:

ADMN-906025 
Base Exception:
The exception, 100999, occurred at Oracle Application Server instance
"J2EE_1123.stada07.us.oracle.com"

Solution

Perform the following steps to restore the opmn.xml file:

  1. Run

    ocs_bkp_restore.sh -m restore_config -t timestamp
    
    
  2. If that command fails, stop the OC4J processes.

  3. Rerun

    ocs_bkp_restore.sh -m restore_config -t timestamp
    
    

Timeout Occurs While Trying to Stop Processes Using the "opmnctl stopall" Command

Description

During backup_instance_cold, backup_instance_cold_incr and restore_instance operations, a timeout may occur while trying to stop processes using the opmnctl stopall command.

Cause

This can occur because of heavy machine load, or because a process is taking a long time to shut down. Under these conditions, you may receive an error message similar to the following:

Oracle Application Server instance backup failed.
Stopping all opmn managed processes ... 

Failure : backup_instance_cold_incr failed 

Unable to stop opmn managed processes !!! 

Solution

Run opmnctl stopall a second time.