Skip Headers
Oracle® Enterprise Manager Cloud Control Administrator's Guide
12c Release 1 (12.1.0.2)

Part Number E24473-16
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

28 Backing Up and Recovering Enterprise Manager

As the monitoring and management framework for your ecosystem, an important part of your high availability strategy is to ensure Enterprise Manager is regularly backed up so that it can be restored in the event of failure.

This chapter covers the following topics:

28.1 Backing Up Your Deployment

Although Enterprise Manager functions as a single entity, technically, it is built on a distributed, multi-tier software architecture composed of the following software components:

Each component, being uniquely different in composition and function, requires different approaches to backup and recovery. For this reason, the backup strategies are discussed on a per-tier basis in this chapter. For an overview of Enterprise Manager architecture, refer to the Oracle® Enterprise Manager Cloud Control Basic Installation Guide.

28.2 Management Repository Backup

The Management Repository is the storage location where all the information collected by the Management Agent gets stored. It consists of objects such as database jobs, packages, procedures, views, and tablespaces. Because it is configured in an Oracle Database, the backup and recovery strategies for the Management Repository are essentially the same as those for the Oracle Database. Backup procedures for the database are well established standards and can be implemented using the RMAN backup utility, which can be accessed via the Cloud Control console.

Management Respository Backup

Oracle recommends using High Availability Best Practices for protecting the Management Repository database against unplanned outages. As such, use the following standard database backup strategies.

Adhering to these strategies will create a full backup and then create incremental backups on each subsequent run. The incremental changes will then be rolled up into the baseline, creating a new full backup baseline.

Using the Recommended Backup Strategy also takes advantage of the capabilities of Enterprise Manager to execute the backups: Jobs will be automatically scheduled through the Job sub-system of Enterprise Manager. The history of the backups will then be available for review and the status of the backup will be displayed on the repository database target home page. This backup job along with archiving and flashback technologies will provide a restore point in the event of the loss of any part of the repository. This type of backup, along with archive and online logs, allows the repository to be recovered to the last completed transaction.

You can view when the last repository backup occurred on the Management Services and Repository Overview page under the Repository details section.

A thorough summary of how to configure backups using Enterprise Manager is available in the Oracle Database 2 Day DBA guide. For additional information on Database high availability best practices, review the Oracle Database High Availability Best Practices documentation.

28.3 Oracle Management Service Backup

The Oracle Management Service (OMS) orchestrates with Management Agents to discover targets, monitor and manage them, and store the collected information in a repository for future reference and analysis. The OMS also renders the Web interface for the Enterprise Manager console.

Backing Up the OMS

The OMS is generally stateless. Some configuration data is stored on the OMS file system.

A snapshot of OMS configuration can be taken using the emctl exportconfig oms command.

$ <OMS_HOME>/bin/emctl exportconfig oms [-sysman_pwd <sysman password>]
[-dir <backup dir>] Specify directory to store backup file
[-keep_host] Specify this parameter if the OMS was installed using a virtual hostname (using
ORACLE_HOSTNAME=<virtual_hostname>)

Running exportconfig captures a snapshot of the OMS at a given point in time, thus allowing you to back up the most recent OMS configuration on a regular basis. exportconfig should always be run on the OMS running the WebLogic Admin Server. If required, the most recent snapshot can then be restored on a fresh OMS installation on the same or different host.

Backup strategies for the OMS components are as follows:

28.4 Management Agent Backup

The Management Agent is an integral software component that is deployed on each monitored host. It is responsible for monitoring all the targets running on those hosts, communicating that information to the middle-tier OMS and managing and maintaining the hosts and its targets.

Backing Up Management Agents

There are no special considerations for backing up Management Agents. As a best practice, reference Management Agent installs should be maintained for different platforms and kept up-to-date in terms of customizations in the emd.properties file and patches applied. Use Deployment options from the Cloud Control console to install and maintain reference Agent installs.

If a Management Agent is lost, it should be reinstalled by cloning from a reference install.

28.5 Recovery of Failed Enterprise Manager Components

Recovering Enterprise Manager means restoring any of the three fundamental components of the Enterprise Manager architecture.

28.5.1 Repository Recovery

Recovery of the Repository database must be performed using RMAN since Cloud Control will not be available when the repository database is down. There are two recovery cases to consider:

  • Full Recovery: No special consideration is required for Enterprise Manager.

  • Point-in-Time/Incomplete Recovery: Recovered repository may be out of sync with Agents because of lost transactions. In this situation, some metrics may show up incorrectly in the Cloud Control console unless the repository is synchronized with the latest state available on the Agents.

A repository resync feature allows you to automate the process of synchronizing the Enterprise Manager repository with the latest state available on the Management Agents.

To resynchronize the repository with the Management Agents, you use Enterprise Manager command-line utility (emctl) resync repos command:

emctl resync repos -full -name "<descriptive name for the operation>"

You must run this command from the OMS Oracle Home AFTER restoring the Management Repository, but BEFORE starting the OMS. After submitting the command, start up all OMS instances and monitor the progress of repository resychronization from the Enterprise Manager console's Repository Resynchronization page, as shown in the following figure.

Figure 28-1 Repository Synchronization Page

Description of Figure 28-1 follows
Description of "Figure 28-1 Repository Synchronization Page"

Management Repository recovery is complete when the resynchronization jobs complete on all Management Agents.

Oracle strongly recommends that the Management Repository database be run in archivelog mode so that in case of failure, the database can be recovered to the latest transaction. If the database cannot be recovered to the last transaction, Repository Synchronization can be used to restore monitoring capabilities for targets that existed when the last backup was taken. Actions taken after the backup will not be recovered automatically. Some examples of actions that will not be recovered automatically by Repository Synchronization are:

  • Incident Rules

  • Preferred Credentials

  • Groups, Services, Systems

  • Jobs/Deployment Procedures

  • Custom Reports

  • New Agents

28.5.2 Recovery Scenarios

A prerequisite for repository (or any database) recovery is to have a valid, consistent backup of the repository. Using Enterprise Manager to automate the backup process ensures regular, up-to-date backups are always available if repository recovery is ever required. Recovery Manager (RMAN) is a utility that backs up, restores, and recovers Oracle Databases. The RMAN recovery job syntax should be saved to a safe location. This allows you to perform a complete recovery of the Enterprise Manager repository database. In its simplest form, the syntax appears as follows:

run {

restore database;

recover database;

}

Actual syntax will vary in length and complexity depending on your environment. For more information on extracting syntax from an RMAN backup and recovery job, or using RMAN in general, see the Oracle Database Backup and Recovery Advanced User's Guide.

The following scenarios illustrate various repository recovery situations along with the recovery steps.

28.5.2.1 Full Recovery on the Same Host

Repository database is running in archivelog mode. Recent backup, archive log files and redo logs are available. The repository database disk crashes. All datafiles and control files are lost.

Resolution:

  1. Stop all OMS instances using emctl stop oms.

  2. Recover the database using RMAN

  3. Bring the site up using the command emctl start oms on all OMS instances.

  4. Verify that the site is fully operational.

28.5.2.2 Incomplete Recovery on the Same Host

Repository database is running in noarchivelog mode. Full offline backup is available. The repository database disk crashes. All datafiles and control files are lost.

Resolution:

  1. Stop the OMS instances using emctl stop oms.

  2. Recover the database using RMAN.

  3. Initiate Repository Resync using emctl resync repos -full -name "<resync name>" from one of the OMS Oracle Home.

  4. Start the OMS instances using emctl start oms.

  5. Log into Cloud Control. Navigate to the Management Services and Repository Overview page. Click Repository Synchronization under Related Links. Monitor the status of resync jobs. Resubmit failed jobs, if any, after fixing the error.

  6. Verify that the site is fully operational.

28.5.2.3 Full Recovery on a Different Host

The Management Repository database is running on host "A" in archivelog mode. Recent backup, archive log files and redo logs are available. The repository database crashes. All datafiles and control files are lost.

Resolution:

  1. Stop the OMS instances using the command emctl stop oms.

  2. Recover the database using RMAN on a different host (host "B").

  3. Correct the connect descriptor for the repository by running the following command on each OMS.

    $emctl config oms –store_repos_details -repos_conndesc <connect descriptor> -repos_user sysman
    
  4. Start the OMS instances using the command emctl start oms.

  5. Relocate the Management Repository database target to the Agent running on host "B" by running the following command from the OMS:

    $emctl config repos -host <hostB> -oh <OH of repository on hostB>  -conn_desc "<TNS connect descriptor>"
    

    Note:

    This command can only be used to relocate the repository database under the following conditions:
    • An Agent is already running on this machine.

    • No database on host "B" has been discovered.

  6. Change the monitoring configuration for the OMS and Repository target: by running the following command from the OMS:

    $emctl config emrep -conn_desc "<TNS connect descriptor>"
    
  7. Verify that the site is fully operational.

28.5.2.4 Incomplete Recovery on a Different Host

The Management Repository database is running on host "A" in noarchivelog mode. Full offline backup is available. Host "A" is lost due to hardware failure. All datafiles and control files are lost.

Resolution:

  1. Stop the OMS instances using emctl stop oms.

  2. Recover the database using RMAN on a different host (host "B").

  3. Correct the connect descriptor for the repository in credential store.

    $emctl config oms –store_repos_details -repos_conndesc <connect descriptor> -repos_user sysman
    
  4. Initiate Repository Resync:

    $emctl resync repos -full -name "<resync name>"

    from one of the OMS Oracle Homes.

  5. Start the OMS using the command emctl start oms.

  6. Run the command to relocate the repository database target to the Management Agent running on host "B":

    $emctl config repos -agent <agent on host B> -host <hostB> -oh <OH of repository on hostB> -conn_desc "<TNS connect descriptor>"

  7. Run the command to change monitoring configuration for the OMS and Repository target:

    emctl config emrep -conn_desc "<TNS connect descriptor>"

  8. Log in to Cloud Control. Navigate to Management Services and Repository Overview page.

  9. Choose on Repository Synchronization under Related Links. Monitor the status of resync jobs. Resubmit failed jobs, if any, after fixing the error mentioned.

  10. Verify that the site is fully operational.

28.5.3 Recovering the OMS

If an Oracle Management Service instance is lost, recovering it essentially consists of two steps: Recovering the Software Homes, then configuring the Instance Home.

When restoring on the same host, the software homes can be restored from filesystem backup. In case a backup does not exist, or if installing to a different host, the Software Homes can be reconstructed using the “Install Software Only" option from the Cloud Control software distribution. Care should be taken to select and install all Management Plug-ins that existed in your environment prior to crash.

The following SQL command can be run against the repository database as the “sysman” user to obtain the list of plug-ins already deployed:

SELECT epv.display_name, epv.plugin_id, epv.version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ( 'BUILT_IN_TARGET_TYPE' , 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id;

Ensure that you select the same set of plug-ins as the ones on the first OMS.

If you had installed additional plug-ins that were not part of the Enterprise Manager Cloud Control software you used for installing the first OMS, then follow these steps:

  1. Connect to the Management Repository and run the following SQL query to retrieve a list of plug-ins installed:

    SELECT epv.plugin_id, epv.version, epv.rev_version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ('BUILT_IN_TARGET_TYPE', 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id
    
  2. Make a note of the additional plug-ins you installed.

  3. Download the additional plug-ins from the following URL:

    http://www.oracle.com/technetwork/oem/grid-control/downloads/oem-upgrade-console-502238.html

  4. Invoke the installer with the -pluginLocation argument, and pass the absolute path to the directory where you downloaded the additional plug-ins:

    runInstaller -pluginLocation <absolute_path_to_plugin_dir>
    

Note that some plug-ins might have not shipped with Cloud Control and might not be present in the install media. Running the SQL query above will return a list of required plug-ins. These plug-ins should be downloaded from OTN and installed as a software-only install. You must then run the PluginInstall.sh script (OMS_HOME/sysman/install/PluginInstall.sh) to install plug-ins. Once the plug-ins have been installed, you can perform the OMS recovery.

Note:

Recovery will fail if all required plug-ins have not been installed.

After running the installer in software-only mode, all patches that were installed prior to the crash must be re-applied. Assuming the Management Repository is intact, the post-scripts that run SQL against the repository can be skipped as the repository already has those patches applied.

As stated earlier, the location of the OMS Oracle Home is fixed and cannot be changed. Hence, ensure that the OMS Oracle Home is restored in the same location that was used previously.

Once the Software Homes are recovered, the instance home can be reconstructed using the omsca command in recovery mode:

omsca recover –as –ms -nostart –backup_file <exportconfig file>

Use the export file generated by the emctl exportconfig command shown in the previous section.

28.5.4 OMS Recovery Scenarios

The following scenarios illustrate various OMS recovery situations along with the recovery steps.

Important:

A prerequisite for OMS recovery is to have recent, valid OMS configuration backups available. Oracle recommends that you back up the OMS using the emctl exportconfig oms command whenever an OMS configuration change is made. This command must be run on the primary OMS running the WebLogic AdminServer.

Alternatively, you can run this command on a regular basis using the Enterprise Manager Job system.

Each of the following scenarios cover the recovery of the Software homes using either a filesystem backup (when available and only when recovering to the same host) or using the Software only option from the installer. In either case, the best practice is to recover the instance home (gc_inst) using the omsca recover command, rather than from a filesystem backup. This guarantees that the instance home is valid and up to date.

28.5.4.1 Single OMS, No Server Load Balancer (SLB), OMS Restored on the same Host

Site hosts a single OMS. No SLB is present. The OMS configuration was backed up using the emctl exportconfig oms command on the primary OMS running the AdminServer. The OMS Oracle Home is lost.

Resolution:

  1. Perform cleanup on failed OMS host.

    Make sure there are no processes still running from the Middleware home using a command similar to the following:

    ps -ef | grep -i -P "(Middleware|gc_inst)" | grep -v grep | awk '{print $2}' | xargs kill -9
    

    Note:

    Change Middleware|gc_inst to strings that match your own middleware and instance homes.

    If recovering the software homes using the software only install method, first de-install the existing Oracle Homes using the Cloud Control software distribution installer. This is required even if the software homes are no longer available as it is necessary to remove any record of the lost Oracle Homes from the Oracle inventory.

    If they exist, remove the 'Middleware' and 'gc_inst' directories.

  2. Ensure that software library locations are still accessible and valid. If a Software library is accessible but corrupt, it will affect OMSCA recovery.

  3. Restore the Software Homes.

    If restoring from a filesystem backup, delete the file OMS_HOME/sysman/config/emInstanceMapping.properties and any gc_inst directory that may have been restored, if they exist.

    Alternatively, if a backup does not exist, use the software only install method to reconstruct the software homes:

    1. Select the 'Install Software Only' option from the 'Install Types' step page within the Cloud Control software installer.

    2. Ensure all previously deployed plug-ins are selected on the 'Select Plug-ins' step page.

      It is possible to determine which plug-ins were deployed previously by running the following SQL against the repository database:

      SELECT epv.display_name, epv.plugin_id, epv.version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ( 'BUILT_IN_TARGET_TYPE' , 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id;
      

      Note:

      At the end of the Software only installation, do NOT run ConfigureGC.sh when told to do so by the installer. This step should only be performed as part of a fresh install, not as part of a recovery operation.

      Ensure that you select the same set of plug-ins as the ones on the first OMS.

      If you had installed additional plug-ins that were not part of the Enterprise Manager Cloud Control software you used for installing the first OMS, then follow these steps:

      Step 1: Connect to the Management Repository and run the following SQL query to retrieve a list of plug-ins installed:

      SELECT epv.plugin_id, epv.version, epv.rev_version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ('BUILT_IN_TARGET_TYPE', 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id
      

      Step 2: Make a note of the additional plug-ins you installed.

      Step 3: Download the additional plug-ins from the following URL:

      http://www.oracle.com/technetwork/oem/grid-control/downloads/oem-upgrade-console-502238.html

      Step 4: Invoke the installer with the -pluginLocation argument, and pass the absolute path to the directory where you downloaded the additional plug-ins:

      runInstaller -pluginLocation <absolute_path_to_plugin_dir>
      
    3. Apply any patches that were previously applied to the OMS software homes.

  4. Run omsca in recovery mode specifying the export file taken earlier to configure the OMS:

    <OMS_HOME>/bin/omsca recover –as –ms –nostart –backup_file <exportconfig file>
    

    Note:

    The -backup_file to be passed must be the latest file generated from emctl exportconfig oms command.
  5. Start the OMS.

    OMS_HOME/bin/emctl start oms
    
  6. Recover the Agent (if necessary).

    If the Management Agent Software Home was recovered along with the OMS Software Homes (as is likely in a single OMS install recovery where the Management Agent and agent_inst directories are commonly under the Middleware home), the Management Agent instance directory should be recreated to ensure consistency between the Management Agent and OMS.

    1. Remove the agent_inst directory if it was restored from backup

    2. Use agentDeploy.sh to configure the agent:

      <AGENT_HOME>/core/12.1.0.0.0/sysman/install/agentDeploy.sh AGENT_BASE_DIR=<AGENT_BASE_DIR> AGENT_INSTANCE_HOME=<AGENT_INSTANCE_HOME> ORACLE_HOSTNAME=<AGENT_HOSTNAME> AGENT_PORT=<AGENT_PORT> -configOnly OMS_HOST=<oms host> EM_UPLOAD_PORT=<OMS_UPLOAD_PORT> AGENT_REGISTRATION_PASSWORD=<REG_PASSWORD>
      
    3. The OMS automatically blocks the Management Agent. Resync the Management Agent from the Management Agent homepage.

    If the Management Agent software home was not recovered along with the OMS but the Agent still needs to be recovered, follow the instructions in section Agent Reinstall Using the Same Port.

    Note:

    This is only likely to be needed in the case where a filesystem recovery has been performed that did not include a backup of the Agent software homes. If the OMS software homes were recovered using the Software only install method, this step will not be required because a Software only install installs an Agent software home under the Middleware home.
  7. Verify that the site is fully operational.

28.5.4.2 Single OMS, No SLB, OMS Restored on a Different Host

Site hosts a single OMS. The OMS is running on host "A." No SLB is present. The OMS configuration was backed up using the emctl exportconfig oms command. Host "A" is lost.

Resolution:

  1. Ensure that software library locations are accessible from “Host B”.

  2. Restore the software homes on “Host B”.

    Oracle does not support restoring OMS Oracle Homes from filesystem backup across different hosts. Use the software-only install method to reconstruct the software homes:

    1. Select the 'Install Software Only' option from the 'Install Types' step page within the Cloud Control software installer.

    2. Ensure all previously deployed plug-ins are selected on the 'Select Plug-ins' step page.

      It is possible to determine which plug-ins were deployed previously by running the following SQL against the repository database:

      SELECT epv.display_name, epv.plugin_id, epv.version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ( 'BUILT_IN_TARGET_TYPE' , 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id;
      

      Note:

      At the end of the Software only installation, do NOT run ConfigureGC.sh when told to do so by the installer. This step should only be performed as part of a fresh install, not as part of a recovery operation.

      Ensure that you select the same set of plug-ins as the ones on the first OMS.

      If you had installed additional plug-ins that were not part of the Enterprise Manager Cloud Control software you used for installing the first OMS, then follow these steps:

      Step 1: Connect to the Management Repository and run the following SQL query to retrieve a list of plug-ins installed:

      SELECT epv.plugin_id, epv.version, epv.rev_version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ('BUILT_IN_TARGET_TYPE', 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id
      

      Step 2: Make a note of the additional plug-ins you installed.

      Step 3: Download the additional plug-ins from the following URL:

      http://www.oracle.com/technetwork/oem/grid-control/downloads/oem-upgrade-console-502238.html

      Step 4: Invoke the installer with the -pluginLocation argument, and pass the absolute path to the directory where you downloaded the additional plug-ins:

      runInstaller -pluginLocation <absolute_path_to_plugin_dir>
      
    3. Apply any patches that were previously applied to the OMS software homes.

  3. Run omsca in recovery mode specifying the export file taken earlier to configure the OMS:

    <OMS_HOME>/bin/omsca recover –as –ms –nostart –backup_file <exportconfig file>
    

    Note:

    The -backup_file to be passed must be the latest file generated from emctl exportconfig oms command.
  4. Start the OMS.

    <OMS_HOME>/bin/emctl start oms
    

    An agent is installed as part of the Software only install and needs to be configured using the agentDeploy.sh command:

  5. Configure the Agent.

    <AGENT_HOME>/core/12.1.0.0.0/sysman/install/agentDeploy.sh AGENT_BASE_DIR=<AGENT_BASE_DIR> AGENT_INSTANCE_HOME=<AGENT_INSTANCE_HOME> ORACLE_HOSTNAME=<AGENT_HOSTNAME> AGENT_PORT=<AGENT_PORT> -configOnly OMS_HOST=<oms host> EM_UPLOAD_PORT=<OMS_UPLOAD_PORT> AGENT_REGISTRATION_PASSWORD=<REG_PASSWORD>
    

    The OMS automatically blocks the Management Agent. Resync the Management Agent from the Management Agent homepage

  6. Relocate the oracle_emrep target to the Management Agent of the new OMS host using the following commands:

    <OMS_HOME>/bin/emcli login –username=sysman
    <OMS_HOME>/bin/emcli sync
    <OMS_HOME>/bin/emctl config emrep -agent <agent on host "B", e.g myNewOMSHost.example.com:3872>
    
  7. In the Cloud Control console, locate the 'WebLogic Domain' target for the Cloud Control Domain. Go to 'Monitoring Credentials' and update the adminserver host to host B. Then do a Refresh Weblogic Domain to reconfigure the domain with new hosts.

  8. Locate duplicate targets from the Management Services and Repository Overview page of the Enterprise Manager console. Click the Duplicate Targets link to access the Duplicate Targets page. To resolve duplicate target errors, the duplicate target must be renamed on the conflicting Agent. Relocate duplicate targets from Agent "A" to Agent "B".

  9. Change the OMS to which all Management Agents point and then resecure all Agents.

    Because the new machine is using a different hostname from the one originally hosting the OMS, all Agents in your monitored environment must be told where to find the new OMS. On each Management Agent, run the following command:

    <AGENT_INST_DIR>/bin/emctl secure agent -emdWalletSrcUrl "http://hostB:<http_port>/em"
    
  10. Assuming the original OMS host is no longer in use, remove the Host target (including all remaining monitored targets) from Cloud Control by selecting the host on the Targets > Hosts page and clicking 'Remove'. You will be presented with an error that informs you to remove all monitored targets first. Remove those targets then repeat the step to remove the Host target successfully.

  11. Verify that the site is fully operational.

28.5.4.3 Single OMS, No SLB, OMS Restored on a Different Host using the Original Hostname

Site hosts a single OMS. The OMS is running on host "A." No SLB is present. The OMS configuration was backed up using the emctl exportconfig oms command. Host "A" is lost. Recovery is to be performed on “Host B” but retaining the use of “Hostname A”.

Resolution:

  1. Ensure that the software library location is accessible from Host "B".

    Oracle does not support restoring OMS Oracle Homes from filesystem backup across different hosts. Use the software-only install method to reconstruct the software homes:

    1. Select the 'Install Software Only' option from the 'Install Types' step page within the Cloud Control software installer.

    2. Ensure all previously deployed plug-ins are selected on the 'Select Plug-ins' step page.

      It is possible to determine which plug-ins were deployed previously by running the following SQL against the Management Repository database:

      SELECT epv.display_name, epv.plugin_id, epv.version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ( 'BUILT_IN_TARGET_TYPE' , 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id; 
      

      Note:

      At the end of the Software only installation, do NOT run ConfigureGC.sh when told to do so by the installer. This step should only be performed as part of a fresh install, not as part of a recovery operation.

      Ensure that you select the same set of plug-ins as the ones on the first OMS.

      If you had installed additional plug-ins that were not part of the Enterprise Manager Cloud Control software you used for installing the first OMS, then follow these steps:

      Step 1: Connect to the Management Repository and run the following SQL query to retrieve a list of plug-ins installed:

      SELECT epv.plugin_id, epv.version, epv.rev_version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ('BUILT_IN_TARGET_TYPE', 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id
      

      Step 2: Make a note of the additional plug-ins you installed.

      Step 3: Download the additional plug-ins from the following URL:

      http://www.oracle.com/technetwork/oem/grid-control/downloads/oem-upgrade-console-502238.html

      Step 4: Invoke the installer with the -pluginLocation argument, and pass the absolute path to the directory where you downloaded the additional plug-ins:

      runInstaller -pluginLocation <absolute_path_to_plugin_dir>
      
    3. Apply any patches that were previously applied to the OMS software homes.

  2. Modify the network configuration such that “Host B” also responds to hostname of “Host A”. Specific instructions on how to configure this are beyond the scope of this document. However, some general configuration suggestions are:

    Modify your DNS server such that both “Hostname B” and “Hostname A” network addresses resolve to the physical IP of “Host B”.

    Multi-home “Host B”. Configure an additional IP on “Host B” for the IP address that “Hostname A” resolves to. For example, on “Host B” run the following commands:

    ifconfig eth0:1 <IP assigned to “Hostname A”> netmask <netmask>
    /sbin/arping -q -U -c 3 -I eth0 <IP of HostA>
    
  3. Run omsca in recovery mode specifying the export file taken earlier to configure the OMS:

    <OMS_HOME>/bin/omsca recover –as –ms –nostart –backup_file <exportconfig file> -AS_HOST <hostA> -EM_INSTANCE_HOST <hostA>
    

    Note:

    The -backup_file to be passed must be the latest file generated from emctl exportconfig oms command.
  4. Start the OMS

    <OMS_HOME>/bin/emctl start oms
    
  5. Configure the agent.

    An agent is installed as part of the Software only install and needs to be configured using the agentDeploy.sh command:

    <AGENT_HOME>/core/12.1.0.0.0/sysman/install/agentDeploy.sh AGENT_BASE_DIR=<AGENT_BASE_DIR> AGENT_INSTANCE_HOME=<AGENT_INSTANCE_HOME> ORACLE_HOSTNAME=<AGENT_HOSTNAME> AGENT_PORT=<AGENT_PORT> -configOnly OMS_HOST=<oms host> EM_UPLOAD_PORT=<OMS_UPLOAD_PORT> AGENT_REGISTRATION_PASSWORD=<REG_PASSWORD>
    
  6. The OMS automatically blocks the Management Agent. Resync the Management Agent from the Management Agent homepage.

    Run the command to relocate Management Services and Management Repository target to Management Agent "B":

    emctl config emrep -agent <agent on host B>
    
  7. In the Cloud Control console, locate the 'WebLogic Domain' target for the Cloud Control Domain. Go to 'Monitoring Credentials' and update the adminserver host to host B. Then do a Refresh Weblogic Domain to reconfigure the domain with new hosts.

  8. Locate duplicate targets from the Management Services and Repository Overview page of the Enterprise Manager console. Click the Duplicate Targets link to access the Duplicate Targets page. To resolve duplicate target errors, the duplicate target must be renamed on the conflicting Management Agent. Relocate duplicate targets from Management Agent "A" to Management Agent "B".

  9. Verify that the site is fully operational.

28.5.4.4 Multiple OMS, Server Load Balancer, Primary OMS Recovered on the Same Host

Site hosts multiple OMS instances. All OMS instances are fronted by a Server Load Balancer. OMS configuration backed up using the emctl exportconfig oms command on the primary OMS running the WebLogic AdminServer. The primary OMS is lost.

Resolution:

  1. Perform a cleanup on the failed OMS host.

    Make sure there are no processes still running from the Middleware home using a command similar to the following:

    ps -ef | grep -i -P "(Middleware|gc_inst)" | grep -v grep | awk '{print $2}' | xargs kill -9
    

    Note:

    Change Middleware|gc_inst to strings that match your own middleware and instance homes.

    If recovering the software homes using the software only install method, first de-install the existing Oracle Homes using the Cloud Control software distribution installer. This is required even if the software homes are no longer available as it is necessary to remove any record of the lost Oracle Homes from the Oracle inventory.

    If they exist, remove the 'Middleware' and 'gc_inst' directories.

  2. Ensure that software library locations are still accessible.

  3. Restore the software homes.

    If restoring from a filesystem backup, delete the file <OMS_HOME>/sysman/config/emInstanceMapping.properties and any gc_inst directory that may have been restored, if they exist. Alternatively, if a backup does not exist, use the software only install method to reconstruct the software homes:

    1. Select the 'Install Software Only' option from the 'Install Types' step page within the Cloud Control software installer.

    2. Ensure all previously deployed plug-ins are selected on the 'Select Plug-ins' step page.

      It is possible to determine which plugins were deployed previously by running the following SQL against the Management Repository database:

      SELECT epv.display_name, epv.plugin_id, epv.version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ( 'BUILT_IN_TARGET_TYPE' , 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id;
      

      Note:

      At the end of the Software only installation, do NOT run ConfigureGC.sh when told to do so by the installer. This step should only be performed as part of a fresh install, not as part of a recovery operation.
    3. Apply any patches that were previously applied to the OMS software homes.

  4. Run omsca in recovery mode specifying the export file taken earlier to configure the OMS:

    <OMS_HOME>/bin/omsca recover –as –ms –nostart –backup_file <exportconfig file>
    

    Note:

    The -backup_file to be passed must be the latest file generated from emctl exportconfig oms command.
  5. Start the OMS.

    <OMS_HOME>/bin/emctl start oms
    
  6. Recover the Management Agent.

    If the Management Agent software home was recovered along with the OMS software homes (as is likely in a Primary OMS install recovery where the agent and agent_inst directories are commonly under the Middleware home), the Management Agent instance directory should be recreated to ensure consistency between the Management Agent and OMS.

    1. Remove the agent_inst directory if it was restored from backup.

    2. Use agentDeploy.sh to configure the Management Agent:

      <AGENT_HOME>/core/12.1.0.0.0/sysman/install/agentDeploy.sh AGENT_BASE_DIR=<AGENT_BASE_DIR> AGENT_INSTANCE_HOME=<AGENT_INSTANCE_HOME> ORACLE_HOSTNAME=<AGENT_HOSTNAME> AGENT_PORT=<AGENT_PORT> -configOnly OMS_HOST=<oms host> EM_UPLOAD_PORT=<OMS_UPLOAD_PORT> AGENT_REGISTRATION_PASSWORD=<REG_PASSWORD>
      
    3. The OMS automatically blocks the Management Agent. Resync the Management Agent from the Management Agent homepage.

    If the Management Agent software home was not recovered along with the OMS but the Management Agent still needs to be recovered, follow the instructions in section Agent Reinstall Using the Same Port.

    Note:

    This is only likely to be needed in the case where a filesystem recovery has been performed that did not include a backup of the Management Agent software homes. If the OMS software homes were recovered using the Software only install method, this step will not be required because a Software only install installs an Management Agent software home under the Middleware home.
  7. Re-enroll the additional OMS, if any, with the recovered Administration Server by running emctl enroll oms on each additional OMS.

  8. Verify that the site is fully operational.

28.5.4.5 Multiple OMS, Server Load Balancer Configured, Primary OMS Recovered on a Different Host

Site hosts multiple OMS instances. OMS instances fronted by a Server Load Balancer. OMS Configuration backed up using emctl exportconfig oms command. Primary OMS on host "A" is lost and needs to be recovered on Host "B".

  1. If necessary, perform cleanup on failed OMS host.

    Make sure there are no processes still running from the Middleware home using a command similar to the following:

    ps -ef | grep -i -P "(Middleware|gc_inst)" | grep -v grep | awk '{print $2}' | xargs kill -9
    
  2. Ensure that software library locations are accessible from “Host B”.

  3. Restore the software homes on “Host B”.

    Oracle does not support restoring OMS Oracle Homes from filesystem backup across different hosts. Use the software-only install method to reconstruct the software homes:

    1. Select the 'Install Software Only' option from the 'Install Types' step page within the Cloud Control software installer.

    2. Ensure all previously deployed plug-ins are selected on the 'Select Plug-ins' step page.

      It is possible to determine which plugins were deployed previously by running the following SQL against the Management Repository database:

      SELECT epv.display_name, epv.plugin_id, epv.version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ( 'BUILT_IN_TARGET_TYPE' , 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id;
      

      Note:

      At the end of the Software only installation, do NOT run ConfigureGC.sh when told to do so by the installer. This step should only be performed as part of a fresh install, not as part of a recovery operation.

      Ensure that you select the same set of plug-ins as the ones on the first OMS.

      If you had installed additional plug-ins that were not part of the Enterprise Manager Cloud Control software you used for installing the first OMS, then follow these steps:

      Step 1: Connect to the Management Repository and run the following SQL query to retrieve a list of plug-ins installed:

      SELECT epv.plugin_id, epv.version, epv.rev_version FROM em_plugin_version epv, em_current_deployed_plugin ecp WHERE epv.plugin_type NOT IN ('BUILT_IN_TARGET_TYPE', 'INSTALL_HOME') AND ecp.dest_type='2' AND epv.plugin_version_id = ecp.plugin_version_id
      

      Step 2: Make a note of the additional plug-ins you installed.

      Step 3: Download the additional plug-ins from the following URL:

      http://www.oracle.com/technetwork/oem/grid-control/downloads/oem-upgrade-console-502238.html

      Step 4: Invoke the installer with the -pluginLocation argument, and pass the absolute path to the directory where you downloaded the additional plug-ins:

      runInstaller -pluginLocation <absolute_path_to_plugin_dir>
      
    3. Apply any patches that were previously applied to the OMS software homes.

  4. Run omsca in recovery mode specifying the export file taken earlier to configure the OMS:

    <OMS_HOME>/bin/omsca recover –as –ms –nostart –backup_file <exportconfig file>
    

    Note:

    The -backup_file to be passed must be the latest file generated from emctl exportconfig oms command.
  5. Start the OMS.

    <OMS_HOME>/bin/emctl start oms
    
  6. Configure the Management Agent.

    An agent is installed as part of the Software only install and needs to be configured using the agentDeploy.sh command:

    <AGENT_HOME>/core/12.1.0.0.0/sysman/install/agentDeploy.sh AGENT_BASE_DIR=<AGENT_BASE_DIR> AGENT_INSTANCE_HOME=<AGENT_INSTANCE_HOME> ORACLE_HOSTNAME=<AGENT_HOSTNAME> AGENT_PORT=<AGENT_PORT> -configOnly OMS_HOST=<oms host> EM_UPLOAD_PORT=<OMS_UPLOAD_PORT> AGENT_REGISTRATION_PASSWORD=<REG_PASSWORD>
    

    The OMS automatically blocks the Management Agent. Resync the Management Agent from the Management Agent homepage

  7. Add the new OMS to the SLB virtual server pools and remove the old OMS.

  8. Relocate the oracle_emrep target to the Management Agent of the new OMS host using the following commands:

    <OMS_HOME>/bin/emcli sync
    <OMS_HOME>/bin/emctl config emrep -agent <agent on host "B", e.g myNewOMSHost.example.com:3872>
    
  9. In the Cloud Control console, locate the 'WebLogic Domain' target for the Cloud Control Domain. Go to 'Monitoring Credentials' and update the adminserver host to host B. Then do a Refresh Weblogic Domain to reconfigure the domain with new hosts.

  10. Locate duplicate targets from the Management Services and Repository Overview page of the Enterprise Manager console. Click the Duplicate Targets link to access the Duplicate Targets page. To resolve duplicate target errors, the duplicate target must be renamed on the conflicting Management Agent. Relocate duplicate targets from Management Agent "A" to Management Agent "B".

  11. Assuming the original OMS host is no longer in use, remove the Host target (including all remaining monitored targets) from Cloud Control by selecting the host on the Targets > Hosts page and clicking 'Remove'. You will be presented with an error that informs you to remove all monitored targets first. Remove those targets then repeat the step to remove the Host target successfully.

  12. Verify that the site is fully operational.

28.5.4.6 Multiple OMS, SLB configured, additional OMS recovered on same or different host

Multiple OMS site where the OMS instances are fronted by an SLB. OMS configuration backed up using the emctl exportconfig oms command on the first OMS. Additional OMS is lost and needs to be recovered on the same or a different host.

  1. If recovering to the same host, ensure cleanup of the failed OMS has been performed:

    Make sure there are no processes still running from the Middleware home using a command similar to the following:

    ps -ef | grep -i -P "(Middleware|gc_inst)" | grep -v grep | awk '{print $2}' | xargs kill -9
    

    First de-install the existing Oracle Homes using the Cloud Control software distribution installer. This is required even if the software homes are no longer available as it is necessary to remove any record of the lost Oracle Homes from the Oracle inventory.

    If they exist, remove the 'Middleware' and 'gc_inst' directories.

  2. Ensure that shared software library locations are accessible.

  3. Install an Management Agent on the required host (same or different as the case may be).

  4. Use the Additional OMS deployment procedure to configure a new additional OMS. See the Oracle® Enterprise Manager Cloud Control Basic Installation Guide for more information.

  5. Verify that the site is fully operational.

28.5.5 Recovering Management Agents

If an Management Agent is lost, it should be reinstalled by cloning from a reference install. Cloning from a reference install is often the fastest way to recover an Management Agent install as it is not necessary to track and reapply customizations and patches. Care should be taken to reinstall the Management Agent using the same port. Using the Enterprise Manager's Management Agent Resynchronization feature, a reinstalled Management Agent can be reconfigured using target information present in the Management Repository. When the Management Agent is reinstalled using the same port, the OMS detects that it has been re-installed and blocks it temporarily to prevent the auto-discovered targets in the re-installed Management Agent from overwriting previous customizations.

Blocked Management Agents:

This is a condition in which the OMS rejects all heartbeat or upload requests from the blocked Management Agent. Hence, a blocked Agent will not be able to upload any alerts or metric data to the OMS. However, blocked Management Agents continue to collect monitoring data.

The Management Agent can be resynchronized and unblocked from the Management Agent homepage by clicking on the Resynchronize Agent button. Resynchronization pushes all targets from the Management Repository to the Management Agent and then unblocks the Agent.

28.5.6 Management Agent Recovery Scenarios

The following scenarios illustrate various Management Agent recovery situations along with the recovery steps. The Management Agent resynchronization feature requires that a reinstalled Management Agent use the same port and location as the previous Management Agent that crashed.

28.5.6.1 Management Agent Reinstall Using the Same Port

A Management Agent is monitoring multiple targets. The Agent installation is lost.

  1. De-install the Agent Oracle Home using the Oracle Universal Installer.

    Note:

    This step is necessary in order to clean up the inventory.
  2. Install a new Management Agent or use the Management Agent clone option to reinstall the Management Agent though Enterprise Manager. Specify the same port that was used by the crashed Agent. The location of the install must be same as the previous install.

    The OMS detects that the Management Agent has been re-installed and blocks the Management Agent.

  3. Initiate Management Agent Resynchronization from the Management Agent homepage.

    All targets in the Management Repository are pushed to the new Management Agent. The Agent is instructed to clear backlogged files and then do a clearstate. The Agent is then unblocked.

  4. Reconfigure User-defined Metrics if the location of User-defined Metric scripts have changed.

  5. Verify that the Management Agent is operational and all target configurations have been restored using the following emctl commands:

    emctl status agent 
    emctl upload agent 
    

    There should be no errors and no XML files in the backlog.

28.5.6.2 Management Agent Restore from Filesystem Backup

A single Management Agent is monitoring multiple targets. File system backup for the Agent Oracle Home exists. The Agent install is lost.

  1. Restore the Management Agent from the filesystem backup then start the Management Agent.

    The OMS detects that the Management Agent has been restored from backup and blocks the Management Agent.

  2. Initiate Management Agent Resynchronization from the Management Agent homepage.

    All targets in the Management Repository are pushed to the new Management Agent. The Agent is instructed to clear backlogged files and performs a clearstate. The Management Agent is unblocked.

  3. Verify that the Management Agent is functional and all target configurations have been restored using the following emctl commands:

    emctl status agent
    
    emctl upload agent 
    

    There should be no errors and no XML files in the backlog.

28.6 Recovering from a Simultaneous OMS-Management Repository Failure

When both OMS and Management Repository fail simultaneously, the recovery situation becomes more complex depending upon factors such as whether the OMS and Management Repository recovery has to be performed on the same or different host, or whether there are multiple OMS instances fronted by an SLB. In general, the order of recovery for this type of compound failure should be Management Repository first, followed by OMS instances following the steps outlined in the appropriate recovery scenarios discussed earlier. The following scenarios illustrate two OMS-Management Repository failures and the requisite recovery steps.

28.6.1 Collapsed Configuration: Incomplete Management Repository Recovery, Primary OMS on the Same Host

Management Repository and the primary OMS are installed on same host (host "A"). The Management Repository database is running in noarchivelog mode. Full cold backup is available. A recent OMS backup file exists ( emctl exportconfig oms). The Management Repository, OMS and the Management Agent crash.

  1. Follow the Management Repository recovery procedure shown in Incomplete Recovery on the Same Host with the following exception:

    Since the OMS OracleHome is not available and Management Repository resynchronization has to be initiated before starting an OMS against the restored Management Repository, submit "resync" via the following PL/SQL block. Log into the Management Repository as SYSMAN using SQLplus and run:

    begin emd_maintenance.full_repository_resync('<resync name>'); end;
    
  2. Follow the OMS recovery procedure shown in Section 28.5.4.1, "Single OMS, No Server Load Balancer (SLB), OMS Restored on the same Host."

  3. Verify that the site is fully operational.

28.6.2 Distributed Configuration: Incomplete Management Repository Recovery, Primary OMS and additional OMS on Different Hosts, SLB Configured

The Management Repository, primary OMS, and additional OMS all reside on the different hosts. The Management Repository database was running in noarchivelog mode. OMS backup file from a recent backup exists (emctl exportconfig oms). Full cold backup of the database exists. All three hosts are lost.

  1. Follow the Management Repository recovery procedure shown in Section 28.5.2.2, "Incomplete Recovery on the Same Host." with the following exception:

    Since OMS Oracle Home is not yet available and Management Repository resync has to be initiated before starting an OMS against the restored Management Repository, submit resync via the following PL/SQL block. Log into the Management Repository as SYSMAN using SQLplus and run the following:

    begin emd_maintenance.full_repository_resync('resync name'); end;
    
  2. Follow the OMS recovery procedure shown in Section 28.5.4.5, "Multiple OMS, Server Load Balancer Configured, Primary OMS Recovered on a Different Host" with the following exception:

    Override the Management Repository connect description present in the backup file by passing the additional omsca parameter:

    -REPOS_CONN_STR <restored repos descriptor>
    

    This needs to be added along with other parameters listed in Section 28.5.4.5, "Multiple OMS, Server Load Balancer Configured, Primary OMS Recovered on a Different Host."

  3. Follow the OMS recovery procedure shown in Section 28.5.4.6, "Multiple OMS, SLB configured, additional OMS recovered on same or different host."

  4. Verify that the site is fully operational.