Prerequisites
Oracle Data Guard must be installed and configured on the Oracle RDBMS instances at each site, such that one site is Primary, and the other site(s) are Standby. See the “Getting Started with Oracle Data Guard” chapter of Oracle Data Guard Concepts and Administration. Make sure that the fast recovery area on all database instances in the Oracle Data Guard configuration are configured with enough space to cover worst case gap in applying archive logs to the standby database(s). This is the db_recovery_file_dest_size system property.
Note: If practical, Oracle “Active” Data Guard is recommended over standard Data Guard. Active Data Guard allows the replicated RDBMS instance to remain open in read-only mode and supports a faster transition to active. Running alternate NMS site RDBMS instance(s) in read-only mode allows NMS to run “Warm Services” on alternate NMS sites, reducing the time to recover if/when an NMS site switchover/failover occurs.
If failover patching is to be used, then Oracle Flashback Database must be enabled in Oracle RDBMS at all sites.
A database user with the SYSDG privilege must be available.
Site-specific parameters in CES_PARAMETERS must be configured for each NMS site. For details, see the Environment Configuration chapter in this guide and the System Installation chapter of the Oracle Utilities Network Management System Installation Guide.
SSH User Equivalence (i.e., SSH password-less login) for the NMS administrative user must be configured between the NMS servers on each site.
nms-wls-config must be run in each environment to configure starting/ stopping the WebLogic managed servers. See the Starting and Stopping Services for more information.
WebLogic 14.1.2 must be installed with at least one machine running node manager at each NMS site.
The main components involved in the failover and recovery process are defined below.
NMS Agent
The NMS Agent is a Java process that runs on each NMS Services server. It is responsible for monitoring NMS back-end services and reporting their state to the NMS Monitor module. In a dual-environment configuration, both administrative users will have an NMS Agent instance.
NMS Monitor
The NMS Monitor runs on a WebLogic 14.1.2 cluster and is responsible for periodically requesting the status of the site from the NMS Agents, WebLogic managed servers and databases, storing the status of the sites in a HAMI (Highly Available Metadata Infrastructure) ensemble. NMS Monitor also provides a browser-based UI to represent the reported status of NMS sites.
Note: This is an independent WebLogic domain/instance compared to the WebLogic domain used to support NMS operational end users. It is acceptable (but not necessary) to run this WebLogic domain on the same servers that support NMS end users. This instance of WebLogic has a very modest resource usage profile.
HAMI
An Oracle HAMI (Highly Available Metadata Infrastructure) ensemble is used to store NMS site status information collated by NMS Monitor. HAMI is essentially a highly available distributed key-value store – where it is expected to be distributed to (running on) at least one server at each NMS site.
CESEJB, NMS-WS
The CESEJB and NMS-WS deployments provide REST APIs to allow the NMS Monitor to request the status of their deployment.
Database Server
The NMS Monitor interrogates the databases to determine the active and staging environment for each site.