Lifecycle Tasks

Regular lifecyle maintenance is needed to keep the secondary site in sync with the primary site. Throughout the lifecyle, you'll want to be able to perform a planned switchover to switch the roles of the primary and secondary sites and respond to unexpected or unplanned operations.

About Configuration Replication

An initial replication of the content that resides in the file systems was performed during the DR set up. You must repeat the file system replication on a regular basis to keep the secondary site up-to-date with the primary site.

You can use the same scripts that you created during the DR setup, Replicate the File System Artifacts to OCI, and schedule the file systems replica with the following considerations for each artifact:

  • Replication of the Oracle Homes during the lifecycle

    This is a static artifact. It does not change frequently, so there is no need to replicate it on a regular basis. Only when you perform a modification in the Oracle Home (such as patching activity) you need to replicate it.

  • Replication of the WebLogic domain shared configuration during the lifecycle

    This is a dynamic artifact. Among other things, it contains the ASERVER_HOME, which is the source-of-the-truth of the SOA domain configuration, and the APPLICATION_HOME, which is updated every time an application is deployed, undeployed, updated, and so on.

    It is expected that this WebLogic domain shared configuration changes frequently. Schedule a regular replica of this artifact, which should be more or less frequent depending on how frequently configuration changes occur in your system. Another controlled approach is to perform a replica every time you perform a configuration change to the primary.

  • Replication of the WebLogic domain private configuration during the lifecycle

    This is also a dynamic artifact, it contains the MSERVER_HOME and the NM_HOME. It is not expected to have frequent updates on the nodemanager home after the initial setup. The content of the MSERVER_HOME will change as frequently as the ASERVER_HOME, because it contains the domain folder used by the managed servers. However, most of its content (the ASERVER_HOME/config) is refreshed and downloaded from the AdminServer when the managed servers starts and when configuration changes are applied using the WebLogic Scripting Tool (WLST) or the Oracle WebLogic Server Administration Console. It's not as critical to replicate this artifact as frequently as the shared configuration. It is mandatory to replicate this only when modifications are performed to other folders in the MSERVER_HOME (for example, a modification in the MSERVER_HOME/bin folder).

  • Replication of the shared runtime folder

    If you store any runtime artifact in this folder, schedule the replica to standby, per your business needs.

    Instead of using Oracle Cloud Infrastructure File Storage file system and replicate with rsync, you can use an Oracle Database File System (DBFS) mount for the shared runtime contents. This way, the content resides in the database and is automatically replicated to the secondary with the underlying Oracle Data Guard replica. See About Oracle Database File System in Learn More for details about using DBFS.

The following table is a summary of the recommendations for file system artifacts replication during the lifecycle.

Artifact Contains Recommendation
Oracle Homes FMW home, JDK, inventory Replicate only under demand (For example, after patching)
WebLogic Domain Shared Configuration ASERVER_HOME, applications, deployment plans, keystores Schedule replication, high frequency maybe required. The frequency depends on how often the configuration changes are performed to the SOA system.
WebLogic Domain Private Configuration MSERVER_HOMES, nodemanager config Schedule replication. High frequency is not normally required.
Shared Runtime Customer-specific runtime artifacts (not JMS, not TLOGS) Determined by your requirements. If this is a DBFS mount, then the content is replicated automatically by Oracle Data Guard.

Perform a Switchover

A switchover is a planned operation where an administrator reverts the roles of the two sites. After a switchover, the primary system becomes secondary and the secondary system becomes primary. Performing a switchover will cause downtime in the primary site.
Before performing a switchover in a SOA Hybrid DR configuration, propagate any pending configuration changes. Ensure there aren't any replicated changes to the secondary site pending.
  1. Disable any scheduled replication while the switchover is performed, since it may fail and interfere with the switchover operation itself.
  2. Stop the Oracle HTTP Server systems in the primary site.
  3. Stop the servers in the primary site.
    Use the WebLogic Administration Server Console or scripts to stop the WebLogic servers in the primary site.

    Note:

    The Admin server in primary site can remain up during switchover. However, it is recommended to stop it when the site is in standby role because it is expected that the domain configuration in the standby site will be overridden by the primary configuration during the lifecycle. If the Admin server is up while this happen, it will be running with a stale configuration.
  4. Switchover the front-end DNS name.

    Perform the required DNS push in the DNS server hosting the names used by the system or alter the file host resolution in clients to point the front-end virtual name of the system to the public IP used by the Load Balancer in the secondary site.

    For scenarios where DNS is used for the external front-end resolution (such as OCI DNS or commercial DNS), you can use an API to push the change. To see an example that pushes this change in an OCI DNS, go to GitHub for example scripts.

    Note that the TTL value of the DNS entry will affect the RTO of the switchover: if the TTL is high (example, 20 min), the DNS change will take that time to be effective in the clients. Using lower TTL values will make this to be faster; however, this can cause an overhead because clients will hit the DNS more frequently instead of using cached names. A good approach is to set the TTL to a low value temporarily (for example, 1 min), before the change in the DNS. Then, perform the change, and once the switchover procedure is completed, revert the TTL to its original value again.

  5. As an oracle user, use Oracle Data Guard broker in the primary database host to perform the database switchover.
    You'll need your system password and the unique name of your primary database.
    [oracle@dbhost1~]$ dgmgrl sys/your_sys_password@primary_db_unqname
    DGMGRL> switchover to secondary_db_unqname
  6. If they are not already up, start the Oracle HTTP Server systems in the secondary site (new primary).
  7. Start the Admin Server in the secondary site (new primary), or restart the server if it was already started.
    Starting the Admin Server enables configuration changes that were replicated while this was a standby to take effect.
  8. Start the secondary managed servers in the secondary site (new primary).
    Use the WebLogic Console or scripts to start the secondary managed servers.

Perform a Failover

A failover operation is commonly an unplanned operation that is performed when the primary site becomes unavailable. You can role-transition a standby database to primary database role when the original primary database fails and there is no possibility of recovering it in a timely manner. There might be data loss, depending upon whether your primary and target standby databases were consistent at the time of the primary database failure.
  1. Propagate any pending configuration changes, if possible.
    See Replicate the File System Artifacts to OCI to replicate changes to secondary site.
  2. Disable any scheduled replication while the switchover is performed, since it may fail and interfere with the switchover operation itself.
  3. Stop the Oracle HTTP Server systems in the primary site.
  4. Stop the managed servers in the primary site, if possible.

    Use the WebLogic Administration Server Console or scripts to stop managed servers in primary.

  5. Switch over the front-end DNS name.

    Perform the required DNS push in the DNS server hosting the names used by the system or alter the file host resolution in clients to point the front-end virtual name of the system to the public IP used by the Load Balancer in the secondary site.

    For scenarios where DNS is used for the external front-end resolution (OCI DNS, commercial DNS, etc.), use the appropriate API to push the change. To see an example that pushes this change in an OCI DNS, go here.

    Note:

    the TTL value of the DNS entry will affect the RTO of the switchover. If the TTL is high (for example, 20 min), then the DNS change will take that time to be effective in the clients. Using lower TTL values will make this faster; however, this can cause an overhead because clients will hit the DNS more frequently instead of using cached names. A good approach is to set the TTL to a low value temporarily (for example, 1 min), before the change in the DNS. Then, perform the change, and once the switchover procedure is completed, revert the TTL to its original value.
  6. As an oracle user, use Oracle Data Guard broker in the secondary database host to perform the failover.
    You'll need your system password and the unique name of your primary database.
    [oracle@hydrdb1 ~]$ dgmgrl sys/your_sys_password@secondary_db_unqname
    DGMGRL> failover to secondary_db_unqname
  7. If they are not already up, start the Oracle HTTP Server systems in the secondary site (new primary).
  8. Start the Admin Server in the secondary site (new primary), or restart the server if it was already started.
    Starting the Admin Server enables configuration changes that were replicated while this was a standby to take effect.
  9. Start the secondary managed servers in the secondary site (new primary).
    Use the WebLogic Console or scripts to start the secondary managed servers.

Open the Secondary for Validation

You can validate the standby site without performing a complete switchover by converting the standby database to snapshot standby. This allows the secondary SOA servers to be started in the standby site and verify the secondary system. Any change performed in the standby site database while it is in snapshot standby mode will be discarded once it is converted to physical standby again. Primary data isn't affected by secondary site validations.

Note:

This operation must be done with caution: if there are pending messages or composites in the database when it is converted into snapshot, the standby site’s SOA servers will process them when they start. Check that there are no pending actions in the primary database when converting to snapshot standby. Otherwise, remove records from runtime SOA tables in the standby database after it is converted to snapshot standby database and before starting the secondary site’s SOA servers. See Removing Records from the Runtime Tables Without Dropping the Tables for the steps to validate the standby site without performing a switchover.
  1. As an oracle user, use Oracle Data Guard broker in the primary db host and convert the secondary into a snapshot standby.
    [oracle@dbhost1~]$ dgmgrl sys/your_sys_password@primary_db_unqname
    DGMGRL> convert database secondary_db_unqname to snapshot standby
    
    Use the command show configuration to verify that the conversion has been performed correctly.
  2. Verify that there are no pending actions in the secondary environment.
    If there are pending actions (transactions, messages) in the primary DB when the standby is converted to snapshot, then the secondary SOA servers will try process them when they start.You can use the SOA truncate script to remove the records from the SOA runtime tables in the secondary database to clean the runtime data before starting the secondary servers. Run this action with CAUTION; do not truncate tables in the primary database. See Removing Records from the Runtime Tables Without Dropping the Tables.
  3. If they are not already up, start the Oracle HTTP Server systems in the secondary site.
  4. Start the Admin Server in the secondary site.
  5. Start the secondary managed servers in the secondary site.
    Use the WebLogic Console or scripts to start the secondary managed servers.
  6. Validate the secondary site.

    As this is not a switchover and the primary site is still active, the virtual front-end name will resolve to the primary site’s load balancer IP address, so any browser access will, by default, be redirected to the active primary site.

    To directly access the secondary site’s SOA services, you must update the /etc/hosts file in a controlled client (for example, a laptop), set the virtual front-end name to resolve to the secondary site’s front-end load balancer IP address, and run any validation from this client.

    Note:

    Verify that the client used for validations does not access the system through an HTTP proxy, because the HTTP proxy may continue to resolve the virtual front-end name with the primary site’s load balancer IP address regardless of which name is in the /etc/hosts of the client.

    Non-Linux clients may require a reset of their local DNS cache before a browser will resolve the IP address using the customized host file entry.

    Once the secondary site has been validated, go to the next step to revert it back to the standby role.

    Note:

    It might take time to validate the secondary site.
  7. Stop the managed servers and admin servers in the secondary site.
    Use the secondary WebLogic Console to shut down the managed servers and Admin server in the secondary site.
  8. As an oracle user, use Oracle Data Guard broker in the primary database host and convert the secondary to physical standby again.
    You'll need your system password and the unique name of your primary database.
    [oracle@dbhost1 ~]$ dgmgrl sys/your_sys_password@primary_db_unqname
        DGMGRL> convert database secondary_db_unqname to physical standby
    Use show configuration to verify the conversion.
  9. Revert back any updated /etc/hosts files.
    If you updated any /etc/hosts files in a client to point to the secondary site for validations, then revert back so the virtual front-end name points to primary front-end IP address again.

Note:

ORA-01403: no data found ORA-06512 errors

While validating the secondary site as described here (without performing a complete switchover, that is, just opening standby in snapshot standby mode) “ORA-01403: no data found ORA-06512” errors may show up in the logs of the standby SOA servers. These errors are related to the SOA auto purge job. These errors arise because jobs in the database may have db role dependencies (they are defined to be enabled only when the database is in primary role). This is an expected and desired behavior that prevents jobs from being executed twice (once in primary and once in standby). The SOA auto purge job is defined with primary role, so it is not shown in DBA_SCHEDULER_JOBS view when the database is in snapshot standby mode. The database_role defined for each job can be seen in the view DBA_SCHEDULER_JOB_ROLE. In summary, these errors can be ignored as long as they appear in the standby system. The scheduler job for SOA auto purge will be executed on the database if, and only if, the instance changes its role to PRIMARY.

Local Failover of the Administration Server on OCI

You can start the Administration Server in a different node in the same site when there's a failure in the host where the Administration Server was originally running. A complete switchover of the system to the other site is not needed.

Note:

This lifecycle task is applicable only when WebLogic Administration Servers uses a VIP for local high availability purposes and the Administration Server configuration folder (ASERVER_HOME) is in a shared location.

The procedure to do this is explained in Verifying Manual Failover of the Administration Server. This provides local failover protection for the Administration server. Note this is not needed for the managed servers, which have local high availability protection based on the Automatic Service Migration feature.

If you need to perform a failover of the Administration Server to another host when the primary is running in the OCI site, then you can follow that procedure. However, additional action is required related to the “Migrate the ADMINVHN virtual IP address to the second host” step.

Perform the following steps to detach the VIP from the SOA host where the Administration Server was running and to attach it to the SOA host where the Administration is being moved (detach the VIP from SOAHOST1 and attach it to SOAHOST2 on the OCI site):

  1. As the root user, run the following commands in SOAHOST1 to remove the Administration Server’s VIP from the network interface.
    1. Stop the Administration Server in case it is still running
    2. Confirm where the VIP is running.
      ip addr show dev ens3
    3. Remove the IP from the network interface.
      ip addr del 100.70.8.120/20 dev ens3
  2. Detach the Administration Server’s VIP from SOAHOST1 .
    1. Connect to the OCI Console and select the appropriate region and compartment.
    2. Navigate to the compute instance. Click Compute, Instances, then click SOAHOST1.
    3. Click Attached VNICs, then select the VNIC in which the Administration Server VIP is attached.
    4. Click IPv4 Addresses and edit the VIP that the Administration Server uses.
    5. Save the VIP’s IP address and fqdn name in a note (for example: 100.70.8.120, hydrsoa-vip.midtiersubnet.hydrvcn.oraclevcn.com).
    6. Click Delete Private IP.
  3. Attach the Administration Server’s VIP to SOAHOST2.
    1. Navigate to the compute instance. Click Compute, Instances, then click SOAHOST2.
    2. Click Attached VNICs, then select the VNIC in which the Administration Server VIP is attached.
    3. Click Assign secondary private IP address.
    4. Click IPv4 Addresses, then click Assign secondary private IP address.
    5. Enter the Private IP address and host name values that were used before. For example: 100.70.8.120 for the IP and hydrsoa-vip as the host name.
  4. Log into SOAHOST2 as root user, then run the following commands to attach the administration server’s VIP to the network interface.
    1. Confirm where the network interface is running.
      ip addr show dev ens3
    2. Add the Administration Server’s VIP to the network interface.
      ip addr add 100.70.8.120/20 dev ens3 label ens3:1
  5. Perform the rest of the steps as described in Verifying Manual Failover of the Administration Server.