Managing Lifecycle Operations

5 Managing Lifecycle Operations

This chapter describes the additional maintainance tasks with specific implications in a disaster protection system such as ongoing synchronizations, patching, and scale out operations.

Scheduling Ongoing Replication Between the Primary and the Secondary Sites

Changes in the primary site need to be propagated to the secondary site to maintain a consistent behavior in case of a switchover or failover scenario. This is needed to guarantee that the validations and verifications are using realistic data without any modifications in the production site.

For more information about different approaches and how to determine the replication frequency for each type of data used by the Fusion Middleware System, see Planning a File System Replication Strategy.
During normal operations, the secondary site receives copies transferred periodically from the production site storage. As explained, this can be done through storage snapshots, rsync operations or using DBFS. After the copies are available, the secondary site storage includes all the data up to the point contained in the last transfer from the production site before the failover or switchover.
When storage replication is used in asynchronous replication mode, then at the requested frequency (most vendors provide manual, on schedule, or continuous options) the changed data blocks at the production site shared storage (based on comparison to the previous snapshot copy) become the new snapshot copy. The snapshot copy is transferred to the secondary site shared storage.
When using rsync or DBFS, ensure that either croned jobs or scheduler software is used to trigger the copy periodically so that it is used by the secondary WebLogic domain. The same scripts that were used in the initial setup should be scheduled to repeat synchronizations on an ongoing basis. Both DBFS and rsync with staging location require a dual scheduling: the copy from the primary nodes to the staging location and from the staging location to the secondary nodes. In the rsync peer-to-peer approach, the copy is done in a single step so only one operation needs to be scheduled.
There will be a trade-off between the frequency of the copy and the overhead caused. The precise system’s RTO should drive how to schedule the synchronization operations.
Ensure to force a synchronization when you introduce a change to the middle-tier at the production site. For example, when you deploy a new application or module in the production site or change the configuration of data sources and JMS destinations. Follow the vendor-specific instructions to force a synchronization when using a storage replication technology or trigger the pertaining rsync copy to the secondary. When using DBFS ensure that the DBFS staging location is updated with the update from the primary also.
After a switchover or failover, you need to revert the direction in which the replication is taking place. When the secondary location becomes the active site, the ongoing replication of configuration changes need to be reverted so that the copy should occur from this secondary region to the original primary or else you may overwrite the active system with obsolete changes from primary.

Scheduling Ongoing Replication With Rsync Scripts

Once you have determined the frequency of replication for each data type (see Planning a File System Replication Strategy), you can use scheduler software to program the replication operations. In Linux, you can use cron jobs for this replication.

If you are using a peer-to-peer model, you can use the following steps as a guideline:

As the root user in the secondary’s WebLogic Administration Server node, execute the following:
```
crontab -e
```

Edit the cron file and using the example scripts at https://github.com/oracle-samples/maa/tree/main/1412EDG, add the following lines: (notice that )

Note:

172.11.2.111 is the IP of the primary node running the WebLogic Administration Server in the examples provided in this document. Replace it with the precise IP in your case.


0 0 * * * su oracle -c "/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.111 /u01/oracle/config/ /home/oracle/my_keys/SSH_KEY.priv"
0 0 * * * su oracle -c /home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.111 /u02/oracle/config/ /home/oracle/my_keys/SSH_KEY.priv"
0 0 */7 * * su oracle -c "/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.111  /u01/oracle/products/ /home/oracle/my_keys/SSH_KEY.priv"

As the root user in each secondary’s WebLogic Managed Server’s node, execute the following:
```
crontab -e
```
Edit the cron file and using the example scripts at https://github.com/oracle-samples/maa/tree/main/1412EDG, add the following lines:

Note:
172.11.2.112 is the IP of the Primary node running the second WebLogic Managed Server in the examples provided in this document. Replace it with the precise IP in your case)
```
0 0 * * * su oracle -c /home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.112 /u02/oracle/config/ /home/oracle/my_keys/SSH_KEY.priv"
0 0 * * *  su oracle -c "/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.112  /u01/oracle/products/ /home/oracle/my_keys/SSH_KEY.priv"
```
Note:
Since the EDG uses only two physical locations for the Oracle Homes (on shared storage), if you have more than two nodes, it is sufficient to schedule the /u01/oracle/products copy from the first two.

With this you have scheduled the Admin Server’s domain directory and the Managed servers’ domain directory to be pulled from primary every day at midnight and the Oracle Homes and JDK installation to be copied every week. Adjust the frequency according to your RTO, RPO, and change size needs. Similarly, create cron jobs in your OHS nodes (as explained in previous sections, OHS configuration changes much less frequently that WebLogic domains so you may adjust to less frequent copies).

If you are using a staging model, you can use the following steps as a guideline:

As the root user in the staging node and using the directory structure explained in the Preparing the Primary Storage for Rsync Replication, execute the following:
```
crontab -e
```
Edit the cron file and using the example scripts at https://github.com/oracle-samples/maa/tree/main/1412EDG, add the following lines:

Note:
172.11.2.111 is the IP of the Primary node running the WebLogic Administration Server and 172.11.2.112 is the IP of the Primary node running the second WebLogic managed server in the examples provided in this document. Replace it with the precise IP in your case.
```
0 0 * * * su oracle -c "export DEST_FOLDER=/staging_location/midtier/wls_shared_config/;/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.111 /u01/oracle/config/ /home/oracle/my_keys/SSH_KEY.priv "
0 0 * * * su oracle -c "export DEST_FOLDER=/staging_location/midtier/wls_private_config/wlsnode1_private_config;/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.111 /u02/oracle/config/ /home/oracle/my_keys/SSH_KEY.priv"
0 0 * * * su oracle -c "export DEST_FOLDER=/staging_location/midtier/wls_private_config/wlsnode2_private_config;/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.112 /u02/oracle/config/ /home/oracle/my_keys/SSH_KEY.priv"

0 0 */7 * * su oracle -c "export DEST_FOLDER=/staging_location/midtier/wls_products_home/wls_products_home1;/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.111  /u01/oracle/products/ /home/oracle/my_keys/SSH_KEY.priv"
0 0 */7 * * su oracle -c "export DEST_FOLDER=/staging_location/midtier/wls_products_home/wls_products_home2;/home/oracle/maa/1412EDG/rsync_for_WLS.sh 172.11.2.112  /u01/oracle/products/ /home/oracle/my_keys/SSH_KEY.priv"
```
Note:
It is recommended to have two copies of Oracle Homes from two different nodes in case one of them gets corrupted.
With this you have a backup in the staging node/location with the copies from primary. You will need to schedule its transfer from the staging node to the WebLogic nodes in secondary. Since there is a lag between the transfer from primary to the staging node, it is recommended to delay this second transfer. For example, at 2am everyday.

As the root user in the secondary’s WebLogic Administration Server node, execute the following:
```
crontab -e
```

Edit the cron file and using the example scripts at https://github.com/oracle-samples/maa/tree/main/1412EDG, add the following lines:

Note:

staging_node_ip is the IP of the staging node.


0 2 * * * su oracle -c "export DEST_FOLDER=/u01/oracle/config/;/home/oracle/maa/1412EDG/rsync_for_WLS.sh staging_node_ip /staging_location/midtier/wls_shared_config/ /home/oracle/my_keys/SSH_KEY.priv"
0 2 * * * su oracle -c "export DEST_FOLDER=/u02/oracle/config/;/home/oracle/maa/1412EDG/rsync_for_WLS.sh staging_node_ip /staging_location/midtier/wls_private_config/wlsnode1_private_config /home/oracle/my_keys/SSH_KEY.priv"
0 2 * * * su oracle -c "export DEST_FOLDER=/u01/oracle/products  ;/home/oracle/maa/1412EDG/rsync_for_WLS.sh staging_node_ip  /staging_location/midtier/wls_products_home/wls_products_home1 /home/oracle/my_keys/SSH_KEY.priv"

As the root user in each secondary’s WebLogic Managed Server’s node, execute the following:
```
crontab -e
```
Edit the cron file and using the example scripts at https://github.com/oracle-samples/maa/tree/main/1412EDG, add the following lines:

Note:
staging_node_ip is the IP of the staging node.
```
0 2 * * * su oracle -c "export DEST_FOLDER=/u02/oracle/config; /home/oracle/maa/1412EDG/rsync_for_WLS.sh staging_node_ip / staging_location/midtier/wls_private_config/wlsnode2_private_config; /home/oracle/my_keys/SSH_KEY.priv"
0 2 * * * su oracle -c "export DEST_FOLDER=/u01/oracle/products; /home/oracle/maa/1412EDG/rsync_for_WLS.sh staging_node_ip / staging_location/midtier/wls_products_home/wls_products_home2 /home/oracle/my_keys/SSH_KEY.priv"
```
Note:
Since the EDG uses only two physical locations for the Oracle Homes (on shared storage), if you have more than two nodes it is sufficient to schedule the /u01/oracle/products copy from the first two.

With this you have scheduled the Admin Server’s domain directory and the Managed servers’ domain directory to be pulled from primary every day at mid night (pushed at 2AM to secondary nodes) and the Oracle Homes and JDK installation to be copied every week. Adjust the frequency according to your RTO, RPO, and change size needs. Similarly, create cron jobs in your OHS nodes (as explained in previous sections, OHS configuration changes much less frequently that WebLogic domains so you may adjust to less frequent copies).

In all cases (whether using the per-to-peer or staging models) check on a regular basis your system’s cron log. For example:

grep rsync /var/log/cron

If you are using the scripts provided at https://github.com/oracle-samples/maa/tree/main/1412EDG, you can also check the logs directory reported by the cron jobs to see the result of each periodic copy.

Besides the appropriate execution of the cron jobs, it is recommended to test and validate secondary on a regular basis when periodic copies are scheduled.


sudo grep rsync /var/log/cron | grep log
May 6 00:00:02 bastion-vcnpho80 CROND[497562]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u02_oracle_products__06-06-2025-00-00-02.log))
May 6 00:00:02 bastion-vcnpho80 CROND[497563]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u02_oracle_config__06-06-2025-00-00-02.log))
May 6 00:00:02 bastion-vcnpho80 CROND[497564]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u02_oracle_config__06-06-2025-00-00-02.log))
May 6 00:00:02 bastion-vcnpho80 CROND[497565]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u02_oracle_config__06-06-2025-00-00-02.log))
May 6 00:00:02 bastion-vcnpho80 CROND[497566]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u02_oracle_config__06-06-2025-00-00-02.log))
May 6 00:00:02 bastion-vcnpho80 CROND[497561]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u02_oracle_products__06-06-2025-00-00-02.log))
May 6 00:00:02 bastion-vcnpho80 CROND[497567]: (root) CMDOUT ((You can check rsync command and exclude list in /home/oracle/maa/1412EDG/logs/rsync_u01_oracle_config__06-06-2025-00-00-02.log))

To avoid undesired overwrites in the application tier (such as accidentally syncing configuration from primary to secondary when secondary is the active site), it is a good practice to base the direction of the replication ( whether to copy configuration, binaries and runtime from the original primary to secondary or the other way around) on the database role in each site. The webtier’s configuration changes less frequently than the application tiers one. It is also more sensitive to provide access from the webtier to the database than from the apptier. Hence, it is recommended to add logic in the application tier’s rsync scripts to base the direction of the copy on the role detected in the database. If the local database is in PHYSICAL STANDBY or SNAPTHOT STANDBY role, the scripts should pull information from its remote peer, and if it is in PRIMARY ROLE the scripts should push information to that remote peer. Refer to the scripts at https://github.com/oracle-samples/maa/tree/main/app_dr_common for examples to implement this logic.

Patching an Oracle Fusion Middleware Disaster Recovery Site

An appropriate disaster recovery strategy needs to address how to apply Oracle Fusion Middleware patches to upgrade the Oracle Homes that participate in an Oracle Fusion Middleware Disaster Recovery site.

Oracle Central Inventory for any Oracle Fusion Middleware instance in the primary system that you are patching should be located on the production site shared storage or rsync-covered location so that the Oracle Central Inventory for the patched instance can be replicated to the secondary site.

Perform the following steps to apply an Oracle Fusion Middleware patch:

Perform a backup of the production site to ensure that the starting state is secured.
Apply the patch set to upgrade the production site instances.
After you apply the patch set, manually force a synchronization of the production site shared storage and secondary site shared storage. This replicates the production site's patched instance and Oracle Central Inventory in the secondary site's shared storage.
After you apply the patch set, ensure your secondary database is a physical standby (no snapshot standby) so that the Oracle Data Guard will synchronize the Oracle primary and secondary databases. When few Oracle Fusion Middleware patch sets make updates to repositories, this step ensures that any changes made to the production site databases are synchronized to the secondary site databases.
The upgrade is now complete. Your disaster recovery topology is ready to resume processing.

Note:

Patches must be applied only at the production site for an Oracle Fusion Middleware Disaster Recovery topology. If a patch is for an Oracle Fusion Middleware instance or for the Oracle Central Inventory, the patch is copied when the production site shared storage is replicated to the secondary site shared storage. A synchronization operation should be performed when a patch is installed at the production site.

When patching the database, check the documentation for that specific patch for information on how to apply the patch in a Data Guard topology.

A disaster recovery topology helps (in some cases) to reduce the patching downtime for the primary system. There are differences depending on the components affected by the patch.

Database Patches

Oracle Fusion Middleware Disaster Recovery uses Data Guard. The advantage of using Data Guard instead of having only a primaryDB system is that you can first patch one site and then the other. However, not all the database patches allow this approach. The downtime and procedure to patch the database depends on the type of patch. Database patches are of the following types:
- Data Guard Standby-First
  
  These can be applied first in standby and then in primary. There are various options available for applying this type of patches. See Oracle Patch Assurance - Data Guard Standby-First Patch Apply (Doc ID 1265700.1).
- Non Data Guard Standby-First
  
  These kind of patches must be applied on both primary and standby databases at the same time and require a shutdown.
  
  If the database patch is standby first applicable, the downtime can be minimized or reduced to a switchover. If not, it requires shutdown of primary and standby and must be applied in both.
Mid-tier-only Patches (patches modifying only mid-tier binaries)
- A few Fusion Middleware patches are marked as FMW_ROLLING_ORACLE_HOME in their readme. This type of patches does not incur in any downtime regardless of using disaster recovery or not.
- Other patches are not FMW_ROLLING_ORACLE_HOME enabled and require a mid-tier shutdown. In these cases, a disaster recovery topology helps minimizing downtime by using the following procedure:
  1. Convert the secondary database to snapshot standby.
  2. Patch the secondary mid-tier domain first.
  3. Test the secondary domain with the patch.
  4. After everything is validated on the secondary, convert the secondary database back to physical standby.
  5. Switchover to secondary (at this point secondary region becomes your primary and runs the business).
  6. Convert the old primary database to snapshot.
  7. Patch the old primary mid-tier and test it.
  8. Convert the database back to physical standby.
  9. Switchback to the original site.
    
    In these cases, the downtime is only the time spent on the switchover operation. Without a standby system the downtime would include the patching time and the time to stop and start the system.
Midtier Patches that Include DB Schema Changes

If the patch is not FMW_ROLLING_ORACLE_HOME enabled, the approach is a bit different to avoid losing the database changes (db schema changes require to patch mid-tier and db at the same time). Perform the following steps:
1. Convert the secondary database to snapshot standby.
2. Patch the secondary mid-tier domain first.
3. Test the secondary domain with the patch.
4. After everything is validated on the secondary, convert the secondary database back to physical standby. At this point, the secondary WebLogic domain is misaligned: the mid-tier has one version but the schemas are in the older version.
5. Patch Primary
  
  The downtime with this approach is the same as without a secondary, but the procedure above has an advantage that it allows you to verify the patch’s behavior in the secondary before applying it in the primary.

Scale Operations in a Fusion Middleware Disaster Recovery System

<<Awaiting inputs for this section/topic>>