High Availability

Introduction

The subject of High Availability covers a range of features and options that can help to minimize planned and unplanned downtime, or facilitate recovery after a period of downtime. They include:

This section will provide a high-level guide to the key features that can help make an Oracle E-Business Suite highly available, with the emphasis on guidelines for making the correct decisions when planning a new installation or upgrade.

Patching Hints and Tips

Patch application is a key activity undertaken by Oracle E-Business Suite DBAs. If you need to apply a large number of patches, the required downtime can be significant. However, there are several simple ways of minimizing this downtime. These strategies include:

Where applicable, these strategies are described further below.

Note: For full details of carrying out patching and maintenance operations, see Oracle E-Business Suite Maintenance Procedures, Oracle E-Business Suite Maintenance Utilities, and Oracle E-Business Suite Patching Procedures.

Maintenance Mode

Maintenance Mode is a mode of operation in which the Oracle E-Business Suite system is made accessible only for patching activities. This provides optimal performance for AutoPatch sessions, and minimizes downtime needed.

Note: Maintenance Mode is only needed for AutoPatch sessions. Other AD utilities do not require Maintenance mode to be enabled.

Administrators can schedule system downtime using Oracle Applications Manager, and send alert messages to users about the impending downtime. When Maintenance Mode is entered, users attempting to log on to Oracle E-Business Suite are redirected to a system downtime URL.

There are several practical points relating to the use of Maintenance Mode:

Shared Application Tier File System

A traditional multi-node installation of Oracle E-Business Suite required each application tier node to maintain its own file system. Installation and migration options were subsequently introduced to enable a single APPL_TOP to be shared between all the application tier nodes of a multi-node system. This was referred to as a Shared APPL_TOP File System, usually abbreviated to Shared APPL_TOP.

A further capability that was introduced was the option to merge the APPL_TOPs of multiple nodes, each with its own set of application tier services, to give a single APPL_TOP that could then be shared between them all.

These concepts were subsequently extended to enable sharing of the application tier technology stack file system as well, the result being known as a Shared Application Tier File System.

This section describes the benefits of using a shared application tier file system in an Oracle E-Business Suite Release 12 environment. Current restrictions are also noted where applicable.

Shared Application Tier File System Features

In a shared application tier file system, all application tier files are installed on a single shared disk resource that is mounted on each application tier node. Any application tier node can be configured to perform any of the standard application tier services, such as serving forms or web pages, and all changes made to the shared file system are immediately visible on all the application tier nodes.

Benefits of using a shared application tier file system include:

Current restrictions on using a shared application tier file system include:

Shared Disk Resources

A shared application tier file system can reside on any standard type of shared disk resource, such as a remote NFS-mounted disk or part of a RAID array. However, you should ensure that performance of the chosen disk resource is adequate to meet peak demand. For example, NFS-mounted disks may give inadequate read or write performance when there is a large amount of network traffic, and RAID arrays must be implemented carefully to strike the appropriate balance between high availability, performance and cost.

Creating a Shared Application Tier File System

By default, the Release 12 Rapid Install will configure a multi-node application tier environment to use a shared application tier file system.

Note: For further details of using a shared application tier file system, see My Oracle Support Knowledge Document 384248.1, Sharing the Application Tier File System in Oracle E-Business Suite Release 12.

High Availability Features of Shared Application Tier File System

Utilizing a shared application tier file system improves high availability in the following ways:

Distributed AD

Many deployments utilize large database servers and multiple, smaller application (middle) tier systems. With the increasing deployment of low cost Linux-based systems, this configuration is becoming more common.

AD has always utilized a job system, where multiple workers are assigned jobs. Information for the job system is stored in the database, and workers receive their assignments based on the contents of the relevant tables. The Distributed AD feature offers improved scalability, performance, and resource utilization, by allowing workers of the same AD session to be started on multiple application tier nodes, utilizing available resources to complete their assigned jobs more efficiently.

Requirements for Distributed AD

Because the AD workers create and update file system objects as well as database objects, a shared application tier file system (shared APPL_TOP in earlier releases) must be employed to ensure the files are created in a single, centralized location.

Using Distributed AD

On one of your shared application tier nodes, you start your AutoPatch or AD Administration session, specifying the number of local workers and the total number of workers.

While using AutoPatch or AD Administration, you can start a normal AD Controller session from any of the nodes in the shared APPL_TOP environment to perform any standard AD Controller operations, using both local and non-local workers. This is possible because the job system can be invoked multiple times during AutoPatch and AD Administration runs. Each time an individual invocation of the job system completes, distributed AD Controller sessions will wait until either the job system is invoked again (at which point it will once again start the local workers) or until the AD utility session ends (at which point distributed AD Controller will exit).

Note: See Oracle E-Business Suite Maintenance Utilities for further details of Distributed AD and AD Controller.

AD Controller Log Files

The log file created by AD Controller is created wherever the AD Controller session is started. This is to prevent file locking issues on certain platforms. It is therefore recommended that the AD Controller log file should include the node name from which the AD Controller session is invoked.

Staged Applications System

A staged Applications system represents an exact copy (clone) of your production system, including all APPL_TOPs as well as a copy of the production database. You can apply patches to a staged system while the production system remains in operation. Then you connect the staged system to the production system, update the database, and synchronize the APPL_TOPs. The downtime for the production system begins only after all patches have been successfully applied to the staged system, and you have tested the newly patched environment.

Important: A staged Applications system must duplicate the topology of your production system. For example, each physical APPL_TOP of your production system must exist in the staged system.

After the patches are applied to the staged system, and the production system is updated, you must export applied patches information from the staged system and import it to the production system. This ensures that the OAM patch history database in the production system is up-to-date and that you can continue to use patch-related features.

Note: For more information, see My Oracle Support Knowledge Document 734025.1, Using a Staged Applications System to Reduce Patching Downtime in Oracle Applications Release 12.

Nologging Operations

The nologging Oracle database feature is used to enhance performance in certain areas of Oracle E-Business Suite. For example, it may be used during patch installation, and when building summary data for Business Intelligence.

Use of nologging in an operation means that the database redo logs will contain incomplete information about the changes made, with any data blocks that have been updated during the nologging operation being marked as invalid. As a result, a database restoration to a point in time (whether from a hot backup or a cold backup) may require additional steps in order to bring the affected data blocks up-to-date, and make the restored database usable. These additional steps may involve taking new backups of the associated datafiles, or by dropping and rebuilding the affected objects. The same applies to activation of a standby database.

Note: Oracle Database 11g also allows logging to be forced to take place, ensuring all data changes are written to the database redo logs in a way that can be recreated in a restored backup, or propagated to a standby database. See Oracle Data Guard Concepts and Administration 11g for details of the force logging clause for database and tablespace commands.

Nologging Principles

At certain times, Oracle E-Business Suite uses the database nologging feature to perform resource-intensive work more efficiently. When an operation uses nologging, blocks of data are written directly to their data file, rather than going through the buffer cache in the System Global Area (SGA).

Instance recovery uses the online redo logs to reconstruct the SGA after a crash, rolling forward through any committed changes in order to ensure the data blocks are valid. Use of nologging does not affect instance recovery.

Database recovery requires rolling forward through the redo logs to recreate the requisite changes, and hence restore the database to the desired point in time. Since nologging operations write directly to the data files, bypassing the redo logs, the redo logs will not contain enough data to roll forward to perform media recovery. Instead, they will only contain enough information to mark the new blocks as invalid. Rolling forward through a nologging operation would therefore result in invalid blocks in the restored database. The same problems will potentially occur upon activating a standby database.

To make the restored backup or activated standby database usable after a nologging operation is carried out, a mechanism other than database recovery must be used to get or create current copies of the affected blocks.

There are two options, either of which may be appropriate depending on the specific circumstances:

Nologging Usage

Nologging is used in the following situations in the Oracle E-Business Suite:

Actions Needed

To monitor nologging activity in your environment, you should periodically query your production database to identify any datafiles that have experienced nologging operations. You should also run the query before and after applying an Oracle E-Business Suite patch, to determine whether any nologging activity was carried out.

A suitable query can be run via monitoring software such as Oracle Enterprise Manager. Alternatively, you can construct a query based on the unrecoverable_change# and unrecoverable_time columns of the data dictionary view v$datafile. These are updated every time an unrecoverable or nologging operation marks blocks as invalid in the datafile.

The results of a query can be saved as a snapshot and compared to the last snapshot. You can then identify each occasion when nologging operations have been carried out in the database, and hence when you need to refresh backup datafiles with new copies that will be usable in the event of restoration being needed.

Disaster Recovery

A significant problem that strikes an Oracle E-Business Suite installation could put the viability of the organization at risk. Such a problem could be:

This section gives an overview of the area of disaster recovery, which can be considered as the final component of a high availability strategy. Disaster recovery involves taking steps to protect the database and its environment to ensure that they can still operate in the face of major problems. Oracle provides features such as Oracle Data Guard and Flashback Database .

You must also install any other hardware and software required to run your standby environment as a production environment after a failover, ensuring that any changes on the primary are matched on the standby. Examples include tape backup equipment and software, system management and monitoring software, and other applications.

Data Guard and Release 12

Oracle Data Guard provides mechanisms for propagating changes from one database to another, to avoid possible loss of data if one site fails. The two main variants of a Data Guard configuration are Redo Apply (often referred to as Physical Standby) and SQL Apply (often referred to as Logical Standby). . Both of these use the primary database’s redo information to propagate changes to the standby database.

The secondary environment should be physically separate from the primary environment, to protect against disasters that affect the entire primary site. This necessitates having a reliable network connection between the two data centers, with sufficient bandwidth (capacity) for peak redo traffic. The other requirement is that the servers at the secondary site are the same type as at the primary site, in sufficient numbers to provide the required level of service; depending on your organization’s needs, this could either be a minimal level of service (supporting fewer users), or exactly the same level of service as you normally provide.

Data Guard’s reliance on redo generated from the production database has significant implications for operations in which Oracle E-Business Suite uses the nologging feature (described previously) to perform some resource-intensive tasks with faster throughput. Oracle recommends turning on the force logging feature at the database level to simplify your backup and recovery, and standby database maintenance procedures. In cases where the nologging feature is used in Release 12, and you have chosen not to use force logging, insufficient redo information will be generated to make the corresponding changes on the standby database. You may then be required to take manual steps to refresh the standby (or recreate the relevant objects) to ensure it will remain usable.

Finally, based on your organization’s business requirements, choose one of the following protection modes:

Flashback Database

Oracle recommends you enable the Flashback Database feature, to:

Flashback Database enables you to rewind the database to a previous point in time without restoring backup copies of the data files. This is accomplished during normal operation by Flashback Database buffering and writing before images of data blocks into the flashback logs, which reside in the flash recovery area.

Flashback Database can also flashback a primary or standby database to a point in time prior to a Data Guard role transition. In addition, a Flashback Database operation can be performed to a point in time prior to a resetlogs operation, which allows administrators more flexibility to detect and correct human errors.