Skip Headers
Oracle® Database High Availability Best Practices
11g Release 1 (11.1)

B28282-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

2.2 Configuring Oracle Database 11g

The best practices discussed in this section apply to Oracle Database release 11g architectures that are described in Oracle Database High Availability Overview:

The recommendations described in this section are identical for both the primary and standby databases in Oracle Data Guard configurations. It is necessary to adopt these practices to reduce or avoid outages, reduce risk of corruption, and improve recovery performance.

This section contains general best practices for configuring the database:

2.2.1 Recommendations for High Availability and Fast Recoverability

Use the following best practices to reduce recovery time and increase database availability and redundancy:

2.2.1.1 Enable ARCHIVELOG Mode

ARCHIVELOG mode enables online database backup and is necessary to recover the database to a point in time later than what has been restored. Features such as Oracle Data Guard and Flashback Database require that the production database run in ARCHIVELOG mode.

See Also:

Oracle Database Administrator's Guide for more information about using automatic archiving

2.2.1.2 Configure the Size of Redo Log Files and Groups Appropriately

Use Oracle log multiplexing to create multiple redo log members in each redo group, one in the data area and one in the flash recovery area. This protects against a failure involving the redo log, such as a disk or I/O failure for one member, or a user error that accidentally removes a member through an operating system command. If at least one redo log member is available, then the instance can continue to function.

Note:

Do not multiplex the standby redo logs.

All online redo log files should be the same size and configured to switch approximately once an hour during normal activity. They should not switch more frequently than every 20 minutes during peak activity.

There should be a minimum of four online log groups to prevent the logwriter process from waiting for a group to be available following a log switch. A group might be unavailable because a checkpoint has not yet completed or because the group has not yet been archived.

See Also:

2.2.1.3 Use a Flash Recovery Area

The flash recovery area is Oracle managed disk space that provides a centralized disk location for backup and recovery files.

The flash recovery area is defined by setting the following database initialization parameters:

  • DB_RECOVERY_FILE_DEST

    This parameter specifies the default location for the flash recovery area.

  • DB_RECOVERY_FILE_DEST_SIZE

    This parameter specifies (in bytes) the hard limit on the total space to be used by database recovery files created in the recovery area location.

The Oracle Suggested Backup Strategy described in the Oracle Database 2 Day DBA recommends using the flash recovery area as the primary location for recovery. When the flash recovery area is properly sized, files needed for repair are readily available. The minimum recommended disk limit is the combined size of the database, incremental backups, all archived redo logs that have not been copied to tape, and flashback logs.

Note:

Do not configure the archived redo log files for Oracle Streams capture to reside solely in the flash recovery area. Instead, configure a separate log archive destination that is independent of the flash recovery area for the Oracle Streams capture process for the database.

This is necessary because the archived redo log files in the flash recovery area may be removed automatically due to lack of disk space or RMAN may remove logs that no longer meet the backup retention policies, even though the log files are still required by Oracle Streams.

See Also:

Oracle Database Backup and Recovery User's Guide for detailed information about sizing the flash recovery area and setting the retention period

2.2.1.4 Enable Flashback Database

Flashback Database provides an efficient alternative to point-in-time recovery for reversing unwanted database changes. Flashback Database enables you to rewind an entire database backward in time, reversing the effects of database changes within a time window. The effects are similar to database point-in-time recovery (DBPITR). You can flash back a database by issuing a single RMAN command or a SQL*Plus statement instead of using a complex procedure.

When configuring and enabling Flashback Database:

  • Ensure that the production database is running in ARCHIVELOG mode.

  • Ensure there is sufficient I/O bandwidth available to the flash recovery area to maintain flashback write throughput.

    During normal run-time activities, Flashback Database buffers and writes the before images of data blocks into the flashback logs that reside in the flash recovery area. If flashback writes are slow (as indicated by the flashback free buffer waits wait event), then database throughput is affected. The amount of disk writes caused by Flashback Database varies depending on the workload and application profile. For a typical OLTP workload that is using a flash recovery area with sufficient disk spindles and I/O throughput, the overhead incurred by Flashback Database is less than two percent.

  • If you have a standby database, then set the DB_FLASHBACK_RETENTION_TARGET initialization parameter to the same value on both the primary and standby databases.

  • For large databases, set the LOG_BUFFER initialization parameter to at least 8 MB to ensure the database allocates maximum memory (typically 16 MB) for writing Flashback Database logs.

Flashback Database can return a primary or standby database to a point in time before a role transition. In addition, you can use guaranteed restore points to flash back a database to a point in time before a RESETLOGS operation, thus providing more flexibility in detecting and correcting human errors.

Flashback Database or guaranteed restore points are required when using:

  • Fast-start failover—Requires Flashback Database so that the broker can automatically reinstate the primary database following an automatic failover. Bystander standby databases that may be disabled after a failover can only be reinstated when Flashback Database is enabled.

  • Snapshot standby database—Requires a guaranteed restore point to convert the snapshot standby database back to a physical standby database.

Flashback Database is optional but recommended when performing rolling database upgrades. You should create a guaranteed restore point before performing an upgrade to fall back in case the upgrade fails. Using this method to restore the database to the pre-upgrade state is substantially quicker than using the downgrade procedure. However, note that flashing back the database to the pre-upgrade state is practical only when no application data changes have been made.

In general, the performance effect of enabling Flashback Database is minimal. However, there are some application profiles that may require special tuning or additional considerations. See support note 565535.1 at http://support.oracle.com/ for additional Flashback Database considerations and specific application use cases.

There are several data analysis methods for monitoring the Flashback Database workload on your system such as using Automatic Workload Repository (AWR), or querying the V$FLASHBACK_DATABASE_STAT or the V$SYSSTAT views. For example, you can use AWR to compare AWR snapshots collected before and after the time that you enabled Flashback Database. You can also review AWR snapshots to pinpoint system usage caused by flashback logging. See the "Monitoring Flashback Database Performance Impact" section in the Oracle Database Backup and Recovery User's Guide for more monitoring and tuning techniques.

See Also:

2.2.1.5 Use Fast-Start Fault Recovery to Control Instance Recovery Time

The fast-start fault recovery feature reduces the time required to recover from a crash. It also makes the recovery bounded and predictable by limiting the number of dirty buffers and the number of redo records generated between the most recent redo record and the last checkpoint.

With this feature, the FAST_START_MTTR_TARGET initialization parameter simplifies the configuration of recovery time from instance or system failure. This parameter specifies a target for the expected recovery time objective (RTO), which is the time (in seconds) that it should take to start the instance and perform cache recovery. When you set this parameter, the database manages incremental checkpoint writes in an attempt to meet the target. If you have chosen a practical value for this parameter, then you can expect your database to recover, on average, in approximately the number of seconds you have chosen.

Outage testing (such as for node or instance failures) during peak loads is recommended.

See Also:

2.2.1.6 Configure to Protect from Data Corruption

By default, Oracle always validates the data blocks that it reads from disk. You should also consider one or more of the following methods to provide additional prevention and detection against corruptions caused by underlying disks, storage systems, or the I/O system:

Use Data Guard and Configure the DB_ULTRA_SAFE Initialization Parameter

Use Data Guard and configure the DB_ULTRA_SAFE initialization parameter on both the primary and standby systems for the most comprehensive data corruption prevention and detection.

  • On the primary database, set the DB_ULTRA_SAFE=DATA_AND_INDEX initialization parameter to prevent and detect data corruptions in a timely manner, and thus provide critical data protection and high availability for the Oracle Database.

    The DB_ULTRA_SAFE initialization parameter also controls other data protection behavior in Oracle Database, such as requiring ASM to perform sequential mirror write I/Os.

    Table 2-1 describes the values that the DB_ULTRA_SAFE parameter automatically assigns to the DB_BLOCK_CHECKING, DB_BLOCK_CHECKSUM, or DB_LOST_WRITE_PROTECT parameters.

    Table 2-1 Parameter Values Set by the DB_ULTRA_SAFE Initialization Parameter

    When you set DB_ULTRA_SAFE to ... Then ...

    DATA_AND_INDEX (recommended)

    • DB_BLOCK_CHECKING is set to FULL.

    • DB_LOST_WRITE_PROTECT is set to TYPICAL.

    • DB_BLOCK_CHECKSUM is set to FULL.

    DATA_ONLY

    • DB_BLOCK_CHECKING is set to MEDIUM.

    • DB_LOST_WRITE_PROTECT is set to TYPICAL.

    • DB_BLOCK_CHECKSUM is set to FULL.


    Note:

    When you set the DB_ULTRA_SAFE parameter, it automatically integrates and controls the behavior (described in Table 2-1) of the following initialization parameters:
    • DB_BLOCK_CHECKING detects and prevents data block corruptions.

      Block checking prevents memory and data corruptions, but it incurs some performance overhead on every block change. For many applications, the block changes are a small percentage compared to the blocks read (typically less than five percent), so the overall effect of enabling block checking is small.

    • DB_BLOCK_CHECKSUM detects redo and data block corruptions and can prevent most corruptions from happening on the physical standby database.

      Redo and data block checksums detect corruptions on the primary database and protect the standby database. This parameter requires minimal CPU resources.

    • DB_LOST_WRITE_PROTECT detects stray and lost writes.

      Lost write protection enables a physical standby database to detect lost write corruptions on both the primary and physical standby database.

    However, if you explicitly set the DB_BLOCK_CHECKING, DB_BLOCK_CHECKSUM, or DB_LOST_WRITE_PROTECT parameters in the initialization parameter file, then the DB_ULTRA_SAFE parameter has no effect and no changes are made to the parameter values. Thus, if you specify the DB_ULTRA_SAFE parameter, do not explicitly set these underlying parameters.

  • On physical standby databases, specify the DB_BLOCK_CHECKSUM and DB_LOST_WRITE_PROTECT parameters:

    • Set DB_BLOCK_CHECKSUM=FULL

      If DB_BLOCK_CHECKSUM is set to FULL, then both disk corruption and in-memory corruption are detected and the block is not written to disk, thus preserving the integrity of the physical standby database. This parameter has minimal effect on Redo Apply performance.

    • Set DB_LOST_WRITE_PROTECT=TYPICAL

      Lost write protection prevents corruptions—due to stray or lost writes on the primary—from being propagated and applied to the standby database. Setting this parameter has a negligible effect on the standby database. Moreover, setting the DB_LOST_WRITE_PROTECT initialization parameter is recommended over employing the HARD solution, because HARD does not provide full stray and lost write protection and redo application validation.

    A standby database is a database that is decoupled from the primary database and on which redo data is checked and verified. Redo Apply and SQL Apply processes perform another layer of validation on the standby database that can detect stray or lost writes and corrupted blocks caused by hardware, software, or network issues. Most of these issues cannot be detected on the primary database or may remain hidden on the primary database for a long period.

  • Enable the DB_BLOCK_CHECKING initialization parameter.

    Consider setting the DB_BLOCK_CHECKING parameter only on the primary database. Enabling DB_BLOCK_CHECKING on the standby database incurs a much higher overhead and can dramatically reduce Redo Apply performance. Testing is recommended to measure the effect on your environment.

    Note:

    Although enabling the DB_BLOCK_CHECKING parameter is recommended, doing so can significantly reduce the throughput of Redo Apply processes by as much as 50%.

    During MAA internal testing, Redo Apply throughput doubled in Oracle Database 11g and reached 100 MB/sec for batch workloads and 50 MB/sec for OLTP workloads. The throughput dropped by 50% after enabling the DB_BLOCK_CHECKING parameter. However, for cases where the reduced throughput can still surpass peak redo rates on the primary database, enabling DB_BLOCK_CHECKING parameter is still advised to provide additional data corruption protection.

Configure Data Recovery Advisor

Configure Data Recovery Advisor to quickly diagnose and repair data failures for non Oracle RAC primary databases. Data Recovery Advisor periodically scans for data corruptions. See "Use the Data Recovery Advisor".

Configure Oracle Recovery Manager (RMAN)

Configure Oracle Recovery Manager (RMAN) to automate the backup and management of recovery-related files, calculate checksums when taking backups to ensure that all blocks being backed up are validated, and detect physical and logical corruptions. Periodically use the RMAN BACKUP VALIDATE CHECK LOGICAL... scan to detect corruptions. See "Configuring Backup and Recovery".

Configure Oracle Secure Backup

Configure Oracle Secure Backup to integrate tape backup and management into your environment to provide local and remote data protection. See "Create Fast Tape Backups Using Oracle Secure Backup".

Use ASM Redundancy

Use ASM redundancy for disk groups to provide mirrored extents that can be used by the database in the event an I/O error or corruption is encountered. For continued protection in the event of a failure, ASM redundancy provides the ability to move an extent to a different area on a disk if an I/O error occurs. The ASM redundancy mechanism is useful if you have some bad sectors returning media sense errors.

2.2.1.7 Use the Data Recovery Advisor

Use the Data Recovery Advisor for non-Oracle RAC primary databases to quickly diagnose data failures, determine and present appropriate repair options, and execute repairs at the user's request. Data Recovery Advisor reduces downtime by eliminating confusion and automating detection and repair. It can diagnose failures based on symptoms, such as:

  • Components that are not accessible because they do not exist, do not have the correct access permissions, are taken offline, and so on

  • Physical corruptions such as block checksum failures, invalid block header field values, and so on

  • Logical corruptions caused by software bugs

  • Incompatibility failures caused by an incorrect version of a component

  • I/O failures such as a limit on the number of open files exceeded, channels inaccessible, network or I/O errors, and so on

  • Configuration errors such as an incorrect initialization parameter value that prevents the opening of the database

If failures are diagnosed, then they are recorded in the Automatic Diagnostic Repository (ADR). Data Recovery Advisor intelligently determines recovery strategies by:

  • Generating repair advice and repairing failures only after failures have been detected by the database and stored in ADR

  • Aggregating failures for efficient recovery

  • Presenting only feasible recovery options

  • Indicating any data loss for each option

Typically, Data Recovery Advisor presents both automated and manual repair options. If appropriate, you can choose to have Data Recovery Advisor automatically perform a repair, verify the repair success, and close the relevant repaired failures.

See Also:

The chapter about diagnosing and repairing failures with Data Recovery Advisor in the Oracle Database Backup and Recovery User's Guide

2.2.1.8 Set DISK_ASYNCH_IO

Under most circumstances, Oracle Database automatically detects if asynchronous I/O is available and appropriate for a particular platform, and enables asynchronous I/O through the DISK_ASYNCH_IO initialization parameter. However, for optimal performance, it is always a best practice to ensure that asynchronous I/O is actually being used. Query the V$IOSTAT_FILE view to determine whether asynchronous I/O is used:

SQL> select file_no,filetype_name,asynch_io from v$iostat_file;

To explicitly enable asynchronous I/O, set the DISK_ASYNCH_IO initialization parameter to TRUE:

ALTER SYSTEM SET DISK_ASYNCH_IO=TRUE SCOPE=SPFILE SID='*';

Note that if you are using ASM, it performs I/O asynchronously by default.

2.2.1.9 Set LOG_BUFFER to at Minimum of 8 MB

For large production databases, set the LOG_BUFFER initialization parameter to a minimum of 8 MB. This setting ensures the database allocates maximum memory for writing Flashback Database logs. If the database is configured to transport redo data to a standby database asynchronously, then you should size the LOG_BUFFER parameter large enough to accommodate the processes involved in the network send.

2.2.1.10 Use Automatic Shared Memory Management

Automatic Shared Memory Management (ASMM) to improve memory management. By setting the SGA_TARGET parameter to a nonzero value, the shared pool, large pool, Java pool, streams pool, and buffer cache are automatically and dynamically resized, as needed. See the Oracle Database Administrator's Guide for more information.

2.2.1.11 Disable Parallel Recovery for Instance Recovery

When the value of RECOVERY_ESTIMATED_IOS in the V$INSTANCE_RECOVERY view is small (for example, < 5000), then the overhead of parallel recovery may outweigh any benefit. This typically occurs with a very aggressive setting of FAST_START_MTTR_TARGET. In this case, set RECOVERY_PARALLELISM to 1 to disable parallel recovery.

2.2.2 Recommendations to Improve Manageability

Use the following best practices to improve Oracle Database manageability:

2.2.2.1 Use Data Recovery Adviser to Detect, Analyze and Repair Data Failures

Data Recovery Advisor automatically diagnoses data failures, determines and presents appropriate repair options, and executes repairs at the user's request. In this context, a data failure is a corruption or loss of persistent data on disk. By providing a centralized tool for automated data repair, Data Recovery Advisor improves the manageability and reliability of an Oracle database and thus helps reduce the MTTR.

Note:

In the current release, Data Recovery Advisor only supports single-instance databases. Oracle RAC and Oracle Data Guard databases are not supported.

See Also:

The chapter about "Diagnosing and Repairing Failures with Data Recovery Advisor" in the Oracle Database Backup and Recovery User's Guide

2.2.2.2 Use Automatic Performance Tuning Features

Effective data collection and analysis is essential for identifying and correcting performance problems. Oracle provides several tools that gather information regarding database performance.

The Oracle Database automatic performance tuning features include:

  • Automatic Workload Repository (AWR)

  • Automatic Database Diagnostic Monitor (ADDM)

  • SQL Tuning Advisor

  • SQL Access Advisor

  • Active Session History Reports (ASH)

When using AWR, consider the following best practices:

  • Set the AWR automatic snapshot interval to 10-20 minutes to capture performance peaks during stress testing or to diagnose performance issues.

  • Under usual workloads a 60-minute interval is sufficient.

2.2.2.3 Use a Server Parameter File

The server parameter file (SPFILE) enables a single, central parameter file to hold all database initialization parameters associated with all instances of a database. This provides a simple, persistent, and robust environment for managing database parameters. An SPFILE is required when using the broker.

See Also:

2.2.2.4 Use Automatic Undo Management

With automatic undo management, the Oracle Database server effectively and efficiently manages undo space, leading to lower administrative complexity and cost. When Oracle Database internally manages undo segments, undo block and consistent read contention are eliminated because the size and number of undo segments are automatically adjusted to meet the current workload requirement.

To use automatic undo management, set the following initialization parameters:

  • UNDO_MANAGEMENT

    Set this parameter to AUTO.

  • UNDO_RETENTION

    Specify the desired time in seconds to retain undo data. Set this parameter to the same value on all instances.

  • UNDO_TABLESPACE

    Specify a unique undo tablespace for each instance.

Advanced object recovery features, such as Flashback Query, Flashback Version Query, Flashback Transaction Query, and Flashback Table, require automatic undo management. The success of these features depends on the availability of undo information to view data as of a previous point in time.

By default, Oracle Database automatically tunes undo retention by collecting database usage statistics and estimating undo capacity needs. Unless you enable retention guarantee for the undo tablespace (by specifying the RETENTION GUARANTEE clause on either the CREATE DATABASE or the CREATE UNDO TABLESPACE statement), Oracle Database may reduce the undo retention below the specified UNDO_RETENTION value.

Note:

By default, ongoing transactions can overwrite undo data even if the UNDO_RETENTION parameter setting specifies that the undo data should be maintained. To guarantee that unexpired undo data is not overwritten, you must enable RETENTION GUARANTEE for the undo tablespace.

If there is a requirement to use Flashback technology features, the best practice recommendations is to enable RETENTION GUARANTEE for the undo tablespace and set a value for UNDO_RETENTION based on the following guidelines:

  1. Establish how long it would take to detect when erroneous transactions have been carried out. Multiply this value by two.

  2. Use the Undo Advisor to compute the minimum undo tablespace size based on setting UNDO_RETENTION to the value recommended in step 1.

  3. If the undo tablespace has the AUTOEXTEND option disabled, allocate enough space as determined in step 2 or reduce the value of the UNDO_RETENTION parameter.

  4. If the undo tablespace has the AUTOEXTEND option enabled, make sure there is sufficient disk space available to extend the datafiles to the size determined in step 2. Make sure the autoextend MAXSIZE value you specified is large enough.

See Also:

The section about "Computing the Minimum Undo Tablespace Size Using the Undo Advisor" in Oracle Database 2 Day DBA

With the RETENTION GUARANTEE option, if the tablespace is configured with less space than the transaction throughput requires, then the following sequence of events occurs:

  1. If you have an autoextensible file, then the file automatically grows to accommodate the retained undo data.

  2. A warning alert reports the disk is at 85% full.

  3. A critical alert reports the disk is at 97% full.

  4. Transactions receive an out-of-space error.

See Also:

Oracle Database Administrator's Guide for more information about the UNDO_RETENTION setting and the size of the undo tablespace

2.2.2.5 Use Locally Managed Tablespaces

Locally managed tablespaces perform better than dictionary-managed tablespaces, are easier to manage, and eliminate space fragmentation concerns. Locally managed tablespaces use bitmaps stored in the data file headers and, unlike dictionary managed tablespaces, do not contend for centrally managed resources for space allocations and de-allocations.

See Also:

Oracle Database Administrator's Guide for more information about locally managed tablespaces

2.2.2.6 Use Automatic Segment Space Management

Automatic segment space management simplifies space administration tasks, thus reducing the chance of human error. An added benefit is the elimination of performance tuning related to space management. It facilitates management of free space within objects such as tables or indexes, improves space utilization, and provides significantly better performance and scalability with simplified administration. The automatic segment space management feature is enabled by default for all tablespaces created using default attributes.

See Also:

Oracle Database Administrator's Guide for more information on segment space management

2.2.2.7 Use Temporary Tablespaces and Specify a Default Temporary Tablespace

Temporary tablespaces improve the concurrency of multiple sort operations, reduce sort operation overhead, and avoid data dictionary space management operations. This is a more efficient way of handling temporary segments, from the perspective of both system resource usage and database performance.

The best practice is to specify a default temporary tablespace for the entire database to ensure that temporary segments are used for the most efficient sort operations, whether individual users have been assigned a temporary tablespace.

To Specify a Default Temporary Tablespace ... Then ...
When creating the database ... Use the DEFAULT TEMPORARY TABLESPACE clause of the CREATE DATABASE statement
After database creation ... Use the ALTER DATABASE statement

Using the default temporary tablespace ensures that all disk sorting occurs in a temporary tablespace and that other tablespaces are not mistakenly used for sorting.

See Also:

Oracle Database Administrator's Guide for more information about managing tablespaces

2.2.2.8 Use Resumable Space Allocation

Resumable space allocation provides a way to suspend and later resume database operations if there are space allocation failures. The affected operation is suspended instead of the database returning an error. No processes must be restarted. When the space problem is resolved, the suspended operation is automatically resumed.

To use resumable space allocation, you can set it at the system level with the RESUMABLE_TIMEOUT initialization parameter, or enable it at the session level using clauses of the ALTER SESSION statement (for example, issue the ALTER SESSION ENABLE RESUMABLE statement). The default for a new session is resumable mode disabled, unless you explicitly set the RESUMABLE_TIMEOUT initialization parameter to a nonzero value.

See Also:

Oracle Database Administrator's Guide for more information about managing resumable space allocation

2.2.2.9 Use Database Resource Manager

The Database Resource Manager gives database administrators more control over resource management decisions, so that resource allocation can be aligned with the business objectives of an enterprise. The Database Resource Manager provides the ability to prioritize work within the Oracle Database server. Availability of the database encompasses both its functionality and performance. If the database is available but users are not getting the level of performance they need, then availability and service level objectives are not being met. Application performance, to a large extent, is affected by how resources are distributed among the applications that access the database. The main goal of the Database Resource Manager is to give the Oracle Database server more control over resource management decisions, thus circumventing problems resulting from inefficient operating system management and operating system resource managers.

See Also:

Oracle Database Administrator's Guide for more information about Database Resource Manager