|Oracle® Database High Availability Overview
12c Release 1 (12.1)
|PDF · Mobi · ePub|
This chapter describes the Oracle Database features used in MAA solutions.
Oracle Data Guard ensures high availability, data protection, and disaster recovery for enterprise data. Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable Oracle databases to survive outages of any kind, including natural disasters and data corruptions. A Data Guard standby database is an exact replica of the production database and thus can be transparently utilized in combination with traditional backup, restoration, flashback, and cluster techniques to provide the highest possible level of data protection and data availability. Data Guard is included in Oracle Enterprise Edition.
A Data Guard configuration consists of one primary database and one or more standby databases. A primary database can be either a single-instance Oracle database or an Oracle RAC database. Similar to a primary database, a standby database can be either a single-instance Oracle database or an Oracle RAC database. Using a backup copy of the primary database, you can create up to 30 standby databases that receive redo directly from the primary database. Optionally you can use a cascaded standby to create Data Guard configurations where the primary transmits redo to a single remote destination, and that destination forwards redo to multiple standby databases. This enables a primary database to efficiently synchronize many more than 30 standby databases if desired.
Note:Oracle Active Data Guard is an extension of basic Data Guard providing advanced features that off-load various types of processing from a production database, extend zero data loss protection over any distance, and that enhance high availability. Oracle Active Data Guard is licensed separately from Oracle Database Enterprise Edition. Oracle Active Data Guard is discussed more completely in Section 3.1.1, "Oracle Active Data Guard."
There are several types of standby databases. Data Guard physical standby database is the MAA best practice for data protection and disaster recovery and is the most common type of standby database used. A physical standby database uses Redo Apply (an extension of Oracle media recovery) to maintain an exact, physical replica of the production database. When configured using MAA best practices, Redo Apply uses multiple Oracle-aware validation checks to prevent corruptions that can impact a primary database from impacting the standby. Other types of Data Guard standby databases include: snapshot standby (a standby open read/write for test or other purposes) and logical standby (used to reduce planned downtime).
Continuous Oracle-aware validation of all changes using multiple checks for physical and logical consistency of structures within an Oracle data block and redo, before updates are applied to a standby database. This isolates the standby database and prevents it from being impacted by data corruptions that can occur on the primary system.
Transparent operation: There are no restrictions on the use of Data Guard physical standby for data protection. Redo Apply supports all data and storage types, all DDL operations, and all applications (custom and packaged applications), and guarantees data consistency across primary and standby databases.
Highest performance: Fast redo transport for best recovery point objective, fast apply performance for best recovery time objective.
Fast failover to a standby database to maintain availability should the primary database fail for any reason. Failover is either a manual or automatic operation depending on how Data Guard is configured.
Integrated client notification framework to enable application clients to connect to a new primary database after a failover occurs.
Automatic or automated (depending upon configuration) resynchronization of a failed primary database, quickly converting it to a synchronized standby database after a failover occurs.
Choice of flexible data protection levels to support all network configurations, availability and performance SLAs, and business requirements.
Management of a primary and all of its standby databases as a single configuration to simplify management and monitoring using either the Data Guard Broker command-line interface or Oracle Enterprise Manager Cloud Control.
Data Guard Broker 12c greatly improves manageability with additional features for comprehensive configuration health checks, resumable switchover operations, streamlined role transitions, support for cascaded standby configurations, and user-configurable thresholds for transport and apply lag to automatically monitor the ability of the configuration to support SLAs for recovery point and recovery time objectives at any instant in time.
Efficient transport to multiple remote destinations using a single redo stream originating from the primary production database and forwarded by a cascading standby database.
Snapshot Standby enables a physical standby database to be open read/write for testing or any activity that requires a read/write replica of production data. A snapshot standby continues to receive but does not apply updates generated by the primary. When testing is complete, a snapshot standby is converted back into a synchronized physical standby database by first discarding the changes made during the open read/write, and then applying the redo received from the primary database. Primary data is always protected. Snapshot standby is particularly useful when used in conjunction with Oracle Real Application Testing (workload is captured at the production database for replay and subsequent performance analysis at the standby database--an exact replica of production).
Reduction of planned downtime by utilizing a standby database to perform maintenance in rolling fashion. The only downtime is the time required to perform a Data Guard switchover; applications remain available while the maintenance is being performed. (See Section 3.3.3, "When to Use Oracle Active Data Guard and Oracle GoldenGate Together" and Table 5-7, "Oracle High Availability Solutions for System and Software Maintenance" for more details).
Increased flexibility for Data Guard configurations where the primary and standby systems may have different CPU architectures or operating systems subject to limitations defined in My Oracle Support note 413484.1 at
Efficient disaster recovery for an Oracle Database 12c container database (CDB). Data Guard failover and switchover completes using a single command at a CDB level regardless of how many databases (pluggable databases or PDBs) are consolidated within the CDB.
Data Guard 12c enables a specific administration privilege, SYSDG, to handle standard administration duties for Data Guard. This new privilege is based on the least privilege principle, in which a user is granted only the necessary privileges required to perform a specific function and no more. The SYSDBA privilege continues to work as in previous releases.
Oracle Active Data Guard is Oracle's strategic solution for real time data protection and disaster recovery for the Oracle database using a physical replication process. Oracle Active Data Guard also provides high return on investment in disaster recovery systems by enabling a standby database to be open read-only while it applies changes received from the primary database. Oracle Active Data Guard is a separately licensed product that provides advanced features that greatly expand Data Guard capabilities included with Oracle Enterprise Edition.
Oracle Active Data Guard enables administrators to improve performance by offloading processing from the primary database to a physical standby database that is open read-only while it applies updates received from the primary database. Offload capabilities of Oracle Active Data Guard 12c were enhanced to include read-only reporting and ad-hoc queries (including DML to global temporary tables and unique global or session sequences), data extracts, fast incremental backups, redo transport compression, efficient servicing of multiple remote destinations, and the ability to extend zero data loss protection to a remote standby database without impacting primary database performance. Oracle Active Data Guard also increases high availability by performing automatic block repair and enabling High Availability Upgrades (new automation in Oracle Database 12c for more easily implementing database rolling upgrades).
Note:Oracle Active Data Guard is licensed separately as a database option license for Oracle Database Enterprise Edition. All Oracle Active Data Guard capabilities are also included in an Oracle Golden Gate license for Oracle Enterprise Edition. This provides customers with the choice of a standalone license for Oracle Active Data Guard, or licensing Oracle GoldenGate to acquire access to all advanced Oracle replication capabilities.
Oracle Active Data Guard inherits all of the benefits previously listed for Data Guard, plus the following:
Improves primary database performance: Production-offload to an Oracle Active Data Guard standby database of read-only applications, reporting, and ad hoc queries. Any application compatible with a read-only database can run on an Oracle Active Data Guard standby. Oracle also provides integration that enables the offloading of many Oracle E-Business Suite Reports, PeopleTools reporting, Oracle Business Intelligence Enterprise Edition (OBIEE), and Oracle TopLink applications to an Oracle Active Data Guard standby database.
Oracle Active Data Guard 12c provides new support for DML to global temporary tables and the use of sequences at the standby database. This significantly expands the number of read-only applications that can be off-loaded from production databases to an Oracle Active Data Guard standby database.
The unique ability to easily scale read performance using multiple Oracle Active Data Guard standby databases, also referred to as a Reader Farm.
Production-offload of data extracts using Oracle Data Pump or other methods that read directly from the source database.
Production-offload of the performance impact from network latency in a synchronous, zero data loss configuration where primary and standby databases are separated by hundreds or thousands of miles. Oracle Active Data Guard 12c far sync utilizes a lightweight instance (control file and archive log files, but no recovery and no data files), deployed on a system independent of the primary database. The far sync instance is ideally located at the maximum distance from the primary system that an application can tolerate the performance impact of synchronous transport to provide optimal protection. Data Guard transmits redo synchronously to the far sync instance and far sync forwards the redo asynchronously to a remote standby database that is the ultimate failover target. If the primary database fails, the same failover command used for any Data Guard configuration, or mouse click using Oracle Enterprise Manager 12c, or automatic failover using Data Guard Fast-Start Failover executes a zero data loss failover to the remote destination. This transparently extends zero data loss protection to a remote standby database just as if it were receiving redo directly from the primary database, while avoiding the performance impact to the primary database of WAN network latency in a synchronous configuration.
Production-offload of the overhead of servicing multiple remote standby destinations using far sync. In a far sync configuration, the primary database ships a single stream of redo to a far sync instance using synchronous or asynchronous transport. The far sync instance is able to forward redo asynchronously to as many as 29 remote destinations with zero incremental overhead on the source database.
Production-offload of CPU cycles required to perform redo transport compression. Redo transport compression can be performed by the far sync instance if the Data Guard configuration is licensed for Oracle Advanced Compression. This conserves bandwidth with zero incremental overhead on the primary database.
Production-offload and increased backup performance by moving fast incremental backups off of the primary database and to the standby database by utilizing Oracle Active Data Guard support for RMAN block change tracking.
Increased high availability using Oracle Active Data Guard automatic block repair to repair block corruptions detected at either the primary or standby, transparent to applications and users
Increased high availability by reducing planned downtime for upgrading to new Oracle Database patch sets and database releases using the additional automation provided by high availability Upgrade, new with Oracle Active Data Guard 12c (described in Section 18.104.22.168, "Rolling Upgrade Using Oracle Active Data Guard.")
Data Guard standby databases can be used to reduce planned downtime by performing maintenance in a rolling fashion. Changes are implemented first at the standby database. The configuration is allowed to run with the primary at the old version and standby at the new version until there is confidence that the new version is ready for production. A Data Guard switchover is then performed, transitioning production to the new version. The only database downtime is the time required to perform the switchover.
There are several approaches to performing maintenance in a rolling fashion using a Data Guard standby. Customer requirements and preferences determine which approach is used. The following approaches are discussed in this document:
Beginning with Oracle Database 10g, there has been increased flexibility in cross-platform support using Data Guard Redo Apply. In certain Data Guard configurations, primary and standby databases are able to run on systems having different operating systems (for example, Windows and Linux), word size (32bit/64bit), or hardware architectures. Redo Apply can also be used to migrate to Oracle Automatic Storage Management (ASM), to move from single instance Oracle databases to Oracle RAC, to perform technology refresh, or to move from one data center to the next.
Beginning with Oracle Database 11g release 2 (11.2), Standby-First Patch Apply (physical standby using Redo Apply) can support different software patch levels between a primary database and its physical standby database for the purpose of applying and validating Oracle patches in a rolling fashion. Patches eligible for Standby-First patching include:
Database Patch Set Update (PSU)
Database Critical Patch Update (CPU)
Database bundled patch
Oracle Exadata Database Machine bundled patch
Exadata Storage Server Software patch
Any operating system, system firmware, or system changes compatible with the existing Oracle database version
Standby-First Patch Apply is supported for certified software patches for Oracle Database Enterprise Edition 11g release 2 (11.2) and later.
In each of the types of planned maintenance previously described, the configuration begins with a primary and physical standby database (in the case of migration to a new platform, or to ASM or Oracle RAC, the standby is created on the new platform). After all changes are implemented at the physical standby database, Redo Apply (physical replication) is used to synchronize the standby with the primary. A Data Guard switchover is used to transfer production to the standby (the new environment).
My Oracle Support Note 413484.1 at
http://support.oracle.com/ for information about mixed platform combinations supported in a Data Guard configuration.
My Oracle Support Note 1265700.1 at
http://support.oracle.com/ for more information about Standby First Patch Apply and the README for each patch to determine if a target patch is certified as being a Standby-First Patch.
There are numerous types of maintenance tasks that are unable to use Redo Apply (physical replication) to synchronize the original version of a database with the changed or upgraded version. These tasks include:
Database patches or upgrades that are not Standby-First Patch Apply-eligible. This includes database patch-sets (22.214.171.124 to 126.96.36.199) and upgrade to new Oracle Database releases (188.8.131.52 to 12.1).
Maintenance must be performed that modifies the physical structure of a database that would require downtime (for example, adding partitioning to non-partitioned tables, changing Basicfile LOBs to Securefile LOBs, changing XML-CLOB to Binary XML, or altering a table to be OLTP-compressed).
All of the previous types of maintenance can be performed in a rolling fashion using a Data Guard standby database by using Data Guard SQL Apply (logical replication) to synchronize the old and new versions of the database. Prior to Oracle Database 11g this required creating a logical standby database, performing the maintenance on the logical standby, resynchronizing the standby with the primary, and then switching over. Additionally if a physical standby was being used for disaster recovery, then a new physical standby database would have to be created from a backup of the production database at the new version. This represented a number of logistical and cost challenges when upgrading a multi-terabyte database.
Beginning with Oracle Database 11g, database rolling upgrades can use a new procedure called Transient Logical that begins and ends with a physical standby database. SQL Apply is only used during the phase when Data Guard is synchronizing across old and new versions. A new logical standby database does not need to be created if there is already a physical standby in place. A new physical standby database does not need to be created from a backup of the production database at the new version after the maintenance is complete. Similar to the traditional process of upgrading a Data Guard configuration having an in-place physical standby, the original primary is upgraded or changed using redo from the new primary database and Redo Apply (a single catalog upgrade migrates both primary and standby databases to the new Oracle release).
Transient Logical upgrades require that the primary database be at Oracle Database 11g release 1 (11.1) or later and that the database meet the pre-requisites of SQL Apply.
Oracle provides a Bourne shell script that automates a number of the manual steps required by the Transient Logical rolling upgrade process. See the MAA Best Practice Paper "Database Rolling Upgrades Made Easy" at
http://www.oracle.com/technetwork/database/features/availability/maa-wp-11g-upgrades-made-easy-131972.pdf for more information.
Oracle Database 12c introduces rolling upgrade using Oracle Active Data Guard to provide a simpler, automated, and easily repeatable method for reducing planned downtime than represented by the manual Transient Logical rolling upgrade procedure previously described. Rolling upgrade using Oracle Active Data Guard transforms the 42 or more steps required by the manual procedure into several easy-to-use DBMS_ROLLING PL/SQL packages.
A rolling upgrade using Oracle Active Data Guard uses the following steps:
Generates an upgrade plan with a configuration-specific set of instructions to guide the administrator through the upgrade process
Modifies parameters of the rolling upgrade
Install new software at all databases participating in the upgrade
Configures primary and standby databases participating in the upgrade
Upgrade or make changes to the standby database
Switchover moves the production to the new version
Switchover is the only downtime required
Restart former primary using new binaries if appropriate
Completes the upgrade of the old primary and any additional standby databases in the Data Guard configuration and resynchronizes with the new primary
Rolling upgrade using Oracle Active Data Guard has the following benefits:
Provides a simple specify-compile-execute protocol
Catches configuration errors at the compilation step
Runtime errors are detected during execution
The state is kept in the database
Enables a reliable, repeatable process
Runtime steps are constant regardless of how many databases are involved
Handles failure at the original primary database
Enables data protection for the upgraded primary at all times
Rolling upgrade using Oracle Active Data Guard requires an Oracle Active Data Guard license and that the primary database be at Oracle Database 12c release 1 (12.1) or later and that the database satisfy prerequisites of SQL Apply. If the primary database is on an earlier Oracle Database release, see the MAA white paper "Database Rolling Upgrades Made Easy" at
See Also:Oracle Data Guard Concepts and Administration "Using DBMS_ROLLING to Perform a Rolling Upgrade."
Oracle GoldenGate is Oracle's strategic logical replication solution for data distribution and data integration. Unlike replication solutions from other vendors, Oracle GoldenGate is more closely integrated with Oracle Database while also providing an open, modular architecture ideal for replication across heterogeneous database management systems. This combination of attributes eliminates compromise, making Oracle GoldenGate the preferred replication solution for addressing requirements that span Oracle Database and non-Oracle Database environments.
Oracle GoldenGate offers a real-time, log-based change data capture and replication software platform. The software provides capture, routing, transformation, and delivery of transactional data across heterogeneous databases in real time.
A typical environment includes a capture, pump, and delivery process. Each of these processes can run on most of the popular operating systems and databases, including Oracle Database and non-Oracle databases. All or a portion of the data may be replicated, and the data within any of these processes can be manipulated for not only heterogeneous environments but also different database schemas. Oracle GoldenGate supports multimaster replication, hub-and-spoke deployment, and data transformation.
Oracle GoldenGate 11g Release 2 offers significant new features that greatly enhance its replication capabilities and integration with Oracle Database. The new features include:
GoldenGate Integrated Capture for Oracle Database. The Integrated Capture mechanism relies on Oracle Database internal log parsing and processing to capture DML transactions. By moving closer to the Oracle database engine, Oracle GoldenGate can take advantage of new Oracle Database features and functionality more quickly. Integrated capture also supports two different deployment models: Local capture which runs on the source database and Downstream capture which runs on a separate server. With Integrated Capture, Oracle GoldenGate now supports all compression used by Oracle Database and Oracle Exadata, including support for Hybrid Columnar Compression (HCC), OLTP, and Segment compression. Integrated Capture also adds distributed transaction support for XA (distributed) and PDML (parallel DML) transactions on Oracle RAC. Finally, XML Object Relational and XML Binary data types are supported along with LOB full and partial reads (selective update) from the redo log.
Intelligent conflict detection and resolution for active-active or multimaster implementations. Setting up conflict detection and resolution takes less time to implement using prebuilt functions to identify the conditions under which an error occurs and how to handle the record when the condition occurs.
Expanded globalization capability for international customers including multibyte support and character set conversions.
Improved security with added support for the industry encryption standard Federal Information Protection Standard (FIPS).
Improved performance with reduction of acknowledgement messages during data replication.
Enhanced manageability and monitoring for customers wanting to take advantage of Oracle Enterprise Manager.
Expanded heterogeneity with new real-time capture support for IBM i-Series (AS/400) and delivery support for Postgres.
Oracle GoldenGate logical replication enables all databases in an Oracle GoldenGate configuration, both source and target databases, to be open read/write. This fact combined with Oracle GoldenGate advanced replication features make it a key component of MAA for addressing a broad range of high availability challenges for zero downtime maintenance, cross platform migration, and continuous data availability, specifically:
Zero or near zero downtime maintenance. In this regard Oracle GoldenGate provides greater flexibility than basic capabilities provided by Data Guard. Oracle GoldenGate source and target databases can have a different physical and logical structure, can reside on different hardware and operating system architectures, can span wide differences in Oracle Database releases (for example, 9i to 12c), or be a mix of Oracle and non-Oracle systems. Maintenance is first performed on a target database while production runs on the source. After the maintenance is complete, production can be moved to the source all at once, similar to a Data Guard switchover. Optionally, bidirectional replication can be used to gradually move users over to the new system to create the perception of zero downtime. In either case, Oracle GoldenGate replication can be enabled in the reverse direction to keep the original source database synchronized during a transition period, making it simple to effect a planned fall-back to the previous version if needed with minimal downtime and no data loss.
Zero or near-zero downtime application upgrades. Application upgrades that modify back-end database objects typically result in significant planned downtime while maintenance is being performed. Oracle GoldenGate replication enables data transformations that map database objects used by a previous version of an application to objects modified by the new version of an application. This enables database maintenance to be performed on a separate copy of the production database without impacting the availability of the application. After the maintenance is complete and Oracle GoldenGate has finished synchronizing old and new versions, users can be switched to the new version of the application.
Oracle GoldenGate enables read/write access to a replica database while it is being synchronized with its source database. This is most often used to offload reporting to a copy of a production database when the reporting application requires a read/write connection to database in order to function. This is also relevant, however, to certain disaster recovery environments where the nature of the technology used for the application tier requires an active read/write connection to the DR database at all times in order to meet recovery time objectives. Oracle GoldenGate would be used in this later case in place of Oracle Active Data Guard where the additional data protection, simplicity, and transparency offered by an Oracle Active Data Guard standby is traded in return for a failover target that is always open read/write.
Multimaster and bidirectional replication architectures where there are multiple databases that all contain the same data and synchronized by Oracle GoldenGate. An update at any database is immediately replicated to all other databases. Update conflicts are either handled by the application, or by conflict handlers configured using Oracle GoldenGate, or are manually resolved. This is an architecture that emphasizes workload balancing and data availability versus simplicity of operation. Each Oracle GoldenGate source can also be protected by a Data Guard standby database for optimal disaster protection. Optionally, due to cost considerations, each Oracle GoldenGate replica can be used to provide both data availability and DR protection to avoid the additional cost of a Data Guard physical standby database.
See Also:Oracle GoldenGate Documentation
While Oracle Active Data Guard and Oracle GoldenGate are each capable of maintaining a synchronized copy of an Oracle database, each has unique characteristics that result in high availability architectures that can use one technology or the other, or both at the same time, depending upon requirements. Examples of MAA Best Practice guidelines for use cases relevant to Oracle Database 12c are as follows:
Use Oracle Active Data Guard when the emphasis is on simplicity, data protection, and availability:
Simplest, fastest, one-way replication of a complete Oracle database.
No restrictions: Data Guard Redo Apply supports all data and storage types and Oracle features; transparent replication of DDL
Features optimized for data protection: Detects silent corruptions that can occur on source or target; automatically repairs corrupt blocks
Synchronized standby open read-only provides simple read-only offloading for maximum ROI
Transparency of backups: A Data Guard primary and standby are physically exact copies of each other; RMAN backups are completely interchangeable
Zero data loss protection at any distance, without impacting database performance
Minimizing planned downtime and risk using standby first patching, database rolling upgrades, and select platform migrations
Reduce risk of introducing change by dual purposing a DR system for testing using Data Guard Snapshot Standby
Integrated automatic database and client failover
Integrated management of a complete configuration: Data Guard Broker command line interface or Oracle Enterprise Manager Cloud Control
Use Oracle GoldenGate when the emphasis is on advanced replication requirements not addressed by Oracle Active Data Guard:
Any requirement where the replica database must be open read/write while synchronizing with the primary database
Any advanced replication requirements such as multimaster and bidirectional replication, subset replication, many-to-one replication, cross-endian replication, and data transformations.
Maintenance and migrations where zero downtime or near zero downtime is required. Oracle GoldenGate can be used to migrate between application versions, for example, from Application 1.0 to Application 2.0 without downtime.
Database rolling upgrades where it is desired to replicate from new version down to the old version for the purpose of fast fall-back if something is wrong with the upgrade.
Zero downtime planned maintenance where bidirectional replication is used to gradually migrate users to the new version, creating the perception of zero downtime. Note that bidirectional replication requires avoiding or resolving update conflicts that can occur on disparate databases.
Oracle Active Data Guard and Oracle GoldenGate are not mutually exclusive. The following are use cases of high availability architectures that include the simultaneous use of Oracle Active Data Guard and Oracle GoldenGate:
An Oracle Active Data Guard standby is utilized for disaster protection and database rolling upgrades for a mission critical OLTP database. At the same time, Oracle GoldenGate is used to extract data from the Data Guard primary database (or from the standby database using Oracle GoldenGate ALO mode) for ETL update of an enterprise data warehouse.
Oracle GoldenGate subset replication is used to create an operational data store (ODS) that extracts, transforms, and aggregates data from numerous data sources. The ODS supports mission critical application systems that generate significant revenue for the company. An Oracle Active Data Guard standby database is used to protect the ODS, providing optimal data protection and availability.
Oracle GoldenGate bidirectional replication is utilized to synchronize two databases separated by thousands of miles. User workload is distributed across each database based upon geography, workload, and service level using Oracle 12c Global Data Services (GDS). Each Oracle GoldenGate copy has its own local synchronous Data Guard standby database that enables zero data loss failover if an outage occurs. Oracle GoldenGate capture and apply processes are easily restarted on the new primary database following a failover because the primary and standby are an exact, up-to-date replica of each other.
An Oracle Active Data Guard standby database utilized steady state for disaster protection is temporarily converted into an Oracle GoldenGate target for the purpose of performing planned maintenance not supported by Data Guard. In this example, assume a Siebel application upgrade requires modification of back-end database objects. The process used is as follows:
Before maintenance is performed, the Data Guard apply process is suspended and a guaranteed restore point is set on both the primary and standby database. For clarity, the original standby database is designated as the upgrade target.
The upgrade target is activated as a primary database, and all database changes required by the new version of the application are implemented.
Meanwhile, the original version of the application is available without disruption to the primary database.
Data Guard transport continues to ship redo for primary transactions to the upgrade target. Redo is not applied and instead is retained in archive logs for protection if something is wrong with the upgrade or if a failover is required before the upgrade is complete. In such a case, the upgrade target can be quickly flashed back to the guaranteed restore point set in Step 1 and converted back into a synchronized standby of the primary database.
Oracle GoldenGate capture is also started on the original primary starting at the guaranteed restore point set in Step 1, but replication to the upgrade target is deferred until the maintenance is complete.
When all maintenance at the upgrade target is complete, Oracle GoldenGate replication is enabled, and the upgrade target is synchronized with all of the transactions that occurred at the original primary while maintenance was performed.
When synchronization is complete, production is switched to the upgrade target, making it the new primary database.
The original primary is flashed back to the guaranteed restore point set in step 1 and converted into a physical standby database. Data Guard transport is restarted pointing from the new primary operating at the new version, back to the original primary, which is now a physical standby database.
The physical standby is then upgraded using the redo stream received from the new primary. Both systems, primary and physical standby are now at the new version. The only downtime experienced by users is the time required for switchover in Step 7.
Oracle GoldenGate is used for zero downtime or near-zero downtime maintenance not supported by Data Guard in a configuration that operates steady state with a primary database and an Oracle Active Data Guard standby. This differs from the previous scenario by creating a parallel primary/standby environment running on the new platform and version that is completely separate from production. While production continues to run unaffected on the original primary/standby systems, Oracle GoldenGate one-way replication (near-zero downtime) or bidirectional replication (zero downtime) is configured between old and new environments. When Oracle GoldenGate has completed synchronizing old and new environments, production is switched to the new environment and the old environment is decommissioned. This provides zero or minimal downtime depending upon configuration, eliminates risk by providing complete isolation between the old and new environment, and avoids any impact to data protection and availability SLAs if problems are encountered during the upgrade process.
Recovery Manager (RMAN) is an Oracle Database utility to manage database backup and, more importantly, the recovery of the database. RMAN eliminates operational complexity while providing superior performance and availability of the database.
RMAN determines the most efficient method of executing the requested backup, restoration, or recovery operation and then submits these operations to the Oracle Database server for processing. RMAN and the server automatically identify modifications to the structure of the database and dynamically adjust the required operation to adapt to the changes.
Support for cross-platform backup and restore (new in Oracle Database 12c)
Network-enabled restoration allows the
RESTORE operations to copy data files directly from one database to another over the network (new in Oracle Database 12c)
Simplified table restoration with the
RECOVER TABLE command (new in Oracle Database 12c)
Automatic channel failover on backup and restore operations
Automatic failover to a previous backup when the restore operation discovers a missing or corrupt backup
Automatic creation of new database files and temporary files during recovery
Automatic recovery through a previous point-in-time recovery—recovery through reset logs
Block media recovery, which enables the data file to remain online while fixing the block corruption
Fast incremental backups using block change tracking
Fast backup and restore operations with intrafile and interfile parallelism
Lower space consumption when creating a database over the network by eliminating staging areas
Merger of incremental backups into image copies in the background, providing up-to-date recoverability
Optimized backup and restoration of required files only
Retention policy to ensure that relevant backups are retained
Ability to resume backup and restoration of previously failed operations
Automatic backup of the control file and the server parameter file, ensuring that backup metadata is available in times of database structural changes and media failure and disasters
Online backup that does not require you to place the database into hot backup mode
Easily reinstantiate a new database from an existing backup or directly from the production database (thus eliminating staging areas) using the
Oracle Secure Backup is a centralized tape backup management solution, providing heterogeneous data protection in distributed UNIX, Linux, Windows, and Network Attached Storage (NAS) environments. By protecting file system and Oracle Database data, Oracle Secure Backup provides a complete tape backup solution for your IT environment.
Oracle Secure Backup is tightly integrated with RMAN to provide the media management layer for RMAN. With optimized integration points, Oracle Secure Backup and RMAN provide the fastest and most efficient tape backup capability for Oracle Database.
You can back up distributed servers to local and remote tape devices from a central Oracle Secure Backup administrative server using backup policies, calendar-based scheduling for lights out operations, or on-demand backup for immediate requirements. With its highly scalable client/server architecture, Oracle Secure Backup provides local and remote data protection, using Secure Sockets layer (SSL) for secure intradomain communication and two-way server authentication.
Optimized performance achieving 25-40% faster Oracle Database backups than comparable media management products with up to 10% less CPU utilization
Unused block and undo block compression
Shared tape buffers with RMAN
Policy-based management that allows backup administrators to exercise precise control over the backup domain
Dynamic drive sharing for increased tape resource use
Heterogeneous Storage Area Network (SAN) support, enabling NAS, UNIX, Windows, and Linux to share tape drives and media
File system backup at the file, directory, file system, or raw partition level with full, incremental, and offsite backup scheduling
Integration with Oracle Enterprise Manager, providing an intuitive, familiar interface
Backup encryption to tape with policy-based encryption key management leveraging either Oracle Secure Backup host-based encryption or hardware encryption (tape drive)
Broad tape-device support for new and legacy tape devices in SAN and SCSI environments
Network Data Management Protocol (NDMP) support for highly efficient backup of NAS files
Scalable, low-cost licensing model that reduces IT costs and operational considerations
Enhanced data throughput Reliable Datagram Socket over Remote Direct Memory Access (RDS/RDMA) over Infiniband networks for maximum backup and restore performance in Exadata Database Machine environments
Oracle-aware backup and restoration on Non-Uniform Memory Access (NUMA) machines, ensuring OSB and Oracle Database background processes communicate in the same NUMA region for optimal performance
See Also:Oracle Secure Backup Administrator's Guide
Oracle RAC and Oracle Clusterware enable Oracle Database to run any packaged or custom application across a set of clustered servers. This capability provides the highest levels of availability and the most flexible scalability. If a clustered server fails, then Oracle Database continues running on the surviving servers. When more processing power is needed, you can add another server without interrupting access to data.
Oracle RAC enables multiple instances that are linked by an interconnect to share access to an Oracle database. In an Oracle RAC environment, Oracle Database runs on two or more systems in a cluster while concurrently accessing a single shared database. The result is a single database system that spans multiple hardware systems, enabling Oracle RAC to provide high availability and redundancy during failures in the cluster. Oracle RAC accommodates all system types, from read-only data warehouse systems to update-intensive online transaction processing (OLTP) systems.
Oracle Clusterware is software that, when installed on servers running the same operating system, enables the servers to be bound together to operate as if they are one server, and manages the availability of user applications and Oracle databases. Oracle Clusterware also provides all of the features required for cluster management, including node membership, group services, global resource management, and high availability functions:
For high availability, you can place Oracle databases (single-instance or Oracle RAC databases), and user applications (Oracle and non-Oracle) under the management and protection of Oracle Clusterware so that the databases and applications restart when a process fails or so that a failover to another node occurs after a node failure.
For cluster management, Oracle Clusterware presents multiple independent servers as if they are a single-system image or one virtual server. This single virtual server is preserved across the cluster for all management operations, enabling administrators to perform installations, configurations, backups, upgrades, and monitoring functions. Then, Oracle Clusterware automatically distributes the execution of these management functions to the appropriate nodes in the cluster.
Oracle Clusterware is a requirement for using Oracle RAC. Oracle Clusterware is the only clusterware that you need for most platforms on which Oracle RAC operates. Although Oracle Database continues to support third-party clusterware products on specified platforms, using Oracle Clusterware provides these main benefits:
Dispenses with proprietary vendor clusterware
Uses an integrated software stack from Oracle that provides disk management with local or remote Oracle Automatic Storage Management (Oracle Flex ASM) to data management with Oracle Database and Oracle RAC
Can be configured in large clusters, called an Oracle Flex Cluster.
In addition, Oracle Database features, such as Oracle Service, use the underlying Oracle Clusterware mechanisms to provide their capabilities.
Oracle Clusterware requires two clusterware components: a voting disk to record node membership information and the Oracle Cluster Registry (OCR) to record cluster configuration information. The voting disk and the OCR must reside on shared storage. Oracle Clusterware requires that each node be connected to a private network over a private interconnect.
Oracle Clusterware provides the following benefits:
Tolerates and quickly recovers from computer and instance failures.
Simplifies management and support by means of using Oracle Clusterware together with Oracle Database. By using fewer vendors and an all Oracle stack you gain better integration compared to using third-party clusterware.
Performs rolling upgrades for system and hardware changes. For example, you can apply Oracle Clusterware upgrades, patch sets, and interim patches in a rolling fashion.
When you upgrade to Oracle Database 12c, Oracle Clusterware and Oracle ASM binaries are installed as a single binary called the Oracle Grid Infrastructure. You can upgrade Oracle Clusterware in a rolling manner from Oracle Clusterware 10g and Oracle Clusterware 11g; however, you can only upgrade Oracle ASM in a rolling manner from Oracle Database 11g release 1 (11.1).
Automatically restarts failed Oracle processes.
Automatically restarts resources from failed nodes on surviving nodes.
Controls Oracle processes as follows:
For Oracle RAC databases, Oracle Clusterware controls all Oracle processes by default.
For Oracle single-instance databases, Oracle Clusterware enables you to configure the Oracle processes into a resource group that is under the control of Oracle Clusterware.
Provides an application programming interface (API) for Oracle and non-Oracle applications that enables you to control other Oracle processes with Oracle Clusterware, such as restart or react to failures and certain rules.
Manages node membership and prevents split-brain syndrome in which two or more instances attempt to control the database.
Provides the ability to perform rolling release upgrades of Oracle Clusterware, with no downtime for applications.
Together, Oracle RAC and Oracle Clusterware provide all of the Oracle Clusterware benefits listed in Section 3.6.1 plus the following benefits:
Provides better integration and support of Oracle Database by using an all Oracle software stack compared to using third-party clusterware.
Relocate Oracle Service automatically. Plus, when you perform additional fast application notification (FAN) and client configuration, distribute FAN events so that applications can react immediately to achieve fast, automatic, and intelligent connection and failover.
Detect connection failures fast and automatically, and remove terminated connections for any Java application using Oracle Universal Connection Pool (Oracle UCP) Fast Connection Failover and FAN events.
Allow the flexibility to increase processing capacity using commodity hardware without downtime or changes to the application.
Provide comprehensive manageability integrating database and cluster features.
Provide scalability across database instances.
Oracle Real Application Clusters One Node (Oracle RAC One Node) is a single instance of an Oracle RAC database that runs on one node in a cluster. This feature enables you to consolidate many databases into one cluster with minimal overhead, protecting them from both planned and unplanned downtime. The consolidated databases reap the high availability benefits of failover protection, online rolling patch application, and rolling upgrades for the operating system and Oracle Clusterware.
Oracle RAC One Node enables better availability than cold failover for single-instance databases because of the Oracle technology called online database relocation, which intelligently migrates database instances and connections to other cluster nodes for high availability and load balancing. Online database relocation is performed using the Server Control Utility (SRVCTL).
Oracle RAC One Node provides the following:
Always available single-instance database services
Built-in cluster failover for high availability
Live migration of instances across servers
Online rolling patches and rolling upgrades for single-instance databases
Online upgrade from single-instance to multiple-instance Oracle RAC
Better consolidation for database servers
Enhanced server virtualization
Lower cost development and test platform for full Oracle RAC
Relocation of Oracle RAC primary and standby databases configured with Data Guard. This functionality is available starting with Oracle Database 11g Release 2 (184.108.40.206).
Oracle RAC One Node also facilitates the consolidation of database storage, standardizes your database environment, and, when necessary, enables you to transition to a full, multiple-instance Oracle RAC database without downtime or disruption.
Significantly less work to provision database storage
Higher level of availability
Elimination of the expense, installation, and maintenance of specialized storage products
Unique capabilities for database applications
For optimal performance, Oracle ASM spreads files across all available storage. To protect against data loss, Oracle ASM extends the concept of SAME (stripe and mirror everything) and adds more flexibility because it can mirror at the database file level rather than at the entire disk level.
More important, Oracle ASM simplifies the processes of setting up mirroring, adding disks, and removing disks. Instead of managing hundreds or possibly thousands of files (as in a large data warehouse), DBAs using Oracle ASM create and administer a larger-grained object called a disk group. The disk group identifies the set of disks that are managed as a logical unit. Automation of file naming and placement of the underlying database files save administrators time and ensure adherence to standard best practices.
The Oracle ASM native mirroring mechanism (two-way or three-way) protects against storage failures. With Oracle ASM mirroring, you can provide an additional level of data protection with the use of failure groups. A failure group is a set of disks sharing a common resource (disk controller or an entire disk array) whose failure can be tolerated. After it is defined, an Oracle ASM failure group intelligently places redundant copies of the data in separate failure groups. This ensures that the data is available and transparently protected against the failure of any component in the storage subsystem.
Mirror and stripe across drives and storage arrays.
Automatically remirror from a failed drive to remaining drives.
Automatically rebalance stored data when disks are added or removed while the database remains online.
Allow for operational simplicity in managing database storage.
Manage the Oracle Cluster Registry (OCR) and voting disks.
Provide preferred read capability on disks that are local to the instance, which gives better performance for an extended cluster.
Support very large databases.
Support Oracle ASM rolling upgrades.
Improve availability and reliability using the Oracle ASM disk scrubbing process to find and repair logical data corruptions using mirror disks.
Provide disaster recovery capability for the file system by enabling replication of Oracle ACFS across the network to a remote site.
Patch the Oracle ASM instance without impacting the clients that are being serviced using Oracle Flex ASM. A database instance can be directed to access Oracle ASM metadata from another location while the current Oracle ASM instance it is connected to is taken offline for planned maintenance.
Monitor and manage the speed and status of Oracle ASM Disk Resync and Rebalance operations.
Bring online multiple disks simultaneously and manage performance better by controlling resync parallelism using the Oracle ASM Resync Power Limit. Recover faster after a cell or disk failure, and the instance doing the resync is failing; this is made possible by using a Disk Resync Checkpoint which enables a resync to resume from where it was interrupted or stopped instead of starting from the beginning.
Automatically connect database instances to another Oracle ASM instance using Oracle Flex ASM. The local database instance can still access the required metadata and data if an Oracle ASM instance fails due to an unplanned outage.
See Also:Oracle Automatic Storage Management Administrator's Guide for more information about ACFS
The fast recovery area is a unified storage location for all recovery-related files and activities in Oracle Database. After this feature is enabled, all RMAN backups, archived redo log files, control file autobackups, flashback logs, and data file copies are automatically written to a specified file system or Oracle ASM disk group, and the management of this disk space is handled by RMAN and the database server.
Performing a backup to disk is faster because using the fast recovery area eliminates the bottleneck of writing to tape. More important, if database media recovery is required, then data file backups are readily available. Restoration and recovery time is reduced because you do not need to find a tape and a free tape device to restore the needed data files and archived redo log files.
Unified storage location of related recovery files
Management of the disk space allocated for recovery files, which simplifies database administration tasks
Fast, reliable, disk-based backup and restoration
Data block corruptions can be very disruptive and challenging to repair (see Section 1.4, "Causes of Downtime"). Corruptions can cause serious application and database downtime when encountered and worse yet it can go undetected for hours, days and even weeks leading to even longer application downtime once detected. Unfortunately, there is not one way to comprehensively prevent, detect, and repair data corruptions within the database because the source and cause of corruptions can be anywhere in memory, hardware, firmware, storage, operating system, software, or user error. Worse yet, third-party solutions that do not understand Oracle data block semantics and how Oracle changes data blocks do not prevent and detect data block corruptions well. Third party remote mirroring technologies can propagate data corruptions to the database replica (standby) leading to a double failure, data loss, and much longer downtime.
Oracle MAA has a comprehensive plan to prevent, detect, and repair all forms of data block corruptions including physical block corruptions, logical block corruptions, stray writes, and lost writes. These additional safeguards provide the most comprehensive Oracle data block corruption prevention, detection, and repair solution. Details of this plan are described in the MAA white paper "Preventing, Detecting, and Repairing Block Corruption: Oracle Database." The solution centers around Oracle Active Data Guard and three database parameters; however, there is much more that Oracle recommends in dealing with data corruptions.
The following summary highlights the key points:
DB_LOST_WRITE_PROTECT database initialization parameters on both the primary and the standby to detect and prevent the majority of physical block corruptions and logical block corruptions.
Use Oracle Active Data Guard to detect and fail over to a standby database when physical block corruptions, logical block corruptions, or lost writes appear on the primary database. Oracle Active Data Guard with real-time apply provides an automatic block repair function for data block corruptions that can eliminate any downtime for physical block corruptions.
Use Oracle ASM to detect physical corruptions in storage and automatic repair if there is a good mirror copy.
Use Exadata storage and its comprehensive HARD checks to prevent and detect data block corruptions originating in the storage I/O subsystem.
Use RMAN backup and restore operations to detect any physical block corruptions especially with infrequently queried data. Use the RMAN "check logical" option to detect logical block corruptions.
Use Data Recovery Advisor to detect and repair data corruption.
V$DATABASE_BLOCK_CORRUPTION periodically for any detected data block corruptions.
ANALYZE statement with the
VALIDATE STRUCTURE option to evaluate interobject or interblock corruptions.
Oracle Database Reference for more information about the views and initialization parameters
MAA white paper "Preventing, Detecting, and Repairing Block Corruption: Oracle Database" at
Data Recovery Advisor automatically diagnoses persistent (on-disk) data failures, presents appropriate repair options, and runs repair operations at your request.
You can use Data Recovery Advisor to troubleshoot primary databases, logical standby databases, physical standby databases, and snapshot standby databases.
Data Recovery Advisor includes the following functionality:
The first symptoms of database failure are usually error messages, alarms, trace files and dumps, and failed health checks. Assessing these symptoms can be complicated, error-prone, and time-consuming. Data Recovery Advisor automatically diagnoses data failures and informs you about them.
Failure impact assessment
After a failure is diagnosed, you must understand its extent and assess its impact on applications before devising a repair strategy. Data Recovery Advisor automatically assesses the impact of a failure and displays it in an easily understood format.
Even if a failure was diagnosed correctly, selecting the correct repair strategy can be error-prone and stressful. Moreover, there is often a high penalty for making poor decisions in terms of increased downtime and loss of data. Data Recovery Advisor automatically determines the best repair for a set of failures and presents it to you.
Repair feasibility checks
Before presenting repair options, Data Recovery Advisor validates them with respect to the specific environment and availability of media components required to complete the proposed repair, including restoring files directly from the primary or standby database to complete the proposed repair.
If you accept the suggested repair option, Data Recovery Advisor automatically performs the repair, verifies that the repair was successful, and closes the appropriate failures.
Validation of data consistency and database recoverability
Data Recovery Advisor can validate the consistency of your data, and backups and redo stream, whenever you choose.
Early detection of corruption
Through Health Monitor, you can schedule periodic runs of Data Recovery Advisor diagnostic checks to detect data failures before a database process executing a transaction discovers the corruption and signals an error. Early warnings can limit the damage caused by corruption.
Integration of data validation and repair
Data Recovery Advisor is a single tool for data validation and repair.
Note:Data Recovery Advisor only supports single-instance databases. Oracle RAC databases are not supported. See Oracle Database Backup and Recovery User's Guide for more information about Data Recovery Advisor supported database configurations.
The best protection against human errors is to prevent their occurrence. The best way to prevent human errors is to restrict user access to only those data and services required to perform business functions. Oracle Database provides a wide range of security tools to control access to application data by authenticating database users and then enabling administrators to grant them only those privileges required to perform their duties.
In addition, the Oracle Database security model provides the ability to restrict data access at a row level using Oracle Virtual Private Database, thereby further isolating database users from data that they do not need to access.
Authorization control to provide limits to access and actions linked by database user identities and roles.
Auditing control to monitor and gather data about specific database activities, investigate suspicious activity, deter users (or others) from inappropriate activities, and detect problems with authorization or access control implementation.
Administration of Data Guard configurations can be delegated to a class of users who would not be granted SYSDBA privileges.
Oracle Flashback technology is a group of Oracle Database features that let you view past states of database, database objects, transactions or rows or to rewind the database, database objects, transactions or rows to a previous state without using point-in-time media recovery.
With flashback features, you can:
Perform queries to show data as it looked at a previous point in time
Perform queries that return metadata that shows a detailed history of changes to the database
Recover tables or rows to a previous point in time
Automatically track and archive transactional data changes
Roll back a transaction and its dependent transactions while the database remains online
Undrop a table
Recover a database to a point-in-time without a restore operaiton
Other than the flashback database feature, most Oracle Flashback features use the Automatic Undo Management (AUM) system to obtain metadata and historical data for transactions. They rely on undo data, which are records of the effects of individual transactions. For example, if a user runs an UPDATE statement to change a salary from 1000 to 1100, then Oracle Database stores the value 1000 in the undo data.
Undo data is persistent and survives a database shutdown. By using flashback features, you can use undo data to query past data or recover from logical damage. Besides using it in flashback features, Oracle Database uses undo data to perform these actions:
Roll back active transactions
Recover terminated transactions by using database or process recovery
Provide read consistency for SQL queries
Oracle Flashback can address and rewind data that is compromised due to various human or operator errors that inadvertently or maliciously change data, cause bad installations and upgrades, and result in logical errors in applications. These problems are addressed in the following phases, and use features such as flashback transaction, flashback drop, flashback table, and flashback database.
Phase 1: Detection of logical failure, which is usually done by the application.
Phase 2: Error investigation using features such as flashback query, flashback version query, and flashback transaction query and the
Phase 3: Error recovery.
Oracle Flashback Query (Flashback Query) provides the ability to view data as it existed in the past by using the Automatic Undo Management system to obtain metadata and historical data for transactions. Undo data is persistent and survives a database malfunction or shutdown. The unique features of Flashback Query not only provide the ability to query previous versions of tables, they also provide a powerful mechanism to recover from erroneous operations.
Uses of Flashback Query include:
Recovering lost data or undoing incorrect, committed changes. For example, rows that were deleted or updated can be immediately repaired even after they were committed.
Comparing current data with the corresponding data at some time in the past. For example, by using a daily report that shows the changes in data from yesterday, it is possible to compare individual rows of table data, or find intersections or unions of sets of rows.
Checking the state of transactional data at a particular time, such as verifying the account balance on a certain day.
Simplifying application design by removing the need to store certain types of temporal data. By using Flashback Query, it is possible to retrieve past data directly from the database.
Applying packaged applications, such as report generation tools, to past data.
Providing self-service error correction for an application, enabling users to undo and correct their errors.
Oracle Flashback Version Query is an extension to SQL that you can use to retrieve the versions of rows in a given table that existed at a specific time interval. Oracle Flashback Version Query returns a row for each version of the row that existed in the specified time interval. For any given table, a new row version is created each time the
COMMIT statement is executed.
Oracle Flashback Version Query is a powerful tool that database administrators (DBAs) can use to run analysis to determine the source of problems. Additionally, application developers can use Oracle Flashback Version Query to build customized applications for auditing purposes.
Oracle Flashback Transaction backs out a transaction and its dependent transactions. The
DBMS_FLASHBACK.TRANSACTION_BACKOUT() procedure rolls back a transaction and its dependent transactions while the database remains online. This recovery operation uses undo data to create and execute the compensating transactions that return the affected data to its original state. You can query the
DBA_FLASHBACK_TRANSACTION_STATE view to see whether the transaction was backed out using dependency rules or forced out by either:
Backing out nonconflicting rows
Applying undo SQL
Oracle Flashback Transaction increases availability during logical recovery by quickly backing out a specific transaction or set of transactions and their dependent transactions. You use one command to back out transactions while the database remains online.
Oracle Flashback Transaction Query provides a mechanism to view all of the changes made to the database at the transaction level. When used in conjunction with Oracle Flashback Version Query, it offers a fast and efficient means to recover from a human or application error. Oracle Flashback Transaction Query increases the ability to perform online diagnosis of problems in the database by returning the database user that changed the row, and performs analysis and audits on transactions.
Oracle Flashback Table recovers a table to a previous point in time. It provides a fast, online solution for recovering a table or set of tables that were changed by a human or application error. In most cases, Oracle Flashback Table alleviates the need for administrators to perform more complicated point-in-time recovery operations. The data in the original table is not lost when you use Oracle Flashback Table because you can return the table to its original state.
Dropping objects by accident is a problem for database users and database administrators. Although there is no easy way to recover dropped tables, indexes, constraints, or triggers, Oracle Flashback Drop provides a safety net when you are dropping objects. When you drop a table, it is automatically placed into the Recycle Bin. The Recycle Bin is a virtual container where all dropped objects reside. You can continue to query data in a dropped table.
When an Oracle Flashback recovery operation is performed on the database, the DBA must determine the point in time—identified by the system change number (SCN) or time stamp—to which you can later flash back the data. Oracle Flashback restore points are labels that you can define to substitute for the SCN or transaction time used in Flashback Database, Flashback Table, and Oracle Recovery Manager (RMAN) operations. Furthermore, a database can be flashed back through a previous database recovery and opened with an
OPEN RESETLOGS command by using guaranteed restore points. Guaranteed restore points allow major database changes—such as database batch jobs, upgrades, or patches—to be quickly undone by ensuring that the undo required to rewind the database is retained.
Using the Oracle Flashback restore points feature provides the following benefits:
The ability to quickly restore to a consistent state, to a time before a planned operation that has gone awry (for example, a failed batch job, an Oracle software upgrade, or an application upgrade)
The ability to resynchronize a snapshot standby database with the primary database
A quick mechanism to restore a test or cloned database to its original state
Oracle Flashback Database is the equivalent of a fast rewind button, quickly returning a database to a previous point in time without requiring a time consuming restore and roll forward using a backup and archived logs. The larger the size of the database, the greater the advantage of using Oracle Flashback Database for fast point in time recovery.
Enabling Oracle Flashback Database provides the following benefits:
Fast point in time recovery to repair logical corruptions, such as those caused by adminstrative error.
Useful for iterative testing when used with Oracle restore points. A restore point can be set, database changes implemented, and test workload run to assess impact. Oracle Flashback Database can then be used to discard the changes and return the database to the original starting point, different modifications can be made, and the same test workload run a second time to have a true basis for comparing the impact of the different configuration changes.
Data Guard uses Oracle Flashback Database to quickly reinstantiate a failed primary database as a new standby (after a failover has occured), without requiring the failed primary to be restored from a backup.
After attempting to automatically repair corrupted blocks, block media recovery can optionally retrieve a more recent copy of a data block from the flashback logs to reduce recovery time. Automatic block repair allows corrupt blocks on the primary database to be automatically repaired as soon as they are detected, by using good blocks from a physical standby database.
Furthermore, a corrupted block encountered during instance recovery does not result in instance recovery failure. The block is automatically marked as corrupt and added to the RMAN corruption list in the
V$DATABASE_BLOCK_CORRUPTION table. You can subsequently issue the RMAN
RECOVER BLOCK command to fix the associated block. In addition, the RMAN
RECOVER BLOCK command restores blocks from a physical standby database, if it is available.
The Flashback Data Archive is stored in a tablespace and contains transactional changes to every record in a table for the duration of the record's lifetime. The archived data can be retained for a much longer duration than the retention period offered by an undo tablespace, and used to retrieve very old data for analysis and repair.
Oracle Data Pump technology enables very high-speed movement of data and metadata from one database to another. Data Pump is used to perform the following planned maintenance activities:
Database migration to a different platform
Database migration to pluggable databases
See Section 5.4, "Oracle High Availability Solutions for System and Software Maintenance" for more information about using this technology for planned maintenance.
The Data Pump features that enable the planned maintenance activities listed above are the following:
Full transportable export/import to move an entire database to a different database instance
Transportable tablespaces to move a set of tablespaces between databases
Table 3-1 describes the Oracle replication technologies for non-database files.
Recommended for Exadata Database Machine systems or when you need full synchronization between database and non-database systems
Recommended to provide a single-node and cluster-wide file system solution integrated with Oracle ASM, Oracle Clusterware, and Oracle Enterprise Manager technologies
Recommended for disaster recovery protection for non-database files, and specifically for Oracle Fusion Middleware critical files stored outside of the database.
Oracle Database File System (DBFS) takes advantage of the features of the database to store files, and the strengths of the database in efficiently managing relational data, to implement a standard file system interface for files stored in the database. With this interface, storing files in the database is no longer limited to programs specifically written to use BLOB and CLOB programmatic interfaces. Files in the database can now be transparently accessed using any operating system (OS) program that acts on files. For example, extract, transform, and load (ETL) tools can transparently store staging files in the database.
Oracle DBFS provides the following benefits:
Full stack integration recovery and failover: By storing file system files in a database structure, it is possible to easily perform point-in-time recovery of both database objects and file system data.
Disaster Recovery System Return on Investment (ROI): All changes to files contained in DBFS are also logged through the Oracle database redo log stream and thus can be passed to a Data Guard physical standby database. Using Oracle Active Data Guard technology, the DBFS file system can be mounted read-only using the physical standby database as the source. Changes made on the primary are propagated to the standby database and are visible once applied to the standby.
File system backups: Because DBFS is stored in the database as database objects, standard RMAN backup and recovery functionality can be applied to file system data. Any backup, restore, or recovery operation that can be performed on a database or object within a database can also be performed against the DBFS file system.
Oracle ASM Cluster File System (ACFS) is a multiplatform, scalable file system, and storage management technology that extends Oracle Automatic Storage Management (Oracle ASM) functionality to support customer files maintained outside of Oracle Database. Oracle ACFS supports many database and application files, including executables, database trace files, database alert logs, application reports, BFILEs, and configuration files. Other supported files are video, audio, text, images, engineering drawings, and other general-purpose application file data.
Oracle ACFS takes advantage of the following Oracle ASM functionality:
Oracle ACFS dynamic file system resizing
Maximized performance through direct access to Oracle ASM disk group storage
Balanced distribution of Oracle ACFS across Oracle ASM disk group storage for increased I/O parallelism
Data reliability through Oracle ASM mirroring protection mechanisms
An additional feature of Oracle ACFS is Oracle ACFS Replication which, similar to Data Guard for the database, enables replication of Oracle ACFS file systems across the network to a remote site, providing disaster recovery capability for the file system. Oracle ACFS replication captures file system changes written to disk for a primary file system and records the changes in files called replication logs. These logs are transported to the site hosting the associated standby file system where background processes read the logs and apply the changes recorded in the logs to the standby file system. After the changes recorded in a replication log are successfully applied to the standby file system, the replication log is deleted from the sites hosting the primary and standby file systems.
The Oracle Solaris ZFS Storage Appliance series supports snapshot-based replication of projects and shares from a source appliance to any number of target appliances manually, on a schedule, or continuously for the following use cases:
Disaster recovery: Replication can be used to mirror an appliance for disaster recovery. In the event of a disaster that impacts the service of the primary appliance (or even an entire data center), administrators activate the service at the disaster recovery site, which takes over using the most recently replicated data. When the primary site is restored, data changed while the disaster recovery site was in service can be migrated back to the primary site, and normal service is restored. Such scenarios are fully testable before a disaster occurs.
Data distribution: Replication can be used to distribute data (such as virtual machine images or media) to remote systems across the world in situations where clients of the target appliance would not ordinarily be able to reach the source appliance directly, or such a setup would have prohibitively high latency. One example uses this scheme for local caching to improve latency of read-only data (such as documents).
Disk-to-disk backup: Replication can be used as a backup solution for environments in which tape backups are not feasible. Tape backup might not be feasible, for example, because the available bandwidth is insufficient or because the latency for recovery is too high.
Data migration: Replication can be used to migrate data and configuration between Oracle Solaris ZFS Storage appliances when upgrading hardware or rebalancing storage. Shadow migration can also be used for this purpose.
The architecture of Oracle Solaris ZFS Storage Appliance also makes it an ideal platform to complement Data Guard for disaster recovery of Oracle Fusion Middleware. Oracle Fusion Middleware has a number of critical files that are stored outside of the database. These binaries, configuration data, metadata, logs and so on also require data protection to ensure availability of the Oracle Fusion Middleware. For these, the built-in replication feature of the ZFS Storage Appliance is used to move this data to a remote disaster recovery site.
Benefits of the Oracle Solaris ZFS Storage Appliance when used with Oracle Fusion Middelware include:
Leverages remote replication for Oracle Fusion Middleware
Provides ability to quickly create clones and snapshots of databases to increase ROI of DR sites
A highly available architecture requires the application tier to transparently fail over to a surviving instance or database advertising the required service. This ensures that applications are generally available or minimally impacted in the event of node failure, instance failure, data corruption, or database failures. Transparent client failover enables applications to fail over to another available Oracle RAC instance or to another database (such as in the case of a Data Guard role transition or Oracle GoldenGate).
Client failover encompasses failure notification, connection cleanup, and automatic reconnection and retries of database service residing in another Oracle RAC instance or database and possibly query retry.
At a high level, the following components are used to provide for seamless client failover:
Oracle Database provides a powerful automatic workload management facility, called services, to enable the enterprise grid vision. Services are entities that you can define in Oracle databases that enable you to group database workloads, route work to the optimal instances that are assigned to offer the service, and achieve high availability for planned and unplanned actions.
High Availability Framework
An Oracle RAC component that enables Oracle Database to maintain components in a running state.
Fast Application Notification (FAN)
FAN is a high availability notification mechanism that Oracle RAC uses to notify other processes about configuration-level and service-level information that includes service status changes, such as UP or DOWN events. The Oracle client drivers and Oracle connection pools respond to FAN events and take immediate action. FAN UP and DOWN events can apply to instances, services, and nodes.
Transaction Guard is a tool that provides a protocol and an API for at-most-once execution of transactions in case of unplanned outages and duplicate submissions.
Application Continuity provides a general purpose infrastructure that replays the in-flight request when a recoverable error is received, masking many system, communication, and storage outages and hardware failures. Unlike existing recovery technologies, this feature attempts to recover the transactional and non-transactional session states beneath the application, so that the outage appears to the application as a delayed execution.
Connection Load Balancing
Connection Load Balancing is a feature of Oracle Net Services that balances incoming connections across all of the instances that provide the requested database service.
Fast Connection Failover
Fast Connection Failover is the ability of Oracle Clients to provide rapid failover of connections by subscribing to FAN events.
Transparent Application Failover (TAF)
Transparent Application Failover is a run-time failover for high availability environments that refers to the failover and re-establishment of application-to-service connections. It enables client applications to automatically reconnect to the database if the connection fails, and, optionally, resume a SELECT statement that was in progress. This reconnection happens automatically from within the Oracle Call Interface (OCI) library.
Single Client Access Name (SCAN)
SCAN provides a single name to the clients connecting to Oracle RAC that does not change throughout the life of the cluster, even if you add or remove nodes from the cluster. Clients connecting with SCAN can use a simple connection string, such as a thin JDBC URL or EZConnect, and still achieve the load balancing and client connection failover.
Global Data Services
Global Data Services (GDS) is a new capability of Oracle Database that extends the concept of services to a globally replicated configuration involving a combination of single-instance, Oracle RAC, Oracle Active Data Guard, and Oracle GoldenGate. This enables services to be deployed anywhere within this globally replicated configuration, supporting load balancing, high availability, database affinity, and so on.
Connection Time Failover
Oracle Net supports connect descriptors with multiple lists of addresses, each with its own characteristics. Connection time failover allows for a new connection attempt to fail over to a different address if the connection to the first address fails.
Oracle Database Concepts for information about how the database processes transactions
Oracle Real Application Clusters Administration and Deployment Guide for information about Dynamic Database Services
Oracle Database 2 Day + Real Application Clusters Guide for information about Dynamic Database Services
At a high level, automating client failover in an MAA environment includes relocating database services to available resources, notifying clients that a failure has occurred, potentially breaking them out of TCP timeout, and redirecting application connections to available resources where the database service is active. The components described in the introduction to this chapter that are used to process the failover of application connections depend on the configuration of your MAA environment.
|MAA Configuration||Service Relocation||Application Notification||Session Failover and Recovery|
Single Instance with Data Guard
Configure your operating system for efficient TCP timeouts on the hosts that run the application layer
Configure Transparent Oracle Failover (TAF) for OCI clients. If not using TAF, you can include Transaction Guard in your application for OCI, JDBC Thin, or ODP.
Oracle RAC Database or Oracle RAC One Node
Configure for Fast Application Notification
Configure Transparent Oracle Failover (TAF) for OCI clients. Configure Application Continuity for Thin JDBC Clients.,
If not using these, you may include Transaction Guard in your application for OCI, JDBC Thin, or ODP. (TAF and AC include Transaction Guard)
Oracle RAC Database with Data Guard
Configure for Fast Application Notification
Configure Transparent Oracle Failover (TAF) for OCI clients. Configure Application Continuity for JDBC thin clients.
If not using these you can include Transaction Guard in your application for OCI, JDBC Thin, or ODP. (TAF and AC include Transaction Guard)
Configure your operating system for efficient TCP timeouts on the hosts that run the application layer
Configure Transparent Oracle Failover (TAF) for OCI clients using BASIC only.
The following sections provide more information about service relocation and application notification.
A service name is a logical representation of a service used for client connections. When a client connects to a listener, it requests a connection to a service. When a database instance starts, it registers itself with a listener as providing one or more services by name. A single service, as known by a listener, can identify one or more database instances in an Oracle RAC or Data Guard environment. A single database instance can register one or more services with a listener.
The application should connect to the database using a primary specific service name, that is a user-created service that is only active on the primary database. In the event of a Data Guard failover, this service migrates to any database that currently holds the primary role. This can be accomplished in single-instance environments that do not have Oracle Clusterware installed by creating a trigger that executes based on the
ON_STARTUP system event. This trigger should check the
DATABASE_ROLE value of the
V$DATABASE view, and if the value is
PRIMARY, then start the user created service.
Resource profiles are automatically created when you define a service. A resource profile describes how Oracle Clusterware should manage the service and which instance the service should failover to if the preferred instance stops. Resource profiles also define service dependencies for the instance and the database. Due to these dependencies, if you stop a database, then the instances and services are automatically stopped in the correct order.
When you define a service for an administrator-managed database, you define which instances usually support that service using
SRVCTL with the
-preferred parameter. These are known as the preferred instances. You can also define other instances to support a service if the service's preferred instance fails using
SRVCTL with the
-available parameter. These are known as available instances.
When you specify preferred instances, you are specifying the number of instances on which a service usually runs. This is the maximum cardinality of the service. Oracle Clusterware attempts to ensure that the service runs on the number of instances for which you have configured the service. Afterward, due to either instance failure or planned service relocations, a service may be running on an available instance.
If an instance fails, then you cannot control to which available instance Oracle Clusterware relocates the services if there are multiple instances in the list. During a planned operation, however, you can manually direct the service to any instance in either the preferred or the available list not currently offering the service.
If you configured Data Guard in your Oracle RAC environment, then you can define a role for each service using
SRVCTL with the
-l parameter. When you specify a role for a service, Oracle Clusterware automatically starts the service only when the database role matches the role you specified for the service. Valid roles are PRIMARY, PHYSICAL_STANDBY, LOGICAL_STANDBY, and SNAPSHOT_STANDBY and you can specify more than one role for a service.
If multiple databases in the cluster offer the same service name, then Oracle RAC balances connections to that service across all such databases. This is useful for standby and active Data Guard databases, but if you want client connections to a service to be directed to a particular database, then the service name must be unique within the cluster (not offered by any other database).
See Also:Oracle Data Guard Concepts and Administration for more information about database roles
The Global Data Services framework is the software infrastructure for global services. This framework automates and centralizes configuration, maintenance, and monitoring of a database cloud, and enables load balancing and failover for global services. The framework manages these virtualized resources with minimal administrative overhead, enabling the cloud to handle additional client requests.
The Global Data Services framework is built around the following preexisting Oracle Database technologies:
Oracle Active Data Guard
Enables high-performance farms of read-only databases.
Data Guard Broker
Enables creation, management, and monitoring of Data Guard configurations that include a primary database and up to 30 standby databases.
Enables replication updates among multiple databases.
With FAN, the continuous, dynamic database services built into Oracle RAC, Data Guard, and Global Data Services are extended to applications and mid-tier servers. When the state of a database service changes (for example, up, down, or not restarting), the new status is posted to interested subscribers through FAN events. Oracle drivers and applications use these events to achieve very fast detection of failures, balancing of connection pools following failures, and balancing of connection pools again when the failed components are repaired. For example, when the service at an instance starts, the FAN event is used immediately to route work to that resource. When the service at an instance or node fails, the FAN event is used immediately to interrupt applications to recover.
To solve the high availability problems with database connections, Oracle Clusterware and Data Guard Broker post a FAN event, and also executes server-side callouts, immediately when a service changes state. The event payload contains the relevant information that describes the status of the service on Oracle RAC. On receipt of the FAN event, applications can terminate sessions in communication with the failed instance or node, notify sessions waiting to resume operation, and reorganize in coming work when additional resources are available. To know which sessions to process, every session using Oracle Database has a unique connection signature. The session signatures match the FAN payload.
For planned outages, use any connection pool with FAN configured: OCI, UCP, ICC, WebLogic Server Active Grid Link, or ODP.Net. The FAN planned event drains the work at request boundaries. Immediately, the FAN event is received for a planned down, the idle connections are removed from the pool for that service or instance, and the active (borrowed) connections are marked for release when they are returned to the pool. This effectively drains the work for planned outages with no interruption to the users.
FAN is also used for posting advisories for runtime connection load balancing, Web Affinity, and Data Dependent Routing.
See Also:Oracle Real Application Clusters Administration and Deployment Guide for information about Dynamic Database Services
Oracle integrated FAN with many of the common client application environments that are used to connect to Oracle RAC databases. Therefore, the easiest way to use FAN is to use an integrated Oracle client.
Due to the integration with FAN, Oracle integrated clients are more aware of the current status of an Oracle RAC cluster. This prevents client connections from waiting or trying to connect to instances or services that are no longer available. When instances start, Oracle RAC uses FAN to notify the connection pool so that the connection pool can create connections to the recently started instance and take advantage of the additional resources that this instance provides.
Oracle client drivers that are integrated with FAN can:
Remove terminated connections immediately when a service is declared as DOWN at an instance, and immediately when nodes are declared as DOWN
Report errors to clients immediately when Oracle Database detects the NOT RESTARTING state, instead of making the client wait while the service repeatedly attempts to restart
Oracle connection pools that are integrated with FAN can:
Balance connections across all of the Oracle RAC instances when a service starts; this is preferable to directing the sessions that are defined for the connection pool to the first Oracle RAC instance that supports the service
Balance work requests at run time using load balancing advisory events
See Also:Oracle Real Application Clusters Administration and Deployment Guide for complete information about how to enable FAN for all Oracle clients
Configure your operating system for efficient TCP timeouts on the hosts that run the application layer. The OS TCP timeouts should be set to the amount of time it takes for the database layer to failover and the database services to be started. Consult your operating system manuals for how to properly configure TCP timeout.
Configure reconnection logic within the application to respond appropriately in the event of an exception. For example, when a session from the connection pool receives an exception that results in a disconnection (such as an ORA-3113 error), the application should automatically attempt to reconnect that session. The reconnection attempts should be configured such that they will continue for the length of time that it takes to failover the database layer and bring the application services online.
Transaction failover and protection technologues include Transaction Guard and Aplpication Continuity.
Transaction Guard is a generic tool for applications to provide a reliable, known outcome for transactions following planned and unplanned outages. Applications use a new concept called the logical transaction ID to determine the outcome of the last transaction open in a database session following an outage. Without using Transaction Guard, applications that attempt to retry operations following outages can cause logical corruption by committing duplicate transactions.
Failing to recognize that the last submission has committed, will commit sometime soon, or has not run to completion can lead applications that attempt to replay to cause duplicate transaction submissions and other forms of logical corruption because the software might try to re-issue already persisted changes.
Without Transaction Guard, if a transaction was started and a commit was issued, the commit message that is sent back to the client is not durable. The client is left not knowing whether the transaction committed or not. The transaction cannot be validly resubmitted if the non-transactional state is incorrect or if it is already committed. In the absence of guaranteed commit and completion information, resubmission can lead to transactions applied more than once and in the incorrect state.
Oracle Database 2 Day + Real Application Clusters Guide for information about Transaction Guard with Oracle RAC and Dynamic Database Services
Oracle Database PL/SQL Packages and Types Reference DBMS_APP_CONT package
A highly available architecture requires the ability of the application tier to transparently fail over to a surviving instance or database advertising the required service. This ensures that applications are generally available or minimally impacted in the event of node failure, instance failure, data corruption, or database failures. Application Continuity for Java attempts to mask recoverable outages by replaying the request at another available Oracle RAC instance or to another database (such as in the case of a Data Guard role transition).
Application Continuity encompasses:
FAN: failure notification
Automatic reconnection and retries of database service residing in another Oracle RAC instance or database
Replay of the in-flight request
Masking outages of the database session is a complex task for application development and, as a result, errors and timeouts are often exposed to the users. Application Continuity attempts to mask outages from users and applications by recovering the database session following recoverable outages, unplanned and planned. Application Continuity performs this recovery beneath the application so that the outage appears to the application as a delayed execution. For the recovery to succeed, the data and messages restored to the client by Application Continuity must be the same as those that the application has seen and potentially made decisions on.
Application Continuity is started for outages that are recoverable, typically related to underlying software, foreground, hardware, communications, network, or storage layers. Application Continuity is used to improve the user experience when handling both unplanned outages and planned outages.
With Oracle Database 12c release 1, Application Continuity for Java is available for general use with:
JDBC-Thin Oracle driver
JDBC Universal Connection Pool
WebLogic Server Active Grid Link
Oracle Database Concepts for information about transactions
Oracle Database Development Guide for information about transactions
Oracle Real Application Clusters Administration and Deployment Guide for information about Dynamic Database Services
Oracle Database 2 Day + Real Application Clusters Guide for information about Dynamic Database Services
Global Data Services enables administrators to automatically and transparently manage client workloads across replicated databases that offer common services. A database service is a named representation of one or more database instances. Services enable you to group database workloads and route a particular work request to an appropriate instance. A global service is a service provided by multiple databases synchronized through data replication.
Global Data Services provides dynamic load balancing, failover, and centralized service management for a set of replicated databases that offer common services. The set of databases can include Oracle RAC and noncluster Oracle databases interrelated through Oracle Data Guard, Oracle GoldenGate, or any other replication technology.
The benefits of Global Data Services include the following:
Enables you to centrally manage global resources, including globally distributed multiple database configurations
Provides global scalability, availability, and runtime load balancing
Supports seamless failover
Enables you to dynamically add databases to the GDS configuration and dynamically migrate global services
Enables optimal resource utilization
The global services management framework is the software infrastructure for global services. This framework automates and centralizes configuration, maintenance, and monitoring of a GDS configuration, and enables load balancing and failover for services. The framework manages these virtualized resources with minimal administrative overhead, enabling the configuration to handle additional client requests.
The global services management framework is built around the following preexisting Oracle Database technologies:
Oracle Real Application Clusters (Oracle RAC)
Enables dynamic load balancing and workload management in a cluster
Oracle Active Data Guard
Enables high-performance farms of read-only databases
Data Guard Broker
Enables creation, management, and monitoring of Data Guard configurations that include a primary database and up to 30 standby databases
Enables replication updates among multiple databases