Managing Fast-Start Failover

You can enable fast-start failover to allow the broker to determine if a failover is necessary and to initiate a failover to a standby database from a list of one or more pre-specified target standby databases.

The failover can be set up for either no data loss or a configurable amount of data loss. In addition, you can specify under which conditions or errors you want a failover to be initiated. Oracle also provides the DBMS_DG PL/SQL package to allow an application to request a fast-start failover.

You use broker configuration properties to control the behavior of fast-start failover. You can also use Cloud Control or the DGMGRL ENABLE FAST_START FAILOVER CONDITION and DISABLE FAST_START FAILOVER CONDITION commands to specify conditions for which a fast-start failover should occur.

Configure Properties to Tune Fast-Start Failover

You can set various properties to tune how fast-start failover behaves.

The configurable properties for fast-start failover include:

FastStartFailoverThreshold

Set the FastStartFailoverThreshold configuration property to specify the number of seconds you want the observer and target standby database to wait (after detecting the primary database is unavailable) before initiating a failover. See Enabling Fast-Start Failover for more information and an example.
FastStartFailoverPmyShutdown

The FastStartFailoverPmyShutdown configuration property controls whether the primary database will shut down if redo generation has been stalled (FS_FAILOVER_STATUS column of V$DATABASE contains a value of STALLED) and the primary database has lost connectivity with the observer and target standby database for longer than the number of seconds specified by the FastStartFailoverThreshold configuration property. The default value for FastStartFailoverPmyShutdown is TRUE.

Note:

The primary database is always shut down if a user configurable fast-start failover condition is detected or if an application initiated a fast-start failover by calling the DBMS_DG.INITIATE_FS_FAILOVER function.
FastStartFailoverLagLimit

The fast-start failover feature can be configured on databases operating in maximum performance mode. Destinations that receive redo in ASYNC mode will be acceptable fast-start failover target standby databases, and these destinations can lag the primary in terms of redo received and applied. A configurable time-based limit can be specified through the FastStartFailoverLagLimit configuration property. If the standby database's applied redo point is within this many seconds of the primary's redo generation point, a fast-start failover will be allowed. If its applied point lags beyond this limit, a fast-start failover is not allowed.

The FastStartFailoverLagLimit configuration property can also be used if fast-start failover is enabled when the configuration is operating in maximum availability mode. It cannot be used when the configuration is operating in maximum protection mode. When FastStartFailoverLagLimit is set to a non-zero value and the configuration is operating in maximum availability mode, a zero data loss failover or a data loss failover is possible. If a data loss failover is performed, the amount of data loss will not exceed the number of seconds specified by the FastStartFailoverLagLimit configuration property. Note that the redo transport mode of the target standby must be set to the value of SYNC or FASTSYNC or ASYNC when a non-zero value is specified for the FastStartFailoverLagLimit property and the protection mode is maximum availability mode or maximum performance mode. If you want to change protection mode or redo transport mode to SYNC or FASTSYNC, you must first disable fast-start failover. Likewise, changing the protection mode from maximum availability mode to maximum performance mode will require first disabling fast-start failover. Reinstatement of an old primary will be possible after a fast-start failover to a target standby. If the observer rediscovers the old primary, it will automatically reinstate the old primary and any redo generated within the specified lag will be lost.

Note:
It is recommended to set the FastStartFailoverLagLimit lower than or equal to FastStartFailoverThreshold. When setting FastStartFailoverLagLimit greater than FastStartFailoverThreshold, the primary will keep committing after the Fast-Start Failover is initiated, exposing the configuration to the risk of having two primary databases for a short period of time.

See Also:

Oracle Data Guard Broker Properties for more information
FastStartFailoverLagType

The FastStartFailoverLagType value (APPLY or TRANSPORT) in the fast-start failover configuration specifies the type of lag (apply lag or transport lag) that is used to specify the data loss threshold.
FastStartFailoverLagGraceTime

The FastStartFailoverLagGraceTime value for the fast-start failover configuration specifies the maximum amount of time (in seconds) that can pass before the lag limit (FastStartFailoverLagLimit) is reached when the primary database requests permission to move to the lagging state.
FastStartFailoverAutoReinstate

The FastStartFailoverAutoReinstate configuration property controls whether the former primary database is automatically reinstated if a fast-start failover occurred because the primary database crashed or was stalled for longer than FastStartFailoverThreshold seconds. The default value for FastStartFailoverAutoReinstate is TRUE.

If you want to perform diagnostic or repair work after failover has completed, you can avoid an automatic reinstatement by setting the FastStartFailoverAutoReinstate configuration property to FALSE.

Note:

The former primary database is never automatically reinstated if a fast-start failover occurred because a user configurable fast-start failover condition was detected or because an application initiated a fast-start failover by calling the DBMS_DG.INITIATE_FS_FAILOVER function.
FastStartFailoverTarget

The FastStartFailoverTarget configuration property specifies the DB_UNIQUE_NAME values of the databases that are eligible to be targets of a fast-start failover when this database is the primary database.
ObserverPingInterval

The ObserverPingInterval configuration property specifies how frequently the observer must ping the primary database. This property is measured in milliseconds. The minimum value is 100 milliseconds. To achieve lower detection times for primary database failures, you must set the ObserverPingInterval and ObserverPingRetry properties before enabling fast-start failover.

In low network latency environments where extremely short primary failure detection times are necessary, a combination of the ObserverPingInterval and ObserverPingRetry can be used to reduce detection time to as low as one second. ObserverPingInterval is used to specify on frequently the observer should ping. The lowest value that can be specified is 100ms.
ObserverPingRetry

The ObserverPingRetry configuration property pecifies the number of times that the observer retries a failed ping before it initiates a failover to the target standby database. A failed ping is a ping to the primary database that failed or took longer than the time specified by the ObserverPingInterval property. You must set both the ObserverPingRetry and ObserverPingInterval properties to achieve lower detection times for primary database failures. The minimum value is 10.
ObserverReconnect

The ObserverReconnect configuration property specifies how often the observer establishes a new connection to the primary database. When this property is set to the default value of 0, it prevents the observer from periodically establishing a new connection with the primary database. While this eliminates the processing overhead associated with periodically establishing a new observer connection to the primary database, it also prevents the observer from detecting that it is not possible to create new connections to the primary database. Note that logging in and out of the database is a resource-intensive operation. Given that, Oracle recommends that this property be set so the value specified is small enough to allow timely detection of faults at the primary database, but large enough to limit the impact of logging in to and out of the primary database.
ObserverOverride

The ObserverOverride configuration property, when set to TRUE, allows an automatic failover to occur when the observer has lost connectivity to the primary, even if the standby has a healthy connection to the primary.

Configure Conditions for Fast-start Failover

By default, a fast-start failover is done when neither the observer nor the standby can reach the primary after the configured time threshold (FastStartFailoverThreshold) has passed.

There are also other conditions under which you might want a fast-start failover to occur.

The configurable conditions fall into two classes: those detected through the database health-check mechanism and those detected through errors raised by the Oracle server (such as ORA errors). When a specified condition occurs, the observer will initiate a fast-start failover without waiting for FastStartFailoverThreshold to expire, assuming the standby is in a valid state to accept a failover.

Each condition may be enabled or disabled individually. The Oracle Data Guard configuration persists all user specified configurable fast-start failover conditions in the broker configuration file.

The observer will detect when the primary database has signaled any of the enabled health-check conditions and will immediately initiate a fast-start failover, assuming the standby is in a valid fast-start failover state (observed and either synchronized or within lag limits) to accept a failover.

For specified Oracle ORA-Error conditions, the primary database will notify the observer if the error is signaled and the observer will immediately initiate a fast-start failover, assuming the standby is in a valid fast-start failover state (observed and either synchronized or within lag limits) to accept a failover. Please note that the only Oracle ORA-Error for which fast-start failover can be triggered is ORA-240.

Note:

The primary database will shut down and the observer will not attempt to automatically reinstate the former primary database.

Application Initiated Fast-Start Failover

You can use the DBMS_DG PL/SQL package to allow an application to direct a fast-start failover when it encounters specific conditions.

See "Directing a Fast-Start Failover From an Application".