Managing Fast-Start Failover
You can enable fast-start failover to allow the broker to determine if a failover is necessary and to initiate a failover to a standby database from a list of one or more pre-specified target standby databases.
The failover can be set up for either no data loss or a configurable amount of data loss. In addition, you can specify under which conditions or errors you want a failover to be initiated. Oracle also provides the DBMS_DG PL/SQL package to allow an application to request a fast-start failover.
You use broker configuration properties to control the behavior of fast-start failover. You can also use Cloud Control or the DGMGRL ENABLE FAST_START FAILOVER CONDITION and DISABLE FAST_START FAILOVER CONDITION commands to specify conditions for which a fast-start failover should occur.
Configure Properties to Tune Fast-Start Failover
You can set various properties to tune how fast-start failover behaves.
The configurable properties for fast-start failover include:
-
FastStartFailoverThresholdSet the
FastStartFailoverThresholdconfiguration property to specify the number of seconds you want the observer and target standby database to wait (after detecting the primary database is unavailable) before initiating a failover. See Enabling Fast-Start Failover for more information and an example. -
FastStartFailoverPmyShutdownThe
FastStartFailoverPmyShutdownconfiguration property controls whether the primary database will shut down if redo generation has been stalled (FS_FAILOVER_STATUScolumn ofV$DATABASEcontains a value ofSTALLED) and the primary database has lost connectivity with the observer and target standby database for longer than the number of seconds specified by theFastStartFailoverThresholdconfiguration property. The default value for FastStartFailoverPmyShutdown isTRUE.Note:
The primary database is always shut down if a user configurable fast-start failover condition is detected or if an application initiated a fast-start failover by calling the
DBMS_DG.INITIATE_FS_FAILOVERfunction. -
FastStartFailoverLagLimitThe fast-start failover feature can be configured on databases operating in maximum performance mode. Destinations that receive redo in
ASYNCmode will be acceptable fast-start failover target standby databases, and these destinations can lag the primary in terms of redo received and applied. A configurable time-based limit can be specified through theFastStartFailoverLagLimitconfiguration property. If the standby database's applied redo point is within this many seconds of the primary's redo generation point, a fast-start failover will be allowed. If its applied point lags beyond this limit, a fast-start failover is not allowed.TheFastStartFailoverLagLimitconfiguration property can also be used if fast-start failover is enabled when the configuration is operating in maximum availability mode. It cannot be used when the configuration is operating in maximum protection mode. WhenFastStartFailoverLagLimitis set to a non-zero value and the configuration is operating in maximum availability mode, a zero data loss failover or a data loss failover is possible. If a data loss failover is performed, the amount of data loss will not exceed the number of seconds specified by theFastStartFailoverLagLimitconfiguration property. Note that the redo transport mode of the target standby must be set to the value ofSYNCorFASTSYNCorASYNCwhen a non-zero value is specified for theFastStartFailoverLagLimitproperty and the protection mode is maximum availability mode or maximum performance mode. If you want to change protection mode or redo transport mode toSYNCorFASTSYNC, you must first disable fast-start failover. Likewise, changing the protection mode from maximum availability mode to maximum performance mode will require first disabling fast-start failover. Reinstatement of an old primary will be possible after a fast-start failover to a target standby. If the observer rediscovers the old primary, it will automatically reinstate the old primary and any redo generated within the specified lag will be lost.Note:
It is recommended to set theFastStartFailoverLagLimitlower than or equal toFastStartFailoverThreshold. When setting FastStartFailoverLagLimit greater thanFastStartFailoverThreshold, the primary will keep committing after the Fast-Start Failover is initiated, exposing the configuration to the risk of having two primary databases for a short period of time.See Also:
Oracle Data Guard Broker Properties for more information
-
FastStartFailoverLagTypeThe
FastStartFailoverLagTypevalue (APPLYorTRANSPORT) in the fast-start failover configuration specifies the type of lag (apply lag or transport lag) that is used to specify the data loss threshold. -
FastStartFailoverLagGraceTimeThe
FastStartFailoverLagGraceTimevalue for the fast-start failover configuration specifies the maximum amount of time (in seconds) that can pass before the lag limit (FastStartFailoverLagLimit) is reached when the primary database requests permission to move to the lagging state. -
FastStartFailoverAutoReinstateThe
FastStartFailoverAutoReinstateconfiguration property controls whether the former primary database is automatically reinstated if a fast-start failover occurred because the primary database crashed or was stalled for longer thanFastStartFailoverThresholdseconds. The default value forFastStartFailoverAutoReinstateisTRUE.If you want to perform diagnostic or repair work after failover has completed, you can avoid an automatic reinstatement by setting the
FastStartFailoverAutoReinstateconfiguration property toFALSE.Note:
The former primary database is never automatically reinstated if a fast-start failover occurred because a user configurable fast-start failover condition was detected or because an application initiated a fast-start failover by calling the
DBMS_DG.INITIATE_FS_FAILOVERfunction. -
FastStartFailoverTargetThe
FastStartFailoverTargetconfiguration property specifies theDB_UNIQUE_NAMEvalues of the databases that are eligible to be targets of a fast-start failover when this database is the primary database. -
ObserverPingIntervalThe
ObserverPingIntervalconfiguration property specifies how frequently the observer must ping the primary database. This property is measured in milliseconds. The minimum value is 100 milliseconds. To achieve lower detection times for primary database failures, you must set theObserverPingIntervalandObserverPingRetryproperties before enabling fast-start failover.In low network latency environments where extremely short primary failure detection times are necessary, a combination of the ObserverPingInterval and ObserverPingRetry can be used to reduce detection time to as low as one second. ObserverPingInterval is used to specify on frequently the observer should ping. The lowest value that can be specified is 100ms.
-
ObserverPingRetryThe
ObserverPingRetryconfiguration property pecifies the number of times that the observer retries a failed ping before it initiates a failover to the target standby database. A failed ping is a ping to the primary database that failed or took longer than the time specified by theObserverPingIntervalproperty. You must set both theObserverPingRetryandObserverPingIntervalproperties to achieve lower detection times for primary database failures. The minimum value is 10. -
ObserverReconnectThe
ObserverReconnectconfiguration property specifies how often the observer establishes a new connection to the primary database. When this property is set to the default value of 0, it prevents the observer from periodically establishing a new connection with the primary database. While this eliminates the processing overhead associated with periodically establishing a new observer connection to the primary database, it also prevents the observer from detecting that it is not possible to create new connections to the primary database. Note that logging in and out of the database is a resource-intensive operation. Given that, Oracle recommends that this property be set so the value specified is small enough to allow timely detection of faults at the primary database, but large enough to limit the impact of logging in to and out of the primary database. -
ObserverOverrideThe
ObserverOverrideconfiguration property, when set toTRUE, allows an automatic failover to occur when the observer has lost connectivity to the primary, even if the standby has a healthy connection to the primary.
Configure Conditions for Fast-start Failover
By default, a fast-start failover is done when neither the observer nor the standby can reach the primary after the configured time threshold (FastStartFailoverThreshold) has passed.
There are also other conditions under which you might want a fast-start failover to occur.
The configurable conditions fall into two classes: those detected through the database health-check mechanism and those detected through errors raised by the Oracle server (such as ORA errors). When a specified condition occurs, the observer will initiate a fast-start failover without waiting for FastStartFailoverThreshold to expire, assuming the standby is in a valid state to accept a failover.
Each condition may be enabled or disabled individually. The Oracle Data Guard configuration persists all user specified configurable fast-start failover conditions in the broker configuration file.
The observer will detect when the primary database has signaled any of the enabled health-check conditions and will immediately initiate a fast-start failover, assuming the standby is in a valid fast-start failover state (observed and either synchronized or within lag limits) to accept a failover.
For specified Oracle ORA-Error conditions, the primary database will notify the observer if the error is signaled and the observer will immediately initiate a fast-start failover, assuming the standby is in a valid fast-start failover state (observed and either synchronized or within lag limits) to accept a failover. Please note that the only Oracle ORA-Error for which fast-start failover can be triggered is ORA-240.
Note:
The primary database will shut down and the observer will not attempt to automatically reinstate the former primary database.
See Also:
-
Cloud Control online help