The two-phase commit mechanism ensures that all nodes either commit or perform a rollback together. What happens if any of the three phases fails because of a system or network error? The transaction becomes in-doubt.
Distributed transactions can become in-doubt in the following ways:
A server machine running Oracle Database software crashes
A network connection between two or more Oracle Databases involved in distributed processing is disconnected
An unhandled software error occurs
The RECO process automatically resolves in-doubt transactions when the machine, network, or software problem is resolved. Until RECO can resolve the transaction, the data is locked for both reads and writes. The database blocks reads because it cannot determine which version of the data to display for a query.
This section contains the following topics:
In the majority of cases, the database resolves the in-doubt transaction automatically. Assume that there are two nodes,
remote, in the following scenarios. The local node is the commit point site. User
scott connects to
local and executes and commits a distributed transaction that updates
Figure 32-5 illustrates the sequence of events when there is a failure during the prepare phase of a distributed transaction:
The following steps occur:
SCOTT connects to
Local and executes a distributed transaction.
The global coordinator, which in this example is also the commit point site, requests all databases other than the commit point site to promise to commit or roll back when told to do so.
remote database crashes before issuing the prepare response back to
The transaction is ultimately rolled back on each database by the RECO process when the remote site is restored.
Figure 32-6 illustrates the sequence of events when there is a failure during the commit phase of a distributed transaction:
The following steps occur:
Scott connects to
local and executes a distributed transaction.
The global coordinator, which in this case is also the commit point site, requests all databases other than the commit point site to promise to commit or roll back when told to do so.
The commit point site receives a prepared message from
remote saying that it will commit.
The commit point site commits the transaction locally, then sends a commit message to
remote asking it to commit.
remote database receives the commit message, but cannot respond because of a network failure.
The transaction is ultimately committed on the remote database by the RECO process after the network is restored.
See Also:"Deciding How to Handle In-Doubt Transactions" for a description of failure situations and how the database resolves intervening failures during two-phase commit
The in-doubt transaction has locks on critical data or undo segments.
The cause of the machine, network, or software failure cannot be repaired quickly.
Resolution of in-doubt transactions can be complicated. The procedure requires that you do the following:
Identify the transaction identification number for the in-doubt transaction.
DBA_2PC_NEIGHBORS views to determine whether the databases involved in the transaction have committed.
If necessary, force a commit using the
COMMIT FORCE statement or a rollback using the
ROLLBACK FORCE statement.
A system change number (SCN) is an internal timestamp for a committed version of the database. The Oracle Database server uses the SCN clock value to guarantee transaction consistency. For example, when a user commits a transaction, the database records an SCN for this commit in the redo log.
The database uses SCNs to coordinate distributed transactions among different databases. For example, the database uses SCNs in the following way:
An application establishes a connection using a database link.
The distributed transaction commits with the highest global SCN among all the databases involved.
The commit global SCN is sent to all databases involved in the transaction.
SCNs are important for distributed transactions because they function as a synchronized commit timestamp of a transaction, even if the transaction fails. If a transaction becomes in-doubt, an administrator can use this SCN to coordinate changes made to the global database. The global SCN for the transaction commit can also be used to identify the transaction later, for example, in distributed recovery.