Managing Transactions

The following sections provide information on administration tasks used to manage transactions:

You can monitor transactions on a server using statistics and monitoring facilities. Use the Administration Console to configure these features and to display the resulting output.

Monitoring Transactions

In the Administration Console, you can monitor transactions for each server in the domain. Transaction statistics are displayed for a specific server, not the entire domain.

For instructions, see the following pages in the Administration Console Online Help:

“ View transaction statistics” (and Servers: Monitoring: JTA: Summary)
“ View statistics for named transactions” (and Servers: Monitoring: JTA: Transactions By Name)
“ View transaction statistics for XA resources” (and Servers: Monitoring: JTA XA Resources)
“ View transaction statistics for non-XA resources” (and Servers: Monitoring: JTA: Non-XA Resources)
“ View current transactions” (and Servers: Monitoring: JTA: Transactions)
“ View transaction recovery statistics” (and Servers: Monitoring: JTA: Recovery Services)

Handling Heuristic Completions

A heuristic completion (or heuristic decision) occurs when a resource makes a unilateral decision during the completion stage of a distributed transaction to commit or rollback updates. This can leave distributed data in an indeterminate state. Network failures or resource timeouts are possible causes for heuristic completion. In the event of an heuristic completion, one of the following heuristic outcome exceptions may be thrown:

HeuristicRollback—one resource participating in a transaction decided to autonomously rollback its work, even though it agreed to prepare itself and wait for a commit decision. If the Transaction Manager decided to commit the transaction, the resource's heuristic rollback decision was incorrect, and might lead to an inconsistent outcome since other branches of the transaction were committed.
HeuristicCommit—one resource participating in a transaction decided to autonomously commit its work, even though it agreed to prepare itself and wait for a commit decision. If the Transaction Manager decided to rollback the transaction, the resource's heuristic commit decision was incorrect, and might lead to an inconsistent outcome since other branches of the transaction were rolled back.
HeuristicMixed—the Transaction Manager is aware that a transaction resulted in a mixed outcome, where some participating resources committed and some rolled back. The underlying cause was most likely heuristic rollback or heuristic commit decisions made by one or more of the participating resources.
HeuristicHazard—the Transaction Manager is aware that a transaction might have resulted in a mixed outcome, where some participating resources committed and some rolled back. But system or resource failures make it impossible to know for sure whether a Heuristic Mixed outcome definitely occurred. The underlying cause was most likely heuristic rollback or heuristic commit decisions made by one or more of the participating resources.

When an heuristic completion occurs, a message is written to the server log. Refer to your database vendor documentation for instructions on resolving heuristic completions.

Some resource managers save context information for heuristic completions. This information can be helpful in resolving resource manager data inconsistencies. If the ForgetHeuristics attribute is selected (set to true) on the JTA panel of the WebLogic Console, this information is removed after an heuristic completion. When using a resource manager that saves context information, you may want to set the ForgetHeuristics attribute to false.

Moving a Server

A server instance is identified by its URL (IP address or DNS name plus the listening port number). Changing the URL by moving the server to a new machine or changing the Listening Port of a server on the same machine effectively moves the server so the server identity may no longer match the information stored in the transaction logs.

If the new server has the same URL as the old server, the Transaction Recovery Service searches all transaction log files for incomplete transactions and completes them as described in Transaction Recovery Service Actions After a Crash.
If the new server does not have the same URL, any pending transactions stored in the transaction log files are unrecoverable. If you wish, you can delete the transaction log files. This step prevents the Transaction Recovery Service from attempting to resolve these transactions until the value of the AbandonTimeoutSeconds parameter is exceeded. See Abandoning Transactions for more information.
If a server acting as a remote transaction sub-coordinator fails and its URL changes, any ongoing transactions will not complete (commit or rolledback) because the coordinator is unable to communicate with the remote sub-coordinator. The coordinator will attempt the commit or rollback request until AbandonTimeoutSeconds is exceeded. See Abandoning Transactions for more information.

BEA recommends configuring server instances using DNS names rather than IP addresses to promote portability.

Abandoning Transactions

You can choose to abandon incomplete transactions after a specified amount of time. In the two-phase commit process for distributed transactions, the transaction manager coordinates all resource managers involved in a transaction. After all resource managers vote to commit or rollback, the transaction manager notifies the resource managers to act—to either commit or rollback changes. During this second phase of the two-phase commit process, the transaction manager will continue to try to complete the transaction until all resource managers indicate that the transaction is completed. Using the AbandonTimeoutSeconds attribute, you can set the maximum time, in seconds, that a transaction manager will persist in attempting to complete a transaction during the second phase of the commit protocol. The default value is 86400 seconds, or 24 hours. After the abandon transaction timer expires, no further attempt is made to resolve the transaction with any resources that are unavailable or unable to acknowledge the transaction outcome. If the transaction is in a prepared state before being abandoned, the transaction manager will roll back the transaction to release any locks held on behalf of the abandoned transaction and will write an heuristic error to the server log.

For instructions on how to set the AbandonTimeoutSeconds attribute, see “ Configure JTA” in the Administration Console Online Help.
For more information about the two-phase commit process, see Distributed Transactions and the Two-Phase Commit Protocol.

Transaction Recovery After a Server Fails

The WebLogic Server transaction manager is designed to recover from system crashes with minimal user intervention. The transaction manager makes every effort to resolve transaction branches that are prepared by resource managers with a commit or roll back, even after multiple crashes or crashes during recovery.

To facilitate recovery after a crash, WebLogic Server provides the Transaction Recovery Service, which automatically attempts to recover transactions on system startup. On startup, the Transaction Recovery Service parses all transaction log records for incomplete transactions and completes them as described in Transaction Recovery Service Actions After a Crash.

Because the Transaction Recovery Service is designed to gracefully handle transaction recovery after a crash, BEA recommends that you attempt to restart a crashed server and allow the Transaction Recovery Service to handle incomplete transactions.

If a server crashes and you do not expect to be able to restart it within a reasonable period of time, you may need to take action. Procedures for recovering transactions after a server failure differ based on your WebLogic Server environment. For a non-clustered server, you can manually move the server (with the default persistent store DAT file) to another system (machine) to recover transactions. See Recovering Transactions for a Failed Non-Clustered Server for more information. For a server in a cluster, you can manually migrate the whole server or the Transaction Recovery Service to another server in the same cluster. Migrating the Transaction Recovery Service involves selecting a server with access to the transaction logs to recover transactions, and then migrating the service using the Administration Console or the WebLogic command line interface.

The following sections provide information on how to recover transactions after a failure:

Transaction Recovery Service Actions After a Crash

When you restart a server after a crash or when you migrate the Transaction Recovery Service to another (backup) server, the Transaction Recovery Service does the following:

Complete transactions ready for second phase of two-phase commit

For transactions for which a commit decision has been made but the second phase of the two-phase commit process has not completed (transactions recorded in the transaction log), the Transaction Recovery Service completes the commit process.

Resolve prepared transactions

For transactions that the transaction manager has prepared with a resource manager (transactions in phase one of the two-phase commit process), the Transaction Recovery Service must call XAResource.recover() during crash recovery for each resource manager and eventually resolve (by calling the commit(), rollback(), or forget() method) all transaction IDs returned by recover().

Report heuristic completions

If a resource manager reports a heuristic exception, the Transaction Recovery Service records the heuristic exception in the server log and calls forget() if the Forget Heuristics configuration attribute is enabled. If the Forget Heuristics configuration attribute is not enabled, refer to your database vendor’s documentation for information about resolving heuristic completions. See Handling Heuristic Completions for more information.

Maintains consistency across resources

The Transaction Recovery Service handles transaction recovery in a consistent, predictable manner: For a transaction for which a commit decision has been made but is not yet committed before a crash, and XAResource.recover() returns the transaction ID, the Transaction Recovery Service consistently calls XAResource.commit(); for a transaction for which a commit decision has not been made before a crash, and XAResource.recover() returns its transaction ID, the Transaction Recovery Service consistently calls XAResource.rollback(). With consistent, predictable transaction recovery, a transaction manager crash by itself cannot cause a mixed heuristic completion where some branches are committed and some are rolled back.

Persists in achieving transaction resolution

If a resource manager crashes, the Transaction Recovery Service must eventually call commit() or rollback() for each prepared transaction until it gets a successful return from commit() or rollback(). The attempts to resolve the transaction can be limited by setting the AbandonTimeoutSeconds configuration attribute. See Abandoning Transactions for more information.

Recovering Transactions for a Failed Non-Clustered Server

Move (or make available) the persistent store DAT file (which contains all transaction log records) from the failed server to a new server.
Set the path for the default persistent store with the path to the data file. See Setting the Path for the Default Persistent Store.
Start the new server. The Transaction Recovery Service searches all transaction log files for incomplete transactions and completes them as described in Transaction Recovery Service Actions After a Crash.

When moving transaction log records after a server failure, make all transaction log records available on the new machine before starting the server there. Otherwise, transactions in the process of being committed at the time of a crash might not be resolved correctly, resulting in application data inconsistencies.You can accomplish this by storing persistent store data files on a dual-ported disk available to both machines. As in the case of a planned migration, update the default file store directory attribute with the new path before starting the server if the pathname is different on the new machine.

Recovering Transactions for a Failed Clustered Server

When a clustered server fails, you have the following options for recovering transactions:

Server Migration

For clustered servers, WebLogic Server enables you to migrate a failing server to a new machine, including the Transaction Recovery Service. When the server migrates to another machine, it must be able to locate the transaction log records to complete or recover transactions. Transaction log records are stored in the default persistent store for the server. If you plan to migrate clustered servers in the event of a failure, you must set up the default persistent store so that it stores records in a shared storage system that is accessible to any potential machine to which a failed migratable server might be migrated. For highest reliability, use a shared storage solution that is itself highly available—for example, a storage area network (SAN).

For information about server migration, see “ Server Migration” in Using WebLogic Server Clusters.

Transaction Recovery Service Migration

When a clustered server crashes, you can manually migrate the Transaction Recovery Service from the crashed server to another server in the same cluster using the Administration Console or the command line interface. The following events occur:

The Transaction Recovery Service on the backup server takes ownership of the transaction log from the crashed server.
The Transaction Recovery Service searches all transaction log records from the failed server for incomplete transactions and completes them as described in Transaction Recovery Service Actions After a Crash.
If the Transaction Recovery Service on the backup server successfully completes all incomplete transactions from the failed server, the server releases ownership of the Transaction Recovery Service for the failed server so the failed server can reclaim it upon restart.

For instructions to migrate the Transaction Recovery Service using the Administration Console, see “ Migrate the Transaction Recovery Service” in the Administration Console Online Help.

A server can perform transaction recovery for more than one failed server. While recovering transactions for other servers, the backup server continues to process and recover its own transactions. If the backup server fails during recovery, you can migrate the Transaction Recovery Service to yet another server, which will continue the transaction recovery. You can also manually migrate the Transaction Recovery Service back to the original failed server using the Administration Console or the command line interface. See Manually Migrating the Transaction Recovery Service Back to the Original Server for more information.

When a backup server completes transaction recovery for a server, it releases ownership of the Transaction Recovery Service for the failed server. When you restart a failed server, it attempts to reclaim ownership of its Transaction Recovery Service. If a backup server is in the process of recovering transactions when you restart the failed server, the backup server stops recovering transactions, performs some internal cleanup, and releases ownership of the Transaction Recovery service so the failed server can reclaim it and start properly. The failed server will then complete its own transaction recovery.

If a backup server still owns the Transaction Recovery Service for a failed server and the backup server is inactive when you attempt to restart the failed server, the failed server will not start because the backup server cannot release ownership of the Transaction Recovery Service. This is also true if the fail back mechanism fails or if the backup server cannot communicate with the Administration Server. You can manually migrate the Transaction Recovery using the Administration Console or the command line interface.

Limitations of Migrating the Transaction Recovery Service

When migrating the Transaction Recovery Service, the following limitations apply:

You cannot migrate the Transaction Recovery Service to a backup server from a server that is running. You must stop the server before migrating the Transactions Recovery Service.
The backup server does not accept new transaction work for the failed server. It only processes incomplete transactions.
The backup server does not process heuristic log files.
The backup server only processes log records written by WebLogic Server. It does not process log records written by gateway implementations, including WebLogic Tuxedo Connector.

Preparing to Migrate the Transaction Recovery Service

To migrate the Transaction Recovery Service from a failed server in a cluster to another server (backup server) in the same cluster, the backup server must have access to the transaction log records from the failed server. Therefore, you must store default persistent store data files on persistent storage available to all potential backup servers in the cluster. BEA recommends that you store transaction log records on a Storage Area Network (SAN) device or a dual-ported disk. Do not use an NFS file system to store transaction log records. Because of the caching scheme in NFS, files on disk may not always be current. Using transaction log records stored on an NFS device for recovery may cause data corruption.

When migrating the Transaction Recovery Service from a server, you must stop the failing or failed server before actually migrating the Transaction Recovery Service. If the original server is still running, you cannot migrate the Transaction Recovery Service from it.

All servers that participate in the migration must have a listen address specified in their configuration. See “ Configure listen addresses” in the Administration Console Help.

Constraining the Servers to Which the Transaction Recovery Service can Migrate

You may want to limit the choices of the servers to use as a Transaction Recovery Service backup for a server in a cluster. For example, all servers in your cluster may not have access to the transaction log records for a server. You can limit the list of destination servers available on the Servers: Configuration: Migration page in the Administration Console. See “ Configure candidate servers for Transaction Recovery Service migration” for instructions.

Viewing Current Owner of the Transaction Recovery Service

When you migrate the Transaction Recovery Service to another server in the cluster, the backup server takes ownership of the Transaction Recovery Service until it completes all incomplete transactions. After which, it releases ownership of the Transaction Recovery Service and the original server can reclaim it. You can see the current owner on the Servers: Control: Migration page in the Administration Console. Follow these instructions:

Manually Migrating the Transaction Recovery Service Back to the Original Server

After completing transaction recovery for a failed server, a backup server releases ownership of the Transaction Recovery Service so that the original server can reclaim it when the server is restarted. If the backup server stops (crashes) for any reason before it completes transaction recovery, the original server cannot reclaim ownership of the Transaction Recovery Service and will not start. You can manually migrate the Transaction Recovery Service back to the original server by selecting the original server as the Destination Server. The backup server must not be running when you migrate the service back to the original server. Follow the instructions below.

A backup server will continue to recover incomplete transactions after you restart it. You will not need to manually migrate the Transaction Recovery Service back to the original server if the backup server completes the transaction recovery.
If you restart the original server while the backup server is recovering transactions, the backup server will gracefully release ownership of the Transaction Recovery Service. You do not need to stop the backup server. See Recovering Transactions for a Failed Clustered Server.

Make sure the backup server is not running.
In the Domain Structure tree in the Administration console, expand Environment and click Servers.
Select the original server from which the Transaction Recovery Service was migrated, then select the Control > Migration tab.
Click Advanced.
Under JTA Migration Options, in Migrate to Server, select the server from which the Transaction Recovery Service was migrated (should be the same as the Preferred Server).
Click Save.

Programming WebLogic JTA