7. Error Management

Introduction

The purpose of this chapter is to review the transaction and communication concepts discussed in the preceding chapters with the focus on how to manage and interpret error conditions correctly.

What are the means used by the BEA TUXEDO system to communicate to the application that a function call has failed, allowing the programmer to implement the appropriate logic? What are the various scenarios for determining whether to commit or abort a transaction? What errors are fatal to transactions? How does transaction mode affect the concept of time-out and what are the implications? How does transaction mode affect the roles of the function primitives and how they may be used? What operations are part of one transaction and what are the determining factors? Does the fate of one transaction ever determine the fate of another? What communication rules must be followed between processes within and not within the same transaction? How do global transaction primitives affect the use of local transaction-defining functions (that is, functions used to explicitly mark the beginning and end of a local transaction) that may be part of the Data Manipulation Language (DML) that is native to the resource manager?

Many of these subjects have been touched upon already in earlier chapters. Now let's attempt to bring them together to explain the functionality of the ATMI, showing how the various pieces fit together following consistent rules that create an environment that combines message communication with transaction integrity.

Communicating Errors

The following discussion concerns how the BEA TUXEDO system communicates errors to the application developer. It is couched in terms of categories of errors and whether they are application or system-based. Hopefully, this discussion will give you more insight as to what errors to expect, what effect they have on transactions, and what kind of control you as a programmer have over them.

Throughout the guide, there has been a continual reference to the global variable tperrno. In an environment of concurrent processes, this is a key way to inform processes if their function calls have succeeded or not. All the ATMI functions that normally return an integer or pointer, return -1 or NULL on error and set tperrno to a value that reveals the nature of the error. In cases where the function does not return to its caller, as in the case of tpreturn() or tpforward() since they are called to terminate a service routine, the only way to communicate success or failure is through the global variable, tpurcode, in the requester.

The global variable tpurcode can also be used to communicate user-defined conditions. The value in tpurcode is set from the value placed in the rcode argument of tpreturn(). This code is sent regardless of the value of the rval argument of tpreturn() unless an error is encountered by tpreturn() or a transaction time-out occurs.

Function	Explanation of TPNOENT Error
`tpalloc()`	The type of buffer asked for is not known to the system. For a buffer type and/or subtype to be known, there must be an entry for it in a type switch data structure that is defined in the BEA TUXEDO system libraries. Refer to the `tuxtypes`(5) and `typesw`(5) reference pages. On an application level, make sure you have referenced a known type correctly, otherwise see your system administrator.
`tpinit()`	The calling process cannot join the application because there is no space left in the bulletin board to make an entry for it. See your system administrator.
`tpcall()` `tpacall()`	The calling process is referencing a service that is not known to the system since there is no entry for it in the bulletin board. On an application level, make sure you have referenced the service correctly, otherwise see your system administrator.
`tpconnect()`	Cannot connect to name because it does not exist or is not a conversational service
`tpgprio()`	The calling process is asking for a request priority when no request has been made. The system has no current entry for a request. This is an application error.
`tpunadvertise()`	Cannot unadvertise the service name because it is not currently advertised by the calling process
`tpenqueue()` `tpdequeue()`	Cannot access the qspace because it is not available (the associated `TMQUEUE`(5) server is not available)
`tppost()` `tpsubscribe()` `tpunsubscribe()`	Cannot access the BEA TUXEDO system event broker

Permission Errors

The only ATMI function that returns this type of error is tpinit(). If the calling process does not have the correct permissions to enter the application, this call fails returning TPEPERM. Permissions are set in the configuration file and as such the correction of this error is outside of your application. See the BEA TUXEDO system administrator if you encounter this error.

Resource Manager Errors

These errors can occur with calls to tpopen() and tpclose(), and they return the value of TPERMERR in tperrno. The meaning of the BEA TUXEDO system error code is intentionally vague in this case so as not to hinder portability. The exact nature of the error must be determined by interrogating the resource manager in its own specific manner. Obviously when this error code is returned for tpopen(), it indicates that the problem has to do with a failure on the part of the resource manager to open correctly and for tpclose(), to close correctly.

Transaction-Related Errors

When this type of error occurs, TPETRAN is returned in tperrno. tpbegin(), tpcancel(), tpresume(), tpconnect(), tppost(), and the tpcall()/tpacall() functions can return this error code. For tpbegin(), it usually means some transient system error occurred when attempting to start the transaction that may clear up with a repeated call.

tpcancel() returns this error code when called for a transaction reply (the request was done without the TPNOTRAN flag).

For tpresume(), it means that the BEA TUXEDO system is unable to resume the global transaction because the caller is currently participating in work outside any global transaction with one or more resource managers. All such work must be completed before a global transaction can be resumed. The caller's state with respect to the local transaction is unchanged.

For the other functions, it means a call was made in transaction mode to a service that does not support transactions. What does this mean? Some services belong to server groups that access a DBMS that can support transactions, whereas other services may be responsible for printing out a form and accessing a printer that knows nothing about transactions. The configuration of services into servers and server groups is an administrative task. In order to determine which services support transactions, ask your system administrator. This is an application error. For the communication call to such a service to succeed, the TPNOTRAN flag must be set. In other words, you may not ask a service that does not support transactions to be a participant in the transaction. If you desire the service, it can be asked for only if the TPNOTRAN flag is explicitly set or if you access the service outside of your transaction.

Typed Buffer Errors

Typed buffer errors are returned as a result of sending requests or replies to processes in typed buffers that are unfamiliar to them. TPEITYPE is returned by tpcall(), tpacall(), and tpconnect() when the request data buffer is sent to a service that does not know about this type. What does this mean? The buffer types that processes know about are determined both by the configuration file and by the BEA TUXEDO system libraries that have been linked into the process. These libraries define and initialize a data structure that identifies the typed buffers that the process is to know about. The library can be tailored to each process. Also, an application can supply its own copy of a file that defines buffer types. An application can set up the buffer type data structure (referred to as a buffer type switch) on a per process basis. Refer to the tuxtypes(5) and typesw(5) reference pages. This is an administrative decision and is mentioned here to clarify what is meant by a process knowing about a typed buffer. The rule for sending requests is that you must always send a request in a typed buffer that a service knows about; this information can be obtained from your system administrator.

TPEOTYPE is returned by tpcall(), tpgetrply(), tpdequeue(), and tprecv() when the reply message is sent in a buffer that is not known or not allowed by the caller. What does this mean? Not known has the same semantics as previously explained for the request buffer. Not allowed means that although the process knows of the existence of this buffer type, the type returned to it does not match the type of the buffer it allocated to receive the reply and the caller is not allowing for a change in buffer type. The caller indicates this preference by setting flags to TPNOCHANGE. In this case, strong type checking is enforced, returning TPEOTYPE when violated. The default is to have weak type checking, allowing a different buffer type to be returned as long as it is known to the caller. Again, the rule for sending replies is that the reply buffer must be known to the caller and you must observe strong type checking if it has been indicated.

Call Descriptor Errors

The errors discussed in this section can occur only when making asynchronous calls or conversational calls because they involve the misuse of call descriptors. Asynchronous calls depend on call descriptors to identify replies with their corresponding requests. Conversational sends and receives depend on call descriptors to identify the connection; the call that initiates the connection depends on the availability of a call descriptor. There are two things that the BEA TUXEDO system doesn't like you to do with call descriptors:

exceed your limit (TPELIMIT)
reference one that has become invalid (TPEBADDESC)

The limit for outstanding call descriptors (replies) has been defined for the system as fifty and is a non-tunable parameter. The only way to change it is to recompile the system. The maximum number of descriptors allowed should be ample for your application, but this limit is system-defined and cannot be redefined by your application.

The limit for call descriptors for simultaneous conversational connections is defined in the configuration file and is more flexible than the limit for replies. The MAXCONV parameter in the RESOURCES section of the configuration file can be changed when the application is not running; it can be dynamically changed in the MACHINES section when the application is running. (See tmconfig(1).)

There are two general ways that a call descriptor can become invalid. If a call descriptor has been used to retrieve a message (including a failed message) and an attempt is made to reuse it, the system complains that you cannot reuse the descriptor and returns TPEBADDESC in tperrno.

Sometimes a condition occurs where you can no longer reference a call descriptor although it has never been used to retrieve a message. In this case we refer to the descriptor as having become stale and any attempt to reference it causes TPEBADDESC to be returned. One of the conditions that causes this to happen is calling tpabort() or tpcommit() when there are still transaction replies (replies for requests sent without the TPNOTRAN flag) to be retrieved. The outstanding descriptors for these transaction replies are considered stale. Another condition that causes this to happen is transaction time-out. When it is reported on the call to tpgetrply(), no message is retrieved with that descriptor, and any further reference to it is invalid because it is considered stale. This error can be corrected at the application level.

General Communication Call Errors

These errors can occur when making communication calls but have nothing to do with the nature of the call being synchronous or asynchronous.

The communication errors, TPESVCERR and TPESVCFAIL, are the result of the reply part of communication. They can be returned as a result of a call to tpcall() or tpgetrply() and they are determined by the arguments passed to and the processing done by tpreturn(). If tpreturn() encounters an error in processing or handling arguments, it will cause a failed message to be sent to the caller. This failed message is detected by the receiver with tperrno being set to TPESVCERR. The caller's data is not sent, and if the failure was on tpgetrply(), the call descriptor becomes invalid. If an error of this nature is not encountered by tpreturn(), then the value placed in rval determines the success or failure of the call. If the application logic placed the value TPFAIL in this parameter, TPESVCFAIL is returned in tperrno and the data message is sent to the caller.

The error codes TPEBLOCK and TPGOTSIG can happen on the request or the reply end of message communication. As a result, it can be returned for all three of the request/response communication calls. TPEBLOCK is returned when a blocking condition exists and the process sending a request either synchronously or asynchronously has indicated that it does not want to wait on a blocking condition by setting its flags parameter to TPNOBLOCK. A blocking condition can exist when sending a request if, for example, all the queues of the desired service are full. When tpcall() indicates a no blocking condition, it affects only the sending part of the communication. If the call successfully sends the request, TPEBLOCK will not be returned regardless of any blocking situation that may exist while the call waits for the reply. TPEBLOCK is returned for tpgetrply() when the call is made with flags set to TPNOBLOCK and a blocking condition is encountered while awaiting the reply; for example, if a message is not currently available.

TPGOTSIG really does not flag an error condition but indicates when a signal interrupts a BEA TUXEDO system call. If the communication functions set their flags parameter to TPSIGRSTRT, the calls will not fail and this code will not be returned in tperrno.

Conversational Errors

Once a conversational connection has been established, tpsend() and tprecv() can fail with a TPEEVENT error. An event has occurred. No data is sent by tpsend(). The event type is returned in the revent member of TPSVCINFO. A course of action is dictated by the particular event.

In conversational services tpsend(), tprecv(), and tpdiscon() return TPEBADDESC when an unknown descriptor is specified.

Time-Out Errors

Time-out errors can occur for one of two reasons:

the maximum length of time a blocking call may remain blocked until the caller regains control has exceeded the amount of time it was allotted, that is, a blocking time-out occurred
the duration of a transaction from start to finish has exceeded the amount of time it was allotted, that is, a transaction time-out occurred

As a result, this error can be returned on communication calls for either blocking or transaction time-out and on tpcommit() for transaction time-out only. In every case, if a process is in transaction mode and TPETIME is returned on a failed call, it means a transaction time-out has occurred.

TPETIME indicates a blocking time-out on a communication call if

the call was not made in transaction mode and
the call was not made with flags set to TPNOBLOCK

You may recall that if this flag is set, a blocking time-out cannot occur because the call returns immediately if a blocking condition exists.

Blocking time-out is a value set by the administrator of the system and is defined in the configuration file. Transaction time-out is defined by the application by the first argument passed to tpbegin().

Further implications concerning the concept of time-out will be discussed in the section "Time-Out" later in this chapter.

Errors Leading to Abort

Errors by a participant in a transaction can cause tpcommit() to fail returning the error code of TPEABORT in tperrno. The transaction is implicitly aborted because of the failure and should be explicitly aborted. There are two ways that this error code can be returned:

if a transaction has been marked abort-only by the initiator or one of the participants, or
the transaction timed out and its status is known to be aborted

Errors Signaling Heuristic Decisions

Based on how TP_COMMIT_CONTROL is set, tpcommit() may return TPEHAZARD or TPEHEURISTIC. If TP_COMMIT_CONTROL is set to TP_CMT_LOGGED, the application gets control before the second phase of the two-phase commit is done, so it may not hear about a heuristic that occurs during the second phase. (Note that TPEHAZARD or TPEHEURISTIC can be returned if only a single resource manager is involved in the transaction and it returns a heuristic decision or a hazard indication during a one-phase commit.) If TP_COMMIT_CONTROL is set to TP_CMT_COMPLETE, then TPEHEURISTIC is returned if any of the resource managers reports a heuristic decision, and TPEHAZARD is returned if any of the involved resource managers reports a hazard. TPEHAZARD simply means that a participant failed during the second phase of commit (or during a one-phase commit) and we can't know if it completed the transaction successfully or unsuccessfully.

Application-Specific Errors

The previous sections dealt with the various categories into which system errors may fall. Your application can set up a method whereby you can pass information about user-defined errors to calling programs.

The mechanism involves use of the rcode argument of tpreturn(3) and the global variable tpurcode(5).

How to Deal with Errors

Your application logic should test for error conditions after the calls that have return values, and take suitable steps in the face of them. You may want to test if -1 or NULL (depending on which the call returns) has been returned after a function call. In the event that it has been, you may invoke a function that contains a switch statement to test for specific values of tperrno and perform the appropriate application logic in each case.

Two routines, tpstrerror(3c) and Fstrerror(3fml), are provided to retrieve the text of an error message from the message catalogs for the BEA TUXEDO system and FML, respectively. The routines return a pointer to the error message. Your program can use the pointer to direct the text to userlog(3c) or to another destination. An example is shown in Listing 7-1.

Listing 7-1 illustrates a general way of dealing with errors. The term atmicall() is used in this example generically to represent an ATMI function call.

The code following the switch statement in Listing 7-1 illustrates how tpurcode can be used to disclose an application-defined code.

Listing 7-1 How to Deal with Errors

#include <stdio.h>
#include "atmi.h"
 
extern int tperrno;
extern int tpurcode;
 
main()
 
{
int rtnval;
 
if (tpinit((TPINIT *) NULL) == -1)
   error message, exit program;
if (tpbegin(30, 0) == -1)
   error message, tpterm, exit program;
 
allocate any buffers,
make atmi calls
check return value
 
rtnval = atmicall();
 
if (rtnval == -1) {
    switch(tperrno) {
    case TPEINVAL:
        fprintf(stderr, "Invalid arguments were given to atmicall\n");
        fprintf(stderr, "e.g., service name was null or flags wrong\n");
        break;
    case ...:
        fprintf(stderr, ". . .");
        break;
 
Include all error cases described in the atmicall(3) reference page.
Other return codes are not possible, so there should be no default within
the switch statement.
 
if (tpabort(0) == -1) {
   char *p;
   fprintf(stderr, "abort was attempted but failed\n");
   p = tpstrerror(tperrno);
   userlog("%s", p);
}
}
else
if (tpcommit(0) == -1)
fprintf(stderr, "REPORT program failed at commit time\n");
 
The following code fragment shows how an application-specific 
return code can be examined.
.
.
.
ret = tpcall("servicename", (char*)sendbuf, 0, (char **)&rcvbuf, &rcvlen, \
(long)0);
.
.
.
(void) fprintf(stdout, "Returned tpurcode is: %d\n", tpurcode);
 
 
free all buffers
tpterm();
exit(0);
}

The specific values of tperrno give you more insight into the nature of the problem and on what level it can be corrected.

If your application has defined a list of error conditions specific to your processing, the same can be said for tpurcode.

Fatal Transaction Errors

In managing transactions, it is important to understand which errors prove fatal to transactions. When these errors are encountered, transactions should be explicitly aborted on the application level by having the initiator of the transaction call tpabort(). Basically, there are three conditions that cause a transaction to fail. They are:

the initiator or a participant of the transaction caused it to be marked abort-only for one of the following reasons:
the transaction timed out (TPETIME)
tpcommit() was called by a participant rather than by the originator of a transaction (TPEPROTO)

If TPESVCERR, TPESVCFAIL, TPEOTYPE, or TPETIME is returned for any of the communication calls, the transaction should be explicitly aborted with a call to tpabort(). If there are still outstanding descriptors, there is no need to wait for them before explicitly aborting the transaction. However, any attempt to access these descriptors after the transaction has been terminated will return TPEBADDESC since they are considered stale after the call.

Note that in the case of TPESVCERR, TPESVCFAIL, and TPEOTYPE, communication calls are still allowed as long as the transaction has not timed out. With the return of these errors, the transaction has been marked abort-only. In order for any further work to have any lasting effect, the communication calls should be made with the flags parameter set to TPNOTRAN. In this way, the work performed for the transaction that has been marked abort-only will not be rolled back when the transaction is aborted.

When a transaction time-out occurs, communication can continue, but it must be conducted with the following conditions enforced. The communication requests

cannot require replies
cannot block
and cannot be performed on behalf of the caller's transaction

This means asynchronous calls can be made with the flags parameter set to TPNOREPLY|TPNOBLOCK|TPNOTRAN.

Calling tpcommit() from the wrong participant in a transaction represents the only protocol error that is fatal to transactions. This error can be corrected on the application level during the development phase.

Calling tpcommit() when there is initiator/participant failure or transaction time-out represents the implicit abort error discussed earlier in the section "Errors Leading to Abort." Because the commit failed, the transaction should be aborted.

Time-Out

As already indicated, there are two possible types of time-out that can occur in the BEA TUXEDO system. The effect of time-out on communication calls is different depending on the type that occurred. Also, the following issues are addressed in the following sections.

What happens if a transaction times out while committing?
Do calls to services that are not part of your transaction use time on your transaction clock?

Blocking vs. Transaction Time-Out

We have defined blocking time-out as exceeding the amount of time a call can wait for a blocking condition to clear up. Transaction time-out occurs when a transaction takes longer than the amount of time defined for it in the timeout argument to tpbegin(). By default, if a process is not in transaction mode, blocking time-outs are performed. When the flags parameter of a communication call is set to TPNOTIME, it applies to blocking time-outs only. If a process is in transaction mode, blocking time-out and the TPNOTIME flag are not relevant. The process is sensitive to transaction time-out only as it has been defined for it when the transaction was started. What are the implications of the two different types of time-out with concern to communication calls?

If a process is not in transaction mode and a blocking time-out occurs on an asynchronous call, the communication call that blocked will fail, but the call descriptor is still valid and may be used on a re-issued call. Further communication in general is unaffected.

In the case of transaction time-out, the call descriptor to an asynchronous transaction reply (done without the TPNOTRAN flag) becomes stale and may no longer be referenced. The only further communication allowed is the one case described earlier of no reply, no blocking, and no transaction.

Effect on tpcommit()

What is the state of a transaction if time-out occurs after the call to tpcommit()? It is unknown; the transaction can have either succeeded or failed. If the transaction timed out and the system knows that it was aborted, this is communicated to you by the error code TPEABORT returned in tperrno. If the status of the transaction is unknown, TPETIME is the error code. When the state of the transaction is in doubt, you must query the resource to see if any of the changes that were part of that transaction have been applied to it in order to discover whether the transaction committed or aborted.

Effect of the TPNOTRAN Flag

When a process is in transaction mode and makes a communication call with flags set to TPNOTRAN, it prohibits the called service from becoming a participant of that transaction and as such the service's success or failure cannot influence the outcome of that transaction. This will be discussed in greater detail in the next section, "Roles of tpreturn() and tpforward()." However, if the caller is expecting a reply, its transaction clock is still ticking away while the services that generate the reply are being performed. As a result, the transaction can time out while waiting for a reply that is due from a service that is not part of that transaction.

Roles of tpreturn() and tpforward()

If a process is called in transaction mode, tpreturn() and tpforward() place the service's portion of the transaction in a state where it can be either committed or aborted when the transaction is completed by its initiator. A service may be called several times on behalf of the same transaction. It is not fully committed or aborted until the initiator of the transaction calls tpcommit() or tpabort().

Neither tpreturn() nor tpforward() should be called until all outstanding descriptors for the communication calls made within the service have been retrieved. If tpreturn() is called with outstanding descriptors with rval set to TPSUCCESS, this constitutes a protocol error and is returned as TPESVCERR to the process waiting on tpgetrply(). If the process is in transaction mode, it will cause the caller's current transaction to be marked internally as abort-only. Even if the initiator of the transaction should call tpcommit(), the transaction is aborted implicitly. If tpreturn() is called with outstanding descriptors with rval set to TPFAIL, TPESVCFAIL is returned to the process waiting on tpgetrply(). The effect on the transaction is the same.

It is always the case that when tpreturn() is called in transaction mode, it can determine the fate of that transaction either from the processing errors it encounters or from the value the application places in rval. Calling tpforward() can be used to indicate success up to that point in processing the request. If no application errors have been detected, tpforward() is invoked, otherwise tpreturn() with TPFAIL. If tpforward() is called improperly, it is considered a processing error and a failed message is returned to the requester.

Many of the ideas presented here have already been discussed in earlier sections, but they bear repeating. The following sections highlight various possible scenarios involving the transaction role of tpreturn() as well as the communication rules.

Service in Same Transaction as Caller

This is the straightforward case of the caller in transaction mode that calls another service to participate in the current transaction. What are the implications?

tpreturn() and tpforward(), when called by the participating service, place that service's portion of the transaction in a state where it can be either aborted or committed by the initiator.
The success or failure of the called process affects the current transaction. If any of the errors that prove fatal to transactions are encountered by the participant, the current transaction is marked abort-only.
The lasting effect of the work done by a successful participant is dependent on the fate of the transaction; that is, if the transaction is aborted, the work of all participants is undone.
The TPNOREPLY flag cannot be used when calling another service to participate in the current transaction.

Service in Different Transaction with AUTOTRAN Set

If a communication call is made with the TPNOTRAN flag set and the called service is configured so that a transaction will automatically get started when it is called, these processes will both be in transaction mode but they will be in different transactions. What are the implications?

tpreturn() plays the initiator's transaction role to terminate the transaction in the service where the transaction was automatically started. Alternatively, if the transaction is automatically started in a service that terminates with tpforward(), the tpreturn() in the last service in the forward chain plays the initiator's transaction role to terminate the transaction. Refer to Figure 7-1.
Because it is in transaction mode, tpreturn() is also vulnerable to failure and is subject to the failure of any participant in the transaction as well as transaction time-out and as a result is more likely to send a failed message to the caller.
Any failed messages or application failures returned to the caller do not affect the state of the caller's transaction.
The caller is vulnerable to its own transaction timing out as it waits for its reply.
If no reply is expected, the caller's transaction cannot be affected in any way by the communication call.

Figure 7-1 Transaction Roles of tpforward() and tpreturn() with AUTOTRAN

Service Starts New Explicit Transaction

If a communication call is made with TPNOTRAN, and the called service is not automatically placed in transaction mode by a configuration option, the service can define as many transactions as it wants with explicit calls to tpbegin(), tpcommit(), and tpabort(). As a result, the transaction is already completed before the call to tpreturn(). What are the implications?

tpreturn() plays no transaction role; that is, the role of tpreturn() would be exactly the same whether transactions were explicitly defined within the service routine or not.
tpreturn() can send any value back in rval regardless of the outcome of the transaction.
Typically, the errors returned will be processing errors, buffer type errors, or application failure, and the normal rules for TPESVCFAIL, TPEITYPE/TPEOTYPE, and TPESVCERR are followed.
Any failed messages or application failures returned to the caller do not affect the state of the caller's transaction.
The caller is vulnerable to its own transaction timing out as it waits for its reply.
If no reply is expected, the caller's transaction cannot be affected in any way by the communication call.

Transaction Rules

Certain rules are in effect when processes perform in transaction mode. Many of them have been touched upon already; but now, by way of summary, let's bring them together and discuss them in one place.

Communication Etiquette

The basic communication etiquette that must be observed while in transaction mode is as follows:

Processes that are participants in the same transaction must require replies for their requests.
Requests requiring no reply can be made only if the flags parameter of tpacall is set to TPNOTRAN|TPNOREPLY.
A service must retrieve all asynchronous transaction replies before calling tpreturn() or tpforward (this applies regardless of transaction mode).
The initiator must retrieve all asynchronous transaction replies (made without the TPNOTRAN flag) before calling tpcommit().
The asynchronous replies that must be retrieved include those that are expected from non-participants of the transaction, that is, replies expected for requests made with tpacall suppressing the transaction but not the reply.
If a transaction has not timed out but is marked abort-only, further communication should be performed with the TPNOTRAN flag set so that the work done as a result of the communication has lasting effect after the transaction is rolled back.
If a transaction has timed out,
Once a transaction has been marked abort-only for reasons other than time-out, a call to tpgetrply() will return whatever represents the local state of the call, that is, it can either return success or an error code that represents the local condition.
Once a descriptor is used with tpgetrply() to retrieve a reply or with tpsend() or tprecv() to report an error condition, it becomes invalid and any further reference to it will return TPEBADDESC (this applies regardless of transaction mode).
Once a transaction is aborted, all outstanding transaction call descriptors (made without the TPNOTRAN flag) become stale, and any further reference to them will return TPEBADDESC.

BEA TUXEDO System-Supplied Subroutines

In both the standard subroutines, namely tpsvrinit() and tpsvrdone(), transactions may be defined and communication may be performed. What rules must they follow?

tpsvrinit()

The BEA TUXEDO system server abstraction calls tpsvrinit() during initialization. This routine is called after the process has become a server but before it handles service requests. If tpsvrinit() performs any asynchronous communication, all replies must be retrieved before returning, or BEA TUXEDO will ignore all pending replies and the server exits. If tpsvrinit() defines any transactions, they must be completed with all asynchronous replies retrieved before returning, or BEA TUXEDO will abort the transaction and ignore the outstanding replies. The server exits gracefully.

tpsvrdone()

The BEA TUXEDO system server abstraction calls tpsvrdone() after it has finished processing service requests but before it exits. Its services are no longer advertised, but it has not yet left the application. If tpsvrdone() initiates communication, it must retrieve all outstanding replies before it returns, or the pending replies will be ignored by the BEA TUXEDO system and the server exits. If a transaction has been started within this subroutine, it must be completed with all replies retrieved, or BEA TUXEDO will abort the transaction and ignore the replies. The server exits.

Leaving the Application

tpterm() is used to remove a client from an application. What transaction rules must it obey? If the client is in transaction mode, the call fails with TPEPROTO returned in tperrno, and the client is still part of the application and in transaction mode. When the call is successful, no further communication or participation in transactions is allowed because the process is no longer part of the application.

Global Transactions and Resource Managers

An interesting point arises when using the ATMI transaction primitives to define transactions. BEA TUXEDO makes an internal call to pass the global transaction information to each resource manager participating in the transaction. When tpcommit() or tpabort() is called, BEA TUXEDO makes internal calls to direct each resource manager to commit or abort the work they did on behalf of the caller's global transaction. When you write service routines in a DTP environment, you need not and should not make resource manager-specific calls to start, commit, or abort transactions. When a global transaction has been initiated either explicitly or implicitly, you should not make explicit calls to the resource manager's transaction primitives in your application code. Failure to follow this transaction rule will give indeterminate results.

This represents a good occasion to use the transaction primitive, tpgetlev(), to determine if a process is already in a global transaction before calling the resource manager's transaction primitive.

Some resource managers offer specific options in their interface. (For example, a resource manager might offer various transaction consistency levels or flags.) Some resource manager providers offer programmers of distributed applications the opportunity to negotiate these options using resource manager-specific calls; in other resource managers these options are hard-coded in the version of the transaction interface supplied by the resource manager provider. Documentation for the resource managers you are using should be consulted for further information on this subject.

In the BEA TUXEDO system SQL resource manager, the set transaction statement is used to negotiate specific options (consistency level and access mode) for a transaction that has already been started by the BEA TUXEDO system. The method of setting such options will vary for other resource managers.

Comprehensive Example

Transaction integrity, message communication, and resource access represent the major needs of an On-line-Transaction-Processing (OLTP) application.

Listing 7-2 shows the ATMI transaction, buffer management, and communication routines working together with the SQL statements that access a resource manager. The example is taken from the ACCT server that is part of the banking application and illustrates the CLOSE_ACCT service.

The example illustrates the use of the set transaction statement (line 49) to set the consistency level and access mode of the transaction (when read/write access is specified the consistency level defaults to high consistency) before the first SQL statement that accesses the database. The SQL query determines the amount to be withdrawn in order to close the account based on the value of the ACCOUNT_ID (lines 50-58).

tpalloc() is invoked to allocate a buffer for the request message to the WITHDRAWAL service, and the ACCOUNT_ID and the amount to be withdrawn are placed in the buffer (lines 62-74). This is followed by a tpcall() to the WITHDRAWAL service (line 79). An SQL delete statement updates the database by removing the account in question (line 86).

If all is successful, the buffer allocated within the service is freed (line 98), the TPSVCINFO data buffer that was sent to the service is updated to indicate the successful completion of the transaction (line 99); the transaction is automatically committed if the service was the initiator. tpreturn() returns TPSUCCESS and the updated buffer to the client process making the request to close the account. The successful completion is reported to the status line of the form.

After each function call, success or failure is determined. In the case of failure, the buffer allocated within the service is freed, the transaction is aborted if started in the service, and the TPSVCINFO buffer is updated to show the cause of failure (lines 80-83). tpreturn() returns TPFAIL and the message in the updated buffer is reported to the status line of the form.

Note: When specifying the consistency level of a global transaction within a service routine, take care to define the level in the same way for all those service routines that may participate in the same transaction.

Listing 7-2 ACCT Server

001   #include <stdio.h>              /* UNIX */
002   #include <string.h>             /* UNIX */
003   #include <fml.h>                /* TUXEDO */
004   #include <atmi.h>               /* TUXEDO */
005   #include <Usysflds.h>           /* TUXEDO */
006   #include <sqlcode.h>            /* TUXEDO */
007   #include <userlog.h>            /* TUXEDO */
008   #include "bank.h"              /* BANKING #defines */
009   #include "bank.flds.h"         /* bankdb fields */
010   #include "event.flds.h"        /* event fields */
011
012
013   EXEC SQL begin declare section;
014   static long account_id;                    /* account id */
015   static long branch_id;                     /* branch id  */
016   static float bal, tlr_bal;                 /* BALANCE    */
017   static char acct_type;                     /* account type*/
018   static char last_name[20], first_name[20]; /* last name, first name */
019   static char mid_init;                      /* middle initial */
020   static char address[60];                   /* address    */
021   static char phone[14];                     /* telephone */
022   static long last_acct;                     /* last account branch gave */
023   EXEC SQL end declare section;
 
024   static FBFR *reqfb;           /* fielded buffer for request message */
025   static long reqlen;           /* length of request buffer */
026   static char amts[BALSTR];     /* string representation of float */
 
027   code for OPEN_ACCT service
 
028   /*
029    * Service to close an account
030    */
 
031   void
032   #ifdef __STDC__
033   LOSE_ACCT(TPSVCINFO *transb)
 
034   #else
 
035   CLOSE_ACCT(transb)
036   TPSVCINFO *transb;
037   #endif
 
038   {
039      FBFR *transf;               /* fielded buffer of decoded message */
 
040      /* set pointer to TPSVCINFO data buffer */
041         transf = (FBFR *)transb->data;
 
042      /* must have valid account number */
043      if (((account_id = Fvall(transf, ACCOUNT_ID, 0)) < MINACCT) ||
044       (account_id > MAXACCT)) {
045         (void)Fchg(transf, STATLIN, 0, "Invalid account number", (FLDLEN)0);
046         tpreturn(TPFAIL, 0, transb->data, 0L, 0);
047      }
 
048      /* Set transaction level */
049      EXEC SQL set transaction read write;
 
050      /* Retrieve AMOUNT to be deleted */
051      EXEC SQL declare ccur cursor for
052       select BALANCE from ACCOUNT where ACCOUNT_ID = :account_id;
053      EXEC SQL open ccur;
054      EXEC SQL fetch ccur into :bal;
055      if (SQLCODE != SQL_OK) {                /* nothing found */
056       (void)Fchg(transf, STATLIN, 0, getstr("account",SQLCODE), (FLDLEN)0);
057       EXEC SQL close ccur;
058       tpreturn(TPFAIL, 0, transb->data, 0L, 0);
059      }
 
060      /* Do final withdrawal */
 
061      /* make withdraw request buffer */
062      if ((reqfb = (FBFR *)tpalloc("FML",NULL,transb->len)) == (FBFR *)NULL) {
063       (void)userlog("tpalloc failed in close_acct\n");
064       (void)Fchg(transf, STATLIN, 0,
065           "Unable to allocate request buffer", (FLDLEN)0);
066       tpreturn(TPFAIL, 0, transb->data, 0L, 0);
067      }
068      reqlen = Fsizeof(reqfb);
069      (void)Finit(reqfb,reqlen);
 
070      /* put ID in request buffer */
071      (void)Fchg(reqfb,ACCOUNT_ID,0,(char *)&account_id, (FLDLEN)0);
 
072      /* put amount into request buffer */
073      (void)sprintf(amts,"%.2f",bal);
074      (void)Fchg(reqfb,SAMOUNT,0,amts, (



 The Central Event Log

The central event log is a UNIX system file to which you can send messages from BEA TUXEDO system clients and services. Writing to the central event log is accomplished through the userlog(3c) function. The central event log simply provides a record of events considered worth recording. Any organized analysis of the central event log must be provided by the application. Application developers are encouraged to establish fairly strict guidelines for events to be recorded in the userlog(3c). Application debugging is made easier if the log is not flooded with trivial messages.

 How the Log Is Named


One of the system parameters set up by the administrator determines the absolute pathname prefix of the userlog error message file on each machine. The userlog() function concatenates the month, day, and year in the form mmddyy to the prefix to form the full file name of the central event log. That means that if a process sends a message to the central event log on succeeding days, the message is written into different files.

 What Log Entries Look Like


Entries on the log consist of: 

a tag made up of the 


time of day (hhmmss)

the name of the machine (the name that is returned by uname)

the name and process-ID of the process calling userlog()


the message text-For BEA TUXEDO system messages, text is preceded by the message catalog name and message number.

optional arguments in printf(3S) format

For example, if the call: 

userlog("Unknown User `%s' \n", usrnm);


is made at 4:22:14pm by the security program, on a machine where uname returns the value mach1, the resulting log entry will look like this: 

162214.mach1!security.23451: Unknown User 'abc'


assuming 23451 is the process ID for security, and that the variable usrnm contains the value abc.

If the above message was generated by BEA TUXEDO (as opposed to the application), it might look like this: 

162214.mach1!security.23451: LIBSEC_CAT: 999: Unknown User 'abc'


where LIBSEC_CAT: 999: represents a message catalog name and message number.

If the message was sent to the central event log while the process is in transaction mode, the user log entry will have additional components in the tag. These components consist of the literal gtrid followed by three long hexadecimal integers. The integers uniquely identify the global transaction and make up what is referred to as the global transaction identifier. This identifier is used mainly for administrative purposes, but it does make an appearance in the tag that prefixes the messages in the central event log. If the foregoing message is written to the central event log in transaction mode, the resulting log entry will look like this: 

162214.mach1!security.23451: gtrid x2 x24e1b803 x239:
Unknown User 'abc'


 How to Write to the Event Log

You can either have the error message you wish to write to the log in a variable of type char * and use the variable name as the argument to the call, or include the message as a literal within quotation marks as the argument to the call, as shown in the example below. 

.
.
.
/* Open the database to be accessed by the transactions.*/
if(tpopen() == -1) {
      userlog("tpsvrinit: Cannot open database");
      return(-1);
}
.
.
.


In this example, the message is sent to the central event log if tpopen() returns a negative number.

userlog() is similar to the UNIX System command printf(3S). That is, the format portion can contain literals and/or conversion specifications for a variable number of arguments.

 Debugging Application Processes


While it is possible to use userlog() statements to help debug application software, it is sometimes necessary to use a debugger command for more complex debugging.

The standard UNIX system debugging command is sdb(1). Refer to a UNIX System programmer's reference manual. Client processes compiled with the -g option are debugged in the conventional manner explained on the sdb reference page. The syntax of the sdb command can take the following form: 

sdb -W client - directory_list


For complete syntactical information, refer to the reference page. To run the client process: 


Set any desired breakpoints in the code.


Enter the sdb command.


At the sdb prompt (*), type the run subcommand (r) and the options you want to 
pass to the client program's main().



The debugging of server programs is more complicated. Normally, servers are started using the tmboot command, which starts the server on the correct machine with the correct options. When using sdb, it is necessary to run the server directly rather than through the tmboot command. The BEA TUXEDO system tmboot(1) command passes undocumented command line options to the server's predefined main(). When you want to run your server, you will need to pass it these options as well. To obtain these options, run tmboot with the -n and -d 1 options. Refer to Section 1 of the BEA TUXEDO Reference Manual. The -n option tells tmboot not to perform the actual execution; -d 1 tells it to print out debugging level one statements. You can pass other options as well to tmboot in order to get information on a particular process rather than all of them. The output from tmboot will look something like the following, revealing the command line options it passes to the server's main(): 

exec server -g 1 -i 1 -u sfmax -U /tuxdir/appdir/ULOG -m 0 -A


When you want to run your server program using sdb, you must pass the options following the word server to its run (r) subcommand. As a result, the run command will look like the following: 

*r -g 1 -i 1 -u sfmax -U /tuxdir/appdir/ULOG -m 0 -A 


Also note that the server you are attempting to run from sdb must not already be running as part of the configuration, or the server will exit gracefully indicating a duplicate server in the central event log.



 
[Top] [Prev]