ODBC Support for Automatic Client Failover

About Automatic Client Failover

Automatic client failover is for use in high availability scenarios, for either TimesTen Scaleout or TimesTen Classic. There are two scenarios for TimesTen Classic, one with active standby pair replication and one referred to as generic automatic client failover.

If there is a failure of the database or database element to which the client is connected, then failover (connection transfer) to an alternate database or database element occurs:

For TimesTen Scaleout, failover is to an element from a list returned by TimesTen of available elements in the grid.
For TimesTen Classic with active standby replication, failover is to the new active (original standby) database.
For TimesTen Classic using generic automatic client failover, where you can ensure that the schema and data are consistent on both databases, failover is to a database from a list that is configured in the client odbc.ini file.

A typical use case for generic automatic failover is a set of databases using read-only caching, where each database has the same set of cached data. For example, if you have several read-only cache groups, then you would create the same read-only cache groups on all TimesTen Classic databases included in the list of failover servers. When the client connection fails over to an alternate TimesTen database, the cached data is consistent because cache operations automatically refresh the data (as needed) from the Oracle database.

Applications are automatically reconnected to the new database or database element. TimesTen provides features that enable applications to be alerted when this happens, so they can take any appropriate action.

Note:

Automatic client failover applies only to client/server connections. The functionality described here does not apply to direct connections.
Automatic client failover is complementary to Oracle Clusterware in situations where Oracle Clusterware is used, though the two features are not dependent on each other. Refer to Using Oracle Clusterware to Manage Active Standby Pairs in Oracle TimesTen In-Memory Database Replication Guide.

Features and Functionality of ODBC Support for Automatic Client Failover

If a database or database element to which a client is connected fails, failover to an alternate database or database element occurs.

When failover occurs, be aware of the following:

The client has a new connection but using the same ODBC connection handle. No state from the original connection is preserved, however, other than the handle itself. The application must open new ODBC statement handles and descriptor handles.
If you register a failover callback function (see Implementing and Registering an ODBC Failover Callback Function.), a failover listener thread will be created within the client process to listen for failover event and invoke the callback function.

All client statement handles from the original connection are marked as invalid. API calls on these statement handles generally return SQL_ERROR with distinctive failover error codes defined in tt_errCode.h:

Native error 30105 with SQL state 08006
Native error 47137

The exception to this is for SQLError, SQLFreeStmt, SQLGetDiagRec, and SQLGetDiagField calls (depending on your version of ODBC), which behave typically.

In addition, note the following:

The socket to the original database or database element is closed. There is no need to call SQLDisconnect. TimesTen performs the equivalent, cleaning up the connection handle and confirming resources are freed.
In connecting to the new TimesTen database or database element, the same connection string and DSN definition from the original connection request are used, with the appropriate server name.
It is up to the application to open new statement handles and reexecute necessary SQLPrepare calls.
If a failover has already occurred and the client is already connected to the new database or database element:
- For TimesTen Scaleout, the next failover request results in an attempt to connect to the next element in the list that was returned by TimesTen at the time of the original connection.
- For TimesTen Classic with active standby replication, the next failover request results in an attempt to reconnect to the original active database. If that fails, alternating attempts are made to connect to the two servers until there is a timeout, and the connection is blocked during this period.
- For TimesTen Classic using generic automatic client failover, the next failover request results in an attempt to connect to the next database in the list that is configured in the client odbc.ini file. This could be the next database sequentially or one chosen at random from the list, according to the setting of the TTC_Random_Selection connection attribute, which is described in Configuration of Automatic Client Failover.
The timeout value is according to the TimesTen client connection attribute TTC_Timeout (default 60 seconds). (Refer to TTC_Timeout in Oracle TimesTen In-Memory Database Reference for information about that attribute.)
Failover connections are created only as needed, not in advance.

During failover, TimesTen can optionally make callbacks to a user-defined function that you register. This function takes care of any custom actions you want to occur in a failover situation. (See Implementing and Registering an ODBC Failover Callback Function.)

The following public connection options are propagated to the new connection. The corresponding general connection attribute is shown in parentheses where applicable. The TT_REGISTER_FAILOVER_CALLBACK option is used to register your callback function.

SQL_ACCESS_MODE
SQL_AUTOCOMMIT
SQL_TXN_ISOLATION (Isolation)
SQL_OPT_TRACE
SQL_QUIET_MODE
TT_PREFETCH_CLOSE
TT_CLIENT_TIMEOUT (TTC_TIMEOUT)
TT_REGISTER_FAILOVER_CALLBACK

The following options are propagated to the new connection if they were set through connection attributes or SQLSetConnectOption calls, but not if set through TimesTen built-in procedures or ALTER SESSION.

TT_NLS_SORT (NLS_SORT)
TT_NLS_LENGTH_SEMANTICS (NLS_LENGTH_SEMANTICS)
TT_NLS_NCHAR_CONV_EXCP (NLS_NCHAR_CONV_EXCP)
TT_DYNAMIC_LOAD_ENABLE (DynamicLoadEnable)
TT_DYNAMIC_LOAD_ERROR_MODE (DynamicLoadErrorMode)
TT_NO_RECONNECT_ON_FAILOVER (TTC_NoReconnectOnFailover)

The following options are propagated to the new connection if they were set on the connection handle.

SQL_QUERY_TIMEOUT
TT_PREFETCH_COUNT

See Connection Attributes in Oracle TimesTen In-Memory Database Reference.

Note:

If you issue an ALTER SESSION statement anytime after the initial database connection, you must re-issue the statement after a failover.

Configuration of Automatic Client Failover

In TimesTen Classic, failover DSNs must be specifically configured through TTC_Server2 and TTC_Servern connection attributes.

Setting any of TTC_Server2, TTC_Server_DSN2, TTC_Servern, or TCP_Port2 implies that you intend to use automatic client failover. For the active standby pair scenario, it also means a new thread is created for your application to support the failover mechanism.

Refer to Configuring Automatic Client Failover for TimesTen Classic in Oracle TimesTen In-Memory Database Operations Guide or Client Connection Failover in the Oracle TimesTen In-Memory Database Scaleout User's Guide.

Be aware of these TimesTen connection attributes:

TTC_NoReconnectOnFailover: If this is set to 1 (enabled), TimesTen is instructed to do all the usual client failover processing except for the automatic reconnect. (For example, statement and connection handles are marked as invalid.) This is useful if the application does its own connection pooling or manages its own reconnection to the database after failover. The default value is 0 (reconnect). Also see TTC_NoReconnectOnFailover in Oracle TimesTen In-Memory Database Reference.
TTC_REDIRECT: If this is set to 0 and the initial connection attempt to the desired database or database element fails, then an error is returned and there are no further connection attempts. This does not affect subsequent failovers on that connection. Also see "TTC_REDIRECT" in Oracle TimesTen In-Memory Database Reference.
TTC_Random_Selection: For TimesTen Classic using generic automatic client failover, the default setting of 1 (enabled) specifies that when failover occurs, the client randomly selects an alternative server from the list provided in TTC_Servern attribute settings. If the client cannot connect to the selected server, it keeps redirecting until it successfully connects to one of the listed servers. With a setting of 0, TimesTen goes through the list of TTC_Servern servers sequentially. Also see TTC_Random_Selection in Oracle TimesTen In-Memory Database Reference.

Note:

If you set any of these in odbc.ini or the connection string, the settings are applied to the failover connection. They cannot be set as ODBC connection options or ALTER SESSION attributes.

Implementing and Registering an ODBC Failover Callback Function

If there are custom actions you would like to have occur when there is a failover, you can have TimesTen make a callback to a user-defined function for such actions.

This function is called when the attempt to connect to the new database or database element begins, and again after the attempt to connect is complete. This function could be used, for example, to cleanly restore statement handles.

The function API is defined as follows.

typedef SQLRETURN (*ttFailoverCallbackFcn_t)
  (SQLHDBC,      /* hdbc    */
   SQLPOINTER,   /* foCtx   */
   SQLUINTEGER,  /* foType  */
   SQLUINTEGER); /* foEvent */

Where:

hdbc is the ODBC connection handle for the connection that failed.
foCtx is a pointer to an application-defined data structure, for use as needed.
foType is the type of failover. In TimesTen, the only supported value for this is TT_FO_SESSION, which results in the session being reestablished. This does not result in statements being re-prepared.
foEvent indicates the event that has occurred, with the following supported values:
- TT_FO_BEGIN: Beginning failover.
- TT_FO_ABORT: Failover failed. Retries were attempted for the interval specified by TTC_Timeout (minimum value 60 seconds for active standby failover) without success.
- TT_FO_END: Successful end of failover.
- TT_FO_ERROR: A failover connection failed but will be retried.
Note that TT_FO_REAUTH is not supported by TimesTen client failover.

Use a SQLSetConnectOption call to set the TimesTen TT_REGISTER_FAILOVER_CALLBACK option to register the callback function, specifying an option value that is a pointer to a structure of C type ttFailoverCallback_t that is defined as follows in the timesten.h file and refers to the callback function.

typedef struct{
  SQLHDBC                 appHdbc;
  ttFailoverCallbackFcn_t callbackFcn;
  SQLPOINTER              foCtx;
} ttFailoverCallback_t;

Where:

appHdbc is the ODBC connection handle, and should have the same value as hdbc in the SQLSetConnectOption calling sequence. (It is required in the data structure due to driver manager implementation details, in case you are using a driver manager.)
callbackFcn specifies the callback function. (You can set this to NULL to cancel callbacks for the given connection. The failover would still happen, but the application would not be notified.)
foCtx is a pointer to an application-defined data structure, as in the function description earlier.

Set TT_REGISTER_FAILOVER_CALLBACK for each connection for which a callback is desired. The values in the ttFailoverCallback_t structure are copied when the SQLSetConnectOption call is made. The structure need not be kept by the application. If TT_REGISTER_FAILOVER_CALLBACK is set multiple times for a connection, the last setting takes precedence.

Note:

Because the callback function executes asynchronously to the main thread of your application, it should generally perform only simple tasks, such as setting flags that are polled by the application. However, there is no such restriction if the application is designed for multithreading. In that case, the function could even make ODBC calls, for example, but it is only safe to do so if the foEvent value TT_FO_END has been received.
It is up to the application to manage the data pointed to by the foCtx setting.

This example shows the following features.

A globally defined user structure type, FOINFO, and the structure variable foStatus of type FOINFO
A callback function, FailoverCallback(), that updates the foStatus structure whenever there is a failover
A registration function, RegisterCallback(), that does the following:
- Declares a structure, failoverCallback, of type ttFailoverCallback_t.
- Initializes foStatus values.
- Sets the failoverCallback data values, consisting of the connection handle, a pointer to foStatus, and the callback function (FailoverCallback).
- Registers the callback function with a SQLSetConnectOption call that sets TT_REGISTER_FAILOVER_CALLBACK as a pointer to failoverCallback.

/* user defined structure */
struct FOINFO
{
   int callCount;
   SQLUINTEGER lastFoEvent;
};
/* global variable passed into the callback function */
struct FOINFO foStatus;
 
/* the callback function */
SQLRETURN FailoverCallback (SQLHDBC hdbc,
                           SQLPOINTER pCtx,
                           SQLUINTEGER FOType,
                           SQLUINTEGER FOEvent)
{
   struct FOINFO* pFoInfo = (struct FOINFO*) pCtx;
 
   /* update the user defined data */
   if (pFoInfo != NULL)
   {
      pFoInfo->callCount ++;
      pFoInfo->lastFoEvent = FOEvent;
 
      printf ("Failover Call #%d\n", pFoInfo->callCount);
   }
 
   /* the ODBC connection handle */
   printf ("Failover HDBC : %p\n", hdbc);
 
   /* pointer to user data */
   printf ("Failover Data : %p\n", pCtx);
 
   /* the type */
   switch (FOType)
   {
       case TT_FO_SESSION:
       printf ("Failover Type : TT_FO_SESSION\n");
       break;
 
     default:
       printf ("Failover Type : (unknown)\n");
   }
 
   /* the event */
   switch (FOEvent)
   {
     case TT_FO_BEGIN:
       printf ("Failover Event: TT_FO_BEGIN\n");
       break;
 
     case TT_FO_END:
       printf ("Failover Event: TT_FO_END\n");
       break;
 
     case TT_FO_ABORT:
       printf ("Failover Event: TT_FO_ABORT\n");
       break;
 
     case TT_FO_REAUTH:
       printf ("Failover Event: TT_FO_REAUTH\n");
       break;
 
     case TT_FO_ERROR:
       printf ("Failover Event: TT_FO_ERROR\n");
       break;
 
     default:
       printf ("Failover Event: (unknown)\n");
   }
 
 return SQL_SUCCESS;
}
 
/* function to register the callback with the failover connection */
SQLRETURN RegisterCallback (SQLHDBC hdbc)
{
   SQLRETURN rc;
   ttFailoverCallback_t failoverCallback;
 
   /* initialize the global user defined structure */
   foStatus.callCount = 0;
   foStatus.lastFoEvent = -1;
 
   /* register the connection handle, callback and the user defined structure */
   failoverCallback.appHdbc = hdbc;
   failoverCallback.foCtx = &foStatus;
   failoverCallback.callbackFcn = FailoverCallback;
 
   rc = SQLSetConnectOption (hdbc, TT_REGISTER_FAILOVER_CALLBACK,
     (SQLULEN)&failoverCallback);
 
   return rc;
}

When a failover occurs, the callback function would produce output such as the following:

Failover Call #1
Failover HDBC : 0x8198f50
Failover Data : 0x818f8ac
Failover Type : TT_FO_SESSION
Failover Event: TT_FO_BEGIN

ODBC Application Action in the Event of Failover

There are actions to perform in the event of a failover.

This section discusses these topics:

Application Steps for Failover

If you receive any error conditions in response to an operation in your application, then application failover is in progress.

See Features and Functionality of ODBC Support for Automatic Client Failover for a list of error conditions.

Perform these recovery actions:

Issue a rollback on the connection. Until you do this, no further processing is possible on the connection.
Clean up all objects from the previous connection. None of the state or objects associated with the previous connection are preserved, but proper cleanup through the relevant API calls is still strongly recommended.
Assuming TTC_NoReconnectOnFailover=0 (the default), sleep briefly, as discussed in the next section, Implementing Failover Delay and Retry Settings. If TTC_NoReconnectOnFailover=1, then you must instead manually reconnect the application to an alternate database or database element.
Recreate and reprepare all objects related to your connection.
Restart any in-progress transactions from the beginning.

Implementing Failover Delay and Retry Settings

The reconnection to another database or database element during automatic client failover may take some time. Therefore, your application should place all recovery actions within a loop with a short delay before each subsequent attempt, where the total number of attempts is limited.

Note the following:

If your application attempts recovery actions before TimesTen has completed its client failover process, you may receive another failover error condition as listed in Features and Functionality of ODBC Support for Automatic Client Failover.
If you do not limit the number of attempts, the application may appear to freeze if the client failover process does not complete successfully. For example, your recovery loop could use a retry delay of 100 milliseconds with a maximum number of retries limited to 100 attempts. The ideal values depend on your particular application and configuration.

This example illustrates some of these points (as well as retrying transient errors, discussed in Transient Errors (ODBC)).

/*
 * The following code snippet is a simple illustration of how you might handle 
 * the retrying of transient and connection failover errors in a C/ODBC 
 * application. In the interests of simplicity code that is not directly 
 * relevant to the example has been omitted (...). A real application
 * would of course be more complex.
 *
 * This example uses the ODBC 3.5 API.
 */
 
// define maximum retry counts and failover retry delay
#define  MAX_TE_RETRIES    30
#define  MAX_FO_RETRIES    100
#define  FO_RETRY_DELAY    100  // milliseconds   
 
// function return values
#define  SUCCESS           0
#define  FAILURE         (-1)
 
// constants for categorising errors
#define  ERR_OTHER         1
#define  ERR_TRANSIENT     2
#define  ERR_FAILOVER      3
 
// SQLSTATES and native errors
#define    SQLSTATE_TRANSIENT   "TT005"
#define    SQLSTATE_FAILOVER    "08006"
#define    NATIVE_FAILOVER1     47137
#define    NATIVE_FAILOVER2     30105
 
// SQL statements
SQLCHAR * sqlQuery = (SQLCHAR *)"SELECT ...";
SQLCHAR * sqlUpdate = (SQLCHAR *)"UPDATE ...";
 
// Database connection handle
SQLHDBC        dbConn = SQL_NULL_HDBC;
 
// Statement handles
SQLHSTMT      stmtQuery = SQL_NULL_HSTMT;
SQLHSTMT      stmtUpdate = SQL_NULL_HSTMT;
 
// ODBC return code
SQLRETURN rc;
 
// Retry counters
int teRetries; // transient errors
int foRetries; // failover errors
int foDelay = FO_RETRY_DELAY; // failover retry delay in ms
 
// Function to sleep for a specified number of milliseconds
void 
sleepMs( unsigned int ms)
{
    struct timespec rqtm, rmtm;
 
    rqtm.tv_sec = (time_t)(ms / 1000);
    rqtm.tv_nsec = (long)(ms % 1000000);
 
    while (  nanosleep( &rqtm, &rmtm )  )
        rqtm = rmtm;
} // sleepMs
 
// Function to check error stack for transient or failover errors.
// In a real application lots of other kinds of checking would also
// go in here to identify other errors of interest. We'd probably also
// log the errors to an error log.
int 
errorCategory( SQLHANDLE handle, SQLSMALLINT handleType )
{
    SQLRETURN rc;
    SQLSMALLINT i = 1;
    SQLINTEGER native_error;
    SQLCHAR sqlstate[LEN_SQLSTATE+1];
    SQLCHAR msgbuff[1024];
    SQLSMALLINT msglen;
 
    native_error = 0;
    sqlstate[0] = '\0';
    rc = SQLGetDiagRec( handleType, handle, i, sqlstate, &native_error,
                        msgbuff, sizeof(msgbuff), &msglen );
    while (   rc == SQL_SUCCESS  )
    {
        if (  strcmp( sqlstate, SQLSTATE_TRANSIENT ) == 0  )
            return ERR_TRANSIENT;
        else
        if (  native_error == NATIVE_FAILOVER1  )
            return ERR_FAILOVER;
        else
        if (  ( strcmp( sqlstate, SQLSTATE_FAILOVER ) == 0 ) &&
              (native_error == NATIVE_FAILOVER2)  )
            return ERR_FAILOVER;
        rc = SQLGetDiagRec( handleType, handle, ++i, sqlstate,
                            &native_error, msgbuff, sizeof(msgbuff),
                            &msglen );
    }
 
    return ERR_OTHER;
} // errorCategory
 
// Function to perform a rollback
void 
rollBack( SQLHDBC hDbc )
{
    SQLRETURN rc;
 
    rc = SQLEndTran( SQL_HANDLE_DBC, hDbc, SQL_ROLLBACK );
    // Report/log errors (a rollback failure is very, very bad).
    ...
} // rollBack
 
// Function to prepare all statements, bind parameters and bind
// columns.
int 
prepareAll( void )
{
    SQLRETURN rc;
 
    // Prepare the SQL statements and check for errors.
    rc = SQLPrepare( stmtQuery, sqlQuery, SQL_NTS );
    if (  rc != SQL_SUCCESS  )
    {
        rollBack( dbConn );
        return errorCategory( stmtQuery, SQL_HANDLE_STMT );
    }
    rc = SQLPrepare( stmtUpdate, sqlUpdate, SQL_NTS );
...
    // Bind parameters and colums
...
 
    return SUCCESS; // indicate success
} // prepareAll
 
// Function to execute a specific application transaction handling
// retries.
int 
txnSomeTransaction( ... )
{
    SQLRETURN rc;
    SQLLEN    rowcount = 0;
    int needReprepare = 0;
    int result;
 
    // Initialize retry counters
    teRetries = MAX_TE_RETRIES;
    foRetries = MAX_FO_RETRIES;
 
    // main retry loop
    while (  ( teRetries > 0 ) && ( foRetries > 0 )  )
    {
 
        // Do we need to re-prepare?
        while ( needReprepare && ( foRetries > 0 ) )
        {
            msSleep( retryDelay ); // delay before proceeding
            result = prepareAll();
            if (  result == SUCCESS  )
                needReprepare = 0;
            else
            if (  result != ERR_FAILOVER  )
                goto err;
            else
                foRetries--;
        }
 
        // First execute the query
 
        // Set input values for query
        ...
 
        // Execute query
        rc = SQLExecute( stmtQuery );
        if (  rc != SQL_SUCCESS  )
        {
            result = errorCategory( stmtQuery, SQL_HANDLE_STMT );
            rollBack( dbConn );
            switch (  result  )
            {
                case ERR_OTHER:
                    goto err;
                    break;
                case ERR_TRANSIENT:
                    teRetries--;
                    continue; // retry loop
                    break;
                case ERR_FAILOVER:
                    foRetries--;
                    needReprepare = 1;
                    continue; // retry loop
                    break;
            }
        }
 
        // Process results
        while (  (rc = SQLFetch( stmtQuery )) == SQL_SUCCESS  )
        {
            // process next row
            ...
        }
        if (  (rc != SQL_SUCCESS) && (rc != SQL_NO_DATA)  )
        {
            result = errorCategory( stmtQuery, SQL_HANDLE_STMT );
            rollBack( dbConn );
            switch (  result  )
            {
                case ERR_OTHER:
                    goto err;
                    break;
                case ERR_TRANSIENT:
                    teRetries--;
                    continue; // retry loop
                    break;
                case ERR_FAILOVER:
                    foRetries--;
                    needReprepare = 1;
                    continue; // retry loop
                    break;
            }
        }
 
        // Now execute the update
 
        // Set input values for update
        ...
 
        // Execute update
        rc = SQLExecute( stmtUpdate );
        if (  rc != SQL_SUCCESS  )
        {
            ...
        }
 
        // Check number of rows affected
        rc = SQLRowCount( stmtUpdate, &rowcount );
        if (  rc != SQL_SUCCESS  )
        {
            ...
        }
        // Check rowcount and handle unexpected cases
        if (  rowcount != 1  )
        {
            ...
        }
 
        // Finally, commit
        rc = SQLEndTran( SQL_HANDLE_DBC, dbConn, SQL_COMMIT );
        if (  rc != SQL_SUCCESS  )
        {
            ...
        }
 
        return SUCCESS; // all good
    } // retry loop
 
err:
    // if we get here, we ran out of retries or had some other non-retryable
    // error. Report/log it etc. then return failure
    ...
 
    return FAILURE;
} // txnSomeTransaction
 
// Main code
int
main ( int argc, char * argv[] )
{
    int status = 0; // final exit code
 
    ....
 
    // Open the connection  to the database and allocate statement handles
    ...
 
    // Disable auto-commit (this is essential)
    rc = SQLSetConnectAttr( dbConn,
                            SQL_ATTR_AUTOCOMMIT,
                            SQL_AUTOCOMMIT_OFF,
                            0 ); 
    ...
 
    // Prepare all statements, bind etc.
    if (  prepareAll() != SUCCESS  )
    {
        ...
    }
 
    // Do stuff until we are finished
    while (  ...  )
    {
        ...
        if (  txnSomeTransaction( ... ) != SUCCESS  )
        {
            ...
            goto fini;
        }
        ...
    }
 
fini:  // cleanup etc.
    // Release all resources (ODBC and non-ODBC)
    ...
    // Disconnect from database
    ...
 
    // Return final exit code
    return status;
} //main