Configuring Java CAPS Business Processes

Implementing Transparent Application Failover

Oracle RAC is an Oracle database that has two or more instances accessing a shared database using cluster technology. A cluster is a group of machines (or nodes) that work together to perform the same task. To support this architecture, two or more machines that host the database instances are linked by a high-speed interconnect to form the cluster. The interconnect is a physical network used as a means of communication between each node of the cluster.

After Oracle RAC is installed, the Transparent Application Failover (TAF) feature can be configured to ensure the highest levels of availability. TAF compliments all levels of the availability hierarchy. Applications and users are automatically and transparently reconnected to another system, applications and queries continue uninterrupted, and the login context is maintained. Oracle Net Services is configured to allow the listener on each database instance of RAC to failover in case of failure.


Note –

Setting up the Oracle RAC/OPS system to test the TAF feature is beyond the scope of this document. Please contact your DBA to set up the Oracle RAC/OPS server with the configuration in the tnsnames.ora and listener.ora files.


The OCI driver works in conjunction with the BPM Engine only. The Worklist Manager uses the DataDirect driver only.

ProcedureTo implement transparent application failover

  1. Set up the Oracle RAC server with the multiple hosts or instances sharing the same data storage.

    For information about configuring tnsnames.ora to enable the TAF feature when using the OCI driver, see Before You Begin.

  2. If you have not already done so, install Oracle client locally where the Logical Host is running.

    Oracle client must be installed for OCI to work. Since the OCI driver establishes connectivity to the database based on a native C call, the same version of the Oracle client must be installed as packaged with BPM. Version conflicts for OCI would result in problems configuring the OCI driver with BPM.

  3. Configure the tnsnames.ora file for TAF.

    Below is an example of a tnsnames.ora file configured for a transparent application failover (TAF).

    Option 1: Connect time FAIL OVER and TAF


    MY_CLUSTER =
      (DESCRIPTION =
        (FAILOVER = ON)
        (LOAD_BALANCE = OFF)
        (ADDRESS_LIST =    
          (ADDRESS = (PROTOCOL = TCP)(HOST = Node1)(PORT = 1521))
          (ADDRESS = (PROTOCOL = TCP)(HOST = Node2)(PORT = 1521))
        )
        (CONNECT_DATA =
          (SERVICE_NAME = my_cluster.my_company.com)
          (FAILOVER_MODE =
            (TYPE = SELECT)
            (METHOD = PRECONNECT)
            (BACKUP=Node2)
          )
        )
      )

    Option 2: TAF configuration


    MY_CLUSTER =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = Node1)(PORT = 1521))
        (CONNECT_DATA =
          (SERVICE_NAME = my_cluster.my_company.com)
          (FAILOVER_MODE =
            (TYPE = SELECT)
            (METHOD = PRECONNECT)
             (BACKUP = Node2)
           )
        )
      )

    In the above configuration, MY_CLUSTER has the knowledge of the two nodes that are configured to act as a cluster. The cluster should be configured to share the same disk by RAC/OPS.

    In option 1, FAILOVER is set to ON and LOADBALANCE is set to OFF. This is the configuration for a connect time failover. In connect time failover, when the OCI driver tries to connect to Node1 and determines that the node is down, it connects to the other host in the Address list (Node2). Option 2 is shown only to illustrate configuring just the TAF feature in the OCI client. For BPM, both the connect time failover and the TAF configuration is required.

    Configuring the TAF option involves adding Oracle Net parameters to the tnsnames.ora file and the use of parameter values to ascertain the next step in the failover process when one of the participating nodes encounters failure. The parameter that drives the TAF option is the FAILOVER_MODE under the CONNECT_DATA section of a connect descriptor. By using one or more of the following parameters, the full functionality of TAF can be achieved.

    Parameter

    Description 

    BACKUP 

    Specifies a different net service name to be used to establish the backup connection. A backup should be specified when using PRECONNECT to pre-establish connections. Specifying a BACKUP is strongly recommended for BASIC methods; otherwise, reconnection might first attempt the instance that has just failed, adding additional delay until the client reconnects. 

    TYPE 

    Specifies the type of failover. Three types of Oracle Net failover functionality are available to the Oracle Call Interface by default. 

    • SESSION– Fails over the session. With this option only a connection is established and no work in progress is transferred from the failed instance to the available instance.

    • SELECT– Enables a user with open cursors to continue fetching on them after failure. Oracle Net keeps track of any SELECT statements issued in the current transaction, as well as how many rows have been fetched back to the client for each cursor associated with a SELECT statement. If the connection to the instance is lost, Oracle Net establishes a connection to a backup instance, re-executes the SELECT statements, and positions the cursors so the client can continue fetching rows as if nothing had happened. However, no DML operations are transferred.

    • NONE– No failover functionality is implemented (this is the default).

    METHOD 

    Determines the speed of the failover from the primary to the secondary or backup node. 

    • BASIC– Establishes connections at failover time.

    • PRECONNECT– Pre-establishes connections. If this parameter is used, connection to the backup instance is made at the same time as the connection to the primary instance.

    RETRIES 

    Specifies the number of times to attempt to connect to the BACKUP node after a failure before giving up. 

    DELAY 

    Specifies the amount of time in seconds to wait between attempts to connect to the BACKUP node after a failure before giving up.