Skip Headers
Oracle® Communication and Mobility Server Administrator's Guide
Release 10.1.3

Part Number E10292-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

8 Configuring High Availability

This chapter discusses configuring high availability through the following sections:

Note:

Currently, OCMS supports high availability using Oracle Application Server only.

About Configuring High Availability

OCMS provides high availability through redundancy, application state replication, and clustering. Highly available OCMS topologies are active-active, meaning that any redundant nodes actively function in the context of the topology. This makes OCMS scalable as well.

Figure 8-1 Highly Available OCMS Topology

Description of Figure 8-1 follows
Description of "Figure 8-1 Highly Available OCMS Topology"

A highly available OCMS topology (Figure 8-1) provides the following:

Note:

For more information about the architecture illustrated in Figure 8-1, refer to "Deploying OCMS as a Highly Available SIP Network".

Configuring a highly available OCMS environment involves the following main steps, depending on the OCMS topology you have chosen to deploy:

Table 8-1 Additional Information

For more information on... See:

OCMS deployment topologies

Chapter 3, "Deployment Topologies" in this guide

OCMS installation

Oracle Communication and Mobility Server Installation Guide

Operating systems supported by highly available OCMS clusters

Oracle Communication and Mobility Server Certification Guide

Configuring a highly available clustered Oracle Application Server environment

  • The "Application Clustering" chapter in Containers for J2EE Configuration and Administration Guide.

  • The "Active-Active Topologies" chapter in Oracle Application Server High Availability Guide.


Setting Up a Highly Available Cluster of OCMS Nodes

Note:

If using UDP, place all servers on the same subnet or switch to avoid the defragmentation of large UDP packages.

Each OCMS node—including the Edge Proxy nodes—must be configured to support high availability.

Following are the main steps in setting up a cluster of OCMS servers:

  1. Associating Nodes with OPMN—Oracle Process Manager and Notification Server provides a command-line interface for process control and monitoring for single or multiple Oracle Application Server components and instances. Using OPMN, you can start and stop each OCMS node and its sub-components.

  2. Starting the Cluster—Starting the cluster with OPMN indicates that all OCMS nodes have been correctly associated with OPMN and are recognized as a cluster.

  3. Verifying the Status of the Cluster—Using OPMN or Enterprise Manager, you can verify that each node in the cluster is up and running.

  4. Stopping the Cluster—Having set up and verified the cluster of OCMS nodes, be sure to stop the cluster before configuring each SIP container for high availability (see "Configuring the OCMS SIP Containers for High Availability").

Associating Nodes with OPMN

Setting up a cluster of OCMS nodes requires associating the nodes with OPMN. There are three ways to do this:

See also:

For more information regarding configuring and managing clusters using OPMN, see Oracle Process Manager and Notification Server Administrator's Guide.

Associating Nodes with OPMN Using the Dynamic Discovery Method

In this method -- one that is recommended by Oracle -- you define the same multicast address and port for each Oracle Application Server instance in the cluster. An advantage in using this method is that you do not have to specify the name of each Oracle Application Server instance in the cluster. Instead, you can dynamically add or remove instances from the cluster by editing the multicast address and port.

  1. For each Oracle Application Server instance that you want to group in the same cluster, run the following command:

    opmnctl config topology update discover="*<multicastAddress>:<multicastPort>"
    

    For example:

    > ORACLE_HOME/opmn/bin/opmnctl config topology update 
                           discover="*225.0.0.20:6200"
    

    where:

    • multicastAddress specifies the multicast address that you want to use for the cluster. The multicast address must be within the valid address range, which is 224.0.1.0 to 239.255.255.255. Note that the multicast address is preceded by an asterisk (*).

    • multicastPort can be any unused port number.

    Use the same multicast IP and port for all the instances.

  2. On each Oracle Application Server instance where you ran the command in Step 1, run opmnctl reload so that OPMN reads the updated opmn.xml file.

    > ORACLE_HOME/opmn/bin/opmnctl reload
    

Associating Nodes with OPMN Using the Discovery Server Method

Although Oracle recommends associating nodes with OPMN using the dynamic discovery method, you can also define a cluster by specifying the names of the nodes running the Oracle Application Server instances in the opmn.xml file for each instance. For example, to cluster four instances (node1.example.com, node2.example.com, node3.example.com, node4.example.com), associate these nodes with OPMN using the discovery server method as follows:

  1. Run Oracle Application Server on all nodes.

  2. Designate one instance as the discovery server, which maintains the topology for the cluster. (In this example, node1.excample.com acts as the discovery server for the cluster.)

  3. In the opmn.xml file for all instances in the cluster, specify the node that is running the discovery server (node1.example.com in Example 8-1). As illustrated in Example 8-1, the opmn.xml file includes the <discover> element. The 6200 value specifies the port number on which the notification server listens. Use the remote port number designated in the <port> sub-element of the <notification-server> element.

    Example 8-1 Designating an Instance as the Discovery Server

    <?xml version="1.0" encoding="UTF-8"?>
    <opmn xmlns="http://www.oracle.com/ias-instance">
       ...
       <notification-server interface="ipv4">
          <port local="6100" remote="6200" request="6003"/>
           ....
          <topology>
             <discover list="node1.example.com:6200"/>
          </topology>
          ...
       </notification-server>
       <process-manager>
       ...
       </process-manager>
    </opmn>
    
  4. On all server instances, run opmnctl reload so that OPMN loads the updated opmn.xml file:

    ORACLE_HOME/opmn/bin/opmnctl reload
    

Starting the Cluster

To start the cluster using OPMN run the following command on each instance in the cluster:

cd ORACLE_HOME/opmn/bin/
opmnctl startall

Verifying the Status of the Cluster

To verify the status of the OCMS nodes in the cluster:

  1. In a Web browser, enter the URI of Enterprise Manager running on any SIP container in the cluster:

    http://<SIP container URI>:<port number>/em
    
  2. Enter the administrator user name and password at the prompt.

    Enterprise Manager displays the status of the cluster topology.

Stopping the Cluster

After verifying the status of the cluster, stop the nodes in the cluster using OPMN so that you can continue configuring the SIP containers (see "Configuring the OCMS SIP Containers for High Availability").

To stop OCMS, execute the following command on each node in the cluster:

cd ORACLE_HOME/opmn/bin/
opmnctl stopall

Configuring the OCMS SIP Containers for High Availability

In the Application Server Control Console MBean browser, configure the following parameters under the SIP Servlet Container MBean for each SIP Application Server node:

  1. Configure the EdgeProxy parameter to point to the SIP URI of the Edge Proxy or to a third-party load balancer if more than one Edge Proxy is used.

    Use the following format:

    SIP:<Edge Proxy or Load Balancer IP address>:<port>;lr
    
  2. Configure the DistributableRecordRoute parameter in the following format:

    SIP:<SIP Container IP address>:<port>
    

    Remove any appended transport methods (such as transport=tcp) to enable any type of transport to be used between the Edge Proxy and OCMS.

  3. Configure the RecordRoute parameter using the following format:

    SIP:<SIP Container IP address>:<port>
    

    Remove any appended transport methods (such as transport=tcp) to enable any type of transport to be used between the Edge Proxy and OCMS.

Configuring the Edge Proxy Nodes for High Availability

Note:

In the load balancer, you must disable stickiness for UDP datagrams sent to the Edge Proxy servers. Refer to the load balancer documentation for more information on disabling stickiness when sending datagrams over UDP.

To configure the Edge Proxy nodes for high availability:

For each Edge Proxy node in the topology, configure the following:

  1. In the Application Server Control Console Mbean Browser, click the edgeproxy Mbean.

  2. Configure the RecordRoute parameter to point to one of the following:

    • For a single Edge Proxy without a load balancer—Set the parameter to the IP address of the Edge Proxy node

    • For more than one Edge Proxy with a load balancer or DNS server—Set the parameter to the virtual IP address or host name of the third-party load balancer or DNS server (if clients connect using a DNS lookup)

  3. Modify the edgeproxy.xml file (sdp/edgeproxy/conf/edgeproxy.xml, illustrated in Example 8-2) to include the oc4j-ping element:

    <oc4j-ping interval="1" allowed-miss-count="2"/>
    

    The oc4j-ping element configures the interval, in seconds, at which the Oracle Application Servers in the cluster ping the Edge Proxy. The allowed-miss-count attribute specifies the number of missed ping intervals allowed before the Edge Proxy removes an unresponsive Oracle Application Server from the routing table.

    Example 8-2 edgeproxy.xml

    <?xml version="1.0" encoding="UTF-8" ?>
    <edge-proxy xmlns:xsi="http://www.oracle.com/sdp">
         <record-route sip-uri="sip:%IPADDRESS%:%SIPPORT%"/>
         <jmx-rmi-connector port="%EPRMIPORT%"/>
         <oc4j-ping interval="1" allowed-miss-count="2"/>
         <nat-traverse enabled="true"/>
         <sip-stack ip="%IPADDRESS%">
              <listening-point transport="tcp" port="%SIPPORT%" />
              <listening-point transport="udp" port="%SIPPORT%" />
         </sip-stack>
    </edge-proxy>
    
  4. In the edgeproxy.xml file, modify the nat-traverse element if necessary.

    • If the Edge Proxy enables SIP clients to traverse of NATs (Network Address Translators), then set the value to true (the default). The corresponding default value must be set in Oracle Communicator.

    • If not using NAT traversal, this attribute must be set to false. For more information disabling NAT traversal, see "Disabling NAT Traversal Enabled by the Edge Proxy".

  5. Verify the status of the Edge Proxy node or nodes in the cluster by performing the following:

For more information, refer to "Configuring OCMS in a Clustered Environment with Edge Proxy" in Oracle Communication and Mobility Server Installation Guide

The NAT Traversal Option Enabled for the Edge Proxy

NAT traversal enables access to SIP User Agents even when they are located behind firewalls or NATs. To support SIP clients residing behind firewalls or NATs, proxy servers use the Path extension header mechanism (described in RFC 3327), which ensures that SIP clients follow specific paths that enable the traversal of NATs and firewalls throughout the network. When you enable the NAT traversal function in the Edge Proxy, an OCMS cluster supports the Path extension header mechanism by inserting a Path header field into REGISTER requests.

Disabling NAT Traversal Enabled by the Edge Proxy

By default, NAT traversal is enabled in edgproxy.xml (nat-traverse enabled=true, as noted in Example 8-2). To disable this function:

  1. If the Edge Proxy is running, stop it by entering the following command:

    ORACLE_HOME/opmn/bin/opmnctl stopproc process-type=EdgeProxy

  2. Edit the nat-traverse element of edgeproxy.xml (located at ORACLE_HOME/sdp/edgeproxy/conf/edgeproxy.xml) as follows:

    <nat-traverse enabled="false"/>

  3. Start the Edge Proxy using the following command:

    ORACLE_HOME/opmn/bin/opmnctl startproc process-type=EdgeProxy

  4. Repeat these steps for each Edge Proxy node in the OCMS cluster.

Caution:

When NAT traversal is enabled, the Edge Proxy nodes insert their local IP addresses into the RecordRoute headers of SIP requests. Therefore, the Edge Proxy nodes must be globally routable. This may not be the case if the cluster has been configured according to the white paper, Oracle Communication and Mobility Server in a High-Availability Environment Running with F5 BigIP (available at the Oracle Technology Network).

Configuring Highly Available SIP Servlet Applications

This section describes how to configure high availability for SIP Servlet applications deployed to a cluster of OCMS nodes.

Notes:

  • When configuring high availability for SIP Servlet applications that depend upon the Proxy Registrar, you must also configure the Proxy Registrar for high availability. See "Configuring the Proxy Registrar for High Availability" for more information.

  • High availability is not currently supported for converged applications (meaning applications comprised of both SIP and HTTP servlets).

Enabling High Availability in SIP Servlet Applications

To configure a highly available SIP Servlet application:

  1. Modify the sip.xml file (located at ORACLE_HOME/j2ee/ocms/applications/<application name>/<web module name>/WEB-INF/sip.xml) to include the <distributable> element.

    For example:

    <?xml version="1.0" encoding="UTF-8"?>
      <sip-app>
      <display-name>proxyregistrarssr</display-name>
       <distributable/>
    <!--Servlets-->
    <servlet>
      <servlet-name>Registrar</servlet-name>
      <servlet-class>oracle.sdp.registrar.VoipRegistrarServlet</servlet-class>
      <init-param>
        <param-name>LocationService</param-name>       
        <param-value>oracle.sdp.locationdbservice.LocationDbServiceBD
        </param-value>
       </init-param>
      </servlet>
    </sip-app>
    
  2. Modify the web.xml file (located at ORACLE_HOME/j2ee/ocms/applications/<application name>/<web module name>/WEB-INF/web.xml) to include the <distributable> element.

    For example:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application
    2.2//EN" "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">
    <web-app>
         <display-name>proxyregistrarssr</display-name>
         <distributable/>
    </web-app>
    
  3. Modify the orion-application.xml file (located at ORACLE_HOME/j2ee/ocms/application-deployments/<application name>/orion-application.xml) to include the <cluster> element which is used to configure clustering both for Oracle Application Server nodes as well as specific SIP servlet applications.

    For example:

    <orion-application ... >
       <cluster allow-colocation="false">
       ...
       </cluster>
    </orion-application>
    

    The <cluster> element, which is used in both orion-application.xml and application.xml files, includes the following sub-elements that control application replication:

    • enabled—Specifies whether clustering is enabled. The default value is true.

    • group-name—The name to use when establishing replication group channels. If not supplied, the application name as defined in server.xml (the Oracle Application Server configuration file) is used by default. New group channels are created for each enterprise application. If a value is specified, the application and all child applications use the channels associated with this group name.

    • allow-colocation—Specifies whether to allow application state to be replicated to a node residing on the same host machine. The default value is true.

      Note:

      Although the default value is true, set allow-colocation to false if multiple hosts are available. If multiple Oracle Application Server instances are instantiated on the same machine, specify different listener ports for each instance in the default-web-site.xml, jms.xml, and rmi.xml configuration files.
    • write-quota—The number of other group members to which the application state should replicate to. This attribute enables reducing overhead by limiting the number of nodes to which state is written. The default value is 1.

      For additional information regarding the <cluster> element and its sub-elements, refer to the chapter "Configuring Application Clustering" in Containers for J2EE Configuration and Administration Guide.

      Note:

      Ensure that you use the correct spelling for attributes in the orion-application.xml file, as misspellings will not result in error messages. For example, if you misspell start-port in the cluster configuration section as start-prt, replication will appear to have started even though session replication does not work.
  4. Repeat these steps for each application deployed on each OCMS instance.

Important:

Deploy the application symmetrically to all SIP application server nodes.

For information about developing highly available SIP Servlet applications, refer to Oracle Communication and Mobility Server Developer's Guide.

Configuring Application Session Data Replication

OCMS supports multicast replication by defining the orion-application.xml file's <cluster> and <property-config> elements. The <property-config> element contains data required to use the JGroups group communication protocol to replicate session state across nodes in the cluster.

To set the replication policy, edit the ORACLE_HOME/j2ee/ocms/application-deployments/<application name>/orion-application.xml file as follows:

  • Set the <cluster> element's allow-colocation attribute to false.

  • Set the the <property-config> element's <url> element to the path to the JGroup's XML file describing the application-related high-availability configuration (illustrated in Example 8-4).

    Example 8-3 Editing orion-application.xml File for Replication

    <orion-application ... >
       ...
       <cluster allow-colocation="false">
       ...
          <property-config>
             <url>file:///ORACLE_HOME/j2ee/ocms/application-deployments/
             <application name>/jgroups-tcp.xml</url> 
          </property-config> 
       </cluster> 
    

Example 8-4 illustrates the JGroups XML file (referred to as jgroups-tcp.xml) in Example 8-3.

Example 8-4 A Sample JGroups Application High Availability Configuration File

<config> 
  <TCP/> 
  <MPING mcast_addr="230.0.0.130" mcast_port="8500" ip_ttl="1"/> 
  <MERGE2 min_interval="5000" max_interval="10000"/> 
  <FD timeout="1000" max_tries="3" shun="false"/> 
  <FD_SOCK/> 
  <VERIFY_SUSPECT timeout="1000"/> 
  <pbcast.NAKACK gc_lag="100" retransmit_timeout="3000"/> 
  <pbcast.STABLE desired_avg_gossip="20000"/> 
  <pbcast.GMS join_timeout="3000" join_retry_timeout="2000" shun="false" 
  print_local_addr="true"/>
</config> 

In Example 8-4, failure detection (set by <FD timeout>) is set in milliseconds. In the sample JGroups file, it is set at 1000 milliseconds with three retries (max_tries="3").

Configuring High Availability for a Deployed SIP Servlet Application

Perform the following if the SIP Servlet application has already been developed and deployed, but not configured for high availability.

  1. Undeploy the SIP Servlet application.

  2. Unpack the application EAR.

  3. Modify the sip.xml and web.xml files to include the <distributable> element, as described in "Enabling High Availability in SIP Servlet Applications". To edit the sip.xml and web.xml files, do the following:

    1. Create a new folder and name it <Your SIP application>.

    2. Unpack the EAR file to the new folder.

    3. Unpack the WAR file inside the EAR file to the folder <Your SIP application>/<WAR module name>.

    4. Edit the following files:

      <Your SIP application>/<WAR module name>/WEB-INF/sip.xml
      <Your SIP application>/<WAR module name>/WEB-INF/web.xml
      
    5. Under <Your SIP application>/<WAR module name>, re-package the contents of the WAR file using the original WAR file name. Replace the original WAR file with in the <Your SIP application> folder with the new one that you just created.

    6. Delete the <Your SIP application>/<WAR module name> folder.

    7. Package the contents of the <Your SIP application> into an EAR file. Use the file name of the original EAR file.

  4. Re-deploy the application EAR to the SIP Servlet Container.

Disabling High Availability at the Application Level

To remove an application from the cluster:

  1. In a text editor, open the application-specific orion-application.xml file.

  2. Set the enabled attribute of the <cluster> element to false. For more information, see "Configuring Application Session Data Replication".

    For example:

    <orion-application ... >
       ...
       <cluster enabled="false">
       ...
       </cluster>
    </orion-application>
    
  3. Save the orion-application.xml file.

Upgrading SIP Servlet Applications in OCMS

SIP Servlet applications can be upgraded using a rolling upgrade procedure that minimizes downtime.

To perform a rolling upgrade of a SIP servlet application in OCMS:

  1. On the OCMS node where you want to upgrade the SIP Servlet application, execute the following command to stop running all OCMS processes on the node:

    ORACLE_HOME/opmn/bin/opmnctl shutdown
    
  2. Comment out the <topology> element in the ORACLE_HOME/opmn/conf/opmn.xml file to remove the SIP Application Server from the cluster.

  3. Restart the SIP container by running the following command:

    ORACLE_HOME/opmn/bin/opmnctl startall
    
  4. Upgrade the SIP Servlet application on the SIP container you took out of the cluster.

  5. Shut down the SIP container again so that you can put it back in the cluster:

    ORACLE_HOME/opmn/bin/opmnctl shutdown
    
  6. Uncomment the <topology> element in the opmn.xml file to place the SIP container back into the cluster.

  7. Place the SIP container back into the cluster by executing the following command:

    ORACLE_HOME/opmn/bin/opmnctl startall
    

    The SIP Servlet application upgrades following the completion of all in-progress calls.

  8. Repeat the process with the remaining SIP containers.

Configuring the Proxy Registrar for High Availability

Configuring the Proxy Registrar for high availability involves the following main steps:

Configuring Oracle TimesTen Replication

For each OCMS node in the cluster, Oracle TimesTen and OCMS must be installed and run under the same non-root user name at the operating system level. For example, if OCMS and TimesTen are installed by a user called ocmsuser on OCMS Node 1 of a cluster, then ocmsuser must install both OCSM and TimesTen on OCMS Node 2 as well.

Replicating an Oracle TimesTen In-Memory database in OCMS involves the following:

Creating and Seeding Replication Tables

For each OCMS node in the cluster, perform the following:

  1. Run the following Perl scripts:

    $TT_HOME/bin/ttIsql -connStr "DSN=ocmsdb" -f 
    $ORACLE_HOME/sdp/replication/sql/replication.drop.timesten.sql
    $TT_HOME/bin/ttIsql -connStr "DSN=ocmsdb" -f 
    $ORACLE_HOME/sdp/replication/sql/replication.timesten.sql
    

    The following tables are created in the TimesTen database:

    clusters
       cluster_id INT NOT NULL,
       cluster_name VARCHAR(255) NOT NULL UNIQUE
       cluster_replication VARCHAR(64) NOT NULL,
    instances
       instance_id INT NOT NULL,
       instance_name VARCHAR(255) NOT NULL UNIQUE,
       hostname VARCHAR(255) INLINE NOT NULL,
       port INT NOT NULL
    cluster_instance_map
       cluster_map_id INT NOT NULL,
       cluster_id INT NOT NULL,
       instance_id INT NOT NULL
    replication_params
       replication_param_id INT NOT NULL,
       replication_param_name VARCHAR(255) NOT NULL UNIQUE,
       replication_param_value VARCHAR(255) NOT NULL,
    replication_tables
       table_id INT NOT NULL,
       table_name VARCHAR(255) NOT NULL UNIQUE
    replication_sequences
       account_seq
       role_seq
       credentials_seq
       realm_seq
       private_identity_seq
       property_seq
       public_identity_seq
       lsid
    
  2. Modify ORACLE_HOME/sdp/replication/sql/seed_clusters_test_sync.sql to suit your deployment. By default, this script configures one cluster of two OCMS nodes. Replace host1.example.com and host2.example.com with the hostnames of the two OCMS nodes in the cluster.

  3. Execute the following command:

    TT_HOME/bin/ttIsql -connStr "DSN=ocmsdb" -f 
    ORACLE_HOME/sdp/replication/sql/seed_clusters_test_sync.sql
    

    This command seeds the TimesTen database with all the cluster and instance information required for configuring replication.

    Following are the tables that require replication:

    • Schema properties:

      properties
      
    • Cluster and instance configurations:

      clusters
      instances
      cluster_instance_map
      replication_params
      replication_tables
      
    • Location Service

      locationservice
      
    • User Service

      public_identity
      private_identity
      
    • Security Service

      account
      role
      user_role
      credentials
      realm
      

Note:

Sequences are excluded from replication.

Creating an Internal TimesTen User

For each OCMS node in the cluster, perform the following:

  1. Create an internal TimesTen user called ocmsuser and assign a password to the user name by executing the following command in TimesTen:

    Command> create user ocmsuser IDENTIFIED BY 'password';
    
  2. Grant ocmsuser administrative privileges by executing the following command in TimesTen:

    Command> grant admin to ocmsuser;
    

Configuring Replication in Oracle TimesTen In-Memory Database

Note:

It is only possible to configure replication between like operating systems, for example two Linux computers.

This section describes how to configure replication in the Oracle TimesTen in-memory database instances:

Configuring Replication in the First Oracle TimesTen Database Instance in the Cluster

To configure replication for the first Oracle TimesTen In-Memory database in a cluster, perform the following:

  1. On each computer, add $TT_HOME/lib to the LD_LIBRARY_PATH environment variable.

  2. On the first computer in the cluster, execute the following command:

    $ORACLE_HOME/sdp/replication/bin/create_replication_sync.pl
    
  3. When prompted for the hostname, enter the fully qualified domain name.

    The following error message may display. You can safely ignore it.

    [TimesTen][TimesTen 6.0.7 ODBC Driver][TimesTen]TT12027: The agent is already stopped for the data store.
    Dropping Replication Scheme: ocms_rep.repscheme
    [TimesTen][TimesTen 6.0.7 ODBC Driver][TimesTen]TT8160: REPLICATION OCMS_REP.REPSCHEME not found
            -- file "repDDL.c", lineno 1436, procedure "sbRepCheckOrDrop()"
    

    The replication scheme is now configured on the database.

  4. Verify that the replication scheme has been configured in the data store by executing the following command at the TimesTen prompt:

    $TT_HOME/bin/ttisql -connstr "dsn=ocmsdb"
    Command> repschemes;
    
  5. Verify that replication scheme OCMS_REP.REPSCHEME has been successfully created.

  6. If you need to make any changes to the names of the TimesTen instances, for example, if you opt to replicate to a different instance altogether, perform the following:

    1. Drop and re-create the replication tables by executing the following commands:

      $TT_HOME/bin/ttisql -connstr "dsn=ocmsdb"
      Command> drop replication ocms_rep.repscheme;
      Command> quit
      
    2. Go back to "Creating and Seeding Replication Tables".

  7. Configure TimesTen to automatically start replication (as well as the replication agent, if it is not running) upon restarting TimesTen by executing the following command:

    $TT_HOME/bin/ttAdmin -repPolicy always ocmsdb
    
Configuring Replication in the Second Oracle TimesTen Database Instance in the Cluster

To configure replication for the second Oracle TimesTen Database in the cluster:

  1. If OCMS is running, stop running OCMS by executing the following command:

    opmnctl stopproc process-type=ocms
    
  2. Stop running Oracle TimesTen by executing the following command:

    opmnctl stopproc ias-component=TIMESTEN
    
  3. Start the TimesTen daemon by executing the following command:

    $TT_HOME/bin/ttdaemonadmin -start
    
  4. Configure the ocmsdb data store to have the same PermSize as that of the first computer. For example, if the first computer has 512 MB PermSize allocated to ocmsdb, allocate 512 MB to the second ocmsdb data store.

    On Linux, edit $TT_HOME/info/sys.odbc.ini and change the PermSize corresponding to the ocmsdb entry.

  5. Delete the ocmsdb datastore on this computer by executing the following command:

    ttdestroy -connstr "dsn=ocmsdb"
    

    The datastore will be replicated from the first OCMS computer.

  6. Duplicate the ocmsdb datastore from the first computer as follows:

    1. Execute the following command:

      perl $ORACLE_HOME/sdp/replication/bin/duplicate_db_replication.pl
      
    2. At the prompt, enter the fully qualified domain name of the first computer.

    3. At the prompt, enter the fully qualified domain name of the second computer (that is, the computer on which you are currently working).

    4. At the prompt, enter the user name and password of the TimesTen user that you created on the first computer, for example, ocmsuser.

    Note:

    Depending on its size, copying and replication of the datastore may take a few moments.
  7. Start the TimesTen daemon as follows:

    $TT_HOME/bin/ttdaemonadmin -start
    
  8. Run the following command to configure replication to start automatically when TimesTen restarts:

    $TT_HOME/bin/ttAdmin -repPolicy always ocmsdb
    
  9. Start TimesTen using opmnctl:

    opmnctl startproc ias-component=TIMESTEN
    
  10. Start OCMS:

    opmnctl startproc process-type=ocms
    

Configuring Failover and Recovery for Oracle TimesTen In-Memory Database

This section describes the following topics:

Configuring Failover

TimesTen is managed by OPMN, such that OPMN immediately revives a failed instance. The replication policy is configured to auto-start when TimesTen starts up and no further action is required. If the replication agent fails, TimesTen immediately restarts it.

As any given application instance is tied to an in-memory instance of the TimesTen database, the application instance cannot fail over to another instance of the TimesTen database.

Typically, when an application instance fails, requests are no longer sent to the failed instance. No action is required on the other instances in the cluster. When the application instance comes back online, however, a recovery procedure must be followed.

Configuring Recovery

If you have manually brought down an instance for a few minutes, you can simply restart it. The replication policy is configured to auto-start when TimesTen starts, such that no further action is required.

If an instance has failed for an extended period of time (over an hour, for example), it is recommended to delete the database on the failed instance and duplicate it from any another instance in the cluster.

To recover a failed instance, perform the following:

  1. Reset the replication policy to manual, so that the replication agent can terminate properly.

    $TT_HOME/bin/ttAdmin -repPolicy manual ocmsdb
    
  2. Stop replication on the failed instance.

    $TT_HOME/bin/ttAdmin -connstr "dsn=ocmsdb" -repStop
    
  3. Run the following Perl script to recover replication:

    perl $ORACLE_HOME/sdp/replication/bin/recover_replication_sync.pl 
    
  4. Change replication policy back to always by executing the following command:

    $TT_HOME/bin/ttAdmin -repPolicy always ocmsdb
    

On the other OCMS node (that is, the surviving instance), perform the following:

  1. Using the following command, change the replication policy to manual to enable the replication agent to terminate proplery:

    $TT_HOME/bin/ttAdmin -repPolicy manual ocmsdb
    
  2. Stop replication on the surviving instance:

    $TT_HOME/bin/ttAdmin -repStop ocmsdb
    
  3. Drop the replication scheme. (This is executed using the drop replication command).

    $TT_HOME/bin/ttisql -connstr "dsn=ocmsdb"
    Command> drop replication ocms_rep.repscheme;
    Command> quit
    
  4. Recreate replication:

    1. Execute the following command:

      perl $ORACLE_HOME/sdp/replication/bin/create_replication_sync.pl
      

      Note:

      You must add TT_HOME/lib to the LD_LIBRARY_PATH environment variable before you can execute the create_replication_sync.pl command.
    2. When prompted for the hostname, enter the fully qualified domain name of the host.

  5. Set the replication back to always by executing the following command:

    $TT_HOME/bin/ttAdmin -repPolicy always ocmsdb
    

Troubleshooting Replication

This section discusses the following replication troubleshooting issues:

Unable to Connect

When attempting to duplicate the database from the first instance to the second instance, the following error message might display:

Unable to connect to the replication agent on the first instance

This message indicates that the replication agent may not be running.

To resolve this error, perform the following:

  • Ensure that the replication agent is running on the first database instance by executing the following command:

    $TT_HOME/bin/ttStatus
    

    The output of this command should include entries for replication with corresponding process IDs.

  • The replication policy might not be configured as always on. To configure the replication policy as always, execute the following command:

    $TT_HOME/bin/ttAdmin -repPolicy always ocmsdb
    
Master Catchup Required

When attempting to connect to the database on the first instance, the following error message may display:

Master Catchup Required. New Connections are not allowed.

To resolve this error, drop and re-create the replication scheme as follows:

  1. On the first instance, edit $TT_HOME/info/sys.odbc.ini and set ForceConnect=1 for the ocmsdb data store. This enables connections and modifications to the database.

  2. Next, you must drop the existing replication scheme, create a new replication scheme, and re-create the replication sequences. To do this, execute the following commands:

    $TT_HOME/bin/ttAdmin -repPolicy manual ocmsdb
    $TT_HOME/bin/ttAdmin -repStop -connstr "dsn=ocmsdb"
    perl $ORACLE_HOME/sdp/replication/bin/create_replication_sync.pl
    
  3. Execute the following command to configure the replication policy as always on:

    $TT_HOME/bin/ttAdmin -repPolicy always ocmsdb
    
  4. Repeat for the other database instance, then duplicate the database from the first instance to the second.