7 Understanding Oracle Virtual Directory Fault Tolerance

This chapter describes Oracle Virtual Directory fault tolerance and contains the following topics:

7.1 Overview

Oracle Virtual Directory is extremely flexible when implementing fault-tolerant designs. Oracle Virtual Directory does not store data locally allowing duplicate copies of the data to be deployed and managed across multiple Oracle Virtual Directory instances. Additionally, Oracle Virtual Directory configuration files can be easily duplicated or shared on an appropriate Storage Area Network (SAN) configuration.

Oracle Virtual Directory's LDAP Adapter provides excellent support for managing connections to multiple source directory replicas and masters. Oracle Virtual Directory provides the ability to spread query loads across multiple directory replicas while directing add, modify, delete, and rename operations to designated directory master servers.

In a situation where one source directory does not have fault tolerance and the LDAP client application issues a query that spans all directories, LDAP RFCs require that all parts of the directory respond correctly or the entire result is invalid. This generally works well until a proxied directory becomes unavailable. If the source without a redundant directory link fails, global queries may begin to failover all directories even though only part of the user base is impacted. Oracle Virtual Directory enables you to control how it responds when individual proxies fail and how it should impact the overall service.

In many scenarios the proxied directory is present to allow partner company users to access a host company's application. If the partner directory is offline or is unreachable, it is also likely that the company's users cannot get to the application anyway, so a failure could be deemed non-critical to the application. In this case, Oracle Virtual Directory can be configured to ignore the downed server connection, allowing the other partners to continue working.

The following is a list of the primary areas of Oracle Virtual Directory fail over, which are described in the subsequent topics in this chapter:

7.2 DNS and Network Fail Over

Depending on how you plan to implement fault tolerance for the Oracle Virtual Directory, you can consider several options for routing clients to available Oracle Virtual Directory systems.

The simplest method is to define DNS round robin where a particular DNS name has two IN A records in DNS management terms which causes a DNS server to give out a rotating address each time a request for a particular address is made (that is, ldap.corp.com alternates between 192.168.0.1 and 192.168.0.2). This approach is useful if you want to spread load between two available servers, but is less useful when one of those servers becomes unavailable because DNS is unaware of the failure and continues to send clients to the server every time it rotates through the failed server's address.

You can also use a hardware load balancer such as Cisco's LocalDirector or F5's Big-IP. These types of products provide true load balancing while monitoring performance of each of the servers. There are many products that vary in cost and capability in this category.

Another method is to use a cluster configuration (for example, Veritas) capable of switching IP addresses between failed nodes in a cluster.

7.3 Oracle Virtual Directory Fail Over

Fail over Oracle Virtual Directory system fail over is relatively straightforward unless you are using a Local Store Adapter. Oracle Virtual Directory uses configuration files that are only read on start-up. In theory, two servers reading the same configuration data automatically perform the same function.

7.3.1 Local Store Adapter Fail Over

When using the Local Store Adapter, you must consider a few additional issues, specifically replication. Replication is the process where one node updates the other node with changes to its local data store. If you are using the Local Store Adapter, you must set up a replication agreement between cluster nodes (and possibly other non-cluster nodes). When replication is configured, one node becomes the primary node and the other the node becomes a subordinate node to the primary node. For example, node 1 is the primary and node 2 is the subordinate. In this configuration, both nodes are equally functional, however, only node 1 may process writes. Once processed, node 1 automatically updates node 2.

If there is a failure, you must configure your cluster failure handling scripts to take appropriate action. If node 2 fails, nothing is impacted if replication is restarted on the return of node 2. In contrast, if node 1 fails, node 2 must be promoted to primary, allowing node 2 to continue handling writes during node 1's absence. Before node 1 returns to operational status, the replication agreement must be reversed.

7.4 Proxied Sources Fail Over

Oracle Virtual Directory's LDAP Adapter provides sophisticated fail over and load balancing management for all LDAP-compliant data repositories. For any proxied source or adapter you can define multiple remote host replicas and specify the following characteristics:

  • Read/Write or Read Only (Master versus Replica node)

  • Percentage load distribution or switch only on failure

Figure 7-1 shows an example of how Oracle Virtual Directory's LDAP Adapter performs transaction load-balancing:

Figure 7-1 Oracle Virtual Directory LDAP Adapter and Transaction Load-Balancing

Figure shows OVD Ldap Adapter transaction load-balancing.

Oracle Virtual Directory includes configurable connection handling settings that allow you to specify the following:

  • Heartbeat Interval: How often Oracle Virtual Directory verifies online status of a proxy. The heartbeat interval continually verifies availability of a server. If a proxy goes offline, Oracle Virtual Directory automatically removes it from its list of active servers and distributes load to other defined replicas. When the heartbeat interval verifies a server is available again, the server is put back on the available list.

  • Time-out Interval: How long (in milliseconds) Oracle Virtual Directory waits before determining a connection has failed. When a connection fails, Oracle Virtual Directory automatically tries the next server on its replica list. If no proxied servers are responding, the LDAP client receives the DSA unavailable error.

  • Criticality: How Oracle Virtual Directory determines when the proxy's results are critical to an overall query. If a query requires responses from multiple adapters, Oracle Virtual Directory responds with an error if any sources are unavailable (because all adapters could not be queried) and have been designated critical. In some situations you may want to have Oracle Virtual Directory return results even if only some servers could be queried. To allow Oracle Virtual Directory to return partial results, set adapters to non-critical if you are allowing missing results from those adapters.