11.6 Monitoring the Health of an Access Manager Server

Access Manager Services are business critical and must always be available to control user access to an organization's protected web services and applications. Because hardware, network connectivity issues and other failures can happen, HeartBeat monitoring can be leveraged by Load Balancers to ensure user traffic is routed to healthy OAM Servers.

For example, when there is a firewall installed between a User Agent or WebGate (10 or 11g) and the 10g or 11g Access Manager server, perimeter devices can check availability of the Access Manager server (its health) by hitting its HeartBeat URL. The following sections contain details.

11.6.1 Understanding WebGate and Access Manager Communications

When deploying a network firewall between a WebGate and Access Manager server, the WebGate communicates using the OAP protocol by creating a TCP socket connection with Access Manager to establish a message channel. The WebGate uses the message channel to send different OAP messages necessary to serve the resource requests (isprotected, isauthorized, and the like). Now, consider a situation in which the WebGate/Oracle HTTP Server is idle. In this case, the WebGate has received no resource request and will not send any messages to Access Manager for authentication or authorization; there will also not be any read/write activity on the socket connection.

The firewall determines this connection is idle after 30-40 minutes of inactivity (depending on its configuration) and terminates the socket connection but does not inform/notify the WebGate or Access Manager server. In this case, when a request for a resource arrives at the WebGate and it sends a OAP message to the Access Manager server, it uses the existing connection and waits for a reply. Because the connection was dropped by the firewall, the WebGate does not receive any reply; so it waits for the TCP timeout. Following the TCP timeout, WebGate understands the message channel is of no use and starts the process to get a new message channel. TCP timeout is OS specific and may vary from several minutes to hours which makes the WebGate unable to process user requests.

Note:

The setKeepAlive WebGate parameter ensures that load balancers do not drop the OAP connection. See Table 15-2 for details.

11.6.2 Monitoring Access Manager Server Health

The OAM monitoring model allows Web Tier components (load balancers) to ping an OAM Managed Server's HeartBeat endpoint at a scheduled interval over HTTP(S). This allows Web Tier components to route incoming HTTP traffic away from unhealthy OAM Managed Server(s).

Every OAM Managed Server exposes this HeartBeat URL:

Scheme://ManagedServerHost:ManagedServerPort/oam/server/HeartBeat

In this URL, the following is true:

  • scheme = https | http

  • ManagedServerHost = Host name of the Access Manager WLS Managed Server

  • ManagedServerPort = Port used by the Access Manager WLS Managed Server

The HeartBeat URL works as follows:

  1. The Web Tier components will send an HTTP request to the HeartBeat endpoint of the Access Manager Managed Server.
  2. The Access Manager Managed Server will then do the following:
    • Verify Id Store Connectivity

    • Verify Policy Store Connectivity

    • Verify the Credential Collector URLs are reachable

    • Sanity check the working of the Coherence Layer

    • Check for NAP connectivity

    If the above tests succeed, the Access Manager server is considered to be healthy and a HTTP 200 response is sent to the Load Balancer. Any other HTTP Status Code value signifies that the Access Manager Managed Server is not healthy.

  3. When multiple Access Manager Managed Servers are present in the deployment, the Web Tier component will repeat this for each OAM Managed Server.

Note:

Neither the health status test results or check results can be communicated in the body of the HTTP Response. A successful heartbeat check will return the HTTP code 200.