4.17.1 Using HAProxy for PGX Load Balancing and High Availability

HAProxy is a high-performance TCP/HTTP load balancer and proxy server that allows multiplexing incoming requests across multiple web servers.

You can use HAProxy with multiple instances of the graph server (PGX) for high availability. The following example uses the OPG4J shell to connect to PGX.

The following instructions assume you have already installed and configured the graph server (PGX), as explained in Starting the Graph Server (PGX).

  1. If HAProxy is not already installed on Big Data Appliance or your Oracle Linux distribution, run this command:
    yum install haproxy
  2. Start the graph server instances.

    For example, if you want to load balance PGX across 4 nodes (such as bda02, bda03, bda04, and bda05) in the Big Data Appliance, start PGX on each of these nodes. Configure PGX to listen for connections on port 7007.

  3. Configure HAProxy:
    1. Locate the haproxy.cfg file in /etc/haproxy directory on the host where you installed HAProxy.
    2. Add a frontend section with the following parameters:
      • bind: to set the listening IP address and port
      • mode: http or https
      • default_backend: to set the name of the backend to be used

      For example, the following frontend configuration receives HTTP traffic on all IP addresses assigned to the server at port 7008:

      frontend graph_server_front
        bind *:7008
        mode http
        default_backend graph_server
    3. Add a backend section with the following parameters:
      • mode: http or https
      • cookie: name of the cookie to be used for session persistence
      • server: list of servers running behind the load balancer

      For example, the following backend configuration uses the PGX_INSTANCE_STICKY_COOKIE:

      backend graph_server
        mode http
        cookie PGX_INSTANCE_STICKY_COOKIE insert indirect nocache
        server graph_server_1 host_name_graph_server_1:port check cookie graph_server_1 # Notice that the name at the end must be the same as the server name
        server graph_server_2 host_name_graph_server_2:port check cookie graph_server_2
        option httpchk GET /isReady
        http-check expect string true

      In the preceding configuration file, the option httpchk clause instructs the load balancer to check the readiness of the server. The http-check clause specifies that the load balancer must expect a true response in order to determine that the server is healthy and capable of handling more requests. See Health Check in the Load Balancer for supported health check endpoints.

  4. Start the load balancer.

    Start HAProxy using systemctl:

    sudo systemctl start haproxy
  5. Test the load balancer.

    From any host you can test connectivity to the HAProxy server by passing in the host and port of the server running HAProxy as the base_url parameter to the graph client shell CLI. For example:

    cd /opt/oracle/graph
    ./bin/opg4j --base_url http://localhost:7008 -u <username>
    

    Note:

    The PGX in-memory state is lost if the server goes down. HAProxy will route commands to another server, but the client must reload all graph data.

    It is recommended that you run a series of PGX commands to test routing. Kill a server and restart the graph shell CLI to confirm that HAProxy redirects the request to a new server.