This topic discusses the load balancing and routing of requests
from Studio nodes to the Dgraph nodes in Oracle Big Data Discovery.
Load balancing of requests
Depending on your deployment strategy, to the external clients, the
entry point of contact with the on-premise deployment of the Big Data Discovery
cluster could be either any Studio-hosting node in the cluster, or an external
load balancer configured in front of Studio instances.
The Big Data Discovery cluster relies on the following two levels of
load balancing of requests:
- Load balancing of requests
across the nodes hosting multiple instances of Studio. This task should be
performed by an external load balancer, if you choose to use it in your
deployment (an external load balancer is not included in the Big Data Discovery
package).
If an external load balancer is used, it receives all requests and
distributes them across all of the nodes in the Big Data Discovery cluster
deployment that host the Studio application. Once a request is received from a
Studio node, it is routed by BDD to the appropriate Dgraph node.
If an external load balancer is not used, external requests can be
sent to any Studio node. They are then load-balanced between the nodes hosting
the Dgraph.
- Load balancing of requests
across the Dgraph nodes. This task is automatically handled by the BDD cluster
— the Big Data Discovery software accepts requests from its Studio and Data
Processing components on any node hosting the Dgraph, and provides internal
load balancing of these requests across the other Dgraph-hosting nodes in the
cluster.
Routing of requests
The Big Data Discovery cluster automatically directs requests to the
subset of the cluster nodes hosting the Dgraph instances.
The following statements describe the behavior of the BDD cluster for
routing of requests to Dgraph nodes:
- Requests can be submitted
from Studio or Data Processing components to any Dgraph Gateway in the BDD
cluster, which in turn will route the request to an appropriate Dgraph node.
For example, if the request is an updating request, such as a data
loading request, or a configuration update, it is routed to the leader Dgraph
node in the cluster. If the request represents a non-updating (query
processing) request, it is routed to the leader Dgraph node or to any of the
follower Dgraph nodes. If a BDD cluster has only one node hosting the Dgraph,
this node serves as the leader (with no followers).
- Non-updating requests are
load-balanced using round-robin algorithm across the Dgraph nodes, for
processing.
- The Big Data Discovery
cluster utilizes session affinity for all requests arriving from Studio to the
Dgraph, by relying on session ID in the header of each Studio request. Requests
from the same session ID are always routed to the same Dgraph node in the
cluster. This improves query processing performance by efficiently utilizing
the Dgraph cache, and improves performance of caching entities (known in Studio
as views).