Distributed Execution
PGX provides an option for distributed (i.e. multi-machine) execution mode, which
makes it possible to analyze large graph instances that would not fit in
the memory of a single machine. When running in distributed execution mode,
PGX partitions the (large) graph instance such that each machine
holds only a subset of the graph data in its own memory but the whole graph
data still resides in the aggregated memory of all the machines. This
partitioning happens automatically when PGX loads the graph.
Naturally, such a distributed execution requires a cluster environment where
several machines are connected via a local communication network. For efficient
large-graph processing, PGX favors engineered cluster environments that are
equipped with server-class machines and high-bandwidth interconnects, such as
Oracle Big Data Appliance.
The PGX distributed execution mode was presented at the SC2015 conference
(download paper)
and GRADES2017 workshop (download paper).
There have been many features added to the distributed mode but the overall architecture is still
close to what was described in these papers.
The PGX distributed mode executes as follows:
- Two components are used: a front-end and a distributed execution engine.
The front-end is a web server, very similar to the one used in the
remote shared-memory mode, while the distributed execution engine performs the actual operations
such as loading graphs, performing algorithms and executing PGQL queries.
- The front-end of the PGX distributed execution mode implements a subset of the
same remote PGX API as the PGX shared-memory server. Therefore, a PGX client,
e.g. PGX shell, can connect to it
and submit analysis requests and PGQL queries, just as to a PGX shared-memory server.
The front-end of the PGX distributed execution mode processes those requests by issuing
corresponding commands to the distributed execution engine.
Please refer to the following pages to learn more about the supported features and limitations of the distributed mode:
- The PGX distributed engine executes the actual computations for graph algorithms and queries in a distributed
manner. Within each machine, the engine employs multi-threading and leverages the multiple cores of modern
hardware. The engine uses zero-copy messaging to efficiently communicate data located on remote machines.
- PGX distributed execution mode supports failure detection and graceful reporting of failures to users connected to
the PGX server, for example via the shell. Failure detection in PGX is based on the Raft consensus protocol and is
a first step towards providing fault-tolerance for the distributed backend.
The following figure illustrates the PGX distributed execution mode:
Figure: PGX Distributed Execution
See the distributed execution mode installation page
for details about how to start the distributed server in your environment.