PGX 20.1.1
Documentation

Distributed Server Installation

This page illustrates how to start PGX with Hadoop on an arbitrary number of hosts. You are also free to use your favorite cluster management framework, e.g., YARN, Mesos, or Kubernetes).

Prerequisite

  1. Make sure you install PGX on every machine of the target cluster, and all the steps following should be done on every machine of the cluster identically.

  2. PGX distributed mode server uses TCP communication for initial handshaking. By default, PGX distributed mode server uses the TCP port 7777. For Linux iptables, you may issue the following command to open the TPC port 7777:

    sudo iptables -I INPUT -p tcp --dport 7777 -j ACCEPT
    

    You can change the port by changing the handshake_port option in the configuration file. See the configurations and Configuration section below for details.

  3. [If your cluster does not have InfiniBand] Configure it to run on Ethernet, PGX distributed mode server uses UDP as the transport protocol. Therefore, the machines in your cluster should be able to receive UDP packets from other machines. Use the following command to accept UDP packets from a certain range of IP addresses:

    sudo iptables -I INPUT -p udp -m iprange --src-range $IP-$IP -j ACCEPT
    

Configure SSL/TLS Security Certificates

Two-way SSL/TLS is enabled by default. See the tutorial on configuring TLS/SSL certificates for details.

Disabling SSL/TLS

You need to properly configure SSL/TLS to start the server (See the web server configuration page for SSL/TLS options). Even though it is possible to turn SSL/TLS off, we strongly recommend to leave SSL/TLS turned on for any production deployment.

Configuration

Configure $PGX_HOME/conf/server.conf and $PGX_HOME/conf/pgxd.conf.

  • hostnames: Please note that we do not provide a default value of the hostnames parameter in pgxd.conf. You must specify a list of host names of your cluster, either in the configuration file or given as arguments (see the next section).

  • secure_handshake_secret_file: PGX distributed mode server uses TLS-PSK based secured handshaking among distributed processes of the server by default. Create a keystore which stores a secret, then set the file path of the keystore to the secure_handshake_secret_file parameter. See the tutorial on creating a keystore for how to create the keystore. You can disable the secured handshaking by setting enable_secure_handshake parameter to false in pgxd.conf. In case the option is disabled, PGX distributed server will use normal unsecured TPC channels for communication. We highly recommend to enable the option for any production deployment.

Start the Distributed Server

Multiple machines

If you are running the distributed server on multiple machines, the first three steps below should be executed on every machine.

  1. Make sure the JAVA_HOME environment variable points to the JDK8 home directory, e.g. export JAVA_HOME=/usr/lib/jvm/java-8-oracle.

  2. PGX distributed mode server requires libjvm.so via LD_LIBRARY_PATH system environment variable. Add a path for libjvm.so, for example,

    export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/amd64/server:$LD_LIBRARY_PATH
    

    The distributed runtime requires Java 8u102 or greater

    Some functionalities require to have at least this specific version.

  3. Start the server using the built-in bash script start-server with --dist argument, the script will automatically detect the configuration files and launch the distributed PGX server.

    cd $PGX_HOME
    ./bin/start-server --dist
    

    You can also start a server with the following commands, if you did not set hostnames in pgxd.conf.

    cd $PGX_HOME
    ./bin/start-server --dist -Dpgx.hostnames=[hostname0,hostname1]
    

    By adding -Dpgx.hostnames=[hostname0,hostname1], it will overwrite the option in pgxd.conf with the given value.

    Do not run with admin permissions

    We strongly recommend not to execute the PGX server with admin permissions (e.g., as the root user). It might increase security risks on your system.

  4. Check the access point of the PGX distributed mode server. After finishing the steps above for all machines in your cluster, you can see a log message as follows:

    INFO: Starting ProtocolHandler ["http-nio-7007"]
    

Test the Connection

Connect to the PGX cluster using the PGX Shell. For example,

cd $PGX_HOME
./bin/pgx-jshell --base_url=http://hostname0:7007

In this example, hostname0 is the first host name listed in the hostnames parameter of pgxd.conf. The first machine in the list will act as the leader of the distributed PGX cluster. Note that clients should only connect to the leader machine.