PGX 20.0.2
Documentation

Graph Algorithms (PGX Algorithm)

PGX Algorithm allows you to write your graph algorithm in Java and have it automatically compiled to an efficient parallel implementation targeting either the single-machine (shared-memory) or distributed runtime.

Writing a PGX Algorithm

A PGX Algorithm program is a regular .java file with a single class definition that is annotated with @GraphAlgorithm:

import oracle.pgx.algorithm.annotations.GraphAlgorithm;

@GraphAlgorithm
public class MyAlgorithm {
    ...
}

A PGX Algorithm class must contain exactly one public method which will be used as entry point:

import oracle.pgx.algorithm.PgxGraph;
import oracle.pgx.algorithm.VertexProperty;
import oracle.pgx.algorithm.annotations.GraphAlgorithm;
import oracle.pgx.algorithm.annotations.Out;

@GraphAlgorithm
public class MyAlgorithm {
    public int myAlgorithm(PgxGraph g, @Out VertexProperty<Integer> distance) {
        System.out.println("My first PGX Algorithm program!");

        return 42;
    }
}

Like normal Java methods a PGX Algorithm method can return a value (e.g. an integer in this example). More interesting is the @Out annotation which marks the vertex property distance as output parameter. The caller passes output parameters by reference. This way, the caller has a reference to the modified property after the algorithm terminates.

Creating Collections

To create a collection you call the .create() function. For example, a VertexProperty<Integer> is created as follows:

VertexProperty<Integer> distance = VertexProperty.create();

To get the value of a property at a certain vertex v:

distance.get(v);

Similarly, to set the property of a certain vertex v to a value e:

distance.set(v, e);

You can even create properties of collections:

VertexProperty<VertexSequence> path = VertexProperty.create();

However, PGX Algorithm assignments are always by value (as opposed to by reference). To make this explicit, you must call .clone() when assigning a collection:

VertexSequence sequence = path.get(v).clone();

Another consequence of values being passed by value is that you can check for equality using the == operator instead of Java's .equals():

PgxVertex v1 = G.getRandomVertex();
PgxVertex v2 = G.getRandomVertex();
System.out.println(v1 == v2);

Iteration

The most common operations in PGX Algorithm are iterations (e.g. looping over all vertices, looping over a vertex's neighbors) and traversing the graph (e.g. breath-first/depth-first). All collections expose a forEach and forSequential method by which you can iterate over the collection in parallel and in sequence, respectively. For example, to iterate over a graph's vertices in parallel:

G.getVertices().forEach(v -> {
    ...
});

To iterate over a graph's vertices in sequence:

G.getVertices().forSequential(v -> {
    ...
});

To traverse the graph's vertices from r in breath-first order:

import oracle.pgx.algorithm.Traversal;

Traversal.inBFS(G, r).forward(n -> {
    ...
});

Inside the forward (or backward) lambda you can access the current level of the BFS (or DFS) traversal by calling currentLevel().

Reductions

Within these parallel blocks it is common to atomically update, or reduce to, a variable defined outside the lambda. These atomic reductions are available as methods on Scalar<T>: reduceAdd, reduceMul, reduceAnd, etc. For example, to count the number of vertices in a graph:

public int countVertices() {
    Scalar<Integer> count = Scalar.create(0);

    G.getVertices().forEach(n -> {
        count.reduceAdd(1);
    });

    return count.get();
}

Sometimes you want to update multiple values atomically. For example, you might want to find the smallest property value as well as the vertex whose property value attains this smallest value. Due to the parallel execution, two separate reduction statements might get you in an inconsistent state. To solve this problem the Reductions class provides an argMin and argMax function. The first argument to argMin is the current value and the second argument is the potential new minimum. Additionally, you can chain andUpdate calls on the ArgMinMax object to indicate other variables and the values that they should be updated to (atomically). For example:

VertexProperty<Integer> rank = VertexProperty.create();
int minRank = Integer.MAX_VALUE;
PgxVertex minVertex = PgxVertex.NONE;

G.getVertices().forEach(n ->
    argMin(minRank, rank.get(n)).andUpdate(minVertex, n)
);

Calling a UDF in a PGX Algorithm

Assume the Java class com.acme.Rate contains a static method getRating:

package com.acme;

public class Rate {
    public static int getRating(int x) {
        return 42;
    }
}

We put the Rate class on the classpath and configure the UDF with PGX:

{
  "user_defined_functions": [{
    "namespace": "acme",
    "function_name": "getRating",
    "language": "java",
    "implementation_reference": "com.acme.Rate",
    "return_type": "int",
    "arguments": [{
      "name": "x",
      "type": "int"
    }]
  }]
}

We can now call this UDF in a PGX Algorithm by creating an abstract method and annotating it with @Udf:

package org.company;

import oracle.pgx.algorithm.annotations.GraphAlgorithm;
import oracle.pgx.algorithm.annotations.Udf;

@GraphAlgorithm
public abstract class Example {
  public void example() {
    System.out.println(getRating(2));
  }

  @Udf(namespace = "acme")
  abstract int getRating(int number);
}

The @Udf annotation takes as sole parameter the namespace that the UDF was registered with.

Compiling and Running a PGX Algorithm

To enable the experimental PGX Algorithm compiler you need to set two configuration parameters in conf/pgx.conf:

  • Set the graph_algorithm_language option to JAVA.
  • Set the java_home_dir option to the path to your Java home (use <system-java-home-dir> to have PGX infer Java home from the system properties).
{
  "graph_algorithm_language": "JAVA",
  "java_home_dir": "<system-java-home-dir>"
}

Create a session (either implicitly in the shell or explicitly in Java):

cd $PGX_HOME
./bin/pgx-jshell
import oracle.pgx.algorithm.*;

PgxSession session = Pgx.createSession("my-session");

Compile a PGX Algorithm:

var myAlgorithm = session.compileProgram("/path/to/MyAlgorithm.java")
import oracle.pgx.algorithm.CompiledProgram;

CompiledProgram myAlgorithm = session.compileProgram("/path/to/MyAlgorithm.java");

Finally, run the algorithm:

var graph = session.readGraphWithProperties("/path/to/config.edge.json")
var property = graph.createVertexProperty(PropertyType.INTEGER)
myAlgorithm.run(graph, property)
import oracle.pgx.algorithm.VertexProperty;

PgxGraph graph = session.readGraphWithProperties("/path/to/config.edge.json");
VertexProperty property = graph.createVertexProperty(PropertyType.INTEGER);
myAlgorithm.run(graph, property);

Example: Pagerank

An implementation of pagerank as PGX Algorithm:

import oracle.pgx.algorithm.PgxGraph;
import oracle.pgx.algorithm.Scalar;
import oracle.pgx.algorithm.VertexProperty;
import oracle.pgx.algorithm.annotations.GraphAlgorithm;
import oracle.pgx.algorithm.annotations.Out;

@GraphAlgorithm
public class Pagerank {
  public void pagerank(PgxGraph G, double tol, double damp, int max_iter, boolean norm, @Out VertexProperty<Double> rank) {
    Scalar<Double> diff = Scalar.create();
    int cnt = 0;
    double N = G.getNumVertices();

    rank.setAll(1 / N);
    do {
      diff.set(0.0);
      Scalar<Double> dangling_factor = Scalar.create(0d);

      if (norm) {
        dangling_factor.set(damp / N * G.getVertices().filter(v -> v.getOutDegree() == 0).sum(rank::get));
      }

      G.getVertices().forEach(t -> {
        double in_sum = t.getInNeighbors().sum(w -> rank.get(w) / w.getOutDegree());
        double val = (1 - damp) / N + damp * in_sum + dangling_factor.get();
        diff.reduceAdd(Math.abs(val - rank.get(t)));
        rank.setDeferred(t, val);
      });
      cnt++;
    } while (diff.get() > tol && cnt < max_iter);
  }
}

Resources

For more information, please refer to the PGX Algorithm API JavaDoc.