PGX 21.1.1

Running Built-in Algorithms Using the Analyst API

The Analyst API provides convenience ways to apply built-in graph analyses. Especially, the API provides a single wrapper method for the analyses that is composed of multiple sub-steps.

Obtaining an Analyst Instance

In order to obtain an Analyst instance, simply invoke createAnalyst() on your session object:

Analyst analyst = session.createAnalyst();
analyst = session.create_analyst()

Asynchronous Execution

All methods in the Analyst API support asynchronous execution.

This is a design choice, since the execution of a graph algorithm can take a long time if the size of the graph is large. The application thread invokes the Analyst API method. It may continue to perform other operations while the algorithm is running. If the application thread decides to wait for the result to be available, it can invoke the blocking version directly (e.g. call pagerank() instead of pagerankAsync()), which internally calls the get() method on the PgxFuture object. Note: All PGX APIs follow the same design approach of supporting both asynchronous and synchronous invocation.

Analyst Thread Safety

The Analyst object is not guaranteed to be thread-safe, since it might maintain certain internal states. It is recommended not to share a single Analyst instance through multiple threads.

Running Algorithms and Browsing Results

The Analyst API provides methods for each built-in graph analysis. If the graph analysis requires certain preprocessing steps, all of those preprocessing steps are encapsulated in the Analyst API method.

For instance, the Analyst API provides the following method to count the numbers of triangles in a graph:

long countTriangles(PgxGraph graph, boolean sortVerticesByDegree)
count_triangles(self, graph, sort_vertices_by_degree)

The above method is actually composed of the following sub-steps.

  1. Create an undirected copy of the given graph (the count triangles algorithm is only defined on undirected graphs)

  2. Optionally sort the vertices of the undirected copy by degree. This is a speed versus memory consumption trade-off which can be controlled by the caller via the sortVerticesByDegree flag

  3. Invoke the built-in count triangles algorithm on the graph copy

  4. Return the result

  5. Drop the undirected copy of the graph

For detailed information, refer to the Analyst class in the javadoc.

Optional Arguments

The Analyst class has overloaded convenience methods which use default values for optional parameters. For example,

public <ID> VertexProperty<ID, Double> pagerank(PgxGraph graph)
 pagerank(self, graph, tol=0.001, damping=0.85, max_iter=100, norm=False, rank="pagerank")

will use a max error of 0.001, a damping factor of 0.85 and a maximum of 100 iterations.

For detailed information, refer to the Analyst class in the javadoc.

Reuse Existing Data Structures

By default, the Analyst creates a new in-memory data structure to hold the result of an algorithm invocation. That data structure gets disposed once the Analyst object gets disposed. However, you can also pass in existing mutable data structures the algorithm writes its result into. Examples:

VertexProperty<Long, Double> rank = analyst.pagerank(G); // creates a new 'rank' property
analyst.pagerank(G, rank); // writes result into existing 'rank' property
rank = analyst.pagerank(G)
analyst.pagerank(G, rank=rank)

Disposing an Analyst Instance

Once the graph analysis is finished, the user can invoke the destroy() method to deallocate all of the resources used by the Analyst. If there are multiple Analyst objects, a destroy() invocation of one Analyst instance does not affect the other instances.