PGX 1.2.0
Documentation

Matching Patterns in Graphs

This tutorial explains how to issue a pattern-matching query against a graph, and work with the results of that query.

The Dataset

In this tutorial, you will use a dataset which models relationships between politicians, athletes, celebrities, and companies. The dataset is located in <pgx root>/examples/graphs/.

Read the Graph

First, create a session by launching PGX in local mode:

cd $PGX_HOME
./bin/pgx
// starting the shell will create an implicit session
import oracle.pgx.api.*;

...

PgxSession session = Pgx.createSession("my-session");

Next, load a graph into memory:

pgx> G = session.readGraphWithProperties("examples/graphs/connections.adj.json")
import oracle.pgx.api.*;

...

PgxGraph G = session.readGraphWithProperties("examples/graphs/connections.adj.json");

Submit Queries

You can issue a graph pattern matching query in PGQL, an SQL-like declarative language that lets you express a pattern that specifies node and edge relationships and constraints on the properties of the nodes and edges that should be matched.

To issue a query, you use the queryPgql() method of PgxGraph (which is the type of object you get when you load a graph via the session).

PgqlResultSet queryPgql(String queryString)

Enemy of My Enemy is My Friend

Here, you will find a graph pattern inspired by the famous ancient proverb The enemy of my enemy is my friend. Specifically, you will find two entities which are connected by two hops of a feuds relationship. Nodes represent people or clans or countries, and those which are feuding with each other will have an edge connecting them which is labelled feuds.

Such a query is written in PGQL as follows:

SELECT x.name, z.name
WHERE
    x -[e1 WITH label = 'feuds']-> y,
    y -[e2 WITH label = 'feuds'] -> z

Submit the query to PGX:

pgx> resultSet = G.queryPgql("SELECT x.name, z.name WHERE x -[e1 WITH label = 'feuds']-> y, y -[e2 WITH label = 'feuds']-> z")
import oracle.pgx.api.*;

...

PgqlResultSet resultSet = G.queryPgql("SELECT x.name, z.name WHERE x -[e1 WITH label = 'feuds']-> y, y -[e2 WITH label = 'feuds']-> z");

PgqlResultSet manages the result set of the query. The result set contains multiple results (such a query may match many sub-graphs). Each result consists of a list of result elements. The order of result elements follows the order of variables in the SELECT clause of a query, similar to SQL.

Iterating over the query results means iterating over a set of PgqlResultElement instances. Each PgqlResultElement can provide the type and variable name of the result element.

You can get the list of PgqlResultElement instances as follows:

pgx> resultElements = resultSet.getPgqlResultElements()
import oracle.pgx.api.*;
import java.util.List;

...

List<PgqlResultElement> resultElements = resultSet.getPgqlResultElements();

Get the type and variable name of the first result element:

pgx> resultElement = resultElements.get(0)
pgx> type = resultElement.getElementType() // STRING
pgx> varName = resultElement.getVarName() // x.name
import oracle.pgx.api.*;

...

PgqlResultElement resultElement = resultElements.get(0);
PqglResultElement.Type = resultElement.getElementType(); // STRING
String varName = resultElement.getVarName(); // x.name

Iterate over a result set using the for-each style for loop. In the loop, you get a PgqlResult instance which contains a query result.

pgx> resultSet.getResults().each { \
       // the variable 'it' is implicitly declared to references each PgqlResult instance
     }
import oracle.pgx.api.*;

...

for (PgqlResult result : resultSet.getResults()) {
  ...
}

You can print out the result set in textual format using print method of PqglResultSet.

pgx> resultSet.print(10) // print the first 10 results
import oracle.pgx.api.*;

...

resultSet.print(10); // print the first 10 results

You can also get a handle of individual PgqlResult instances or their elements.

By the index of the result element:

pgx> nameX = it.getString(0)
pgx> nameZ = it.getString(1)
import oracle.pgx.api.*;

...

String nameX = result.getString(0);
String nameZ = result.getString(1);

By the variable name of the result element:

pgx> nameX = it.getString("x.name")
pgx> nameZ = it.getString("z.name")
import oracle.pgx.api.*;

...

String nameX = result.getString("x.name");
String nameZ = result.getString("z.name");

You can also get a result element without knowing its type:

pgx> nameX = it.get(0)
// or
pgx> nameX = it.get("x.name")
import oracle.pgx.api.*;

...

Object nameX = result.get(0);
// or
Object nameX = result.get("x.name");

Top 10 Most Collaborative People

Another interesting query is finding the top 10 most collaborative people in the graph in a decreasing order of the number of collaborators. Such a query exploits various features of PGQL which include grouping, aggregating, ordering, and limiting the graph patterns found in the WHERE clause. The following query string expresses a user's inquiry in PGQL.

pgx> resultSet = G.queryPgql("SELECT x.name, COUNT(*) AS num_collaborators WHERE x -[WITH label = 'collaborates']-> () GROUP BY x ORDER BY DESC(num_collaborators) LIMIT 10")
import oracle.pgx.api.*;

...

PgqlResultSet resultSet = G.queryPgql("SELECT x.name, COUNT(*) AS num_collaborators WHERE x -[WITH label = 'collaborates']-> () GROUP BY x ORDER BY DESC(num_collaborators) LIMIT 10");

The above query does the following:

  1. Find all collaboration relationship patterns from the graph.
  2. Group the found patterns by its source vertex.
  3. Apply the count aggregation to each group to find the number of collaborators.
  4. Order the groups by the number of collaborators in a decreasing order.
  5. Take only the first 10 results.

print() method shows the name and the number of collaborators of the top 10 collaborative people in the graph.

pgx> resultSet.print()
import oracle.pgx.api.*;

...

resultSet.print()

You can see the following in the console.

              x.name     num_collaborators
============================================
      Barack%20Obama                    20
       The%20Academy                     6
                 NBC                     5
        John%20Kerry                     5
    Vladimir%20Putin                     5
             Beyonce                     4
      Charlie%20Rose                     4
Kirsten%20Gillibrand                     4
   Hillary%20Clinton                     4
      Pope%20Francis                     4

For the complete PGQL specification, please refer to PGQL specification. For the complete API set, please refer to API reference for graph pattern matching.