Running Graph Pattern Matching Queries (PGQL)

Submitting a Query using PgxGraph

PgxGraph provides methods query_pgql() and execute_pgql() for submitting graph pattern matching queries written in PGQL . Note that INSERT/UPDATE/DELETE queries are only accepted by execute_pgql(), whereas SELECT queries can be run using both methods.

PgxGraph additionally provides clone_and_execute_pgql() method, which first clones the graph, and after executes the query on the cloned graph and returns it, while the original graph remains intact. If the graphName argument is specified (and not None), it will be the name of the returned graph (if the name is taken, the execution will fail). If the name is not specified, it will be automatically generated. The function accepts both SELECT and INSERT/UPDATE/DELETE queries, however in case of a SELECT, the returned graph will be the exact copy of the original one.

Note that clone_and_execute_pgql() is only supported on graphs where the clone() method is supported. For this reason, partitioned graphs cannot be queried using this method.

These methods return a PgqlResultSet that contains the result of the SELECT query, or None in case of INSERT/UPDATE/DELETE queries.

Examples

In the following block, there are some examples for submitting SELECT pattern matching queries to a graph:

1result_set = graph.query_pgql(
2    "SELECT v.age FROM MATCH (v) WHERE v.salary > 1000")
3rs = graph.execute_pgql("SELECT v.age FROM MATCH (v) WHERE v.salary > 1000")

The two above examples are completely equivalent, they give the same result.

INSERT/UPDATE/DELETE queries can be applied in-place for the graph using the execute_pgql() method, for example each persons age in the graph can be increased by one using the following query:

1graph.execute_pgql("UPDATE v SET ( v.age = v.age + 1 ) FROM MATCH (v)")

The same modifications can be run on a new graph instance using the clone_and_execute_pgql() method:

1graph2 = graph.clone_and_execute_pgql(
2    "UPDATE v SET ( v.age = v.age + 1 ) FROM MATCH (v)",
3    "new_graph"
4)

In this case the graph graph will remain untouched by the query, and graph graph2 will be named new_graph and will be a copy of graph but with the update applied.

In case of INSERT, a graph name can be specified as part of the query:

1graph = session.read_graph_with_properties(self.pgql_graph)
2graph = graph.clone()
3tmp = graph.clone_and_execute_pgql(
4    "INSERT INTO \"tmp_graph\" VERTEX v2 LABELS ( Person ) PROPERTIES ( v2.age = v.age + 1 ) \
5        FROM MATCH (v)",
6    "tmp_graph"
7)

The above statement only succeeds because the graph name in the INTO clause matches the graph name provided as argument to clone_and_execute_pgql(). The following query fails because of the mismatch between my_graph and new_graph.

1graph2 = graph.clone_and_execute_pgql(
2    "INSERT INTO \"my_graph\" VERTEX v2 LABELS ( Person ) PROPERTIES ( v2.age = v.age + 1 ) FROM MATCH (v)",
3    "new_graph"
4)

Submitting a Query using PgxSession

PgxSession provides the following methods for submitting pattern matching queries written in PGQL :

1query = "SELECT x.name FROM MATCH (x) ON graph"
2session.query_pgql(query)
1pgqlstring = "SELECT x.name FROM MATCH (x) ON graph"
2session.execute_pgql(pgqlstring)
1session.query_pgql("SELECT x.name FROM MATCH (x) ON tmp_graph")
2session.prepare_pgql(
3    "SELECT x.name FROM MATCH (x) ON tmp_graph WHERE x.age = ?")

The ON clause specifies the graph on which the query will be executed. Only queries with an ON clause can be executed on a PgxSession instance.

Getting Result Element Information from a Result Set

A graph pattern matching result consists of a list of result elements. We allow the following types for an result element.

  • Integer

  • Long

  • Float

  • Double

  • Boolean

  • Date

  • Time

  • Timestamp

  • Time with timezone

  • Timestamp with timezone

  • Vertex

  • Edge

  • Vertex labels

  • Point2D

Since a result element can have any of the above types, we provide a way to get information on each result element. PgqlResultSet has many methods to get a list of result element information.

Creating vertex/edge sets out of a PGQL result set

Users can create a vertex/edge set out of a given PGQL result set. For example, given the result set generated by the following line:

1result_set = graph.query_pgql(
2    "SELECT x FROM MATCH (x) WHERE x.age > 24")

Prepared Statements

Prepared statements provide a robust way of fighting injection and provides additional opportunity for query optimization. The prepared statment class of PGX follows JDBC’s PreparedStatement class very closely (same methods and method names). Furthermore, the PGQL syntax for bind variables uses the question mark symbol (?) in places where literals are allowed, just like in SQL.

An example of a query with three bind variables is as follows:

1query = "SELECT * FROM MATCH (n:Person) WHERE n.name = ? LIMIT ? OFFSET ?"

Bind variables are allowed at any location where a literal is allowed, in the PATH, SELECT, MATCH, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT and OFFSET clauses (i.e. in all the clauses) as well as in PATH pattern definitions.

Note, that currently prepare_pgql() only accepts SELECT and UPDATE queries, INSERT//DELETE queries cannot be prepared.

PgxGraph provides the following method to prepare a PGQL query:

1prepared_stmt = graph.prepare_pgql(
2    "SELECT m.name " + "FROM MATCH (n:Person) -[:?]-> (m:Person) \
3        WHERE n.name = ? AND n.age = ?"
4)
5prepared_stmt.set_string(1, "likes")
6prepared_stmt.set_string(2, "John")
7prepared_stmt.set_int(3, 40)

Similarly, PgxPreparedStatement provides method for executing a prepared PGQL query:

1result_set1 = prepared_stmt.execute()

If a SELECT query was executed, the result set can be retrieved by the following methods:

1result_set2 = prepared_stmt.get_result_set()

In addition to execute(), which works for all types of statements, the following methods can be used to execute a SELECT query and return the ResultSet.

1prepared_stmt.execute_query()

Note that these methods only support SELECT queries.

The following example creates a prepared statement for a query with three bind variables. Bindings for the variables are first set through the PreparedStatement API and then the query gets executed. In the example, the statement is used a second time, to execute another query. The second and third bind values (previously set) are reused, but the first bind variable is changed to a new string value. Then, the prepared statement gets executed again , but this time, using execute() and get_result_set(). Finally, an UPDATE query is executed.

 1prepared_stmt = graph.prepare_pgql(
 2    "SELECT m.name "
 3    + "FROM MATCH (n:Person) -[:?]-> (m:Person) WHERE n.name = ? AND n.age = ?"
 4)
 5prepared_stmt.set_string(1, "likes")
 6prepared_stmt.set_string(2, "John")
 7prepared_stmt.set_int(3, 40)
 8result_set1 = prepared_stmt.execute_query()
 9
10prepared_stmt.set_string(1, "dislikes")
11prepared_stmt.execute()
12result_set2 = prepared_stmt.get_result_set()
13
14prepared_stmt = graph.prepare_pgql(
15    "UPDATE n SET ( n.age = ? ) FROM MATCH (n) WHERE n.name = ?"
16)
17prepared_stmt.set_int(1, 45)
18prepared_stmt.set_string(2, "John")
19prepared_stmt.execute()

When closing a prepared statement, all the result sets associated with it are automatically closed in case they were not closed yet.

The following example shows a PGQL query with an IN predicate, which tests an expression for membership in an array of values. In this case, a bind variable can be placed in the position of the array literal. The method set_array() of PreparedStatement API is used in this example to set the bind variable to the given List of str values.

1prepared_stmt = graph.prepare_pgql(
2    "SELECT n.name FROM MATCH (n:Person) WHERE n.name IN ?"
3)
4arr = ["Emily", "Carl"]
5prepared_stmt.set_array(1, arr)
6prepared_stmt.execute()

Notice that only some of the Java types are allowed for the list of values. Also, the type of the expression that is being tested for membership (the type of n.name in the example) should be compatible with the type of the values in the list (type str in the example). The following table specifies which Java types are allowed and its compatible expression types:

Java type

Expression type

Integer

Integer, Long, Float, Double

Long

Integer, Long, Float, Double

Float

Integer, Long, Float, Double

Double

Integer, Long, Float, Double

Boolean

Boolean

String

String

LocalDate

Date

LocalTime

Time, Time with time zone

OffsetTime

Time, Time with time zone

LocalDateTime

Timestamp, Timestamp with time zone

OffsetDateTime

Timestamp, Timestamp with time zone

Explain the query execution plan

Every query uses an execution plan to guide the way a query is processed. The execution plan of a query can be obtained using the function explain_pgql() (simply replace the function query_pgql() with explain_pgql()). The result of explain_pgql() is the root of the PGQL execution plan tree. The whole tree can be printed by calling print() on this root node. Every node contains information about a single step in the execution plan of the query, e.g., the operator, cardinality estimate and cost estimate.