Running Graph Pattern Matching Queries (PGQL)
Submitting a Query using PgxGraph
PgxGraph
provides methods query_pgql()
and execute_pgql()
for submitting graph pattern matching queries written in PGQL .
Note that INSERT/UPDATE/DELETE queries are only accepted by execute_pgql()
, whereas SELECT queries can be run using both methods.
PgxGraph
additionally provides clone_and_execute_pgql()
method, which first clones the graph, and after executes the query on the cloned graph and returns it, while the original graph remains intact.
If the graphName
argument is specified (and not None
), it will be the name of the returned graph (if the name is taken, the execution will fail).
If the name is not specified, it will be automatically generated.
The function accepts both SELECT and INSERT/UPDATE/DELETE queries, however in case of a SELECT, the returned graph will be the exact copy of the original one.
Note that clone_and_execute_pgql()
is only supported on graphs where the clone()
method is supported.
For this reason, partitioned graphs cannot be queried using this method.
These methods return a PgqlResultSet
that contains the result of the SELECT query, or None
in case of INSERT/UPDATE/DELETE queries.
Examples
In the following block, there are some examples for submitting SELECT pattern matching queries to a graph:
1result_set = graph.query_pgql(
2 "SELECT v.age FROM MATCH (v) WHERE v.salary > 1000")
3rs = graph.execute_pgql("SELECT v.age FROM MATCH (v) WHERE v.salary > 1000")
The two above examples are completely equivalent, they give the same result.
INSERT/UPDATE/DELETE queries can be applied in-place for the graph using the execute_pgql()
method, for example each persons age in the graph can be increased by one using the following query:
1graph.execute_pgql("UPDATE v SET ( v.age = v.age + 1 ) FROM MATCH (v)")
The same modifications can be run on a new graph instance using the clone_and_execute_pgql()
method:
1graph2 = graph.clone_and_execute_pgql(
2 "UPDATE v SET ( v.age = v.age + 1 ) FROM MATCH (v)",
3 "new_graph"
4)
In this case the graph graph
will remain untouched by the query, and graph graph2
will be named new_graph
and will be a copy of graph
but with the update applied.
In case of INSERT
, a graph name can be specified as part of the query:
1graph = session.read_graph_with_properties(self.pgql_graph)
2graph = graph.clone()
3tmp = graph.clone_and_execute_pgql(
4 "INSERT INTO \"tmp_graph\" VERTEX v2 LABELS ( Person ) PROPERTIES ( v2.age = v.age + 1 ) \
5 FROM MATCH (v)",
6 "tmp_graph"
7)
The above statement only succeeds because the graph name in the INTO
clause matches the graph name provided as argument to clone_and_execute_pgql()
.
The following query fails because of the mismatch between my_graph
and new_graph
.
1graph2 = graph.clone_and_execute_pgql(
2 "INSERT INTO \"my_graph\" VERTEX v2 LABELS ( Person ) PROPERTIES ( v2.age = v.age + 1 ) FROM MATCH (v)",
3 "new_graph"
4)
Submitting a Query using PgxSession
PgxSession
provides the following methods for submitting pattern matching queries written in PGQL :
1query = "SELECT x.name FROM MATCH (x) ON graph"
2session.query_pgql(query)
1pgqlstring = "SELECT x.name FROM MATCH (x) ON graph"
2session.execute_pgql(pgqlstring)
1session.query_pgql("SELECT x.name FROM MATCH (x) ON tmp_graph")
2session.prepare_pgql(
3 "SELECT x.name FROM MATCH (x) ON tmp_graph WHERE x.age = ?")
The ON clause specifies the graph on which the query will be executed. Only queries with an ON clause can be executed on a PgxSession instance.
Getting Result Element Information from a Result Set
A graph pattern matching result consists of a list of result elements. We allow the following types for an result element.
Integer
Long
Float
Double
Boolean
Date
Time
Timestamp
Time with timezone
Timestamp with timezone
Vertex
Edge
Vertex labels
Point2D
Since a result element can have any of the above types, we provide a
way to get information on each result element. PgqlResultSet
has
many methods to get a list of result element information.
Print a Result Set
We also provide a print
method for PgqlResultSet
to print out the result set without iterating over each result.
1result_set.print()
Creating vertex/edge sets out of a PGQL result set
Users can create a vertex/edge set out of a given PGQL result set. For example, given the result set generated by the following line:
1result_set = graph.query_pgql(
2 "SELECT x FROM MATCH (x) WHERE x.age > 24")
Prepared Statements
Prepared statements provide a robust way of fighting injection and provides additional opportunity for query optimization.
The prepared statment class of PGX follows JDBC’s PreparedStatement
class very closely (same methods and method names).
Furthermore, the PGQL syntax for bind variables uses the question mark symbol (?
) in places where literals are allowed, just like in SQL.
An example of a query with three bind variables is as follows:
1query = "SELECT * FROM MATCH (n:Person) WHERE n.name = ? LIMIT ? OFFSET ?"
Bind variables are allowed at any location where a literal is allowed, in the PATH, SELECT, MATCH, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT and OFFSET clauses (i.e. in all the clauses) as well as in PATH pattern definitions.
Note, that currently prepare_pgql()
only accepts SELECT and UPDATE queries, INSERT//DELETE queries cannot be prepared.
PgxGraph
provides the following method to prepare a PGQL query:
1prepared_stmt = graph.prepare_pgql(
2 "SELECT m.name " + "FROM MATCH (n:Person) -[:?]-> (m:Person) \
3 WHERE n.name = ? AND n.age = ?"
4)
5prepared_stmt.set_string(1, "likes")
6prepared_stmt.set_string(2, "John")
7prepared_stmt.set_int(3, 40)
Similarly, PgxPreparedStatement
provides method for executing a prepared PGQL query:
1result_set1 = prepared_stmt.execute()
If a SELECT query was executed, the result set can be retrieved by the following methods:
1result_set2 = prepared_stmt.get_result_set()
In addition to execute()
, which works for all types of statements, the following methods can be used to execute a SELECT query and return the ResultSet.
1prepared_stmt.execute_query()
Note that these methods only support SELECT queries.
The following example creates a prepared statement for a query with three bind variables.
Bindings for the variables are first set through the PreparedStatement
API and then the query gets executed.
In the example, the statement is used a second time, to execute another query. The second and third bind values (previously set) are reused, but the first bind variable is changed to a new string value.
Then, the prepared statement gets executed again , but this time, using execute()
and get_result_set()
.
Finally, an UPDATE query is executed.
1prepared_stmt = graph.prepare_pgql(
2 "SELECT m.name "
3 + "FROM MATCH (n:Person) -[:?]-> (m:Person) WHERE n.name = ? AND n.age = ?"
4)
5prepared_stmt.set_string(1, "likes")
6prepared_stmt.set_string(2, "John")
7prepared_stmt.set_int(3, 40)
8result_set1 = prepared_stmt.execute_query()
9
10prepared_stmt.set_string(1, "dislikes")
11prepared_stmt.execute()
12result_set2 = prepared_stmt.get_result_set()
13
14prepared_stmt = graph.prepare_pgql(
15 "UPDATE n SET ( n.age = ? ) FROM MATCH (n) WHERE n.name = ?"
16)
17prepared_stmt.set_int(1, 45)
18prepared_stmt.set_string(2, "John")
19prepared_stmt.execute()
When closing a prepared statement, all the result sets associated with it are automatically closed in case they were not closed yet.
The following example shows a PGQL
query with an IN
predicate, which tests an expression for membership in an array of values. In this case, a bind variable can be placed in the position of the array literal.
The method set_array()
of PreparedStatement
API is used in this example to set the bind variable to the given List
of str
values.
1prepared_stmt = graph.prepare_pgql(
2 "SELECT n.name FROM MATCH (n:Person) WHERE n.name IN ?"
3)
4arr = ["Emily", "Carl"]
5prepared_stmt.set_array(1, arr)
6prepared_stmt.execute()
Notice that only some of the Java types are allowed for the list of values. Also, the type of the expression that is being tested for membership (the type of n.name
in the example) should be compatible with the type of the values in the list (type str
in the example). The following table specifies which Java types are allowed and its compatible expression types:
Java type |
Expression type |
---|---|
Integer |
Integer, Long, Float, Double |
Long |
Integer, Long, Float, Double |
Float |
Integer, Long, Float, Double |
Double |
Integer, Long, Float, Double |
Boolean |
Boolean |
String |
String |
LocalDate |
Date |
LocalTime |
Time, Time with time zone |
OffsetTime |
Time, Time with time zone |
LocalDateTime |
Timestamp, Timestamp with time zone |
OffsetDateTime |
Timestamp, Timestamp with time zone |
Explain the query execution plan
Every query uses an execution plan to guide the way a query is processed.
The execution plan of a query can be obtained using the function explain_pgql()
(simply replace the function query_pgql()
with explain_pgql()
).
The result of explain_pgql()
is the root of the PGQL execution plan tree. The whole tree can be printed by calling print()
on this root node.
Every node contains information about a single step in the execution plan of the query, e.g., the operator, cardinality estimate and cost estimate.