PGX 20.1.1

Documentation

Documentation

Along with graph analytics, *graph pattern matching* is an
important task on graph datasets, which aims to find
patterns of users' interest from a large data graph as subgraphs. It has a wide
range of applicability, such as semantic web search, link prediction, and fraud
detection. For example, given a data graph below where a vertex has
a string property `name`

and an edge has a string property
`relationship`

, a user wants to find `friends of John's friends`

. The
data graph and such an inquiry can be expressed as the following
graphs.

**Figure: Friendship Graph and a query friends of John's friends**

Similar to SQL for relational data, to specify an arbitrary query graph, PGX supports a query language, PGQL, which is designed for graph pattern matching on property graphs. The above example graph query can be written in PGQL as follows:

SELECT friendOfFriend.name FROM MATCH (me:Person) -[:friendOf]-> (friend:Person) , MATCH (friend) -[:friendOf]-> (friendOfFriend) WHERE me.name = 'John'

PGX provides APIs to submit a query in PGQL and fetching query results, which is detailed in the tutorial and API guide. Along with basic pattern matching (or topology matching), PGQL provides additional features including constraints on node, edge, and their properties, ordering results, grouping results, and aggregations. For the full PGQL specification, please refer to PGQL 1.3 Specification.

Even with the same query, the query result varies depending on the desired semantics of structural graph equivalence - the definition of what it means for two graphs to match. PGX supports two representative graph pattern matching semantics -- graph homomorphism and graph isomorphism.

Under *graph homomorphism*, two graphs are considered structurally equivalent
(their patterns match) if, when there is an edge between vertices in the
query, there is an equivalent edge between two matched vertices in the graph.
In this example, there are two subgraphs - the read and blue ones in the
figure - which are solutions to the query. Thus, the answer for
`friends of John's friends`

are `David`

and `John`

himself.

Under *graph isomorphism*, two graphs must not only satisfy the conditions
of graph homomorphism, but also a vertex in the data graph cannot match
multiple query vertices - a vertex in the data being queried may not satisfy
the role of more than one vertex in the query. This eliminates the possibility
of `John`

being one of the "friends of John's friends" - the red sub-graph
in the figure (solution 1) is the only matched sub-graph, and the blue
sub-graph (`David`

) is the only match.

**Figure: Solutions of friends of John's friends**