6.2 Pattern Matching with PGQL
Pattern matching is done by specifying one or more path patterns in the MATCH clause. A single path pattern matches a linear path of vertices and edges, while more complex patterns can be matched by combining multiple path patterns, separated by comma. Value expressions (similar to their SQL equivalents) are specified in the WHERE clause and let you filter out matches, typically by specifying constraints on the properties of the vertices and edges
For example, assume a graph of TCP/IP connections on a computer network, and you want to detect cases where someone logged into one machine, from there into another, and from there into yet another. You would query for that pattern like this:
SELECT id(host1) AS id1, id(host2) AS id2, id(host3) AS id3 /* choose what to return */
FROM MATCH
(host1) -[connection1]-> (host2) -[connection2]-> (host3) /* single linear path pattern to match */
WHERE
connection1.toPort = 22 AND connection1.opened = true AND
connection2.toPort = 22 AND connection2.opened = true AND
connection1.bytes > 300 AND /* meaningful amount of data was exchanged */
connection2.bytes > 300 AND
connection1.start < connection2.start AND /* second connection within time-frame of first */
connection2.start + connection2.duration < connection1.start + connection1.duration
GROUP BY id1, id2, id3 /* aggregate multiple matching connections */
For more examples of pattern matching, see the Writing simple queries section in the PGQL specification.
Parent topic: Property Graph Query Language (PGQL)