6.9.2.2 Performance Considerations for PGQL Queries
The following are some recommended practices for query performance.
- Recursive Queries
- Using Query Optimizer Hints
- Speed Up Query Translation Using Graph Metadata Cache and Translation Cache
Recursive Queries
The following indexes are recommended in order to speed up execution of recursive queries:
- For underlying VERTEX tables of the recursive pattern, an index on the key column
- For underlying EDGE tables of the recursive pattern, an index on the
source key column
Note:
You can also create index on (source key, destination key).
For example, consider the following CREATE PROPERTY GRAPH statement:
CREATE PROPERTY GRAPH people
VERTEX TABLES(
person
KEY ( id )
LABEL person
PROPERTIES( name, age )
)
EDGE TABLES(
knows
key (person1, person2)
SOURCE KEY ( person1 ) REFERENCES person (id)
DESTINATION KEY ( person2 ) REFERENCES person (id)
NO PROPERTIES
)
OPTIONS ( PG_VIEW )
And also consider the following query:
SELECT COUNT(*)
FROM MATCH ANY SHORTEST ( (n:Person) -[e:knows]->* (m:Person) )
WHERE n.id = 1234
In order to improve performance of the recursive part of the preceding query, the following indexes must exist:
CREATE INDEX <INDEX_NAME> ON PERSON(ID)
CREATE INDEX <INDEX_NAME> ON KNOWS(PERSON1)
orCREATE INDEX <INDEX_NAME> ON KNOWS(PERSON1, PERSON2)
Composite Vertex Keys
For composite vertex keys, query execution can be optimized with the creation of function-base indexes on the key columns:
- For underlying VERTEX tables of the recursive pattern, a function-based index on the comma-separated concatenation of key columns
- For underlying EDGE tables of the recursive pattern, a
function-based index on the comma-separated concatenation of source key
columns
Note:
You can also create index on (source key columns, destination key columns).
For example, consider the following CREATE PROPERTY GRAPH statement:
CREATE PROPERTY GRAPH people
VERTEX TABLES(
person
KEY ( id1, id2 )
LABEL person
PROPERTIES( name, age )
)
EDGE TABLES(
knows
key (id)
SOURCE KEY ( id1person1, id2person1 ) REFERENCES person (id1,id2)
DESTINATION KEY ( id1person2, id2person2 ) REFERENCES person (id1,id2)
NO PROPERTIES
)
OPTIONS ( PG_VIEW )
And also consider the following query:
SELECT COUNT(*)
FROM MATCH ANY SHORTEST ( (n:Person) -[e:knows]->* (m:Person) )
WHERE n.id = 1234
In order to improve performance of the recursive part of the preceding query, the following indexes must exist:
CREATE INDEX <INDEX_NAME> ON PERSON (ID1 || ',' || ID2)
CREATE INDEX <INDEX_NAME> ON KNOWS (ID1PERSON1 || ',' || ID2PERSON1)
orCREATE INDEX <INDEX_NAME> ON KNOWS (ID1PERSON1 || ',' || ID2PERSON1, ID1PERSON2 || ',' || ID2PERSON2)
If some of the columns in a composite vertex key is a string column, the column needs to be comma-escaped in the function-base index creation.
For example, if column ID1
in table
PERSON
of the preceding example is of type
VARCHAR2(10)
, you need to escape the comma for the column as
follows:
replace(ID1, ',', '\,')
So, the indexes to improve performance will result as shown:
CREATE INDEX <INDEX_NAME> ON PERSON (replace(ID1, ',', '\,') || ',' || ID2)
CREATE INDEX <INDEX_NAME> ON KNOWS (replace(ID1PERSON1, ',', '\,') || ',' || ID2PERSON1)
Using Query Optimizer Hints
The following hints can be used to influence translation of PGQL variable-length path patterns to SQL:
REVERSE_PATH
: Switches on or off the reverse path optimization (ON
by default). IfON
, it automatically determines if the pattern can best be evaluated from source to destination or from destination to source, based on specified filter predicates.PUSH_SRC_HOPS
: Switches on or off pushing source filter optimization (ON
by default). IfON
, then filter predicates are used to limit the number of source vertices (or destination vertices if path evaluation is reversed) and thereby the search space of variable-length path pattern evaluations.PUSH_DST_HOPS
: Switches on or off pushing destination filter optimization (OFF
by default). IfON
, then filter predicates are used to limit the number of destination vertices (or source vertices if path evaluation is reversed) and thereby the search space of variable-length path pattern evaluations.
The preceding hints can be configured as options
parameter in the
following Java API methods:
executeQuery(String pgql, String options)
translateQuery(String pgql, String options)
execute(String pgql, String matchOptions, String options)
For example, consider the following PGQL query:
SELECT v1.name AS v1, v2.name AS v2, v3.name As v3
FROM MATCH (v1:Person)-[e1:friendOf]->(v2:Person),
MATCH ANY (v2:Person)-[e2:friendOf]->*(v3:Person)
WHERE v1.name= 'Bob'
When the preceding query is executed using the default option for
PUSH_SRC_HOPS
, the output for
start_nodes_translation
displays the filter expression as
shown:
System.out.println(pgqlStatement.translateQuery(pgql).getSqlTranslation())
...
...
start_nodes_translation => (to_clob('SELECT ''PERSONS'' AS "src_table", e1.person_b AS "src_key"
FROM "GRAPHUSER"."PERSONS" "V1", "GRAPHUSER"."FRIENDSHIPS" "E1"
WHERE (((e1.person_a = v1.person_id) AND NOT(e1.person_b IS NULL)) AND (v1.name = ''Bob''))')),
end_nodes_translation => (to_clob('SELECT ''PERSONS'' AS "dst_table", v3.person_id AS "dst_key"
FROM "GRAPHUSER"."PERSONS" "V3"')),
...
...
If the preceding query is executed with the hint
PUSH_SRC_HOPS=F
, then the query is translated into SQL as
shown:
System.out.println(pgqlStatement.translateQuery(pgql,"PUSH_SRC_HOPS=F").getSqlTranslation())
...
...start_nodes_translation => (to_clob('SELECT ''PERSONS'' AS "src_table", v2.person_id AS "src_key"
FROM "GRAPHUSER"."PERSONS" "V2"')),
end_nodes_translation => (to_clob('SELECT ''PERSONS'' AS "dst_table", v3.person_id AS "dst_key"
FROM "GRAPHUSER"."PERSONS" "V3"')),
...
...
Speed Up Query Translation Using Graph Metadata Cache and Translation Cache
The following global caches help to speed up PGQL query translation:
- Graph Metadata Cache: Stores graph metadata such as tables, labels, properties, and so on.
- Translation Cache: Stores PGQL to SQL translation.
You can configure the caches using the following Java APIs:
clearTranslationCache()
disableTranslationCache()
enableTranslationCache()
setTranslationCacheMaxCapacity(int maxCapacity)
clearGraphMetadataCache()
disableGraphMetadataCache()
enableGraphMetadataCache()
setGraphMetadataCacheMaxCapacity(int maxCapacity)
These preceding methods are part of the PgqlConnection
class.
Separate caches are maintained for each database user such that cached objects are
shared between different PgqlConnection
objects if they have the
same connection URL and user underneath.
Parent topic: Executing PGQL Queries Against Property Graph Views