PGX 2.7.0
Documentation

Subgraph loading

PGX supports loading subgraphs by specifying a filter expression in the graph config. This technique combines loading a graph and creating a subgraph from it

Configuration

The filter expression is specified in the loading section of the graph config. It is specified by the following fields:

Field Type Description R? Default
expressionstringthe filter expressionNonull
typeenum[vertex, edge]the type of the filterNonull

Check the filter expressions documentation for the syntax of the filter expressions.

Supported Operations

Subgraph loading is currently supported for the following data sources only:

  • Adjacency List
  • Oracle Property Graph (RDBMS)
  • Oracle Property Graph (Oracle NoSQL)
  • Oracle Property Graph (Apache HBase)

When loading from an Oracle Property Graph source, all filter expressions are supported. The only difference is the method of evaluation, resulting in performance and memory differences. See more about that below in the Evaluation Strategies section.

When loading from Adjacency List, only the following expressions are allowed:

  • Vertex filter expressions

    • Simple vertex property access, e.g. vertex.prop == 10
    • Vertex property cross constraints, e.g. vertex.prop1 > vertex.prop2
    • The outDegree() function, e.g. vertex.outDegree() > 5
    • Vertex-ID access, e.g. vertex == "New York"
  • Edge filter expressions

    • Simple edge property access, e.g. edge.cost == 10
    • Simple source-vertex property access, e.g. src.prop == 10
    • Property cross constraints, e.g. edge.cost > src.prop
    • The outDegree() function on the source-vertex, e.g. src.outDegree() > 5
    • Source-vertex-ID access, e.g. src == "New York"
    • Destination-vertex-ID access, e.g. dst == "Boston"
  • General

    • Boolean expressions, e.g. edge.cost > 10 AND src.prop == 5

Evaluation Strategies

Adjacency List

For subgraphs loaded from Adjacency List, the filter expression is evaluated on each line. This means a line is parsed and the parsed data is stored in memory temporarily. The filter expression is then evaluated on the parsed data. If the filter expression evaluates to true on the parsed data, it is added to the final graph. If not the data is disregarded.

Oracle Property Graph

Depending on the OPG backend, filter expressions are evaluated with different strategies.

Parts of the filter expression may be handled by the database engine itself, rather than by PGX. This evaluation method is called "push-down evaluation".

Other parts of the filter expression may be evaluated while the data is received, similarly to how it is handled in the case of loading subgraphs from an Adjacency List. This evaluation method is referred to as "streaming evaluation".

Finally, some parts of the filter expression may be evaluated on the fully loaded graph. The last method is equivalent to loading the whole graph into memory and then creating a subgraph out of it, only that it happens automatically. This evaluation method is called "post-loading evaluation".

Oracle NoSQL and Apache HBase

Currently, when using OPG with Oracle NoSQL or Apache HBase as backend, the only available evaluation methods are "streaming" and "post-loading". PGX will decide automatically which method to use by analyzing the filter expression.

Oracle RDBMS

When using Oracle RDBMS, all three evaluation methods are used by PGX.

By default, PGX analyzes the filter expression and decides automatically based on heuristics on the strategy how to evaluate the expression.

This behavior can be changed with a configuration parameter in the loading section of the graph config, called filter_strategy. If a filter expression can only be evaluated after loading the full graph, this setting has no meaning.

Value Description
auto The default. Let PGX decide on how to execute the filter expression
stream Evaluate the filter expression while receiving the data.
post Evaluate the filter after loading the full graph.
db Evaluate the filter completely on the database