Executing PGQL Queries Against the Graph Server (PGX)

16 Executing PGQL Queries Against the Graph Server (PGX)

This section describes the Java APIs that are used to execute PGQL queries in the graph server (PGX).

Getting Started with PGQL
Creating Property Graphs Using Options
Supported PGQL Features and Limitations on the Graph Server (PGX)
Java APIs for Executing CREATE PROPERTY GRAPH Statements
Python APIs for Executing CREATE PROPERTY GRAPH Statements
Java APIs for Executing SELECT Queries
This section describes the APIs to execute SELECT queries in the graph server (PGX).
Java APIs for Executing UPDATE Queries
The UPDATE queries make changes to existing graphs using the INSERT, UPDATE, and DELETE operations as detailed in the section Graph Modification of the PGQL 1.3 specification.
PGQL Queries with Partitioned IDs
Security Tools for Executing PGQL Queries
To safeguard against query injection, bind variables can be used in place of literals while printIdentifier(String identifier) can be used in place of identifiers like graph names, labels, and property names.
Best Practices for Tuning PGQL Queries
This section describes best practices regarding memory allocation, parallelism, and query planning.

Parent topic: Using the Graph Server (PGX)

16.1 Getting Started with PGQL

This section provides an example on how to get started with PGQL. It assumes a database realm that has been previously set up (follow the steps in Prepare the Graph Server for Database Authentication). It also assumes that the user has read access to the HR schema.

First, create a graph with employees, departments, and employee works at department, by executing a CREATE PROPERTY GRAPH statement.

Example 16-1 Creating a graph in the graph server (PGX)

The following statement creates a graph in the graph server (PGX)

String statement =
      "CREATE PROPERTY GRAPH hr_simplified "
    + "  VERTEX TABLES ( "
    + "    hr.employees LABEL employee "
    + "      PROPERTIES ARE ALL COLUMNS EXCEPT ( job_id, manager_id, department_id ), "
    + "    hr.departments LABEL department "
    + "      PROPERTIES ( department_id, department_name ) "
    + "  ) "
    + "  EDGE TABLES ( "
    + "    hr.employees AS works_at "
    + "      SOURCE KEY ( employee_id ) REFERENCES employees (employee_id) "
    + "      DESTINATION departments "
    + "      PROPERTIES ( employee_id ) "
    + "  )";
session.executePgql(statement);

/**
 * To get a handle to the graph, execute:
 */
PgxGraph g = session.getGraph("HR_SIMPLIFIED");

/**
 * You can use this handle to run PGQL queries on this graph.
 * For example, to find the department that “Nandita Sarchand” works for, execute:
 */
String query =
    "SELECT dep.department_name "
  + "FROM MATCH (emp:Employee) -[:works_at]-> (dep:Department) "
  + "WHERE emp.first_name = 'Nandita' AND emp.last_name = 'Sarchand' "
  + "ORDER BY 1";
PgqlResultSet resultSet = g.queryPgql(query);
resultSet.print();
+-----------------+
| department_name |
+-----------------+
| Shipping        |
+-----------------+

/**
 * To get an overview of the types of vertices and their frequencies, execute:
 */
String query =
      "SELECT label(n), COUNT(*) "
    + "FROM MATCH (n) "
    + "GROUP BY label(n) "
    + "ORDER BY COUNT(*) DESC";
PgqlResultSet resultSet = g.queryPgql(query);
resultSet.print();

+-----------------------+
| label(n)   | COUNT(*) |
+-----------------------+
| EMPLOYEE   | 107      |
| DEPARTMENT | 27       |
+-----------------------+

/**
  *To get an overview of the types of edges and their frequencies, execute:
  */
 String query =
    "SELECT label(n) AS srcLbl, label(e) AS edgeLbl, label(m) AS dstLbl, COUNT(*) "
  + "FROM MATCH (n) -[e]-> (m) "
  + "GROUP BY srcLbl, edgeLbl, dstLbl "
  + "ORDER BY COUNT(*) DESC";
PgqlResultSet resultSet = g.queryPgql(query);
resultSet.print();

+---------------------------------------------+
| srcLbl   | edgeLbl  | dstLbl     | COUNT(*) |
+---------------------------------------------+
| EMPLOYEE | WORKS_AT | DEPARTMENT | 106      |
+---------------------------------------------+

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.2 Creating Property Graphs Using Options

Using the OPTIONS clause in the CREATE PROPERTY GRAPH statement, you can specify any of the options explained in the following sections:

Using Graph Optimization Options

You can load a graph for querying and analytics or for performing update operations. Depending on your requirement, you can optimize the read or update performance using the OPTIONS clause in the CREATE PROPERTY GRAPH statement.

The following table describes the valid options that are supported in the OPTIONS clause:

Table 16-1 Graph Optimization Options

OPTIONS	Description
`OPTIMIZED_FOR_READ`	This can be used for read-intensive scenarios.
`OPTIMIZED_FOR_UPDATES`	This is the default option and can be used for fast updates.
`SYNCHRONIZABLE`	This assures that the graph can be synchronized via Flashback Technology. However, exceptions are thrown if one of the edge keys is either composite or non-numeric. In these cases, the graph can normally still be loaded, but PGX generates a new (numeric and non-composite) edge key. Such edges can therefore not be synchronized with the database.

For example, the following graph is set using OPTIMIZED_FOR_UPDATES and SYNCHRONIZABLE options:

CREATE PROPERTY GRAPH hr 
VERTEX TABLES ( 
employees LABEL employee, departments LABEL department 
) 
EDGE TABLES ( 
departments AS managed_by 
SOURCE KEY ( department_id ) REFERENCES departments (department_id)
DESTINATION employees 
NO PROPERTIES 
) OPTIONS (OPTIMIZED_FOR_UPDATES, SYNCHRONIZABLE)

Note:

SYNCHRONIZABLE option can be used in combination with OPTIMIZED_FOR_UPDATES and OPTIMIZED_FOR_READ. But, OPTIMIZED_FOR_UPDATES and OPTIMIZED_FOR_READ cannot be used together and in such a case an exception will be thrown.
If you are creating a synchronizable graph, then ensure that the vertex and edge keys are numeric and non-composite.

Using Options to Handle Edges with Missing Vertices

If either the source or destination vertex or both are missing for an edge, then you can configure one of the following values in the OPTIONS clause in the CREATE PROPERTY GRAPH statement:

IGNORE EDGE ON MISSING VERTEX: Specifies that the edge for a missing vertex must be ignored.
IGNORE EDGE AND LOG ON MISSING VERTEX: Specifies that the edge for a missing vertex must be ignored and all ignored edges must be logged.
IGNORE EDGE AND LOG ONCE ON MISSING VERTEX: Specifies that the edge for a missing vertex must be ignored and only the first ignored edge must be logged.
ERROR ON MISSING VERTEX (default): Specifies that an error must be thrown for edges with missing vertices.

For example, the following graph is set using ERROR ON MISSING VERTEX option:

CREATE PROPERTY GRAPH region_graph 
VERTEX TABLES ( 
regions KEY (region_id), 
countries KEY (country_id)
) 
EDGE TABLES ( 
countries AS countries_regions 
SOURCE KEY ( country_id ) REFERENCES countries(country_id) 
DESTINATION KEY (region_id) REFERENCES regions(region_id) 
NO PROPERTIES 
) OPTIONS ( ERROR ON MISSING VERTEX)

On execution, the following error response is shown:

unknown vertex ID received in destination 4 of edge 5

When using IGNORE EDGE AND LOG ON MISSING VERTEX or IGNORE EDGE AND LOG ONCE ON MISSING VERTEX option, you must update the default Logback configuration file in /etc/oracle/graph/logback.xml and the graph server (PGX) logger configuration file in /etc/oracle/graph/logback-server.xml to log the DEBUG logs. Only then you can view the ignored edges in /var/opt/log/pgx-server.log file.

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.3 Supported PGQL Features and Limitations on the Graph Server (PGX)

The following table provides the complete list of supported and unsupported PGQL functionalities for the graph server (PGX):

Table 16-2 Supported PGQL Functionalities and Limitations on the Graph Server (PGX)

Features	PGQL on the Graph Server (PGX)
`CREATE PROPERTY GRAPH`	Supported Limitations: No composite keys for vertices
`DROP PROPERTY GRAPH`	Not Supported
Fixed-length pattern matching	Supported
Variable-length pattern matching goals	Supported: Reachability `ANY` `ANY SHORTEST` `TOP k SHORTEST` `ALL SHORTEST` `ANY CHEAPEST` `TOP k CHEAPEST` `ALL`
Variable-length pattern matching quantifiers	Supported: * + ? { n } { n, } { n, m } { , m } Limitations: ? is only supported for reachability In case of `ANY CHEAPEST` and `TOP k CHEAPEST`, only * is supported
Variable-length path unnesting	Supported: `ONE ROW PER VERTEX` `ONE ROW PER STEP` Limitation: * quantifier is not supported
`GROUP BY`	Supported
`HAVING`	Supported
Aggregations	Supported: `COUNT` `MIN`, `MAX`, `AVG`, `SUM` `LISTAGG` `ARRAY_AGG` Limitations: `ARRAY_AGG` is only supported as horizontal aggregation (in combination with variable-length path) but not in combination with vertical aggregation
`DISTINCT` `SELECT DISTINCT` Aggregation with `DISTINCT` (such as, `COUNT(DISTINCT e.prop))`	Supported
`SELECT v.*`	Supported
`ORDER BY (+ASC/DESC), LIMIT, OFFSET`	Supported
Data Types	Supported: `INTEGER (32-bit)` `LONG (64-bit)` `FLOAT (32-bit)` `DOUBLE (64-bit)` `STRING` (no maximum length) `BOOLEAN` `DATE` `TIME` `TIME WITH TIME ZONE` `TIMESTAMP` `TIMESTAMP WITH TIME ZONE`
JSON	No built-in JSON support. However, JSON values can be stored as `STRING` and manipulated or queried through user-defined functions (UDFs) written in Java or JavaScript.
Operators	Supported: Relational: `+`, `-`, `*`, `/`, `%`, `-` (unary minus) Arithmetic: `=`, `<>`, `<`, `>`, `<=`, `>=` Logical: `AND`, `OR`, `NOT` String: `\|\|` (concat)
Functions and predicates	Supported: `IS NULL`, `IS NOT NULL` `JAVA_REGEXP_LIKE` (based on `CONTAINS`) `LOWER`, `UPPER` `SUBSTRING` `ABS`, `CEIL/CEILING`, `FLOOR`, `ROUND` `EXTRACT` `ID` `LABEL`, `LABELS`, `HAS_LABEL` `ALL_DIFFERENT` `IN_DEGREE`, `OUT_DEGREE` `CAST` `CASE` `IN` and `NOT IN` `MATCH_NUMBER` `ELEMENT_NUMBER`
User-defined functions	Supported: Java UDFs JavaScript UDFs
Subqueries: Scalar subqueries `EXISTS` and `NOT EXISTS` subqueries	Supported
`INSERT/UPDATE/DELETE`	Supported
`INTERVAL` literals and operations	Supported literals: `SECOND` `MINUTE` `HOUR` `DAY` `MONTH` `YEAR` Supported operations: Add `INTERVAL` to datetime (+) Subtract `INTERVAL` from datetime (-)

Also, the following explains certain supported and unsupported PGQL features:

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.3.1 Support for Selecting All Properties

You can use SELECT v.* to select all properties of the vertices or edges that bind to the variable v. For example:

SELECT label(n), n.* FROM MATCH (n) ORDER BY "number", "name"

On execution, the query output is as shown:

+-----------------------------+
| label(n) | number | name    |
+-----------------------------+
| Account  | 1001   | <null>  |
| Account  | 2090   | <null>  |
| Account  | 8021   | <null>  |
| Account  | 10039  | <null>  |
| Person   | <null> | Camille |
| Person   | <null> | Liam    |
| Person   | <null> | Nikita  |
| Company  | <null> | Oracle  |
+-----------------------------+

You can use label expressions to select properties that belong to the specified vertex or edge labels. For example:

SELECT label(n), n.* FROM MATCH (n:Person) ORDER BY "name"

The preceding query retrieves all the properties for the specified Person label:

+--------------------+
| label(n) | name    |
+--------------------+
| Person   | Camille |
| Person   | Liam    |
| Person   | Nikita  |
+--------------------+

You can also specify a PREFIX to avoid duplicate column names in cases where you select all properties using multiple variables. For example:

SELECT n.* PREFIX 'n_', e.* PREFIX 'e_', m.* PREFIX 'm_'
FROM MATCH (n:Account) -[e:transaction]-> (m:Account)
ORDER BY "e_amount"

The query output is as shown:

+--------------------------------+
| n_number | e_amount | m_number |
+--------------------------------+
| 10039    | 1000.0   | 8021     |
| 8021     | 1500.3   | 1001     |
| 8021     | 3000.7   | 1001     |
| 2090     | 9900.0   | 10039    |
| 1001     | 9999.5   | 2090     |
+--------------------------------+

Parent topic: Supported PGQL Features and Limitations on the Graph Server (PGX)

16.3.2 Unnesting of Variable-Length Path Queries

Unnesting of variable-length path queries (such as, SHORTEST or CHEAPEST paths) to obtain a separate row for each vertex or edge along a path is supported.

You can unnest a path aggregation using one of the following options:

ONE ROW PER MATCH (default option)
ONE ROW PER VERTEX(vertex_variable)
ONE ROW PER STEP(edge_source_variable,edge_variable,edge_destination_variable)

For example, the following PGQL query uses the ONE ROW PER STEP option:

SELECT  v1.ACCT_ID AS src_no, k.TXN_AMOUNT, v2.ACCT_ID AS dest_no 
FROM MATCH ALL SHORTEST (a:Accounts) -[e:transfers]->+ (b:Accounts)
ONE ROW PER STEP( v1,k,v2 )
WHERE a.ACCT_ID = 284 AND b.ACCT_ID = 616

It is important to note that the ONE ROW PER STEP option only supports paths with a minimal hop greater than 0 and hence * quantifier is not supported with this option.

On execution, the preceding query retrieves one row for every edge on the path that is bound by the corresponding source and destination vertices:

+-------------------------------+
| src_no | TXN_AMOUNT | dest_no |
+-------------------------------+
| 744    | 1000.0     | 616     |
| 772    | 1000.0     | 744     |
| 284    | 1000.0     | 772     |
| 744    | 1000.0     | 616     |
| 772    | 1500.0     | 744     |
| 284    | 1000.0     | 772     |
+-------------------------------+

You can also use the Graph Visualization tool to visualize edges using ONE ROW PER STEP along a path:

Figure 16-1 Visualizing Unnesting of Variable-Length Path Queries

Description of "Figure 16-1 Visualizing Unnesting of Variable-Length Path Queries"

An example for a query with the ONE ROW PER VERTEX option is as follows:

SELECT k.acct_id AS id, k.acct_name AS name
FROM MATCH ANY SHORTEST (a:Accounts) ((src:Accounts)-[e:transfers]->){1,3}(b:Accounts)
ONE ROW PER VERTEX(k)
WHERE a.acct_id=284 AND b.acct_id=616

On execution, the preceding query retrieves one row per vertex along a path:

+----------------+
| id  | name     |
+----------------+
| 616 | Account4 |
| 744 | Account3 |
| 772 | Account2 |
| 284 | Account1 |
+---------------+

Built-in Function Support for Recursive Path Unnesting Queries

PGQL supports the following two built-in functions, which can be used in combination with any of the path unnesting option (ONE ROW PER VERTEX, ONE ROW PER STEP or ONE ROW PER MATCH):

MATCH_NUMBER(k): Returns a unique per-path identifier for each unnested path (that is, if two rows come from the same path, they have the same MATCH_NUMBER(k)).
ELEMENT_NUMBER(k): Returns the element number of a vertex or an edge along a path. Vertices are numbered with odd numbers, the leftmost vertex is numbered 1, the second 3, then 5 and so on. Edges are assigned with even numbers, starting with 2 for the leftmost edge, 4 for the next one, and so on.

For example, the following PGQL query uses the MATCH_NUMBER(k) and ELEMENT_NUMBER(k) functions with ONE ROW PER VERTEX option:

SELECT k.*, match_number(k), element_number(k)
FROM MATCH ANY SHORTEST (a:Accounts) -[e:transfers]->* (b:Accounts) ONE ROW PER VERTEX ( k )
WHERE a.acct_id = 284 AND b.acct_id = 616

The preceding query produces the following output on execution. Note that the element_number(k) returned for the vertices are odd numbered values. Since the preceding query uses ANY path pattern, there is only one arbitrary path displayed in the output. Therefore match_number(k) is the same for all the rows in the path.

+-----------------------------------------------------------+
| ACCT_ID | ACCT_NAME | match_number(k) | element_number(k) |
+-----------------------------------------------------------+
| 616     | Account   | 0               | 7                 |
| 744     | Account   | 0               | 5                 |
| 772     | Account   | 0               | 3                 |
| 284     | Account   | 0               | 1                 |
+-----------------------------------------------------------+

The following example shows a PGQL query using MATCH_NUMBER(k) and ELEMENT_NUMBER(k) functions with ONE ROW PER STEP option:

SELECT v1.acct_id AS src_no,k.txn_amount,v2.acct_id AS dest_no, match_number(k), element_number(k)
FROM MATCH ALL SHORTEST (a:Accounts) -[e:transfers]->+ (b:Accounts)
ONE ROW PER STEP( v1,k,v2 )
WHERE a.acct_id = 284 AND b.acct_id = 616

The preceding query output is as shown. Note that there are two paths identified by match_number(k)and the edges are displayed with even numbered element_number(k) values.

+---------------------------------------------------------------------+
| src_no | txn_amount | dest_no | match_number(k) | element_number(k) |
+---------------------------------------------------------------------+
| 744    | 1000.0     | 616     | 0               | 6                 |
| 772    | 1000.0     | 744     | 0               | 4                 |
| 284    | 1000.0     | 772     | 0               | 2                 |
| 744    | 1000.0     | 616     | 1               | 6                 |
| 772    | 1500.0     | 744     | 1               | 4                 |
| 284    | 1000.0     | 772     | 1               | 2                 |
+---------------------------------------------------------------------+

Parent topic: Supported PGQL Features and Limitations on the Graph Server (PGX)

16.3.3 Using INTERVAL Literals in PGQL Queries

You can use INTERVAL literals in PGQL queries to add or subtract intervals to or from PGQL temporal data types respectively.

See the PGQL 1.5 Specification for the supported temporal data types.

An INTERVAL type is a period of time, which consists of the keyword "INTERVAL" followed by a numeral and a temporal unit. For example, INTERVAL '1' DAY.

The following table shows the valid temporal units that are supported in INTERVAL values:

Table 16-3 Valid values for fields in INTERVAL values

Keyword	Supported Valid Values
`YEAR`	Unconstrained except by <interval leading field precision>
`MONTH`	Months (within years) (0-11)
`DAY`	Unconstrained except by <interval leading field precision>
`HOUR`	Hours (within days) (0-23)
`MINUTE`	Minutes (within hours) (0-59)
`SECOND`	Seconds (within minutes) (0-59.999...)

The following INTERVAL operations are supported on a temporal data type:

TEMPORAL TYPE + INTERVAL
INTERVAL + TEMPORAL TYPE
TEMPORAL TYPE - INTERVAL

For example, the following PGQL query retrieves persons where n.birthdate + INTERVAL '20' YEAR > TIMESTAMP '2000-01-01 00:00:00':

opg4j> graph.queryPgql("SELECT n.name, n.birthdate FROM MATCH (n:Person) WHERE n.birthdate + INTERVAL '20' YEAR > TIMESTAMP '2000-01-01 00:00:00'").print()

graph.queryPgql("SELECT n.name, n.birthdate FROM MATCH (n:Person) WHERE n.birthdate + INTERVAL '20' YEAR > TIMESTAMP '2000-01-01 00:00:00'").print();

graph.query_pgql("SELECT n.name, n.birthdate FROM MATCH (n:Person) WHERE n.birthdate + INTERVAL '20' YEAR > TIMESTAMP '2000-01-01 00:00:00'").print()

On execution, the query output is as shown:

+--------------------------+
| name  | birthdate        |
+--------------------------+
| Mary  | 1982-09-25T00:00 |
| Alice | 1987-02-01T00:00 |
+--------------------------+

Parent topic: Supported PGQL Features and Limitations on the Graph Server (PGX)

16.3.4 Limitations on Quantifiers

Although all quantifiers such as *, +, and {1,4} are supported for reachability and shortest path patterns, the only quantifier that is supported for cheapest path patterns is * (zero or more).

Parent topic: Supported PGQL Features and Limitations on the Graph Server (PGX)

16.3.5 Limitations on WHERE and COST Clauses in Quantified Patterns

The WHERE and COST clauses in quantified patterns, such as reachability patterns or shortest and cheapest path patterns, are limited to referencing a single variable only.

The following are examples of queries that are not supported because the WHERE or COST clauses reference two variables e and x instead of zero or one:

... PATH p AS (n) –[e]-> (m) WHERE e.prop > m.prop ...
... SHORTEST ( (n) (-[e]-> (x) WHERE e.prop + x.prop > 10)* (m) ) ...
... CHEAPEST ( (n) (-[e]-> (x) COST e.prop + x.prop )* (m) ) ...

The following query is supported because the subquery only references a single variable a from the outer scope, while the variable c does not count since it is newly introduced in the subquery:

... PATH p AS (a) -> (b)
      WHERE EXISTS ( SELECT * FROM MATCH (a) -> (c) ) ...

Parent topic: Supported PGQL Features and Limitations on the Graph Server (PGX)

16.4 Java APIs for Executing CREATE PROPERTY GRAPH Statements

The easiest way to execute a CREATE PROPERTY GRAPH statement is through the PgxSession.executePgql(String statement) method.

Example 16-2 Executing a CREATE PROPERTY GRAPH statement

String statement =
      "CREATE PROPERTY GRAPH hr_simplified "
    + "  VERTEX TABLES ( "
    + "    hr.employees LABEL employee "
    + "      PROPERTIES ARE ALL COLUMNS EXCEPT ( job_id, manager_id, department_id ), "
    + "    hr.departments LABEL department "
    + "      PROPERTIES ( department_id, department_name ) "
    + "  ) "
    + "  EDGE TABLES ( "
    + "    hr.employees AS works_at "
    + "      SOURCE KEY ( employee_id ) REFERENCES employees (employee_id) "
    + "      DESTINATION departments "
    + "      PROPERTIES ( employee_id ) "
    + "  )";
session.executePgql(statement);
PgxGraph g = session.getGraph("HR_SIMPLIFIED");

/**
 * Alternatively, one can use the prepared statement API, for example:
 */

PgxPreparedStatement stmnt = session.preparePgql(statement);
stmnt.execute();
stmnt.close();
PgxGraph g = session.getGraph("HR_SIMPLIFIED");

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.5 Python APIs for Executing CREATE PROPERTY GRAPH Statements

You can create a property graph by executing the CREATE PROPERTY GRAPH statement through the Python API.

Creating a Property Graph Using the Python Client

Launch the Python client:

./bin/opg4py --base_url https://localhost:7007 --user customer_360

Define and execute the CREATE PROPERTY GRAPH statement as shown:

statement = (
       "CREATE PROPERTY GRAPH "+ "<graph_name>" + " " +
       "VERTEX TABLES ( " +
       "bank_accounts " +
       "KEY(acct_id) " +
       "LABEL Account PROPERTIES (acct_id) " +
       ")" +
       "EDGE TABLES ( " +
       "bank_txns " +
       "KEY (txn_id) " +
       "SOURCE KEY (from_acct_id) REFERENCES bank_accounts (acct_id) " +
       "DESTINATION KEY (to_acct_id) REFERENCES bank_accounts (acct_id) " +
       "LABEL Transfer PROPERTIES(amount) " +
       ")")
>>> session.prepare_pgql(statement).execute()

where <graph_name> is the name of the graph.

The graph gets created and you can verify through the get_graph method:

>>> graph = session.get_graph("<graph_name>")
>>> graph
PgxGraph(name:<graph_variable>, v: 1000, e: 5001, directed: True, memory(Mb): 0)

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.6 Java APIs for Executing SELECT Queries

This section describes the APIs to execute SELECT queries in the graph server (PGX).

Executing SELECT Queries Against a Graph in the Graph Server (PGX)
The PgxGraph.queryPgql(String query) method executes the query in the current session. The method returns a PgqlResultSet.
Executing SELECT Queries Against a PGX Session
The PgxSession.queryPgql(String query) method executes the given query in the session and returns a PgqlResultSet.
Iterating Through a Result Set
There are two ways to iterate through a result set: in a JDBC-like manner or using the Java Iterator interface.
Printing a Result Set
The following methods of PgqlResultSet (package oracle.pgx.api) are used to print a result set:

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.6.1 Executing SELECT Queries Against a Graph in the Graph Server (PGX)

The PgxGraph.queryPgql(String query) method executes the query in the current session. The method returns a PgqlResultSet.

The ON clauses inside the MATCH clauses can be omitted since the query is executed directly against a PGX graph. For the same reason, the INTO clauses inside the INSERT clauses can be omitted. However, if you want to explicitly specify graph names in the ON and INTO clauses, then those graph names have to match the actual name of the graph (PgxGraph.getName()).

Parent topic: Java APIs for Executing SELECT Queries

16.6.2 Executing SELECT Queries Against a PGX Session

The PgxSession.queryPgql(String query) method executes the given query in the session and returns a PgqlResultSet.

The ON clauses inside the MATCH clauses, and the INTO clauses inside the INSERT clauses, must be specified and cannot be omitted. At this moment, all the ON and INTO clauses of a query need to reference the same graph since joining data from multiple graphs in a single query is not yet supported.

Parent topic: Java APIs for Executing SELECT Queries

16.6.3 Iterating Through a Result Set

There are two ways to iterate through a result set: in a JDBC-like manner or using the Java Iterator interface.

For JDBC-like iterations, the methods in PgqlResultSet (package oracle.pgx.api) are similar to the ones in java.sql.ResultSet. A noteworthy difference is that PGQL's result set interface is based on the new date and time library that was introduced in Java 8, while java.sql.ResultSet is based on the legacy java.util.Date. To bridge the gap, PGQL's result set provides getLegacyDate(..) for applications that still use java.util.Date.

A PgqlResultSet has a cursor that is initially set before the first row. Then, the following methods are available to reposition the cursor:

next() : boolean
previous() : boolean
beforeFirst()
afterLast()
first() : boolean
last() : boolean
absolute(long row) : boolean
relative(long rows) : boolean

After the cursor is positioned at the desired row, the following getters are used to obtain values:

getObject(int columnIdx) : Object
getObject(String columnName) : Object
getString(int columnIdx) : String
getString(String columnName) : String
getInteger(int columnIdx) : Integer
getInteger(String columnName) : Integer
getLong(int columnIdx) : Long
getLong(String columnName) : Long
getFloat(int columnIdx) : Float
getFloat(String columnName) : Float
getDouble(int columnIdx) : Double
getDouble(String columnName) : Double
getBoolean(int columnIdx) : Boolean
getBoolean(String columnName) : Boolean
getVertexLabels(int columnIdx) : Set<String>
getVertexLabels(String columnName) : Set<String>
getDate(int columnIdx) : LocalDate
getDate(String columnName) : LocalDate
getTime(int columnIdx) : LocalTime
getTime(String columnName) : LocalTime
getTimestamp(int columnIdx) : LocalDateTime
getTimestamp(String columnName) : LocalDateTime
getTimeWithTimezone(int columnIdx) : OffsetTime
getTimeWithTimezone(String columnName) : OffsetTime
getTimestampWithTimezone(int columnIdx) : OffsetDateTime
getTimestampWithTimezone(String columnName) : OffsetDateTime
getLegacyDate(int columnIdx) : java.util.Date
getLegacyDate(String columnName) : java.util.Date
getVertex(int columnIdx) : PgxVertex<ID>
getVertex(String columnName) : PgxVertex<ID>
getEdge(int columnIdx) : PgxEdge
getEdge(String columnName) : PgxEdge

See the Java Documentation for more details.

Finally, there is a PgqlResultSet.close() which releases the result set’s resources, and there is a PgqlResultSet.getMetaData() through which the column names and column count can be retrieved.

An example for result set iteration is as follows:

PgqlResultSet resultSet = g.queryPgql(
    "   SELECT owner.name AS account_holder, SUM(t.amount) AS total_transacted_with_Nikita "
  + "     FROM MATCH (p:Person) -[:ownerOf]-> (account1:Account) " 
  + "        , MATCH (account1) -[t:transaction]- (account2) "
  + "        , MATCH (account2:Account) <-[:ownerOf]- (owner:Person|Company) "
  + "    WHERE p.name = 'Nikita' "
  + " GROUP BY owner");

while (resultSet.next()) {
  String accountHolder = resultSet.getString(1);
  long totalTransacted = resultSet.getLong(2);
  System.out.println(accountHolder + ": " + totalTransacted);
}

resultSet.close();

The output of the above example will look like:

Oracle: 4501
Camille: 1000

In addition, the PgqlResultSet is also iterable via the Java Iterator interface. An example of a “for each loop” over the result set is as follows:

for (PgxResult result : resultSet) {
  String accountHolder = result.getString(1);
  long totalTransacted = result.getLong(2);
  System.out.println(accountHolder + ": " + totalTransacted);
}

The output of the above example will look like:

Oracle: 4501
Camille: 1000

Note that the same getters that are available for PgqlResultSet are also available for PgxResult.

Parent topic: Java APIs for Executing SELECT Queries

16.6.4 Printing a Result Set

The following methods of PgqlResultSet (package oracle.pgx.api) are used to print a result set:

print() : PgqlResultSet
print(long numResults) : PgqlResultSet
print(long numResults, int from) : PgqlResultSet
print(PrintStream printStream, long numResults, int from) : PgqlResultSet

For example:

g.queryPgql("SELECT COUNT(*) AS numPersons FROM MATCH (n:Person)").print().close()
+------------+
| numPersons |
+------------+
| 3          |
+------------+

Another example:

PgqlResultSet resultSet = g.queryPgql(
    "   SELECT owner.name AS account_holder, SUM(t.amount) AS total_transacted_with_Nikita "
  + "     FROM MATCH (p:Person) -[:ownerOf]-> (account1:Account) " 
  + "        , MATCH (account1) -[t:transaction]- (account2) "
  + "        , MATCH (account2:Account) <-[:ownerOf]- (owner:Person|Company) "
  + "    WHERE p.name = 'Nikita' "
  + " GROUP BY owner")

resultSet.print().close()
+-----------------------------------------------+
| account_holder | total_transacted_with_Nikita |
+-----------------------------------------------+
| Camille        | 1000.0                       |
| Oracle         | 4501.0                       |
+-----------------------------------------------+

Parent topic: Java APIs for Executing SELECT Queries

16.7 Java APIs for Executing UPDATE Queries

The UPDATE queries make changes to existing graphs using the INSERT, UPDATE, and DELETE operations as detailed in the section Graph Modification of the PGQL 1.3 specification.

Note that INSERT allows you to insert new vertices and edges into a graph, UPDATE allows you to update existing vertices and edges by setting their properties to new values, and DELETE allows you to delete vertices and edges from a graph.

Updatability of Graphs Through PGQL
Graph data that is loaded from the Oracle RDBMS or from CSV files into the PGX is not updatable through PGQL right away.
Executing UPDATE Queries Against a Graph in the Graph Server (PGX)
To execute UPDATE queries against a graph, use the PgxGraph.executePgql(String query) method.
Executing UPDATE Queries Against a PGX Session
For now, there is no support for executing UPDATE queries against a PgxSession and therefore, updates always have to be executed against a PgxGraph. To obtain a graph from a session, use the PgxSession.getGraph(String graphName) method.
Altering the Underlying Schema of a Graph
The INSERT operations can only insert vertices and edges with known labels and properties. Similarly, UPDATE operations can only set values of known properties. Thus, new data must always conform to the existing schema of the graph.

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.7.1 Updatability of Graphs Through PGQL

Graph data that is loaded from the Oracle RDBMS or from CSV files into the PGX is not updatable through PGQL right away.

First, you need to create a copy of the data through the PgxGraph.clone() method. The resulting graph is fully updatable.

Consider the following example:

// load a graph from the RDBMS or from CSV
PgxGraph g1 = session.readGraphWithProperties("path/to/graph_config.json");

// create an updatable copy of the graph
PgxGraph g2 = g1.clone("new_graph_name");

// insert an additional vertex into the graph
g2.executePgql("INSERT VERTEX v " +
               "         LABELS ( Person ) " +
               "         PROPERTIES ( v.firstName = 'Camille', " +
               "                      v.lastName = ' Mullins')");

Additionally, there is also a PgxGraph.cloneAndExecutePgql(String query, String graphName) method that combines the last two steps from above example into a single step:

// create an updatable copy of the graph while inserting a new vertex
PgxGraph g2_copy = g1.cloneAndExecutePgql(
                     "INSERT VERTEX v " +
                     "         LABELS ( Person ) " +
                     "         PROPERTIES ( v.firstName = 'Camille', " +
                     "                      v.lastName = ' Mullins') "
                   , "new_graph_name");

Note that graphs that are created through PgxGraph.clone() are local to the session. However, they can be shared with other sessions through the PgxGraph.publish(..) methods but then they are no longer updatable through PGQL. Only session-local graphs are updatable but persistent graphs are not.

Parent topic: Java APIs for Executing UPDATE Queries

16.7.2 Executing UPDATE Queries Against a Graph in the Graph Server (PGX)

To execute UPDATE queries against a graph, use the PgxGraph.executePgql(String query) method.

The following is an example of INSERT query:

g.executePgql("INSERT VERTEX v " +
              "         LABELS ( Person ) " +
              "         PROPERTIES ( v.firstName = 'Camille', " +
              "                      v.lastName = ' Mullins' ) ");

Note that the INTO clause of the INSERT can be omitted. If you use an INTO clause, the graph name in the INTO clause must correspond to the name of the PGX graph (PgxGraph.getName()) that the query is executed against.

The following is an example of UPDATE query:

// set the date of birth of Camille to 2014-11-15
g.executePgql("UPDATE v SET ( v.dob = DATE '2014-11-14' ) " +
              "FROM MATCH (v:Person) " +
              "WHERE v.firstName = 'Camille' AND v.lastName = ' Mullins' ");

The following is an example of DELETE query:

// delete Camille from the graph
g.executePgql("DELETE v " +
              "FROM MATCH (v:Person) " +
              "WHERE v.firstName = 'Camille' AND v.lastName = 'Mullins' ");

Parent topic: Java APIs for Executing UPDATE Queries

16.7.3 Executing UPDATE Queries Against a PGX Session

For now, there is no support for executing UPDATE queries against a PgxSession and therefore, updates always have to be executed against a PgxGraph. To obtain a graph from a session, use the PgxSession.getGraph(String graphName) method.

Parent topic: Java APIs for Executing UPDATE Queries

16.7.4 Altering the Underlying Schema of a Graph

The INSERT operations can only insert vertices and edges with known labels and properties. Similarly, UPDATE operations can only set values of known properties. Thus, new data must always conform to the existing schema of the graph.

However, some PGX APIs exist for updating the schema of a graph: while no APIs exist for adding new labels, new properties can be added through the PgxGraph.createVertexProperty(PropertyType type, String name) and PgxGraph.createEdgeProperty(PropertyType type, String name) methods. The new properties are attached to each vertex/edge in the graph, irrespective of their labels. Initially the properties are assigned a default value but then the values can be updated through the UPDATE statements.

Consider the following example:

// load a graph from the RDBMS or from CSV
PgxGraph g = session.readGraphWithProperties("path/to/graph_config.json");

// add a new property to the graph
g.createVertexProperty(PropertyType.LOCAL_DATE, "dob");

// set the date of birth of Camille to 2014-11-15
g.executePgql("UPDATE v SET ( v.dob = DATE '2014-11-14' ) " +
              "FROM MATCH (v:Person) " +
              "WHERE v.firstName = 'Camille' AND v.lastName = ' Mullins' ");

Parent topic: Java APIs for Executing UPDATE Queries

16.8 PGQL Queries with Partitioned IDs

You can retrieve partitioned IDs using the id() function in PGQL.

PGQL SELECT Queries

The following are a few examples to retrieve partitioned IDs using PGQL SELECT queries:

g.queryPgql("SELECT id(n) FROM MATCH(n)").print().close()

This prints an output similar to:

+-------------+
| id(n)       |
+-------------+
| Accounts(2) |
| Accounts(4) |
| Accounts(6) |
+-------------+

g.queryPgql("SELECT n.name FROM MATCH(n) WHERE id(n) = 'Accounts(1)'").print().close()

The output is printed as shown:

+-------+
| name  |
+-------+
| User1 |
+-------+

g.queryPgql("SELECT LABEL(n), n.name from MATCH(n) WHERE n.id = 1").print().close()

The output is printed as shown:

+------------------+
| label(n) | name  |
+------------------+
| Accounts | User1 |
+------------------+

PGX automatically creates a unique index for keys so that queries with predicates such as WHERE id(n) = 'Accounts(1)' and WHERE n.id = 1 can be efficiently processed by retrieving the vertex in constant time.

Using Bind Variables

Partitioned IDs can also be passed as bind values into a PgxPreparedStatement.

For example:

PgxPreparedStatement statement = g.preparePgql("SELECT n.name FROM MATCH (n) WHERE id(n)= ?")
statement.setString(1, "Accounts(1)")
statement.executeQuery().print().close()

This prints the output as shown:

+-------+
| name  |
+-------+
| User1 |
+-------+

PGQL INSERT Queries

In INSERT queries, you must provide a value for the key property if a key property exists. The value is then used for the vertex or edge key.

For example you can execute an INSERT as shown:

g.executePgql("INSERT VERTEX v LABELS (Accounts) PROPERTIES (v.id = 1001, v.name = 'User1001')")

The inserted values can be verified as shown:

g.queryPgql("SELECT id(n), n.name FROM MATCH(n) WHERE n.id = 1001").print().close()

This prints the output:

+---------------------------+
| id(n)          | name     |
+---------------------------+
| Accounts(1001) | User1001 |
+---------------------------+

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.9 Security Tools for Executing PGQL Queries

To safeguard against query injection, bind variables can be used in place of literals while printIdentifier(String identifier) can be used in place of identifiers like graph names, labels, and property names.

Using Bind Variables
There are two reasons for using bind variables:
Using Identifiers in a Safe Manner
When you create a query through string concatenation, not only literals in queries pose a security risk, but also identifiers like graph names, labels, and property names do. The only problem is that bind variables are not supported for such identifier. Therefore, if these identifiers are variable from the application's perspective, then it is recommended to protect against query injection by passing the identifier through the oracle.pgql.lang.ir.PgqlUtils.printIdentifier(String identifier) method.

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.9.1 Using Bind Variables

There are two reasons for using bind variables:

It protects against query injection.
It speeds up query execution because the same bind variables can be set multiple times without requiring recompilation of the query.

To create a prepared statement, use one of the following two methods:

PgxGraph.preparePgql(String query) : PgxPreparedStatement
PgxSession.preparePgql(String query) : PgxPreparedStatement

The PgxPreparedStatement (package oracle.pgx.api) returned from these methods have setter methods for binding the bind variables to values of the designated data type.

Consider the following example:

PreparedStatement stmnt = g.preparePgql(
  "SELECT v.id, v.dob " +
  "FROM MATCH (v) " +
  "WHERE v.firstName = ? AND v.lastName = ?");
stmnt.setString(1, "Camille");
stmnt.setString(2, "Mullins");
ResultSet rs = stmnt.executeQuery();

Each bind variable in the query needs to be set to a value using one of the following setters of PgxPreparedStatement:

setBoolean(int parameterIndex, boolean x)
setDouble(int parameterIndex, double x)
setFloat(int parameterIndex, float x)
setInt(int parameterIndex, int x)
setLong(int parameterIndex, long x)
setDate(int parameterIndex, LocalDate x)
setTime(int parameterIndex, LocalTime x)
setTimestamp(int parameterIndex, LocalDateTime x)
setTimeWithTimezone(int parameterIndex, OffsetTime x)
setTimestampWithTimezone(int parameterIndex, OffsetDateTime x)
setArray(int parameterIndex, List<?> x)

Once all the bind variables are set, the statement can be executed through:

PgxPreparedStatement.executeQuery()
- For SELECT queries only
- Returns a ResultSet
PgxPreparedStatement.execute()
- For any type of statement
- Returns a Boolean to indicate the form of the result: true in case of a SELECT query, false otherwise
- In case of SELECT, the ResultSet can afterwards be accessed through PgxPreparedStatement.getResultSet()

In PGQL, bind variables can be used in place of literals of any data type, including array literals. An example query with a bind variable to is set to an instance of a String array is:

List<String> countryNames = new ArrayList<String>();
countryNames.add("Scotland");
countryNames.add("Tanzania");
countryNames.add("Serbia");

PreparedStatement stmnt = g.preparePgql(
  "SELECT n.name, n.population " +
  "FROM MATCH (c:Country) " +
  "WHERE c.name IN ?");

ResultSet rs = stmnt.executeQuery();

Finally, if a prepared statement is no longer needed, it is closed through PgxPreparedStatement.close() to free up resources.

Parent topic: Security Tools for Executing PGQL Queries

16.9.2 Using Identifiers in a Safe Manner

When you create a query through string concatenation, not only literals in queries pose a security risk, but also identifiers like graph names, labels, and property names do. The only problem is that bind variables are not supported for such identifier. Therefore, if these identifiers are variable from the application's perspective, then it is recommended to protect against query injection by passing the identifier through the oracle.pgql.lang.ir.PgqlUtils.printIdentifier(String identifier) method.

Given an identifier string, the method automatically adds double quotes to the start and end of the identifier and escapes the characters in the identifier appropriately.

Consider the following example:

String graphNamePrinted = printIdentifier("my graph name with \" special % characters ");
PreparedStatement stmnt = g.preparePgql(
  "SELECT COUNT(*) AS numVertices FROM MATCH (v) ON " + graphNamePrinted);

Parent topic: Security Tools for Executing PGQL Queries

16.10 Best Practices for Tuning PGQL Queries

This section describes best practices regarding memory allocation, parallelism, and query planning.

Memory Allocation
The graph server (PGX) has on-heap and off-heap memory, the earlier being the standard JVM heap while the latter being a separate heap that is managed by PGX. Just like graph data, intermediate and final results of PGQL queries are partially stored on-heap and partially off-heap. Therefore, both heaps are needed.
Parallelism
By default, all available processor threads are used to process PGQL queries. However, if needed, the number of threads can be limited by setting the parallelism option of the graph server (PGX).
Query Plan Explaining
The PgxGraph.explainPgql(String query) method is used to get insight into the query plan of the query. The method returns an instance of Operation (package oracle.pgx.api) which has the following methods:

Parent topic: Executing PGQL Queries Against the Graph Server (PGX)

16.10.1 Memory Allocation

The graph server (PGX) has on-heap and off-heap memory, the earlier being the standard JVM heap while the latter being a separate heap that is managed by PGX. Just like graph data, intermediate and final results of PGQL queries are partially stored on-heap and partially off-heap. Therefore, both heaps are needed.

In case of the on-heap memory, the default maximum is chosen upon startup of the JVM, but it can be overwritten through the -Xmx option.

In case of the off-heap, there is no maximum set by default and the off-heap memory usage, therefore, keeps increasing automatically until it depletes the system resources, in which case the operation is canceled, it's memory is released, and an appropriate exception is passed to the user. If needed, a maximum off-heap size can be configured through the max_off_heap_size option in the graph server (PGX).

A ratio of 1:1 for on-heap versus off-heap is recommended as a good starting point to allow for the largest possible graphs to be loaded and queried. See Configuring On-Heap Limits for the steps to configure the on-heap memory size.

Parent topic: Best Practices for Tuning PGQL Queries

16.10.2 Parallelism

By default, all available processor threads are used to process PGQL queries. However, if needed, the number of threads can be limited by setting the parallelism option of the graph server (PGX).

See Configuration Parameters for the Graph Server (PGX) Engine for more information on the graph server configuration parameters.

Parent topic: Best Practices for Tuning PGQL Queries

16.10.3 Query Plan Explaining

The PgxGraph.explainPgql(String query) method is used to get insight into the query plan of the query. The method returns an instance of Operation (package oracle.pgx.api) which has the following methods:

print(): for printing the operation and its child operations
getOperationType(): for getting the type of the operation
getPatternInfo(): for getting a string representation of the operation
getCostEstimate(): for getting the cost of the operation
getTotalCostEstimate(): for getting the cost of the operations and its child operations
getCardinatlityEstimate(): for getting the expected number of result rows
getChildren(): for accessing the child operations

Consider the following example:

g.explainPgql("SELECT COUNT(*) FROM MATCH (n) -[e1]-> (m) -[e2]-> (o)").print()
\--- GROUP BY  GroupBy {"cardinality":"42", "cost":"42", "accumulatedCost":"58.1"}
     \--- (m) -[e2]-> (o) NeighborMatch {"cardinality":"3.12", "cost":"3.12", "accumulatedCost":"16.1"}
          \--- (n) -[e1]-> (m) NeighborMatch {"cardinality":"5", "cost":"5", "accumulatedCost":"13"}
               \--- (n) RootVertexMatch {"cardinality":"8", "cost":"8", "accumulatedCost":"8"}

In the above example, the print() method is used to print the query plan.

If a query plan is not optimal, it is often possible to rewrite the query to improve its performance. For example, a SELECT query may be split into an UPDATE and a SELECT query as a way to improve the total runtime.

Note that the graph server (PGX) does not provide a hint mechanism.

Also, printing the query plan shows the filters used in the query. For example:

g.explainPgql("SELECT id(n) FROM MATCH (n)-[e]->(m) WHERE " +
...> "id(n) > 500 " +
...> "AND id(n) < 510 " +
...> "AND id(n) <> 509 " +
...> "AND id(n) <> 507 ").print()
\--- Projection {"cardinality":"146", "cost":"0", "accumulatedCost":"175"}
     \--- (n) -[e]-> (m) NeighborMatch {"cardinality":"146", "cost":"146", "accumulatedCost":"175"}
          \--- (n) RootVertexMatch {"cardinality":"29.2", "cost":"29.2", "accumulatedCost":"29.2"}
                WHERE $filter1
filter1: (id(n) <> 509) AND
         (id(n) <> 507) AND
         (id(n) > 500) AND
         (id(n) < 510)

In the preceding example, since the query has filters that spans more than three lines, the filters are shown displayed below the query plan. If the filters are less than three lines, then the filters are shown directly within the query plan tree as shown:

g.explainPgql("SELECT id(n) FROM MATCH (n)-[e]->(m) WHERE " +
...> "id(n) > 500 " +
...> "AND id(n) < 510 ").print()
\--- Projection {"cardinality":"162", "cost":"0", "accumulatedCost":"194"}
     \--- (n) -[e]-> (m) NeighborMatch {"cardinality":"162", "cost":"162", "accumulatedCost":"194"}
          \--- (n) RootVertexMatch {"cardinality":"32.4", "cost":"32.4", "accumulatedCost":"32.4"}
                WHERE (id(n) > 500) AND
                      (id(n) < 510)

Parent topic: Best Practices for Tuning PGQL Queries