148 DBMS_OGA

The DBMS_OGA package provides PL/SQL functions to run graph algorithms on SQL property graphs. These functions are invoked from within the GRAPH_TABLE operator of a SQL graph query.

This chapter contains the following topics:

148.1 DBMS_OGA Overview

The DBMS_OGA package allows you to run graph algorithms on SQL property graphs and combine the results with graph pattern matching in SQL graph queries.

Graph algorithms analyze a SQL property graph and can use existing vertex or edge properties. They compute new output properties without changing the graph topology. See Summary of DBMS_OGA Subprograms for the list of supported graph algorithms. User-defined graph algorithms are not supported.

The graph algorithm functions (GAFs) are exposed as PL/SQL functions that take a graph as input and return a new (temporary) graph as output. The returned graph is then used as the first argument of the GRAPH_TABLE operator. Properties computed by GAFs can be referenced like any other properties and combined with pattern matching in SQL graph queries.

However, the computed properties exist only within the scope of that GRAPH_TABLE operator in the SQL graph query. They are not persisted on the original graph, not visible to other sessions or users, and they are finally discarded when the operator (and cursor) completes.

See Also:

Running Graph Algorithm Functions in SQL Graph Queries in Oracle AI Database Graph Developer's Guide for Property Graph

148.2 DBMS_OGA Operational Notes

Review the operational notes when working with the DBMS_OGA package.

  • A graph algorithm function (GAF) takes a property graph as input and returns a new property graph. The returned graph has the same topology (that is, it is based on the same set of vertex and edge tables) as the input graph, but with additional properties. The algorithm creates these new properties to store its results.
  • The property created by a GAF and the resulting property graph are accessible only within the SQL cursor that contains the GAF. They are both destroyed at the end of the cursor execution. The input graph definition is not changed as the property is not added to the input graph. It can only be accessed within the GRAPH_TABLE operator that encloses the GAF invocation.
  • The first argument and the return type of a GAF are always of type ORA_PROPERTY_GRAPH. The first argument is the input property graph, and the function returns a new ORA_PROPERTY_GRAPH.
  • Algorithms use both input and output properties. These properties can be defined on the vertices or edges of the graph. See PL/SQL Types Introduced by Graph Algorithm Functions for more information on the property types.
  • You can execute only a single GAF within a GRAPH_TABLE operator in a SQL query. Ensure that you have sufficient temporary tablespace before invoking a GAF. The results computed by the algorithm exist only within the query in which they were computed.
  • When calling a GAF, the names of the input and output properties are specified using a new PROPERTY pseudo-operator. See Using the PROPERTY Pseudo-Operator for more information.
  • The GRAPH_TABLE operator takes the property graph returned by a GAF as its first argument. This graph includes all the labels and properties defined in the input graph and also the properties created by the GAF.

148.2.1 PL/SQL Types Introduced by Graph Algorithm Functions

Learn about the PL/SQL types introduced by graph algorithm functions (GAFs).

Table 148-1 Specific PL/SQL Types in Graph Algorithm Functions

Type Description
ORA_PROPERTY_GRAPH Represents the property graph that is passed as an argument to a GAF.
ORA_VERTEX_INPUT_PROPERTY Represents the vertex property that is passed as input to a GAF.

The expected data type for this input property is BINARY_DOUBLE.

ORA_VERTEX_OUTPUT_PROPERTY Represents the output vertex property that stores the results computed by a GAF.

Only properties of data types BINARY_DOUBLE and NUMBER are supported.

ORA_EDGE_INPUT_PROPERTY Represents the edge property that is passed as input to a GAF.

The expected data type for this input property is BINARY_DOUBLE.

ORA_EDGE_OUTPUT_PROPERTY Represents the output edge property.

148.2.2 Input Properties

If a graph algorithm function (GAF) reads a graph property during computation, you must pass the property as an argument to the algorithm using an ORA_VERTEX_INPUT_PROPERTY or ORA_EDGE_INPUT_PROPERTY input-property type.

Note the following about input properties:

  • The input property passed to a GAF must reference a vertex or an edge property for a label in the input graph definition.
  • If a GAF expects a vertex input property, the property passed as argument must be defined on at least one vertex table in the graph.
  • If a GAF expects an edge input property, the property passed as argument must be defined on at least one edge table in the graph.
  • If the property is defined on more than one vertex (or edge) tables in the graph, it can be exposed through different labels.
  • If the property is not exposed by some vertex or edge tables in the graph, the algorithm treats this property as NULL for these tables.

Input Property Types

Input properties are specified using arguments of type ORA_VERTEX_INPUT_PROPERTY or ORA_EDGE_INPUT_PROPERTY.

When an input property is passed to a GAF, its data type needs to be either BINARY_DOUBLE or a numeric data type that is implicitly convertible to BINARY_DOUBLE. Input property data types are compared to and possibly converted to BINARY_DOUBLE as follows:

  • If the input property data type is BINARY_DOUBLE, the property is read without conversions.
  • If the input property data type is not BINARY_DOUBLE, the SQL implicit conversion rules are applied. The following list of SQL types can be implicitly converted to BINARY_DOUBLE: CHAR, VARCHAR2, NCHAR, NVARCHAR2, NUMBER, and BINARY_FLOAT.
  • If the input property data type cannot be implicitly converted to BINARY_DOUBLE, the compiler raises a compile-time error.

Also, note that the data type of an input property depends on the MIXED PROPERTY TYPES mode of the graph. See Using Graph Options to Create SQL Property Graphs in Oracle AI Database Graph Developer's Guide for Property Graph for more information on creating a SQL property graph using MIXED PROPERTY TYPES options.

  • DISALLOW MIXED TYPES: The data type of the property is guaranteed to be the same for every label that defines it.
  • ALLOW MIXED TYPES: The data type of the property may be different across labels that define the property. In this case, each property type defined in the relevant labels are considered.
    • For a vertex input property, the data types defined on labels assigned to at least one vertex table are checked.
    • For an edge input property, the data types defined on labels assigned to at least one edge table are checked.

As a result, in ALLOW MIXED TYPES mode, multiple input property types may be identified. Each identified type is compared to and, if needed, implicitly converted to the expected BINARY_DOUBLE type.

If you expect NULL values in input properties, then note that you can pass a default value to the PROPERTY pseudo-operator. See Using the PROPERTY Pseudo-Operator for more information on default values.

148.2.3 Output Properties

A graph algorithm function (GAF) creates an output property with the identifier (property name) you pass as an argument. This new property contains the results of the algorithm and is accessible in the property graph returned by the graph function.

Note the following about output properties:

  • The output property must not have the same name as any existing property defined on any vertex or edge table of the input graph.
  • The output property name given must be a valid identifier.
  • The output property will always be created on every vertex or edge table of the graph. Whether the property is created on the vertices or edges of the graph depends on the algorithm.

Output Property Types

Output properties are represented by arguments of type ORA_VERTEX_OUTPUT_PROPERTY or ORA_EDGE_OUTPUT_PROPERTY.

The algorithm specifies the data types of the output properties. Currently, only properties of types BINARY_DOUBLE and NUMBER can be created through graph algorithms. The output property created for every vertex or edge table has the same data type. This behavior is consistent with graphs that are created in DISALLOW MIXED PROPERTY TYPES mode.

148.2.4 Using the PROPERTY Pseudo-Operator

You can pass the names of the input and output properties to a graph algorithm function (GAF) using the PROPERTY pseudo-operator. These are represented by ORA_VERTEX_INPUT_PROPERTY, ORA_EDGE_INPUT_PROPERTY, ORA_VERTEX_OUTPUT_PROPERTY, or ORA_EDGE_OUTPUT_PROPERTY in the GAF signature.

The syntax for the PROPERTY pseudo-operator is as shown:

PROPERTY(<element_type> <input_or_output> <property_name> <optional_default_value>)

In the preceding pattern:

  • <element_type>: Type of the graph element - VERTEX or EDGE.
  • <input_or_output>: This is to specify if the argument represents an INPUT or OUTPUT property.
  • <property_name>: Name of an input or output property.
  • <optional_default_value>: An optional default value for an INPUT property when NULL values are expected. You can specify this value using the DEFAULT ON NULL clause. For example:
    PROPERTY(EDGE INPUT cost DEFAULT ON NULL 0)

    This means that if the input edge cost property is NULL, it is treated as zero.

    If no default value is passed, then ensure that the property is not NULL (that is, all columns exposed by that property in all the tables have a NOT NULL constraint).

    Also, note the following:

    • If a vertex or an edge input property is defined on only a subset of the vertex or edge tables respectively, then a default value must be provided.
    • If the property is exposed as an expression (rather than a column) on any vertex or edge table, then a default value must be provided.
    • If the property exposes a column (not an expression), and that column is missing a NOT NULL constraint in any vertex or edge table, then a default value must be provided.
    • Default values cannot be provided for OUTPUT properties.

The following shows an example where the output rank vertex property is specified using the PROPERTY pseudo-operator:

PROPERTY(VERTEX OUTPUT rank)

Also, the property names can be quoted and follows all the rules of SQL quoted identifiers:

  • PROPERTY(VERTEX OUTPUT "RANK") and PROPERTY(VERTEX OUTPUT RANK) are equivalent.
  • PROPERTY(VERTEX OUTPUT "Rank") and PROPERTY(VERTEX OUTPUT Rank) are not equivalent.
  • PROPERTY(VERTEX OUTPUT Rank) and PROPERTY(VERTEX OUTPUT "RANK") are equivalent.

148.3 DBMS_OGA Security Model

Review the privileges required to run graph algorithm functions.

To run the graph algorithm functions (GAFs) in DBMS_OGA:
  • You must have the EXECUTE privilege on the DBMS_OGA and DBMS_GAF packages. These packages are owned by SYS and, by default, EXECUTE is granted to PUBLIC.
  • You must have the required privileges on the underlying graph object that you provide as input to a GAF, as described in Privileges to Query a SQL Property Graph .

When a graph algorithm function is invoked, it may issue recursive SQL statements. These recursive statements execute under the graph owner’s identity, and the graph owner’s privileges are used to parse and run them.

In addition, the graph algorithm function invoker must have sufficient temporary tablespace to execute the function. The amount of temporary tablespace required grows linearly with the total number of rows across all vertex tables in the graph. For BELLMAN_FORD specifically, it grows linearly with the total number of vertices and edges in the graph.

148.4 DBMS_OGA Examples

This section provides examples of running graph algorithms on a SQL property graph.

As a prerequisite to run the examples, connect to the database as a schema user and create the following tables with sample data.
CREATE TABLE Cities (
  cid         NUMBER        NOT NULL,
  name        VARCHAR2(100)  NOT NULL,
  population  NUMBER,
  CONSTRAINT pk_cities PRIMARY KEY (cid)
);

CREATE TABLE Residents (
  rid          NUMBER        NOT NULL,
  name         VARCHAR2(50)  NOT NULL,
  age          NUMBER,
  lives_in_cid NUMBER        NOT NULL,
  CONSTRAINT pk_persons PRIMARY KEY (rid),
  CONSTRAINT fk_persons_city FOREIGN KEY (lives_in_cid)
    REFERENCES Cities (cid)
);

CREATE TABLE Knows (
  res1_rid  NUMBER NOT NULL,
  res2_rid  NUMBER NOT NULL,
  CONSTRAINT pk_knows PRIMARY KEY (res1_rid, res2_rid),
  CONSTRAINT fk_knows_p1 FOREIGN KEY (res1_rid) REFERENCES Residents (rid),
  CONSTRAINT fk_knows_p2 FOREIGN KEY (res2_rid) REFERENCES Residents (rid)
);

CREATE TABLE Roads (
  city1_cid  NUMBER NOT NULL,
  city2_cid  NUMBER NOT NULL,
  distance   NUMBER NOT NULL,
  CONSTRAINT pk_roads PRIMARY KEY (city1_cid, city2_cid),
  CONSTRAINT fk_roads_c1 FOREIGN KEY (city1_cid) REFERENCES Cities (cid),
  CONSTRAINT fk_roads_c2 FOREIGN KEY (city2_cid) REFERENCES Cities (cid)
 );

INSERT INTO Cities (cid, name, population) VALUES (101, 'Austin',     980000);
INSERT INTO Cities (cid, name, population) VALUES (102, 'Dallas',    1300000);
INSERT INTO Cities (cid, name, population) VALUES (103, 'Houston',   2300000);
INSERT INTO Cities (cid, name, population) VALUES (104, 'San Antonio',1550000);
INSERT INTO Cities (cid, name, population) VALUES (105, 'El Paso',    680000);

INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (1, 'Mary',  29, 101);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (2, 'Bob',   35, 102);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (3, 'Tom',   27, 103);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (4, 'Alice', 41, 104);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (5, 'Ben',   33, 105);

INSERT INTO Knows (res1_rid, res2_rid) VALUES (1, 2);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (2, 3);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (3, 4);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (4, 5);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (5, 1);

INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (101, 102, 195);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (102, 103, 240);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (103, 104, 200);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (104, 105, 550);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (101, 103, 165);

Then, create the addresses property graph using the following CREATE PROPERTY GRAPH statement.

CREATE PROPERTY GRAPH addresses
VERTEX TABLES (
  Residents AS Resident
    PROPERTIES (name, age),
  Cities AS City
    PROPERTIES (name, population)
)
EDGE TABLES (
  Knows
    SOURCE KEY (res1_rid) REFERENCES Resident (rid)
    DESTINATION KEY (res2_rid) REFERENCES Resident (rid)
    NO PROPERTIES,
  Residents AS LivesIn
    SOURCE KEY (rid) REFERENCES Resident (rid)
    DESTINATION KEY (lives_in_cid) REFERENCES City (cid)
    PROPERTIES (0 AS distance),
  Roads AS ConnectedTo
    SOURCE KEY (city1_cid) REFERENCES City (cid)
    DESTINATION KEY (city2_cid) REFERENCES City (cid)
    PROPERTIES (distance)
);

You can run the following examples on the addresses SQL property graph.

148.5 Summary of DBMS_OGA Subprograms

This section provides a list of the DBMS_OGA subprograms and briefly describes each one.

Table 148-2 DBMS_OGA Subprograms

Subprogram Description
BELLMAN_FORD The Bellman-Ford algorithm computes the shortest-path distances from a start vertex to all other vertices in a weighted directed graph.
PAGERANK The PageRank algorithm computes an importance score for each vertex in the graph.
PERSONALIZED_PAGERANK The Personalized PageRank algorithm is a variant of the classic PageRank. It computes personalized PageRank scores for vertices in a graph, based on a single vertex.
PERSONALIZED_PAGERANK_SET The Personalized PageRank algorithm is a variant of the classic PageRank. It computes personalized PageRank scores for vertices in a graph, based on a vertex set.
WCC The Weakly Connected Components (WCC) algorithm finds connected components in a graph.

148.5.1 BELLMAN_FORD

You can use the DBMS_OGA.BELLMAN_FORD algorithm to find the shortest path distances from a specified source vertex to all other vertices in the graph.

Syntax

DBMS_OGA.BELLMAN_FORD (
  g               IN ORA_PROPERTY_GRAPH,
  start_vertex    IN JSON,
  length_property IN ORA_EDGE_INPUT_PROPERTY,
  dist            IN ORA_VERTEX_OUTPUT_PROPERTY
) RETURN ORA_PROPERTY_GRAPH;

Parameters

Table 148-3 BELLMAN_FORD Function Parameters

Parameter Description
g Name of the property graph.
start_vertex Specifies the graph vertex that should be used as the starting point for the Bellman-Ford algorithm.
length_property References the edge property which represents the length of the edges in the graph.
dist Represents the output vertex property name which stores the results of the graph algorithm.

Usage Notes

  • The JSON representation of the start_vertex parameter can be constructed using the VERTEX_ID function. The following shows a sample JSON output of a vertex identifier:
    JSON('{"GRAPH_OWNER":"GRAPHUSER",
           "GRAPH_NAME":"ADDRESSES",
           "ELEM_TABLE":"CITY",
           "KEY_VALUE":{"CID":101}}')
  • The length of each edge in the graph is represented by length_property. The expected data type for this property is BINARY_DOUBLE. SQL data types such as CHAR, VARCHAR2, NCHAR, NVARCHAR2, NUMBER, BINARY_FLOAT, and BINARY_DOUBLE can be implicitly converted to BINARY_DOUBLE. Exceptions will be raised if the input property type cannot be converted to BINARY_DOUBLE.

    Also, note that NULL and negative values are not supported for length_property.

  • The DBMS_OGA.BELLMAN_FORD function will create a new vertex property with a name as given to the dist argument to store the results of the algorithm. The data type for this new vertex property will be BINARY_DOUBLE and it is created on every vertex table of the output graph. Ensure that the given property name is unique and does not reference any existing vertex or edge property names of the input graph.

    Note that vertices that are not connected to the start_vertex are assigned positive infinity.

Example 148-1 DBMS_OGA.BELLMAN_FORD Example

The following example query runs the DBMS_OGA.BELLMAN_FORD algorithm on the addresses graph and returns the computed shortest distance from the City vertex with cid = 101 (Austin) to all other City vertices.

SELECT *
FROM GRAPH_TABLE (
  DBMS_OGA.BELLMAN_FORD(
    addresses,
    JSON('{"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"CITY","KEY_VALUE":{"CID":101}}'),
    PROPERTY(EDGE INPUT distance DEFAULT ON NULL 0), PROPERTY(VERTEX OUTPUT dist))
  MATCH (c IS city)
  COLUMNS (c.name AS to_city, c.dist)
) ORDER BY dist ASC;

As seen in the preceding code, the query uses each edge's distance property as the edge weight for the Bellman–Ford algorithm. If an edge’s distance is null, it is treated as zero, as specified by DEFAULT ON NULL 0. The function writes the computed distance from the source to each reachable vertex into a property named dist.

"TO_CITY"	"DIST"
Austin  	0.0
Houston 	165.0
Dallas  	195.0
San Antonio 	365.0
El Paso 	915.0

148.5.2 PAGERANK

You can use the DBMS_OGA.PAGERANK algorithm to measure the relative importance of a vertex in a graph.

Syntax

DBMS_OGA.PAGERANK (
  g               IN ORA_PROPERTY_GRAPH,
  rank            IN ORA_VERTEX_OUTPUT_PROPERTY,
  max_iter        IN INTEGER,
  tolerance       IN BINARY_DOUBLE,
  damping_factor  IN BINARY_DOUBLE,
  normalize       IN BOOLEAN
) RETURN ORA_PROPERTY_GRAPH;

Parameters

Table 148-4 PAGERANK Function Parameters

Parameter Description
g Name of the property graph.
rank Represents the output vertex property name which stores the rank value computed by the graph algorithm.
max_iter Specifies the maximum number of iterations to be performed by the algorithm.
tolerance Specifies the maximum error tolerated between two consecutive iterations.
damping_factor Represents the damping factor for the computation of the rank.

A common value is 0.85.

normalize Controls the behaviour of the total value of the rank with respect to dangling vertices (that is, vertices that have no outgoing neighbors).
  • TRUE: Indicates that rank values are normalized taking into account the dangling vertices. This implies that the sum of all the rank values in the graph will always be equal to 1.
  • FALSE: Indicates that the dangling vertices are not included for normalization. This implies that the rank values may end up not adding to 1 over the whole graph, although the relative importance of each vertex will still be given by the rank value.

Usage Notes

  • The PageRank algorithm is iterative and the termination criteria depends on the values of tolerance and max_iter parameters.

    The tolerance argument specifies the maximum error tolerated between two consecutive iterations. At each iteration, the algorithm calculates the current error by summing, over all vertices, the absolute difference between each vertex’s rank in the previous iteration and its rank in the current iteration. Once this total value becomes lower than the tolerance value, the algorithm will stop. At this point, the algorithm is seen as having converged.

    The max_iter parameter gives the maximum number of iterations the algorithm can perform before stopping. If the maximum number of iterations is reached before the algorithm converged, the algorithm will stop regardless. The maximum value for this argument is (231−1).

  • The DBMS_OGA.PAGERANK function will create a new vertex property with a name as given to the rank argument to store the results (rank values) of the algorithm. The data type for this new vertex property will be BINARY_DOUBLE and it is created on every vertex table of the output graph. Ensure that the given property name is unique and does not have the same name as any existing vertex or edge property name of the input graph.

Example 148-2 DBMS_OGA.PAGERANK Example

The following example runs the PageRank algorithm on the addresses graph and then projects the name and rank for every CITY in the graph.

SELECT name, rank
FROM GRAPH_TABLE (
  DBMS_OGA.pagerank (
    addresses,
    PROPERTY(VERTEX OUTPUT rank), 10, 1.0, 0.85d, FALSE)
  MATCH (c IS city)
  COLUMNS (c.name, c.rank)
) ORDER BY rank DESC;

The example produces the following output:

"NAME"	        "RANK"
 Houston	0.18500000000000003
 El Paso	0.14250000000000002
 San Antonio	0.14250000000000002
 Dallas 	0.1
 Austin 	0.05750000000000001

148.5.3 PERSONALIZED_PAGERANK

You can use the DBMS_OGA.PERSONALIZED_PAGERANK algorithm to compute personalized PageRank scores for vertices in a graph, based on a single vertex.

Syntax

DBMS_OGA.PERSONALIZED_PAGERANK (
  g               IN ORA_PROPERTY_GRAPH,
  rank            IN ORA_VERTEX_OUTPUT_PROPERTY,
  input_vertex    IN JSON,
  max_iter        IN INTEGER,
  tolerance       IN BINARY_DOUBLE,
  damping_factor  IN BINARY_DOUBLE,
  normalize       IN BOOLEAN
) RETURN ORA_PROPERTY_GRAPH;

Parameters

Table 148-5 PERSONALIZED_PAGERANK Function Parameters

Parameter Description
g Name of the property graph.
rank Represents the output vertex property name which stores the rank value computed by the graph algorithm.
input_vertex Represents the given input vertex used to personalize the PageRank computation.
max_iter Specifies the maximum number of iterations to be performed by the algorithm.
tolerance Specifies the maximum error tolerated between two consecutive iterations.
damping_factor Represents the damping factor for the computation of the rank.

A common value is 0.85.

normalize Controls the behaviour of the total value of the rank with respect to dangling vertices (that is, vertices that have no outgoing neighbors).
  • TRUE: Indicates that rank values are normalized taking into account the dangling vertices. This implies that the sum of all the rank values in the graph will always be equal to 1.
  • FALSE: Indicates that the dangling vertices are not included for normalization. This implies that the rank values may end up not adding to 1 over the whole graph, although the relative importance of each vertex will still be given by the rank value.

Usage Notes

  • The Personalized PageRank algorithm is a variant of the PageRank algorithm. In this variant, the PageRank computation is personalized for a single vertex.

    The specified input_vertex is used to personalize the PageRank computation. It is given more weight when computing the PageRank by initializing its rank value to 1 and initializing all other vertices to 0.

    The input_vertex is represented as a JSON. For example:

    {"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"CITY","KEY_VALUE":{"CID":101}}
  • See also usage notes in PAGERANK for more information.

Example 148-3 DBMS_OGA.PERSONALIZED_PAGERANK Example

The following example computes the personalized PageRank score on the addresses graph, giving importance to the input RESIDENT vertex with rid = 3. It then projects the name and rank for every RESIDENT vertex in the graph.

SELECT *
FROM GRAPH_TABLE (
  DBMS_OGA.personalized_pagerank(
    addresses,
    PROPERTY(VERTEX OUTPUT rank),
    JSON('{
      "GRAPH_OWNER": "GRAPHUSER",
      "GRAPH_NAME": "ADDRESSES",
      "ELEM_TABLE": "RESIDENT",
      "KEY_VALUE": { "RID": 3 }
    }'),
    10,
    1.0,
    0.85d,
    FALSE
  )
  MATCH (r IS resident)
  COLUMNS (
    r.name,
    r.rank
  )
) ORDER BY rank DESC;

The example produces the following output:

"NAME"	 "RANK"
 Tom	  0.15000000000000002
 Alice	  0.06375
 Bob	  0.03262539062499999
 Ben	  0.02709375
 Mary	  0.01151484375

148.5.4 PERSONALIZED_PAGERANK_SET

You can use the DBMS_OGA.PERSONALIZED_PAGERANK_SET algorithm to compute personalized PageRank scores for vertices in a graph, based on a set of vertices within a graph.

Syntax

DBMS_OGA.PERSONALIZED_PAGERANK_SET (
  g               IN ORA_PROPERTY_GRAPH,
  rank            IN ORA_VERTEX_OUTPUT_PROPERTY,
  input_vertices  IN JSON,
  max_iter        IN INTEGER,
  tolerance       IN BINARY_DOUBLE,
  damping_factor  IN BINARY_DOUBLE,
  normalize       IN BOOLEAN
) RETURN ORA_PROPERTY_GRAPH;

Parameters

Table 148-6 PERSONALIZED_PAGERANK_SET Function Parameters

Parameter Description
g Name of the property graph.
rank Represents the output vertex property name which stores the rank value computed by the graph algorithm.
input_vertices Represents the given input vertex set used to personalize the PageRank computation.
max_iter Specifies the maximum number of iterations to be performed by the algorithm.
tolerance Specifies the maximum error tolerated between two consecutive iterations.
damping_factor Represents the damping factor for the computation of the rank.

A common value is 0.85.

normalize Controls the behaviour of the total value of the rank with respect to dangling vertices (that is, vertices that have no outgoing neighbors).
  • TRUE: Indicates that rank values are normalized taking into account the dangling vertices. This implies that the sum of all the rank values in the graph will always be equal to 1.
  • FALSE: Indicates that the dangling vertices are not included for normalization. This implies that the rank values may end up not adding to 1 over the whole graph, although the relative importance of each vertex will still be given by the rank value.

Usage Notes

  • The Personalized PageRank algorithm executed by the personalized_pagerank_set function is a variant of the PageRank algorithm. In this variant, the PageRank computation is personalized for a set of vertices.

    The algorithm accepts a set of input_vertices to personalize the PageRank computation. The vertices in this set will be given more weight when computing the PageRank by initializing their rank value to 1/count(input_vertices), all other vertices being initialized to 0.

    The content of input_vertices must be a JSON array. For example:

    [
      {
        "GRAPH_NAME": "ADDRESSES",
        "GRAPH_OWNER": "GRAPHUSER",
        "ELEM_TABLE": "RESIDENT",
        "KEY_VALUE": {
          "RID": 1
        }
      },
      {
        "GRAPH_NAME": "ADDRESSES",
        "GRAPH_OWNER": "GRAPHUSER",
        "ELEM_TABLE": "RESIDENT",
        "KEY_VALUE": {
          "RID": 2
        }
      }
    ]
  • See also usage notes in PAGERANK for more information.

Example 148-4 DBMS_OGA.PERSONALIZED_PAGERANK_SET Example

The following example computes Personalized PageRank on addresses graph, giving importance to all residents with age > 30. It then projects the name and rank for every RESIDENT vertex in the graph.

The example is executed in two steps as shown:

  1. Construct the set of vertices targeting all residents with age > 30.
    SELECT JSON_ARRAYAGG (p_json)
    FROM GRAPH_TABLE(
    addresses
    MATCH (r IS Resident)
    WHERE r.age > 30
    COLUMNS(VERTEX_ID(r) AS p_json)
    );

    The query produces the following output:

    [{"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"RESIDENT","KEY_VALUE":{"RID":2}},
     {"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"RESIDENT","KEY_VALUE":{"RID":4}},
     {"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"RESIDENT","KEY_VALUE":{"RID":5}}
    ]
  2. Run the DBMS_OGA.PERSONALIZED_PAGERANK_SET function to compute the personalized PageRank scores biased towards the set of vertices obtained in the previous step.
    SELECT *
    FROM GRAPH_TABLE (
      DBMS_OGA.personalized_pagerank_set(
        addresses,
        PROPERTY(VERTEX OUTPUT rank),
        JSON('[
          {
            "GRAPH_OWNER": "GRAPHUSER",
            "GRAPH_NAME": "ADDRESSES",
            "ELEM_TABLE": "RESIDENT",
            "KEY_VALUE": { "RID": 2 }
          },
          {
            "GRAPH_OWNER": "GRAPHUSER",
            "GRAPH_NAME": "ADDRESSES",
            "ELEM_TABLE": "RESIDENT",
            "KEY_VALUE": { "RID": 4 }
          },
          {
            "GRAPH_OWNER": "GRAPHUSER",
            "GRAPH_NAME": "ADDRESSES",
            "ELEM_TABLE": "RESIDENT",
            "KEY_VALUE": { "RID": 5 }
          }
        ]'),
        50,
        0.1,
        0.85d,
        FALSE
      )
      MATCH (r IS Resident)
      COLUMNS (
        r.name,
        r.rank
      )
    ) ORDER BY rank DESC;

The query ouput is as shown:

"NAME"	       "RANK"
 Ben	 0.07607621885986329
 Bob	 0.06439765696818034
 Alice	 0.06219064524943034
 Mary	 0.03304200375773112
 Tom	 0.027014198840332033

148.5.5 WCC

You can use the DBMS_OGA.WCC algorithm to find the connected components in a graph.

Syntax

DBMS_OGA.WCC (
  g               IN ORA_PROPERTY_GRAPH,
  comp_id         IN ORA_VERTEX_OUTPUT_PROPERTY
) RETURN ORA_PROPERTY_GRAPH;

Parameters

Table 148-7 WCC Function Parameters

Parameter Description
g Name of the property graph.
comp_id Represents the output vertex property name which stores the component ID for a vertex as computed by the algorithm.

Usage Notes

The DBMS_OGA.WCC function creates a new vertex property with the name as given to the comp_id argument to store the component ID for a vertex. The data type for this new vertex property will be INTEGER.

If two vertices are assigned the same component ID, then they belong to the same weakly connected component. This means that there exists a path between them in the underlying undirected graph (that is, when edge directions are ignored). If the graph contains more than one component ID value, it means there are non-connected components in the graph (sets of vertices that are not connected to each other).

The component ID values produced by the algorithm are not guaranteed to be the same between two different runs. Also, the component IDs might not start at 0 or 1. If the algorithm discovers multiple connected components, the IDs might not be contiguous. For example, the component IDs can be 2, 5, and 17 instead of 2, 3, and 4.

Example 148-5 DBMS_OGA.WCC Example

The following example uses the DBMS_OGA.WCC function to find the connected components in addresses graph. It then projects the name and component ID for every CITY vertex in the graph.

SELECT name,comp_id
FROM GRAPH_TABLE (
  DBMS_OGA.wcc(
    addresses,
    PROPERTY(VERTEX OUTPUT comp_id)
  )
  MATCH (c IS city)
  COLUMNS (
    c.name    AS name,
    c.comp_id AS comp_id
  )
);

The example produces the following output. All cities have COMP_ID = 2, which indicates they are all connected.

NAME	COMP_ID
Austin	      2
Dallas	      2
Houston	      2
San Antonio   2
El Paso	      2