148 DBMS_OGA
The DBMS_OGA package provides PL/SQL functions to run graph
algorithms on SQL property graphs. These functions are invoked from within the
GRAPH_TABLE operator of a SQL graph query.
This chapter contains the following topics:
148.1 DBMS_OGA Overview
The DBMS_OGA package allows you to run graph algorithms on SQL property graphs and combine the results with graph pattern matching in SQL graph queries.
Graph algorithms analyze a SQL property graph and can use existing vertex or edge properties. They compute new output properties without changing the graph topology. See Summary of DBMS_OGA Subprograms for the list of supported graph algorithms. User-defined graph algorithms are not supported.
The graph algorithm functions (GAFs) are exposed as PL/SQL functions that
take a graph as input and return a new (temporary) graph as output. The returned
graph is then used as the first argument of the GRAPH_TABLE
operator. Properties computed by GAFs can be referenced like any other properties
and combined with pattern matching in SQL graph queries.
However, the computed properties exist only within the scope of that
GRAPH_TABLE operator in the SQL graph query. They are not
persisted on the original graph, not visible to other sessions or users, and they
are finally discarded when the operator (and cursor) completes.
See Also:
Running Graph Algorithm Functions in SQL Graph Queries in Oracle AI Database Graph Developer's Guide for Property Graph148.2 DBMS_OGA Operational Notes
Review the operational notes when working with the DBMS_OGA
package.
- A graph algorithm function (GAF) takes a property graph as input and returns a new property graph. The returned graph has the same topology (that is, it is based on the same set of vertex and edge tables) as the input graph, but with additional properties. The algorithm creates these new properties to store its results.
- The property created by a GAF and the resulting property
graph are accessible only within the SQL cursor that contains the
GAF. They are both destroyed at the end of the cursor execution. The
input graph definition is not changed as the property is not added
to the input graph. It can only be accessed within the
GRAPH_TABLEoperator that encloses the GAF invocation. - The first argument and the return type of a GAF are always
of type
ORA_PROPERTY_GRAPH. The first argument is the input property graph, and the function returns a newORA_PROPERTY_GRAPH. - Algorithms use both input and output properties. These properties can be defined on the vertices or edges of the graph. See PL/SQL Types Introduced by Graph Algorithm Functions for more information on the property types.
- You can execute only a single GAF within a
GRAPH_TABLEoperator in a SQL query. Ensure that you have sufficient temporary tablespace before invoking a GAF. The results computed by the algorithm exist only within the query in which they were computed. - When calling a GAF, the names of the input and output
properties are specified using a new
PROPERTYpseudo-operator. See Using the PROPERTY Pseudo-Operator for more information. - The
GRAPH_TABLEoperator takes the property graph returned by a GAF as its first argument. This graph includes all the labels and properties defined in the input graph and also the properties created by the GAF.
148.2.1 PL/SQL Types Introduced by Graph Algorithm Functions
Learn about the PL/SQL types introduced by graph algorithm functions (GAFs).
Table 148-1 Specific PL/SQL Types in Graph Algorithm Functions
| Type | Description |
|---|---|
ORA_PROPERTY_GRAPH |
Represents the property graph that is passed as an argument to a GAF. |
ORA_VERTEX_INPUT_PROPERTY |
Represents the vertex property that is passed as
input to a GAF.
The expected data type for this input
property is |
ORA_VERTEX_OUTPUT_PROPERTY |
Represents the output vertex property that stores
the results computed by a GAF.
Only properties of data
types |
ORA_EDGE_INPUT_PROPERTY |
Represents the edge property that is passed as input
to a GAF.
The expected data type for this input
property is |
ORA_EDGE_OUTPUT_PROPERTY |
Represents the output edge property. |
148.2.2 Input Properties
If a graph algorithm function (GAF) reads a graph property during
computation, you must pass the property as an argument to the algorithm using an
ORA_VERTEX_INPUT_PROPERTY or ORA_EDGE_INPUT_PROPERTY
input-property type.
Note the following about input properties:
- The input property passed to a GAF must reference a vertex or an edge property for a label in the input graph definition.
- If a GAF expects a vertex input property, the property passed as argument must be defined on at least one vertex table in the graph.
- If a GAF expects an edge input property, the property passed as argument must be defined on at least one edge table in the graph.
- If the property is defined on more than one vertex (or edge) tables in the graph, it can be exposed through different labels.
- If the property is not exposed by some vertex or edge tables in the
graph, the algorithm treats this property as
NULLfor these tables.
Input Property Types
Input properties are specified using arguments of type
ORA_VERTEX_INPUT_PROPERTY or
ORA_EDGE_INPUT_PROPERTY.
When an input property is passed to a GAF, its data type needs to be
either BINARY_DOUBLE or a numeric data type that is implicitly
convertible to BINARY_DOUBLE. Input property data types are
compared to and possibly converted to BINARY_DOUBLE as follows:
- If the input property data type is
BINARY_DOUBLE, the property is read without conversions. - If the input property data type is not
BINARY_DOUBLE, the SQL implicit conversion rules are applied. The following list of SQL types can be implicitly converted toBINARY_DOUBLE:CHAR,VARCHAR2,NCHAR,NVARCHAR2,NUMBER, andBINARY_FLOAT. - If the input property data type cannot be implicitly converted to
BINARY_DOUBLE, the compiler raises a compile-time error.
Also, note that the data type of an input property depends on the
MIXED PROPERTY TYPES mode of the graph. See Using Graph Options to Create SQL
Property Graphs in Oracle AI Database Graph
Developer's Guide for Property Graph for more
information on creating a SQL property graph using MIXED PROPERTY
TYPES options.
DISALLOW MIXED TYPES: The data type of the property is guaranteed to be the same for every label that defines it.ALLOW MIXED TYPES: The data type of the property may be different across labels that define the property. In this case, each property type defined in the relevant labels are considered.- For a vertex input property, the data types defined on labels assigned to at least one vertex table are checked.
- For an edge input property, the data types defined on labels assigned to at least one edge table are checked.
As a result, in ALLOW MIXED TYPES mode, multiple input property
types may be identified. Each identified type is compared to and, if needed,
implicitly converted to the expected BINARY_DOUBLE type.
If you expect NULL values in input properties, then note
that you can pass a default value to the PROPERTY pseudo-operator. See Using the PROPERTY Pseudo-Operator for more information on default values.
148.2.3 Output Properties
A graph algorithm function (GAF) creates an output property with the identifier (property name) you pass as an argument. This new property contains the results of the algorithm and is accessible in the property graph returned by the graph function.
Note the following about output properties:
- The output property must not have the same name as any existing property defined on any vertex or edge table of the input graph.
- The output property name given must be a valid identifier.
- The output property will always be created on every vertex or edge table of the graph. Whether the property is created on the vertices or edges of the graph depends on the algorithm.
Output Property Types
Output properties are represented by arguments of type
ORA_VERTEX_OUTPUT_PROPERTY or
ORA_EDGE_OUTPUT_PROPERTY.
The algorithm specifies the data types of the output properties.
Currently, only properties of types BINARY_DOUBLE and
NUMBER can be created through graph algorithms. The output
property created for every vertex or edge table has the same data type. This
behavior is consistent with graphs that are created in DISALLOW MIXED
PROPERTY TYPES mode.
148.2.4 Using the PROPERTY Pseudo-Operator
You can pass the names of the input and output properties to a graph
algorithm function (GAF) using the PROPERTY pseudo-operator. These
are represented by ORA_VERTEX_INPUT_PROPERTY,
ORA_EDGE_INPUT_PROPERTY,
ORA_VERTEX_OUTPUT_PROPERTY, or
ORA_EDGE_OUTPUT_PROPERTY in the GAF signature.
The syntax for the PROPERTY pseudo-operator is as
shown:
PROPERTY(<element_type> <input_or_output>
<property_name> <optional_default_value>)
In the preceding pattern:
- <element_type>: Type of the graph element -
VERTEXorEDGE. - <input_or_output>: This is to specify if the
argument represents an
INPUTorOUTPUTproperty. - <property_name>: Name of an input or output property.
- <optional_default_value>: An optional default
value for an
INPUTproperty when NULL values are expected. You can specify this value using theDEFAULT ON NULLclause. For example:PROPERTY(EDGE INPUT cost DEFAULT ON NULL 0)This means that if the input edge
costproperty isNULL, it is treated as zero.If no default value is passed, then ensure that the property is not
NULL(that is, all columns exposed by that property in all the tables have aNOT NULLconstraint).Also, note the following:
- If a vertex or an edge input property is defined on only a subset of the vertex or edge tables respectively, then a default value must be provided.
- If the property is exposed as an expression (rather than a column) on any vertex or edge table, then a default value must be provided.
- If the property exposes a column (not an
expression), and that column is missing a
NOT NULLconstraint in any vertex or edge table, then a default value must be provided. - Default values cannot be provided for
OUTPUTproperties.
The following shows an example where the output rank vertex
property is specified using the PROPERTY
pseudo-operator:
PROPERTY(VERTEX OUTPUT rank)
Also, the property names can be quoted and follows all the rules of SQL quoted identifiers:
PROPERTY(VERTEX OUTPUT "RANK")andPROPERTY(VERTEX OUTPUT RANK)are equivalent.PROPERTY(VERTEX OUTPUT "Rank")andPROPERTY(VERTEX OUTPUT Rank)are not equivalent.PROPERTY(VERTEX OUTPUT Rank)andPROPERTY(VERTEX OUTPUT "RANK")are equivalent.
148.3 DBMS_OGA Security Model
Review the privileges required to run graph algorithm functions.
DBMS_OGA:
- You must have the
EXECUTEprivilege on theDBMS_OGAandDBMS_GAFpackages. These packages are owned bySYSand, by default,EXECUTEis granted toPUBLIC. - You must have the required privileges on the underlying graph object that you provide as input to a GAF, as described in Privileges to Query a SQL Property Graph .
When a graph algorithm function is invoked, it may issue recursive SQL statements. These recursive statements execute under the graph owner’s identity, and the graph owner’s privileges are used to parse and run them.
In addition, the graph algorithm function invoker must have sufficient temporary tablespace to execute the function. The amount of temporary tablespace required grows linearly with the total number of rows across all vertex tables in the graph. For BELLMAN_FORD specifically, it grows linearly with the total number of vertices and edges in the graph.
148.4 DBMS_OGA Examples
This section provides examples of running graph algorithms on a SQL property graph.
CREATE TABLE Cities (
cid NUMBER NOT NULL,
name VARCHAR2(100) NOT NULL,
population NUMBER,
CONSTRAINT pk_cities PRIMARY KEY (cid)
);
CREATE TABLE Residents (
rid NUMBER NOT NULL,
name VARCHAR2(50) NOT NULL,
age NUMBER,
lives_in_cid NUMBER NOT NULL,
CONSTRAINT pk_persons PRIMARY KEY (rid),
CONSTRAINT fk_persons_city FOREIGN KEY (lives_in_cid)
REFERENCES Cities (cid)
);
CREATE TABLE Knows (
res1_rid NUMBER NOT NULL,
res2_rid NUMBER NOT NULL,
CONSTRAINT pk_knows PRIMARY KEY (res1_rid, res2_rid),
CONSTRAINT fk_knows_p1 FOREIGN KEY (res1_rid) REFERENCES Residents (rid),
CONSTRAINT fk_knows_p2 FOREIGN KEY (res2_rid) REFERENCES Residents (rid)
);
CREATE TABLE Roads (
city1_cid NUMBER NOT NULL,
city2_cid NUMBER NOT NULL,
distance NUMBER NOT NULL,
CONSTRAINT pk_roads PRIMARY KEY (city1_cid, city2_cid),
CONSTRAINT fk_roads_c1 FOREIGN KEY (city1_cid) REFERENCES Cities (cid),
CONSTRAINT fk_roads_c2 FOREIGN KEY (city2_cid) REFERENCES Cities (cid)
);
INSERT INTO Cities (cid, name, population) VALUES (101, 'Austin', 980000);
INSERT INTO Cities (cid, name, population) VALUES (102, 'Dallas', 1300000);
INSERT INTO Cities (cid, name, population) VALUES (103, 'Houston', 2300000);
INSERT INTO Cities (cid, name, population) VALUES (104, 'San Antonio',1550000);
INSERT INTO Cities (cid, name, population) VALUES (105, 'El Paso', 680000);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (1, 'Mary', 29, 101);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (2, 'Bob', 35, 102);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (3, 'Tom', 27, 103);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (4, 'Alice', 41, 104);
INSERT INTO Residents (rid, name, age, lives_in_cid) VALUES (5, 'Ben', 33, 105);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (1, 2);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (2, 3);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (3, 4);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (4, 5);
INSERT INTO Knows (res1_rid, res2_rid) VALUES (5, 1);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (101, 102, 195);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (102, 103, 240);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (103, 104, 200);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (104, 105, 550);
INSERT INTO Roads (city1_cid, city2_cid, distance) VALUES (101, 103, 165);Then, create the addresses property graph using the following
CREATE PROPERTY GRAPH
statement.
CREATE PROPERTY GRAPH addresses
VERTEX TABLES (
Residents AS Resident
PROPERTIES (name, age),
Cities AS City
PROPERTIES (name, population)
)
EDGE TABLES (
Knows
SOURCE KEY (res1_rid) REFERENCES Resident (rid)
DESTINATION KEY (res2_rid) REFERENCES Resident (rid)
NO PROPERTIES,
Residents AS LivesIn
SOURCE KEY (rid) REFERENCES Resident (rid)
DESTINATION KEY (lives_in_cid) REFERENCES City (cid)
PROPERTIES (0 AS distance),
Roads AS ConnectedTo
SOURCE KEY (city1_cid) REFERENCES City (cid)
DESTINATION KEY (city2_cid) REFERENCES City (cid)
PROPERTIES (distance)
);You can run the following examples on the addresses SQL property
graph.
- Example 148-1: Bellman-Ford algorithm
- Example 148-2: PageRank algorithm
- Example 148-3: Personalized PageRank algorithm (for a single vertex)
- Example 148-4: Personalized PageRank algorithm (for a vertex set)
- Example 148-5: Weakly Connected Components algorithm
148.5 Summary of DBMS_OGA Subprograms
This section provides a list of the DBMS_OGA subprograms and briefly
describes each one.
Table 148-2 DBMS_OGA Subprograms
| Subprogram | Description |
|---|---|
BELLMAN_FORD |
The Bellman-Ford algorithm computes the shortest-path distances from a start vertex to all other vertices in a weighted directed graph. |
PAGERANK |
The PageRank algorithm computes an importance score for each vertex in the graph. |
PERSONALIZED_PAGERANK |
The Personalized PageRank algorithm is a variant of the classic PageRank. It computes personalized PageRank scores for vertices in a graph, based on a single vertex. |
PERSONALIZED_PAGERANK_SET |
The Personalized PageRank algorithm is a variant of the classic PageRank. It computes personalized PageRank scores for vertices in a graph, based on a vertex set. |
WCC |
The Weakly Connected Components (WCC) algorithm finds connected components in a graph. |
148.5.1 BELLMAN_FORD
You can use the DBMS_OGA.BELLMAN_FORD algorithm to find the
shortest path distances from a specified source vertex to all other vertices in the
graph.
Syntax
DBMS_OGA.BELLMAN_FORD (
g IN ORA_PROPERTY_GRAPH,
start_vertex IN JSON,
length_property IN ORA_EDGE_INPUT_PROPERTY,
dist IN ORA_VERTEX_OUTPUT_PROPERTY
) RETURN ORA_PROPERTY_GRAPH;Parameters
Table 148-3 BELLMAN_FORD Function Parameters
| Parameter | Description |
|---|---|
g |
Name of the property graph. |
start_vertex |
Specifies the graph vertex that should be used as the starting point for the Bellman-Ford algorithm. |
length_property |
References the edge property which represents the length of the edges in the graph. |
dist |
Represents the output vertex property name which stores the results of the graph algorithm. |
Usage Notes
- The JSON representation of the
start_vertexparameter can be constructed using theVERTEX_IDfunction. The following shows a sample JSON output of a vertex identifier:JSON('{"GRAPH_OWNER":"GRAPHUSER", "GRAPH_NAME":"ADDRESSES", "ELEM_TABLE":"CITY", "KEY_VALUE":{"CID":101}}') - The length of each edge in the graph is represented by
length_property. The expected data type for this property isBINARY_DOUBLE. SQL data types such asCHAR,VARCHAR2,NCHAR,NVARCHAR2,NUMBER,BINARY_FLOAT, andBINARY_DOUBLEcan be implicitly converted toBINARY_DOUBLE. Exceptions will be raised if the input property type cannot be converted toBINARY_DOUBLE.Also, note that
NULLand negative values are not supported forlength_property. - The
DBMS_OGA.BELLMAN_FORDfunction will create a new vertex property with a name as given to thedistargument to store the results of the algorithm. The data type for this new vertex property will beBINARY_DOUBLEand it is created on every vertex table of the output graph. Ensure that the given property name is unique and does not reference any existing vertex or edge property names of the input graph.Note that vertices that are not connected to the
start_vertexare assigned positive infinity.
Example 148-1 DBMS_OGA.BELLMAN_FORD Example
The following example query runs the
DBMS_OGA.BELLMAN_FORD algorithm on the
addresses graph and returns the computed shortest distance from
the City vertex with cid = 101
(Austin) to all other City
vertices.
SELECT *
FROM GRAPH_TABLE (
DBMS_OGA.BELLMAN_FORD(
addresses,
JSON('{"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"CITY","KEY_VALUE":{"CID":101}}'),
PROPERTY(EDGE INPUT distance DEFAULT ON NULL 0), PROPERTY(VERTEX OUTPUT dist))
MATCH (c IS city)
COLUMNS (c.name AS to_city, c.dist)
) ORDER BY dist ASC;
As seen in the preceding code, the query uses each edge's
distance property as the edge weight for the Bellman–Ford
algorithm. If an edge’s distance is null, it is treated as zero, as specified by
DEFAULT ON NULL 0. The function writes the computed distance
from the source to each reachable vertex into a property named
dist.
"TO_CITY" "DIST"
Austin 0.0
Houston 165.0
Dallas 195.0
San Antonio 365.0
El Paso 915.0
148.5.2 PAGERANK
You can use the DBMS_OGA.PAGERANK algorithm to measure the
relative importance of a vertex in a graph.
Syntax
DBMS_OGA.PAGERANK (
g IN ORA_PROPERTY_GRAPH,
rank IN ORA_VERTEX_OUTPUT_PROPERTY,
max_iter IN INTEGER,
tolerance IN BINARY_DOUBLE,
damping_factor IN BINARY_DOUBLE,
normalize IN BOOLEAN
) RETURN ORA_PROPERTY_GRAPH;Parameters
Table 148-4 PAGERANK Function Parameters
| Parameter | Description |
|---|---|
g |
Name of the property graph. |
rank |
Represents the output vertex property name which stores the rank value computed by the graph algorithm. |
max_iter |
Specifies the maximum number of iterations to be performed by the algorithm. |
tolerance |
Specifies the maximum error tolerated between two consecutive iterations. |
damping_factor |
Represents the damping factor for the computation of
the rank.
A common value is |
normalize |
Controls the behaviour of the total value of the rank
with respect to dangling vertices (that is, vertices that have no
outgoing neighbors).
|
Usage Notes
- The PageRank algorithm is iterative and the termination criteria
depends on the values of
toleranceandmax_iterparameters.The
toleranceargument specifies the maximum error tolerated between two consecutive iterations. At each iteration, the algorithm calculates the current error by summing, over all vertices, the absolute difference between each vertex’s rank in the previous iteration and its rank in the current iteration. Once this total value becomes lower than the tolerance value, the algorithm will stop. At this point, the algorithm is seen as having converged.The
max_iterparameter gives the maximum number of iterations the algorithm can perform before stopping. If the maximum number of iterations is reached before the algorithm converged, the algorithm will stop regardless. The maximum value for this argument is(231−1). - The
DBMS_OGA.PAGERANKfunction will create a new vertex property with a name as given to therankargument to store the results (rank values) of the algorithm. The data type for this new vertex property will beBINARY_DOUBLEand it is created on every vertex table of the output graph. Ensure that the given property name is unique and does not have the same name as any existing vertex or edge property name of the input graph.
Example 148-2 DBMS_OGA.PAGERANK Example
The following example runs the PageRank algorithm on the
addresses graph and then projects the name and rank for every
CITY in the graph.
SELECT name, rank
FROM GRAPH_TABLE (
DBMS_OGA.pagerank (
addresses,
PROPERTY(VERTEX OUTPUT rank), 10, 1.0, 0.85d, FALSE)
MATCH (c IS city)
COLUMNS (c.name, c.rank)
) ORDER BY rank DESC;
The example produces the following output:
"NAME" "RANK"
Houston 0.18500000000000003
El Paso 0.14250000000000002
San Antonio 0.14250000000000002
Dallas 0.1
Austin 0.05750000000000001
148.5.3 PERSONALIZED_PAGERANK
You can use the DBMS_OGA.PERSONALIZED_PAGERANK algorithm to
compute personalized PageRank scores for vertices in a graph, based on a single
vertex.
Syntax
DBMS_OGA.PERSONALIZED_PAGERANK (
g IN ORA_PROPERTY_GRAPH,
rank IN ORA_VERTEX_OUTPUT_PROPERTY,
input_vertex IN JSON,
max_iter IN INTEGER,
tolerance IN BINARY_DOUBLE,
damping_factor IN BINARY_DOUBLE,
normalize IN BOOLEAN
) RETURN ORA_PROPERTY_GRAPH;Parameters
Table 148-5 PERSONALIZED_PAGERANK Function Parameters
| Parameter | Description |
|---|---|
g |
Name of the property graph. |
rank |
Represents the output vertex property name which stores the rank value computed by the graph algorithm. |
input_vertex |
Represents the given input vertex used to personalize the PageRank computation. |
max_iter |
Specifies the maximum number of iterations to be performed by the algorithm. |
tolerance |
Specifies the maximum error tolerated between two consecutive iterations. |
damping_factor |
Represents the damping factor for the computation of
the rank.
A common value is
|
normalize |
Controls the behaviour of the total value of the rank
with respect to dangling vertices (that is, vertices that have no
outgoing neighbors).
|
Usage Notes
- The Personalized PageRank algorithm is a variant of the PageRank
algorithm. In this variant, the PageRank computation is personalized for a
single vertex.
The specified
input_vertexis used to personalize the PageRank computation. It is given more weight when computing the PageRank by initializing its rank value to 1 and initializing all other vertices to 0.The
input_vertexis represented as a JSON. For example:{"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"CITY","KEY_VALUE":{"CID":101}} - See also usage notes in PAGERANK for more information.
Example 148-3 DBMS_OGA.PERSONALIZED_PAGERANK Example
The following example computes the personalized PageRank score on the
addresses graph, giving importance to the input
RESIDENT vertex with rid = 3. It then projects
the name and rank for every RESIDENT vertex in the graph.
SELECT *
FROM GRAPH_TABLE (
DBMS_OGA.personalized_pagerank(
addresses,
PROPERTY(VERTEX OUTPUT rank),
JSON('{
"GRAPH_OWNER": "GRAPHUSER",
"GRAPH_NAME": "ADDRESSES",
"ELEM_TABLE": "RESIDENT",
"KEY_VALUE": { "RID": 3 }
}'),
10,
1.0,
0.85d,
FALSE
)
MATCH (r IS resident)
COLUMNS (
r.name,
r.rank
)
) ORDER BY rank DESC;
The example produces the following output:
"NAME" "RANK"
Tom 0.15000000000000002
Alice 0.06375
Bob 0.03262539062499999
Ben 0.02709375
Mary 0.01151484375
148.5.4 PERSONALIZED_PAGERANK_SET
You can use the DBMS_OGA.PERSONALIZED_PAGERANK_SET
algorithm to compute personalized PageRank scores for vertices in a graph, based on a set of
vertices within a graph.
Syntax
DBMS_OGA.PERSONALIZED_PAGERANK_SET (
g IN ORA_PROPERTY_GRAPH,
rank IN ORA_VERTEX_OUTPUT_PROPERTY,
input_vertices IN JSON,
max_iter IN INTEGER,
tolerance IN BINARY_DOUBLE,
damping_factor IN BINARY_DOUBLE,
normalize IN BOOLEAN
) RETURN ORA_PROPERTY_GRAPH;Parameters
Table 148-6 PERSONALIZED_PAGERANK_SET Function Parameters
| Parameter | Description |
|---|---|
g |
Name of the property graph. |
rank |
Represents the output vertex property name which stores the rank value computed by the graph algorithm. |
input_vertices |
Represents the given input vertex set used to personalize the PageRank computation. |
max_iter |
Specifies the maximum number of iterations to be performed by the algorithm. |
tolerance |
Specifies the maximum error tolerated between two consecutive iterations. |
damping_factor |
Represents the damping factor for the computation of
the rank.
A common value is
|
normalize |
Controls the behaviour of the total value of the rank
with respect to dangling vertices (that is, vertices that have no
outgoing neighbors).
|
Usage Notes
- The Personalized PageRank algorithm executed by the
personalized_pagerank_setfunction is a variant of the PageRank algorithm. In this variant, the PageRank computation is personalized for a set of vertices.The algorithm accepts a set of
input_verticesto personalize the PageRank computation. The vertices in this set will be given more weight when computing the PageRank by initializing their rank value to1/count(input_vertices), all other vertices being initialized to 0.The content of
input_verticesmust be a JSON array. For example:[ { "GRAPH_NAME": "ADDRESSES", "GRAPH_OWNER": "GRAPHUSER", "ELEM_TABLE": "RESIDENT", "KEY_VALUE": { "RID": 1 } }, { "GRAPH_NAME": "ADDRESSES", "GRAPH_OWNER": "GRAPHUSER", "ELEM_TABLE": "RESIDENT", "KEY_VALUE": { "RID": 2 } } ] - See also usage notes in PAGERANK for more information.
Example 148-4 DBMS_OGA.PERSONALIZED_PAGERANK_SET Example
The following example computes Personalized PageRank on
addresses graph, giving importance to all residents with
age > 30. It then projects the name and rank for every
RESIDENT vertex in the graph.
The example is executed in two steps as shown:
- Construct the set of vertices targeting all residents with
age > 30.SELECT JSON_ARRAYAGG (p_json) FROM GRAPH_TABLE( addresses MATCH (r IS Resident) WHERE r.age > 30 COLUMNS(VERTEX_ID(r) AS p_json) );The query produces the following output:
[{"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"RESIDENT","KEY_VALUE":{"RID":2}}, {"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"RESIDENT","KEY_VALUE":{"RID":4}}, {"GRAPH_OWNER":"GRAPHUSER","GRAPH_NAME":"ADDRESSES","ELEM_TABLE":"RESIDENT","KEY_VALUE":{"RID":5}} ] - Run the
DBMS_OGA.PERSONALIZED_PAGERANK_SETfunction to compute the personalized PageRank scores biased towards the set of vertices obtained in the previous step.SELECT * FROM GRAPH_TABLE ( DBMS_OGA.personalized_pagerank_set( addresses, PROPERTY(VERTEX OUTPUT rank), JSON('[ { "GRAPH_OWNER": "GRAPHUSER", "GRAPH_NAME": "ADDRESSES", "ELEM_TABLE": "RESIDENT", "KEY_VALUE": { "RID": 2 } }, { "GRAPH_OWNER": "GRAPHUSER", "GRAPH_NAME": "ADDRESSES", "ELEM_TABLE": "RESIDENT", "KEY_VALUE": { "RID": 4 } }, { "GRAPH_OWNER": "GRAPHUSER", "GRAPH_NAME": "ADDRESSES", "ELEM_TABLE": "RESIDENT", "KEY_VALUE": { "RID": 5 } } ]'), 50, 0.1, 0.85d, FALSE ) MATCH (r IS Resident) COLUMNS ( r.name, r.rank ) ) ORDER BY rank DESC;
The query ouput is as shown:
"NAME" "RANK"
Ben 0.07607621885986329
Bob 0.06439765696818034
Alice 0.06219064524943034
Mary 0.03304200375773112
Tom 0.027014198840332033
148.5.5 WCC
You can use the DBMS_OGA.WCC algorithm to find the
connected components in a graph.
Syntax
DBMS_OGA.WCC (
g IN ORA_PROPERTY_GRAPH,
comp_id IN ORA_VERTEX_OUTPUT_PROPERTY
) RETURN ORA_PROPERTY_GRAPH;Parameters
Table 148-7 WCC Function Parameters
| Parameter | Description |
|---|---|
g |
Name of the property graph. |
comp_id |
Represents the output vertex property name which stores the component ID for a vertex as computed by the algorithm. |
Usage Notes
The DBMS_OGA.WCC function creates a new vertex property
with the name as given to the comp_id argument to store the
component ID for a vertex. The data type for this new vertex
property will be INTEGER.
If two vertices are assigned the same component ID,
then they belong to the same weakly connected component. This means that there
exists a path between them in the underlying undirected graph (that is, when edge
directions are ignored). If the graph contains more than one component
ID value, it means there are non-connected components in the
graph (sets of vertices that are not connected to each other).
The component ID values produced by the algorithm are not guaranteed
to be the same between two different runs. Also, the component IDs
might not start at 0 or 1. If the algorithm discovers multiple connected components,
the IDs might not be contiguous. For example, the component
IDs can be 2, 5, and 17 instead of 2, 3, and 4.
Example 148-5 DBMS_OGA.WCC Example
The following example uses the DBMS_OGA.WCC function to
find the connected components in addresses graph. It then projects
the name and component ID for every CITY vertex in
the graph.
SELECT name,comp_id
FROM GRAPH_TABLE (
DBMS_OGA.wcc(
addresses,
PROPERTY(VERTEX OUTPUT comp_id)
)
MATCH (c IS city)
COLUMNS (
c.name AS name,
c.comp_id AS comp_id
)
);
The example produces the following output. All cities have
COMP_ID = 2, which indicates they are all connected.
NAME COMP_ID
Austin 2
Dallas 2
Houston 2
San Antonio 2
El Paso 2