7 Using the Machine Learning Library (PgxML) for Graphs

The in-memory graph server (PGX) provides a machine learning library oracle.pgx.api.mllib, which supports graph-empowered machine learning algorithms.

The following machine learning algorithms are currently supported:

7.1 Using the DeepWalk Algorithm

DeepWalk is a widely employed vertex representation learning algorithm used in industry.

It consists of two main steps:

  1. First, the random walk generation step computes random walks for each vertex (with a pre-defined walk length and a pre-defined number of walks per vertex).
  2. Second, these generated walks are fed to a Word2vec algorithm to generate the vector representation for each vertex (which is the word in the input provided to the Word2vec algorithm). See KDD paper for more details on DeepWalk algorithm.

DeepWalk creates vertex embeddings for a specific graph and cannot be updated to incorporate modifications on the graph. Instead, a new DeepWalk model should be trained on this modified graph. Lastly, it is important to note that the memory consumption of the DeepWalk model is O(2n*d) where n is the number of vertices in the graph and d is the embedding length.

The following describes the usage of the main functionalities of DeepWalk in in-memory PGX using DBpedia graph as an example with 8,637,721 vertices and 165,049,964 edges:

7.1.1 Loading a Graph

The following describes the steps for loading a graph:

  1. Create a Session and an Analyst.
    Creating a Session and an Analyst Using JShell
    cd /opt/oracle/graph/
    ./bin/opg-jshell
    // starting the shell will create an implicit session and analyst
    Creating a Session and an Analyst Using Java
    import oracle.pgx.api.*;
    import oracle.pgx.api.mllib.DeepWalkModel;
    import oracle.pgx.api.frames.*;
    ...
    PgxSession session = Pgx.createSession("my-session");
    Analyst analyst = session.createAnalyst();
    Creating a Session and an Analyst Using Python
    session = pypgx.get_session(session_name="my-session")
    analyst = session.create_analyst()
  2. Load the graph.

    Note:

    Though the DeepWalk algorithm implementation can be applied to directed or undirected graphs, currently only undirected random walks are considered.
    Loading a graph using JShell
    opg-jshell> var graph = session.readGraphWithProperties("<path>/<graph.json>")
    Loading a graph using Java
    PgxGraph graph = session.readGraphWithProperties("<path>/<graph.json>");
    Loading a graph using Python
    graph = session.read_graph_with_properties("<path>/<graph.json>")

7.1.2 Building a Minimal DeepWalk Model

You can build a DeepWalk model using the minimal configuration and default hyper-parameters as described in the following code:

Building a Minimal DeepWalk Model Using JShell
opg-jshell> var model = analyst.deepWalkModelBuilder()
                .setWindowSize(3)
                .setWalksPerVertex(6)
                .setWalkLength(4)
                .build()
Building a Minimal DeepWalk Model Using Java
DeepWalkModel model = analyst.deepWalkModelBuilder()
    .setWindowSize(3)
    .setWalksPerVertex(6)
    .setWalkLength(4)
    .build()
Building a Minimal DeepWalk Model Using Python
model = analyst.deepwalk_builder(window_size=3,walks_per_vertex=6,walk_length=4)

7.1.3 Building a Customized DeepWalk Model

You can build a DeepWalk model using cusomized hyper-parameters as described in the following code:

Building a Customized DeepWalk model Using JShell
opg-jshell> var model = analyst.deepWalkModelBuilder()
                .setMinWordFrequency(1)
                .setBatchSize(512)
                .setNumEpochs(1)
                .setLayerSize(100)
                .setLearningRate(0.05)
                .setMinLearningRate(0.0001)
                .setWindowSize(3)
                .setWalksPerVertex(6)
                .setWalkLength(4)
                .setSampleRate(0.00001)
                .setNegativeSample(2)
                .setValidationFraction(0.01)
                .build()
Building a Customized DeepWalk model Using Java
DeepWalkModel model= analyst.deepWalkModelBuilder()
    .setMinWordFrequency(1)
    .setBatchSize(512)
    .setNumEpochs(1)
    .setLayerSize(100)
    .setLearningRate(0.05)
    .setMinLearningRate(0.0001)
    .setWindowSize(3)
    .setWalksPerVertex(6)
    .setWalkLength(4)
    .setSampleRate(0.00001)
    .setNegativeSample(2)
    .setValidationFraction(0.01)
    .build()
Building a Customized DeepWalk model Using Python
model = analyst.deepwalk_builder(min_word_frequency=1,
                                batch_size=512,num_epochs=1,
                                layer_size=100,
                                learning_rate=0.05,
                                min_learning_rate=0.0001,
                                window_size=3,
                                walks_per_vertex=6,
                                walk_length=4,
                                sample_rate=0.00001,
                                negative_sample=2,
                                validation_fraction=0.01)

See DeepWalkModelBuilder in Javadoc for more explanation for each builder operation along with the default values.

7.1.4 Training a DeepWalk Model

You can train a DeepWalk model with the specified default or customized settings as described in the following code:

Training a DeepWalk model Using JShell
opg-jshell> model.fit(graph)
Training a DeepWalk model Using Java
model.fit(graph)
Training a DeepWalk model Using Python
model.fit(graph)

7.1.5 Getting the Loss Value For a DeepWalk Model

You can fetch the loss value on a specified fraction of training data, that is set in builder using setValidationFraction as described in the following code:

Getting the Loss Value Using JShell
opg-jshell> var loss = model.getLoss()
Getting the Loss Value Using Java
double loss = model.getLoss();
Getting the Loss Value Using Python
loss = model.loss

7.1.6 Computing Similar Vertices for a Given Vertex

You can fetch the k most similar vertices for a given vertex as described in the following code:

Computing Similar Vertices for Given Vertex Using JShell
opg-jshell> var similars = model.computeSimilars("Albert_Einstein", 10)
opg-jshell> similars.print()
Computing Similar Vertices for Given Vertex Using Java
PgxFrame similars = model.computeSimilars("Albert_Einstein", 10)
similars.print()
Computing Similar Vertices for Given Vertex Using Python
similars = model.compute_similars("Albert_Einstein",10)
similars.print()
Searching for similar vertices for Albert_Einstein using the trained model, will result in the following output:
+-----------------------------------------+
| dstVertex          | similarity         |
+-----------------------------------------+
| Albert_Einstein    | 1.0000001192092896 |
| Physics            | 0.8664291501045227 |
| Werner_Heisenberg  | 0.8625140190124512 |
| Richard_Feynman    | 0.8496938943862915 |
| List_of_physicists | 0.8415523767471313 |
| Physicist          | 0.8384397625923157 |
| Max_Planck         | 0.8370327353477478 |
| Niels_Bohr         | 0.8340970873832703 |
| Quantum_mechanics  | 0.8331197500228882 |
| Special_relativity | 0.8280861973762512 |
+-----------------------------------------+

7.1.7 Computing Similar Vertices for a Vertex Batch

You can fetch the k most similar vertices for a list of input vertices as described in the following code:

Computing Similar Vertices for a Vertex Batch Using JShell
opg-jshell> var vertices = new ArrayList()
opg-jshell> vertices.add("Machine_learning")
opg-jshell> vertices.add("Albert_Einstein")
opg-jshell> batchedSimilars = model.computeSimilars(vertices, 10)
opg-jshell> batchedSimilars.print()
Computing Similar Vertices for a Vertex Batch Using Java
List vertices = Arrays.asList("Machine_learning","Albert_Einstein");
PgxFrame batchedSimilars = model.computeSimilars(vertices,10);
batchedSimilars.print();
Computing Similar Vertices for a Vertex Batch Using Python
vertices = ["Machine_learning","Albert_Einstein"]
batched_similars = model.compute_similars(vertices,10)
batched_similars.print()
The following describes the output result:
+-------------------------------------------------------------------+
| srcVertex        | dstVertex                 | similarity         |
+-------------------------------------------------------------------+
| Machine_learning | Machine_learning          | 1.0000001192092896 |
| Machine_learning | Data_mining               | 0.9070799350738525 |
| Machine_learning | Computer_science          | 0.8963605165481567 |
| Machine_learning | Unsupervised_learning     | 0.8828719854354858 |
| Machine_learning | R_(programming_language)  | 0.8821185827255249 |
| Machine_learning | Algorithm                 | 0.8819515705108643 |
| Machine_learning | Artificial_neural_network | 0.8773092031478882 |
| Machine_learning | Data_analysis             | 0.8758628368377686 |
| Machine_learning | List_of_algorithms        | 0.8737979531288147 |
| Machine_learning | K-means_clustering        | 0.8715602159500122 |
| Albert_Einstein  | Albert_Einstein           | 1.0000001192092896 |
| Albert_Einstein  | Physics                   | 0.8664291501045227 |
| Albert_Einstein  | Werner_Heisenberg         | 0.8625140190124512 |
| Albert_Einstein  | Richard_Feynman           | 0.8496938943862915 |
| Albert_Einstein  | List_of_physicists        | 0.8415523767471313 |
| Albert_Einstein  | Physicist                 | 0.8384397625923157 |
| Albert_Einstein  | Max_Planck                | 0.8370327353477478 |
| Albert_Einstein  | Niels_Bohr                | 0.8340970873832703 |
| Albert_Einstein  | Quantum_mechanics         | 0.8331197500228882 |
| Albert_Einstein  | Special_relativity        | 0.8280861973762512 |
+-------------------------------------------------------------------+

7.1.8 Storing a Trained DeepWalk Model

You can store models in database. The models get stored as a row inside a model store table.

The following code shows how to store a trained DeepWalk model in database in a specific model store table:

Storing a Trained DeepWalk Model Using JShell
opg-jshell> model.export().db() 
              .modelstore("modelstoretablename")  // name of the model store table
              .modelname("model")                 // model name (primary key of model store table)
              .description("a model description") // description to store alongside the model
              .store();
Storing a Trained DeepWalk Model Using Java
model.export().db()
    .modelstore("modelstoretablename")  // name of the model store table
    .modelname("model")                 // model name (primary key of model store table)
    .description("a model description") // description to store alongside the model
    .store();
Storing a Trained DeepWalk Model Using Python
model.export().db(model_store="modelstoretablename",
                  model_name="model")

Note:

All the above examples assume that you are storing the model in the current logged in database. If you must store the model in a different database then refer to the examples in Storing a Trained Model in Another Database.
7.1.8.1 Storing a Trained Model in Another Database

You can store models in a different database other than the one used for login.

The following code shows how to store a trained model in a different database:

Storing a Trained Model Using JShell
opg-jshell> model.export().db() 
              .username("user")                   // DB user to use for storing the model
              .password("password")               // password of the DB user
              .jdbcUrl("jdbcUrl")                 // jdbc url to the DB
              .modelstore("modelstoretablename")  // name of the model store table
              .modelname("model")                 // model name (primary key of model store table)
              .description("a model description") // description to store alongside the model
              .store();
Storing a Trained Model Using Java
model.export().db()
    .username("user")                   // DB user to use for storing the model
    .password("password")               // password of the DB user
    .jdbcUrl("jdbcUrl")                 // jdbc url to the DB
    .modelstore("modelstoretablename")  // name of the model store table
    .modelname("model")                 // model name (primary key of model store table)
    .description("a model description") // description to store alongside the model
    .store();
Storing a Trained Model Using Python
model.export().db(username="user",
                  password="password",
                  model_store="modelstoretablename",
                  model_name="model",
                  jdbc_url="jdbc_url")

7.1.9 Loading a Pre-Trained DeepWalk Model

You can load models from a database.

You can load a pre-trained DeepWalk model from a model store table in database as described in the following code:

Loading a Pre-Trained DeepWalk Model Using JShell
opg-jshell> var model = analyst.loadDeepWalkModel().db()
                .modelstore("modeltablename") // name of the model store table
                .modelname("model")           // model name (primary key of model store table)
                .load();
Loading a Pre-Trained DeepWalk Model Using Java
DeepWalkModelmodel = analyst.loadDeepWalkModel().db()
     .modelstore("modeltablename") // name of the model store table
     .modelname("model")           // model name (primary key of model store table)
     .load();
Loading a Pre-Trained DeepWalk Model Using Python
analyst.get_deepwalk_model_loader().db(model_store="modelstoretablename",
                                       model_name="model")

Note:

All the above examples assume that you are loading the model from the current logged in database. If you must load the model from a different database then refer to the examples in Loading a Pre-Trained Model From Another Database.
7.1.9.1 Loading a Pre-Trained Model From Another Database

You can load models from a different database other than the one used for login.

You can load a pre-trained model from a model store table in database as described in the following code:

Loading a Pre-Trained Model Using JShell
opg-jshell> var model = analyst.<modelLoader>.db()
                .username("user")             // DB user to use for storing the model
                .password("password")         // password of the DB user
                .jdbcUrl("jdbcUrl")           // jdbc url to the DB
                .modelstore("modeltablename") // name of the model store table
                .modelname("model")           // model name (primary key of model store table)
                .load();
where <modelLoader> applies as follows:
  • loadDeepWalkModel(): Loads a Deepwalk model
  • loadSupervisedGraphWiseModel(): Loads a GraphWise model
  • loadPg2vecModel(): Loads a Pg2vec model
Loading a Pre-Trained DeepWalk Model Using Java
DeepWalkModelmodel = analyst.<modelLoader>.db()
     .username("user")             // DB user to use for storing the model
     .password("password")         // password of the DB user
     .jdbcUrl("jdbcUrl")           // jdbc url to the DB
     .modelstore("modeltablename") // name of the model store table
     .modelname("model")           // model name (primary key of model store table)
     .load();
where <modelLoader> applies as follows:
  • loadDeepWalkModel(): Loads a Deepwalk model
  • loadSupervisedGraphWiseModel(): Loads a GraphWise model
  • loadPg2vecModel(): Loads a Pg2vec model
Loading a Pre-Trained DeepWalk Model Using Python
analyst.<modelLoader>.db(username="user",
                                       password="password",
                                       model_store="modelstoretablename",
                                       model_name="model",
                                       jdbc_url="jdbc_url")
where <modelLoader> applies as follows:
  • get_deepwalk_model_loader(): Loads a Deepwalk model
  • get_pg2vec_model_loader(): Loads a Pg2vec model

7.1.10 Destroying a DeepWalk Model

You can destroy a DeepWalk model as described in the following code:

Destroying a DeepWalk Model Using JShell
opg-jshell> model.destroy()
Destroying a DeepWalk Model Using Java
model.destroy();
Destroying a DeepWalk Model Using Python
model.destroy()

7.2 Using the Supervised GraphWise Algorithm

Supervised GraphWise is an inductive vertex representation learning algorithm which is able to leverage vertex feature information. It can be applied to a wide variety of tasks, including vertex classification and link prediction.

Supervised GraphWise is based on GraphSAGE by Hamilton et al.

Model Structure

A Supervised GraphWise model consists of two graph convolutional layers followed by several prediction layers.

The forward pass through a convolutional layer for a vertex proceeds as follows:

  1. A set of neighbors of the vertex is sampled.
  2. The previous layer representations of the neighbors are mean-aggregated, and the aggregated features are concatenated with the previous layer representation of the vertex.
  3. This concatenated vector is multiplied with weights, and a bias vector is added.
  4. The result is normalized to such that the layer output has unit norm.

The prediction layers are standard neural network layers.

The following describes the usage of the main functionalities of the implementation of GraphSAGE in PGX using the Cora graph as an example:

7.2.1 Loading a Graph

The following describes the steps for loading a graph:

  1. Create a Session and an Analyst.
    Creating a Session and an Analyst Using JShell
    cd /opt/oracle/graph/
    ./bin/opg-jshell
    // starting the shell will create an implicit session and analyst
    Creating a Session and an Analyst Using Java
    import oracle.pgx.api.*;
    import oracle.pgx.api.mllib.SupervisedGraphWiseModel;
    import oracle.pgx.api.frames.*;
    import oracle.pgx.config.mllib.ActivationFunction;
    import oracle.pgx.config.mllib.GraphWiseConvLayerConfig;
    import oracle.pgx.config.mllib.GraphWisePredictionLayerConfig;
    import oracle.pgx.config.mllib.SupervisedGraphWiseModelConfig;
    import oracle.pgx.config.mllib.WeightInitScheme;
    PgxSession session = Pgx.createSession("my-session");
    Analyst analyst = session.createAnalyst();
  2. Load the graph.
    Loading a graph using JShell
    opg-jshell> var fullGraph = session.readGraphWithProperties("<path>/<full_graph.json>")
    opg-jshell> var trainGraph = session.readGraphWithProperties("<path>/<train_graph.json>")
    opg-jshell> var testVertices = fullGraph.getVertices()
                    .stream()
                    .filter(v -> !trainGraph.hasVertex(v.getId()))
                    .collect(Collectors.toList());
    Loading a graph using Java
    PgxGraph fullGraph = session.readGraphWithProperties("<path>/<full_graph.json>");
    PgxGraph trainGraph = session.readGraphWithProperties("<path>/<train_graph.json>");
    List<PgxVertex> testVertices = fullGraph.getVertices()
        .stream()
        .filter(v->!trainGraph.hasVertex(v.getId()))
        .collect(Collectors.toList());

7.2.2 Building a Minimal GraphWise Model

You can build a GraphWise model using the minimal configuration and default hyper-parameters as described in the following code:

Building a Minimal GraphWise Model Using JShell
opg-jshell> var model = analyst.supervisedGraphWiseModelBuilder()
                .setVertexInputPropertyNames("features")
                .setVertexTargetPropertyName("label")
                .build()
Building a Minimal GraphWise Model Using Java
SupervisedGraphWiseModel model = analyst.supervisedGraphWiseModelBuilder()
    .setVertexInputPropertyNames("features")
    .setVertexTargetPropertyName("labels")
    .build()

Note:

Even though only one feature property is specified in the above example, you can specify arbitrarily many.

7.2.3 Advanced Hyperparameter Customization

You can build a GraphWise model using rich hyperparameter customization.

This is done through the following two sub-config classes:

  1. GraphWiseConvLayerConfig
  2. GraphWisePredictionLayerConfig

The following code describes the implementation of the configuration using the above classes in GraphWise model:

Building a Customized GraphWise Model Using JShell
opg-jshell> var weightProperty = analyst.pagerank(trainGraph).getName()
opg-jshell> var convLayerConfig = analyst.graphWiseConvLayerConfigBuilder()
                .setNumSampledNeighbors(25)
                .setActivationFunction(ActivationFunction.TANH)
                .setWeightInitScheme(WeightInitScheme.XAVIER)
                .setWeightedAggregationProperty(weightProperty)
                .build()
opg-jshell> var predictionLayerConfig = analyst.graphWisePredictionLayerConfigBuilder()
                .setHiddenDimension(32)
                .setActivationFunction(ActivationFunction.RELU)
                .setWeightInitScheme(WeightInitScheme.HE)
                .build()
opg-jshell> var model = analyst.supervisedGraphWiseModelBuilder()
                .setVertexInputPropertyNames("features")
                .setVertexTargetPropertyName("labels")
                .setConvLayerConfigs(convLayerConfig)
                .setPredictionLayerConfigs(predictionLayerConfig)
                .build()
Building a Customized GraphWise Model Using Java
String weightProperty = analyst.pagerank(trainGraph).getName()
GraphWiseConvLayerConfig convLayerConfig = analyst.graphWiseConvLayerConfigBuilder()
    .setNumSampledNeighbors(25)
    .setActivationFunction(ActivationFunction.TANH)
    .setWeightInitScheme(WeightInitScheme.XAVIER)
    .setWeightedAggregationProperty(weightProperty)
    .build();

GraphWisePredictionLayerConfig predictionLayerConfig = analyst.graphWisePredictionLayerConfigBuilder()
    .setHiddenDimension(32)
    .setActivationFunction(ActivationFunction.RELU)
    .setWeightInitScheme(WeightInitScheme.HE)
    .build();

SupervisedGraphWiseModel model = analyst.supervisedGraphWiseModelBuilder()
    .setVertexInputPropertyNames("features")
    .setVertexTargetPropertyName("labels")
    .setConvLayerConfigs(convLayerConfig)
    .setPredictionLayerConfigs(predictionLayerConfig)
    .build();

See SupervisedGraphWiseModelBuilder, GraphWiseConvLayerConfigBuilder and GraphWisePredictionLayerConfigBuilder in Javadoc for a full description of all available hyperparameters and their default values.

7.2.4 Training a Supervised GraphWise Model

You can train a Supervised GraphWise model on a graph as described in the following code:

Training a GraphWise Model Using JShell
opg-jshell> model.fit(trainGraph)
Training a GraphWise Model Using Java
model.fit(trainGraph)

7.2.5 Getting the Loss Value For a Supervised GraphWise Model

You can fetch the training loss value as described in the following code:

Getting the Loss Value Using JShell
opg-jshell> var loss = model.getTrainingLoss()
Getting the Loss Value Using Java
double loss = model.getTrainingLoss();

7.2.6 Inferring the Vertex Labels for a Supervised GraphWise Model

You can infer the labels for vertices on any graph (including vertices or graphs that were not seen during training) as described in the following code:

Inferring the Vertex Labels Using JShell
opg-jshell> var labels = model.inferLabels(fullGraph, testVertices)
opg-jshell> labels.head().print()
Inferring the Vertex Labels Using Java
PgxFrame labels = model.inferLabels(fullGraph,testVertices);
labels.head().print();
The output will be similar to the following example output:
+----------------------------------+
| vertexId | label                 |
+----------------------------------+
| 2        | Neural Networks       |
| 6        | Theory                |
| 7        | Case Based            |
| 22       | Rule Learning         |
| 30       | Theory                |
| 34       | Neural Networks       |
| 47       | Case Based            |
| 48       | Probabalistic Methods |
| 50       | Theory                |
| 52       | Theory                |
+----------------------------------+

Similarly, you can also get the model confidence for each class by inferring the prediction logits as described in the following code:

Getting the Model Confidence Using JShell
opg-jshell> var logits = model.inferLogits(fullGraph, testVertices)
opg-jshell> labels.head().print()
Getting the Model Confidence Using Java
PgxFrame logits = model.inferLogits(fullGraph,testVertices);
logits.head().print();

7.2.7 Evaluating the Supervised GraphWise Model Performance

You can evaluate various classification metrics for the model using the evaluateLabels method as described in the following code:

Evaluating the Supervised GraphWise Model Performance Using JShell
opg-jshell> model.evaluateLabels(fullGraph, testVertices).print()
Evaluating the Supervised GraphWise Model Performance Using Java
model.evaluateLabels(fullGraph,testVertices).print();
The output will be similar to the following example output:
+------------------------------------------+
| Accuracy | Precision | Recall | F1-Score |
+------------------------------------------+
| 0.8488   | 0.8523    | 0.831  | 0.8367   |
+------------------------------------------+

7.2.8 Inferring Embeddings for a Supervised GraphWise Model

You can use a trained model to infer embeddings for unseen nodes and store in the database as described in the following code:

Inferring Embeddings Using JShell
opg-jshell> var vertexVectors = model.inferEmbeddings(fullGraph, fullGraph.getVertices()).flattenAll()
opg-jshell> vertexVectors.write()
    .db()
    .username("user")            // DB user
    .password("password")        // password of the DB user
    .jdbcUrl("jdbcUrl")          // jdbc url to the DB
    .name("vertex vectors")
    .tablename("vertexVectors")  // indicate the name of the table in which the data should be stored
    .overwrite(true)             // indicate that if there is a table with the same name, it will be overwritten (truncated)
    .store()
Inferring Embeddings Using Java
PgxFrame vertexVectors = model.inferEmbeddings(fullGraph,fullGraph.getVertices()).flattenAll();
vertexVectors.write()
    .db()
    .username("user")           // DB user
    .password("password")       // password of the DB user
    .jdbcUrl("jdbcUrl")         // jdbc url to the DB
    .name("vertex vectors")
    .tablename("vertexVectors") // indicate the name of the table in which the data should be stored
    .overwrite(true)            // indicate that if there is a table with the same name, it will be overwritten (truncated)
    .store();
The schema for the vertexVectors will be as follows without flattening (flattenAll splits the vector column into separate double-valued columns):
+---------------------------------------------------------------+
| vertexId                                | embedding           |
+---------------------------------------------------------------+

7.2.9 Storing a Trained Supervised GraphWise Model

You can store models in database. The models get stored as a row inside a model store table.

The following code shows how to store a trained Supervised GraphWise model in database in a specific model store table:

Storing a Trained Supervised GraphWise Model Using JShell
opg-jshell> model.export().db() 
              .modelstore("modelstoretablename")  // name of the model store table
              .modelname("model")                 // model name (primary key of model store table)
              .description("a model description") // description to store alongside the model
              .store();
Storing a Trained Supervised GraphWise Model Using Java
model.export().db()
    .modelstore("modelstoretablename")  // name of the model store table
    .modelname("model")                 // model name (primary key of model store table)
    .description("a model description") // description to store alongside the model
    .store();

Note:

All the above examples assume that you are storing the model in the current logged in database. If you must store the model in a different database then refer to the examples in Storing a Trained Model in Another Database.

7.2.10 Loading a Pre-Trained Supervised GraphWise Model

You can load models from a database.

You can load a pre-trained Supervised GraphWise model from a model store table in database as described in the following code:

Loading a Pre-Trained Supervised GraphWise Model Using JShell
opg-jshell> var model = analyst.loadSupervisedGraphWiseModel().db()
                .modelstore("modeltablename") // name of the model store table
                .modelname("model")           // model name (primary key of model store table)
                .load();
Loading a Pre-Trained Supervised GraphWise Model Using Java
SupervisedGraphWiseModelmodel = analyst.loadSupervisedGraphWiseModel().db()
     .modelstore("modeltablename") // name of the model store table
     .modelname("model")           // model name (primary key of model store table)
     .load();

Note:

All the above examples assume that you are loading the model from the current logged in database. If you must load the model from a different database then refer to the examples in Loading a Pre-Trained Model From Another Database.

7.2.11 Destroying a Supervised GraphWise Model

You can destroy a GraphWise model as described in the following code:

Destroying a GraphWise Model Using JShell
opg-jshell> model.destroy()
Destroying a GraphWise Model Using Java
model.destroy();

7.3 Using the Pg2vec Algorithm

Pg2vec learns representations of graphlets (partitions inside a graph) by employing edges as the principal learning units and thereby packing more information in each learning unit (as compared to employing vertices as learning units) for the representation learning task.

It consists of three main steps:

  1. Random walks for each vertex (with pre-defined length per walk and pre-defined number of walks per vertex) is generated.
  2. Each edge in this random walk is mapped as a property edge-word in the created document (with the document label as the graph-id) where the property edge-word is defined as the concatenation of the properties of the source and destination vertices.
  3. The generated documents (with their attached document labels) are fed to a doc2vec algorithm which generates the vector representation for each document, which is a graph in this case.

Pg2vec creates graphlet embeddings for a specific set of graphlets and cannot be updated to incorporate modifications on these graphlets. Instead, a new Pg2vec model should be trained on these modified graphlets.

The following represents the memory consumption of Pg2vec model.
O(2(n+m)*d)
where:
  • n: is the number of vertices in the graph
  • m: is the number of graphlets in the graph
  • d: is the embedding length

The following describes the usage of the main functionalities of the implementation of Pg2vec in PGX using NCI109 dataset as an example with 4127 graphs in it:

7.3.1 Loading a Graph

The following describes the steps for loading a graph:

  1. Create a Session and an Analyst.
    Creating a Session and an Analyst Using JShell
    cd /opt/oracle/graph/
    ./bin/opg-jshell
    // starting the shell will create an implicit session and analyst
    Creating a Session and an Analyst Using Java
    import oracle.pgx.api.*;
    import oracle.pgx.api.mllib.Pg2vecModel;
    import oracle.pgx.api.frames.*;
    ...
    PgxSession session = Pgx.createSession("my-session");
    Analyst analyst = session.createAnalyst();
    Creating a Session and an Analyst Using Python
    session = pypgx.get_session(session_name="my-session")
    analyst = session.create_analyst()
  2. Load the graph.
    Loading a graph using JShell
    opg-jshell> var graph = session.readGraphWithProperties("<path>/<graph.json>")
    Loading a graph using Java
    PgxGraph graph = session.readGraphWithProperties("<path>/<graph.json>");
    Loading a graph using Python
    graph = session.read_graph_with_properties("<path>/<graph.json>")

7.3.2 Building a Minimal Pg2vec Model

You can build a Pg2vec model using the minimal configuration and default hyper-parameters as described in the following code:

Building a Minimal Pg2vec Model Using JShell
opg-jshell> var model = analyst.pg2vecModelBuilder()
                .setGraphLetIdPropertyName("graph_id")
                .setVertexPropertyNames(Arrays.asList("category"))
                .setWindowSize(4)
                .setWalksPerVertex(5)
                .setWalkLength(8)
                .build()
Building a Minimal Pg2vec Model Using Java
Pg2vecModel model = analyst.pg2vecModelBuilder()
    .setGraphLetIdPropertyName("graph_id")
    .setVertexPropertyNames(Arrays.asList("category"))
    .setWindowSize(4)
    .setWalksPerVertex(5)
    .setWalkLength(8)
    .build();
    
Building a Minimal Pg2vec Model Using Python
model = analyst.pg2vec_model_builder(
    graph_let_id_property_name="graph_id",
    vertex_property_names(["category"]),
    window_size=4,
    walks_per_vertex=5,
    walk_length=8)

You can specify the property name to determine each graphlet using the Pg2vecModelBuilder#setGraphLetIdPropertyName operation and also employ the vertex properties in Pg2vec which are specified using the Pg2vecModelBuilder#setVertexPropertyNames operation.

You can also use the weakly connected component (WCC) functionality in PGX to determine the graphlets in a given graph.

7.3.3 Building a Customized Pg2vec Model

You can build a Pg2vec model using cusomized hyper-parameters as described in the following code:

Building a Customized Pg2vec model Using JShell
opg-jshell> var model = analyst.pg2vecModelBuilder()
                .setGraphLetIdPropertyName("graph_id")
                .setVertexPropertyNames(Arrays.asList("category"))
                .setMinWordFrequency(1)
                .setBatchSize(128)
                .setNumEpochs(5)
                .setLayerSize(200)
                .setLearningRate(0.04)
                .setMinLearningRate(0.0001)
                .setWindowSize(4)
                .setWalksPerVertex(5)
                .setWalkLength(8)
                .setUseGraphletSize(true)
                .setValidationFraction(0.05)
                .setGraphletSizePropertyName("<propertyName>")
                .build()
Building a Customized Pg2vec model Using Java
Pg2vecModel model= analyst.pg2vecModelBuilder()
    .setGraphLetIdPropertyName("graph_id")
    .setVertexPropertyNames(Arrays.asList("category"))
    .setMinWordFrequency(1)
    .setBatchSize(128)
    .setNumEpochs(5)
    .setLayerSize(200)
    .setLearningRate(0.04)
    .setMinLearningRate(0.0001)
    .setWindowSize(4)
    .setWalksPerVertex(5)
    .setWalkLength(8)
    .setUseGraphletSize(true)
    .setValidationFraction(0.05)
    .setGraphletSizePropertyName("<propertyName>")
    .build()
Building a Customized Pg2vec model Using Python
model = analyst.pg2vec_model_builder(
    graph_let_id_property_name = "graph_id",
    vertex_property_names = ["category"],
    min_word_frequency = 1,
    batch_size = 128,
    num_epochs = 5,
    layer_size = 200,
    learning_rate = 0.04,
    min_learning_rate = 0.0001,
    window_size = 4,
    walks_per_vertex = 5,
    walk_length = 8,
    use_graphlet_size = true,
    graphlet_size_property_name = "<property_name>",
    validation_fraction = 0.05)
    

See Pg2vecModelBuilder in Javadoc for more explanation for each builder operation along with the default values.

7.3.4 Training a Pg2vec Model

You can train a Pg2vec model with the specified default or customized settings as described in the following code:

Training a Pg2vec Model Using JShell
opg-jshell> model.fit(graph)
Training a Pg2vec Model Using Java
model.fit(graph);
Training a Pg2vec Model Using Python
model.fit(graph)

7.3.5 Getting the Loss Value For a Pg2vec Model

You can fetch the training loss value on a specified fraction of training data (set in builder using setValidationFraction) as described in the following code:

Getting the Loss Value Using JShell
opg-jshell> var loss = model.getLoss()
Getting the Loss Value Using Java
double loss = model.getLoss();
Getting the Loss Value Using Python
loss = model.loss

7.3.6 Computing Similar Graphlets for a Given Graphlet

You can fetch the k most similar graphlets for a given graphlet as described in the following code:

Computing Similar Graphlets for Given Graphlet Using JShell
opg-jshell> var similars = model.computeSimilars(52, 10)
Computing Similar Graphlets for Given Graphlet Using Java
PgxFrame similars = model.computeSimilars(52, 10);
Computing Similar Graphlets for Given Graphlet Using Python
similars = model.compute_similars(52, 10) 
Searching for similar vertices for graphlet with ID = 52 using the trained model and printing it with similars.print(), will result in the following output:
+----------------------------------+
| dstGraphlet | similarity         |
+----------------------------------+
| 52          | 1.0                |
| 10          | 0.8748674392700195 |
| 23          | 0.8551455140113831 |
| 26          | 0.8493421673774719 |
| 47          | 0.8411962985992432 |
| 25          | 0.8281504511833191 |
| 43          | 0.8202780485153198 |
| 24          | 0.8179885745048523 |
| 8           | 0.796689510345459  |
| 9           | 0.7947834134101868 |
+----------------------------------+

The following depicts the visualization of two similar graphlets (top: ID = 52 and bottom: ID = 10):

7.3.7 Computing Similars for a Graphlet Batch

You can fetch the k most similar graphlets for a batch of input graphlets as described in the following code:

Computing Similar Graphlets for a Graphlet Batch Using JShell
opg-jshell> var graphlets = new ArrayList()
opg-jshell> graphlets.add(52)
opg-jshell> graphlets.add(41)
opg-jshell> var batchedSimilars = model.computeSimilars(graphlets, 10)
Computing Similar Graphlets for a Graphlet Batch Using Java
List graphlets = Arrays.asList(52,41);
PgxFrame batchedSimilars = model.computeSimilars(graphlets,10);
Computing Similar Graphlets for a Graphlet Batch Using Python
batched_similars = model.compute_similars([52,41],10)
Searching for similar vertices for graphlet with ID = 52 and ID = 41 using the trained model and printing it with batched_similars.print(), will result in the following output:
+------------------------------------------------+
| srcGraphlet | dstGraphlet | similarity         |
+------------------------------------------------+
| 52          | 52          | 1.0                |
| 52          | 10          | 0.8748674392700195 |
| 52          | 23          | 0.8551455140113831 |
| 52          | 26          | 0.8493421673774719 |
| 52          | 47          | 0.8411962985992432 |
| 52          | 25          | 0.8281504511833191 |
| 52          | 43          | 0.8202780485153198 |
| 52          | 24          | 0.8179885745048523 |
| 52          | 8           | 0.796689510345459  |
| 52          | 9           | 0.7947834134101868 |
| 41          | 41          | 1.0                |
| 41          | 197         | 0.9653506875038147 |
| 41          | 84          | 0.9552277326583862 |
| 41          | 157         | 0.9465565085411072 |
| 41          | 65          | 0.9287481307983398 |
| 41          | 248         | 0.9177336096763611 |
| 41          | 315         | 0.9043129086494446 |
| 41          | 92          | 0.8998928070068359 |
| 41          | 297         | 0.8897411227226257 |
| 41          | 50          | 0.8810243010520935 |
+------------------------------------------------+

7.3.8 Inferring a Graphlet Vector

You can infer the vector representation for a given new graphlet as described in the following code:

Inferring a Graphlet Vector Using JShell
opg-jshell> var graphlet = session.readGraphWithProperties("<path>/<graphletConfig.json>")
opg-jshell> inferredVector = model.inferGraphletVector(graphlet)
opg-jshell> inferredVector.print()
Inferring a Graphlet Vector Using Java
PgxGraph graphlet = session.readGraphWithProperties("<path>/<graphletConfig.json>");
PgxFrame inferredVector = model.inferGraphletVector(graphlet);
inferredVector.print();
Inferring a Graphlet Vector Using Python
PgxGraph graphlet = session.read_graph_with_properties("<path>/<graphletConfig.json>")
inferredVector = model.infer_graphlet_vector(graphlet)
inferredVector.print()
The schema for the inferredVector will be similar to the following output:
+---------------------------------------------------------------+
| graphlet                                | embedding           |
+---------------------------------------------------------------+

7.3.9 Inferring Vectors for a Graphlet Batch

You can infer the vector representations for multiple graphlets (specified with different graph-ids in a graph) as described in the following code:

Inferring Vectors for a Graphlet Batch Using JShell
opg-jshell> var graphlet = session.readGraphWithProperties("<path>/<graphletConfig.json>")
opg-jshell> inferredVectorBatched = model.inferGraphletVectorBatched(graphlets)
opg-jshell> inferredVectorBatched.print()
Inferring Vectors for a Graphlet Batch Using Java
PgxGraph graphlet = session.readGraphWithProperties("<path>/<graphletConfig.json>");
PgxFrame inferredVectorBatched = model.inferGraphletVectorBatched(graphlets);
inferredVector.print();
Inferring Vectors for a Graphlet Batch Using Python
graphlets = session.read_graph_with_properties("<path>/<graphletConfig.json>")
inferred_vector_batched = model.infer_graphlet_vector_batched(graphlets)
inferred_vector_batched.print()

The schema is same as for inferGraphletVector but with more rows corresponding to the input graphlets.

7.3.10 Storing a Trained Pg2vec Model

You can store models in database. The models get stored as a row inside a model store table.

The following code shows how to store a trained Pg2vec model in database in a specific model store table:

Storing a Trained Pg2vec Model Using JShell
opg-jshell> model.export().db() 
              .modelstore("modelstoretablename")  // name of the model store table
              .modelname("model")                 // model name (primary key of model store table)
              .description("a model description") // description to store alongside the model
              .store();
Storing a Trained Pg2vec Model Using Java
model.export().db()
    .modelstore("modelstoretablename")  // name of the model store table
    .modelname("model")                 // model name (primary key of model store table)
    .description("a model description") // description to store alongside the model
    .store();
Storing a Trained Pg2vec Model Using Python
model.export().db(model_store="modelstoretablename",
                  model_name="model")

Note:

All the above examples assume that you are storing the model in the current logged in database. If you must store the model in a different database then refer to the examples in Storing a Trained Model in Another Database.

7.3.11 Loading a Pre-Trained Pg2vec Model

You can load models from a database.

You can load a pre-trained Pg2vec model from a model store table in database as described in the following:

Loading a Pre-Trained Pg2vec Model Using JShell
opg-jshell> var model = analyst.loadPg2vecModel().db()
                .modelstore("modeltablename") // name of the model store table
                .modelname("model")           // model name (primary key of model store table)
                .load();
Loading a Pre-Trained Pg2vec Model Using Java
Pg2vecModelmodel = analyst.loadPg2vecModel().db()
     .modelstore("modeltablename") // name of the model store table
     .modelname("model")           // model name (primary key of model store table)
     .load();
Loading a Pre-Trained Pg2vec Model Using Python
analyst.get_pg2vec_model_loader().db(model_store="modelstoretablename",
                                     model_name="model")

Note:

All the above examples assume that you are loading the model from the current logged in database. If you must load the model from a different database then refer to the examples in Loading a Pre-Trained Model From Another Database.

7.3.12 Destroying a Pg2vec Model

You can destroy a Pg2vec model as described in the following code:

Destroying a Pg2vec Model Using JShell
opg-jshell> model.destroy()
Destroying a Pg2vec Model Using Java
model.destroy();
Destroying a Pg2vec Model Using Python
model.destroy()