MLlib

Graph machine learning tools for use with PGX.

class pypgx.api.mllib.CategoricalPropertyConfig(java_config, params)

Bases: InputPropertyConfig

Configuration class for handling categorical input properties.

property categorical: bool

Get whether the feature is categorical.

Returns

whether the feature is categorical

property categorical_embedding_type: bool

Get the type of categorical embedding.

Returns

type of categorical embedding

property max_vocabulary_size: int

Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.

Returns

maximum number of tokens allowed

property property_name: str

Get the name of the feature that the configuration is used for.

Returns

name of the feature that the configuration is used for

set_max_vocabulary_size(max_vocabulary_size)

Set max vocabulary size to a given value.

Parameters

max_vocabulary_size (int) – set the maximum vocabulary size to the given value

Return type

None

set_shared(shared)

Set whether the feature is shared among vertex/edge types.

Parameters

shared – set shared to the value

Return type

None

property shared: bool

Get whether the feature is shared among vertex/edge types.

Returns

whether the feature is shared among vertex/edge types

class pypgx.api.mllib.ConcatEdgeCombinationMethod(use_source_vertex, use_destination_vertex, use_edge)

Bases: EdgeCombinationMethod

Concatenation method for edge embedding generation

Parameters
  • use_source_vertex (bool) –

  • use_destination_vertex (bool) –

  • use_edge (bool) –

get_aggregation_type()

Get the aggregation type

Returns:

the aggregation type

Return type

str

use_dst_vertex()

Get if destination vertex embedding is used or not for the edge embedding.

Returns

uses or not the destination vertex

Return type

bool

use_edge()

Get if edge features are used or not for the edge embedding

Returns

uses or not the edge features

Return type

bool

use_src_vertex()

Get if source vertex embedding is used or not for the edge embedding.

Returns

uses or not the source vertex

Return type

bool

class pypgx.api.mllib.CorruptionFunction(java_corruption_function)

Bases: object

Abstract Corruption Function which generate the corrupted subgraph for DGI

class pypgx.api.mllib.DeepWalkModel(java_deepwalk_model)

Bases: Model

DeepWalk model object.

property batch_size: int

Get the batch size.

close()

Call destroy

Return type

None

compute_similars(v, k)

Compute the top-k similar vertices for a given vertex.

Parameters
  • v (Union[int, str, List[int], List[str]]) – id of the vertex or list of vertex ids for which to compute the similar vertices

  • k (int) – number of similar vertices to return

Return type

PgxFrame

destroy()

Destroy this model object.

Return type

None

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Return type

None

is_fitted()

Whether or not the model has been fitted.

Return type

bool

property layer_size: int

Get the layer size.

property learning_rate: float

Get the learning rate.

property loss: Optional[float]

Get the loss of the model.

property min_learning_rate: float

Get the minimum learning rate.

property min_word_frequency: int

Get the minimum word frequency.

property negative_sample: int

Get the negative sample.

property num_epochs: int

Get the number of epochs.

property sample_rate: float

Get the sample rate.

property seed: Optional[int]

Get the seed.

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (Optional[str]) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Return type

None

property trained_vectors: PgxFrame

Get the trained vertex vectors for the current DeepWalk model.

Returns

PgxFrame object with the trained vertex vectors

Return type

PgxFrame

property validation_fraction: float

Get the validation fraction.

validation_fraction is deprecated since 23.4, the loss now is computed on all samples

property walk_length: int

Get the walk length.

property walks_per_vertex: int

Get the number of walks per vertex.

property window_size: int

Get the window size.

class pypgx.api.mllib.DevNetLoss(confidence_margin, anomaly_property_value)

Bases: LossFunction

Deviation loss for anomaly detection

Parameters
  • confidence_margin (float) –

  • anomaly_property_value (bool) –

get_anomaly_property_value()

Get Anomaly Property Value.

Returns

the anomaly property value

Return type

Any

get_confidence_margin()

Get confidence margin of the loss function.

Returns

the confidence margin

Return type

float

class pypgx.api.mllib.EdgeWiseModel(java_edgewise_model, params=None)

Bases: Model

EdgeWise model object.

This is a base class for SupervisedEdgeWiseModel.

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_edge_combination_method()

Get the edge combination method used to compute the edge embedding

Returns

edge combination method

Return type

EdgeCombinationMethod

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_training_loss()

Get the final training loss

Returns

training loss

Return type

float

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

class pypgx.api.mllib.EdgeWiseModelConfig(java_edgewise_model_config)

Bases: object

Edgewise Model Configuration class

property backend: str

Get the backend.

property batch_size: int

Get the batch size.

property conv_layer_configs: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

Get the conv layer configs.

property edge_combination_method: EdgeCombinationMethod

Get the edge combination method.

property edge_input_feature_dim: int

Get the edge input feature dimension.

property edge_input_property_configs: Dict[str, InputPropertyConfig]

Get the edge input property configs.

property edge_input_property_names: Optional[List[str]]

Get the edge input property names.

property embedding_dim: int

Get the embedding dimension.

get_conv_layer_configs()

Return a list of conv layer configs

Return type

List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

property input_feature_dim: int

Get the input feature dimension.

property is_fitted: bool

Whether or not the model is fitted.

property learning_rate: float

Get the learning rate.

property normalize: bool

Whether or not normalization is enabled.

property num_epochs: int

Get the number of epochs.

property seed: int

Get the seed.

set_batch_size(batch_size)

Set the batch size

Parameters

batch_size (int) – batch size

Return type

None

set_edge_combination_method(edge_combination_method)

Set the edge combination method

Parameters

edge_combination_method (EdgeCombinationMethod) – edge combination method

Return type

None

set_edge_input_feature_dim(edge_input_feature_dim)

Set the edge input feature dimension

Parameters

edge_input_feature_dim (int) – edge input feature dimension

Return type

None

set_embedding_dim(embedding_dim)

Set the embedding dimension

Parameters

embedding_dim (int) – embedding dimension

Return type

None

set_fitted(fitted)

Set the fitted flag

Parameters

fitted (bool) – fitted flag

Return type

None

set_input_feature_dim(input_feature_dim)

Set the input feature dimension

Parameters

input_feature_dim (int) – input feature dimension

Return type

None

set_learning_rate(learning_rate)

Set the learning rate

Parameters

learning_rate (int) – initial learning rate

Return type

None

set_normalize(normalize)

Whether or not normalization is enabled.

Parameters

normalize (bool) –

Return type

None

set_num_epochs(num_epochs)

Set the number of epochs

Parameters

num_epochs (int) – number of epochs

Return type

None

set_seed(seed)

Set the seed

Parameters

seed (int) – seed

Return type

None

set_shuffle(shuffle)

Set the shuffling flag

Parameters

shuffle (bool) – shuffling flag

Return type

None

set_standardize(standardize)

Set the standardize flag

Parameters

standardize (bool) – standardize flag

Return type

None

set_training_loss(training_loss)

Set the training loss

Parameters

training_loss (float) – training loss

Return type

None

set_weight_decay(weight_decay)

Set the weight decay

Parameters

weight_decay (float) – weight decay

Return type

None

property shuffle: bool

Whether or not shuffling is enabled.

property standardize: bool

Whether or not standardization is enabled.

property training_loss: float

Get the training loss.

property vertex_input_property_configs: Dict[str, InputPropertyConfig]

Get the vertex input property configs.

property vertex_input_property_names: Optional[List[str]]

Get the vertex input property names.

property weight_decay: float

Get the weight decay.

class pypgx.api.mllib.EmbeddingTableConfig(java_config, params)

Bases: CategoricalPropertyConfig

Configuration class for handling categorical input properties using embedding table method.

property categorical: bool

Get whether the feature is categorical.

Returns

whether the feature is categorical

property categorical_embedding_type: bool

Get the type of categorical embedding.

Returns

type of categorical embedding

property embedding_dim: int

Get the embedding dimension.

Returns

embedding dimension

property max_vocabulary_size: int

Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.

Returns

maximum number of tokens allowed

property oov_probability: float

Get the probability of randomly setting the category value to OOV token during training.

Returns

probability of using OOV token

property property_name: str

Get the name of the feature that the configuration is used for.

Returns

name of the feature that the configuration is used for

set_embedding_dimension(embedding_dim)

Set the embedding dimension.

Parameters

embedding_dim (int) – embedding dimension

Return type

None

set_max_vocabulary_size(max_vocabulary_size)

Set max vocabulary size to a given value.

Parameters

max_vocabulary_size (int) – set the maximum vocabulary size to the given value

Return type

None

set_oov_probability(oov_proba)

Set the out of vocabulary probability.

Parameters

oov_proba (float) – out of vocabulary probability

Return type

None

set_shared(shared)

Set whether the feature is shared among vertex/edge types.

Parameters

shared – set shared to the value

Return type

None

property shared: bool

Get whether the feature is shared among vertex/edge types.

Returns

whether the feature is shared among vertex/edge types

class pypgx.api.mllib.GnnExplainer(java_explainer)

Bases: object

GnnExplainer object used to request explanations from model predictions.

property learning_rate: float

Get learning rate.

Returns

learning rate

Return type

float

property marginalize: bool

Get value of marginalize.

Returns

value of marginalize

Return type

bool

property num_optimization_steps: int

Get number of optimization steps.

Returns

number of optimization steps

Return type

int

class pypgx.api.mllib.GnnExplanation(java_gnn_explanation)

Bases: object

GnnExplanation object

get_embedding()

Get the inferred embedding of the specified vertex.

Returns

the embedding

Return type

List[float]

get_importance_graph()

Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.

Returns

the importance graph

Return type

PgxGraph

get_vertex_feature_importance()

Get the feature importances as a map from property to importance value.

Returns

the feature importances.

Return type

Dict[VertexProperty, float]

get_vertex_importance_property()

Get the vertex property that contains the computed vertex importance.

Returns

the vertex importance property

Return type

VertexProperty

class pypgx.api.mllib.GraphWiseAttentionLayerConfig(java_config, params)

Bases: GraphWiseBaseConvLayerConfig

GraphWise attention layer configuration.

property activation_fn: Any

Get the activation function.

property edge_to_edge_connection: Optional[bool]

Get the edge to edge connection.

property edge_to_vertex_connection: Optional[bool]

Get the edge to vertex connection.

property head_aggregation: Any

Get the aggregation operation for heads.

property num_heads: int

Get the number of heads.

property num_sampled_neighbors: int

Get the number of sampled neighbors.

property vertex_to_edge_connection: Optional[bool]

Get the vertex to edge connection.

property vertex_to_vertex_connection: Optional[bool]

Get the vertex to vertex connection.

property weight_init_scheme: Any

Get the weight initialization scheme.

class pypgx.api.mllib.GraphWiseConvLayerConfig(java_config, params)

Bases: GraphWiseBaseConvLayerConfig

GraphWise conv layer configuration.

property activation_fn: Any

Get the activation function.

property edge_to_edge_connection: Optional[bool]

Get the edge to edge connection.

property edge_to_vertex_connection: Optional[bool]

Get the edge to vertex connection.

property neighbor_weight_property_name: str

Get the name of the property that stores the weight of the edge.

property num_sampled_neighbors: int

Get the number of sampled neighbors.

property vertex_to_edge_connection: Optional[bool]

Get the vertex to edge connection.

property vertex_to_vertex_connection: Optional[bool]

Get the vertex to vertex connection.

property weight_init_scheme: Any

Get the weight initialization scheme.

class pypgx.api.mllib.GraphWiseDgiLayerConfig(java_config, params)

Bases: GraphWiseEmbeddingConfig

GraphWise dgi layer configuration.

get_corruption_function()

Return the corruption function

Return type

PermutationCorruption

get_discriminator()

Return the discriminator

Return type

str

get_embedding_type()

Return the embedding type used by this config

Return type

str

get_readout_function()

Return the readout function

Return type

str

set_corruption_function(corruption_function)

Set the corruption function

Parameters

corruption_function (CorruptionFunction) – the corruption function. Supported currently: PermutationCorruption

set_discriminator(discriminator)

Set the discriminator

Parameters

discriminator (str) – The discriminator function. Supported currently: ‘bilinear’

Return type

None

set_readout_function(readout_function)

Set the readout function

Parameters

readout_function (str) – The readout function. Supported currently: ‘mean’

Return type

None

class pypgx.api.mllib.GraphWiseDominantLayerConfig(java_config, params)

Bases: GraphWiseEmbeddingConfig

GraphWise dominant layer configuration.

get_alpha()

Return alpha.

Returns

alpha of the decoder layer

Return type

float

get_decoder_layer_configs()

Get the configuration objects for the decoder layers.

Returns

configuration of the decoder layer

Return type

GraphWisePredictionLayerConfig

get_embedding_type()

Return the embedding type used by this config

Returns

embedding type

Return type

str

set_alpha(alpha)

Set the alpha parameter

Parameters

alpha (float) – The alpha parameter to set.

set_decoder_layer_configs(decoder_layer_configs)

Set the configuration objects for the decoder layers.

Parameters

decoder_layer_configs (GraphWisePredictionLayerConfig) – configuration of the decoder layer

Return type

GraphWisePredictionLayerConfig

class pypgx.api.mllib.GraphWiseEmbeddingConfig

Bases: object

GraphWise embedding configuration.

get_embedding_type()

Return the embedding type used by this config

Return type

str

class pypgx.api.mllib.GraphWiseModel(java_graphwise_model, params=None)

Bases: Model

GraphWise model object.

This is a base class for UnsupervisedGraphWiseModel and SupervisedGraphWiseModel.

check_is_fitted()

Make sure the model is fitted.

Returns

None

Raise

RuntimeError if the model is not fitted

Return type

None

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

property edge_input_feature_dim: int

Get the dimension of the edge input features.

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_training_loss()

Get the final training loss

Returns

training loss

Return type

Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

property loss: Optional[float]

Get the training loss.

property vertex_input_feature_dim: int

Get the dimension of the vertex input features.

class pypgx.api.mllib.GraphWiseModelConfig(java_graphwise_model_config)

Bases: object

Graphwise Model Configuration class

property backend: str

Get the backend.

property batch_size: int

Get the batch size.

property conv_layer_configs: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

Get the conv layer configs.

property edge_input_feature_dim: int

Get the edge input feature dimension.

property edge_input_property_configs: Dict[str, InputPropertyConfig]

Get the edge input property configs.

property edge_input_property_names: Optional[List[str]]

Get the edge input property names.

property embedding_dim: int

Get the embedding dimension.

property fitted: bool

Whether or not the model is fitted.

get_conv_layer_configs()

Return a list of conv layer configs

Return type

List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

property input_feature_dim: int

Get the input feature dimension.

property is_fitted: bool

Whether or not the model is fitted.

property learning_rate: float

Get the learning rate.

property normalize: bool

Whether or not normalization is enabled.

property num_epochs: int

Get the number of epochs.

property seed: int

Get the seed.

set_batch_size(batch_size)

Set the batch size

Parameters

batch_size (int) – batch size

Return type

None

set_edge_input_feature_dim(edge_input_feature_dim)

Set the edge input feature dimension

Parameters

edge_input_feature_dim (int) – edge input feature dimension

Return type

None

set_embedding_dim(embedding_dim)

Set the embedding dimension

Parameters

embedding_dim (int) – embedding dimension

Return type

None

set_fitted(fitted)

Set the fitted flag

Parameters

fitted (bool) – fitted flag

Return type

None

set_input_feature_dim(input_feature_dim)

Set the input feature dimension

Parameters

input_feature_dim (int) – input feature dimension

Return type

None

set_learning_rate(learning_rate)

Set the learning rate

Parameters

learning_rate (int) – initial learning rate

Return type

None

set_normalize(normalize)

Whether or not normalization is enabled.

Parameters

normalize (bool) –

Return type

None

set_num_epochs(num_epochs)

Set the number of epochs

Parameters

num_epochs (int) – number of epochs

Return type

None

set_seed(seed)

Set the seed

Parameters

seed (int) – seed

Return type

None

set_shuffle(shuffle)

Set the shuffling flag

Parameters

shuffle (bool) – shuffling flag

Return type

None

set_standardize(standardize)

Set the standardize flag

Parameters

standardize (bool) – standardize flag

Return type

None

set_training_loss(training_loss)

Set the training loss

Parameters

training_loss (float) – training loss

Return type

None

set_weight_decay(weight_decay)

Set the weight decay

Parameters

weight_decay (float) – weight decay

Return type

None

property shuffle: bool

Whether or not shuffling is enabled.

property standardize: bool

Whether or not standardization is enabled.

property training_loss: float

Get the training loss.

property vertex_input_property_configs: Dict[str, InputPropertyConfig]

Get the vertex input property configs.

property vertex_input_property_names: Optional[List[str]]

Get the vertex input property names.

property weight_decay: float

Get the weight decay.

class pypgx.api.mllib.GraphWisePredictionLayerConfig(java_config, params)

Bases: object

GraphWise prediction layer configuration.

class pypgx.api.mllib.InputPropertyConfig(java_config, params)

Bases: object

Configuration class for handling input properties using one hot encoding method.

property categorical: bool

Get whether the feature is categorical.

Returns

whether the feature is categorical

property property_name: str

Get the name of the feature that the configuration is used for.

Returns

name of the feature that the configuration is used for

class pypgx.api.mllib.MSELoss

Bases: LossFunction

MSE loss for regression

class pypgx.api.mllib.Model(java_generic_model)

Bases: PgxContextManager

Model object

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

is_fitted()

Whether or not the model has been fitted.

Returns

Always returns False since this base class cant be fitted.

Return type

bool

class pypgx.api.mllib.ModelLoader(analyst, java_model_loader, wrapper, java_class)

Bases: object

ModelLoader object.

Parameters
  • analyst (Analyst) –

  • java_model_loader (Any) –

  • wrapper (Callable) –

  • java_class (str) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)

Return a model stored in a database.

Parameters
  • model_store (str) – Model store in database.

  • model_name (str) – name of the model to store.

  • username (Optional[str]) – Username in database.

  • password (Optional[str]) – Password of username in database.

  • jdbc_url (Optional[str]) – JDBC url of database.

  • model_description (Optional[str]) – Description of model.

  • overwrite (bool) – Boolean value for overwriting or not.

  • keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.

  • schema (Optional[str]) –

Return type

Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel]

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Returns

The model stored in database.

Return type

Union[

“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]

Parameters
  • model_store (str) –

  • model_name (str) –

  • username (Optional[str]) –

  • password (Optional[str]) –

  • jdbc_url (Optional[str]) –

  • keystore_alias (Optional[str]) –

  • schema (Optional[str]) –

file(path, key)

Return an encrypted model stored in a file.

Parameters
  • path (str) – Path of stored model.

  • key (str) – Used for encryption.

Returns

The model stored in file.

Return type

Union[

“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]

class pypgx.api.mllib.ModelRepository(java_generic_model_repository)

Bases: object

ModelRepository object that exposes crud operations on - model stores and - the models within these model stores.

create(model_store_name)

Create a new model store.

Parameters

model_store_name (str) – The name of the model store.

Return type

None

delete_model(model_store_name, model_name)

Delete the model in the specified model store with the given model name.

Parameters
  • model_store_name (str) – The name of the model store.

  • model_name (str) – The name under which the model was stored.

Return type

None

delete_model_store(model_store_name)

Delete a model store.

Parameters

model_store_name (str) – The name of the model store.

Return type

None

get_model_description(model_store_name, model_name)

Retrieve the description of the model in the specified model store, with the given model name.

Parameters
  • model_store_name (str) – The name of the model store.

  • model_name (str) – The name under which the model was stored.

Returns

A string containing the description that was stored with the model.

Return type

str

list_model_stores_names()

List the names of all model stores in the model repository.

Returns

List of names.

Return type

List[str]

list_model_stores_names_matching(regex)

List the names of all model stores in the model repository that match the regex.

Parameters

regex (str) – A regex in form of a string.

Returns

List of matching names.

Return type

List[str]

list_models(model_store_name)

List the models present in the model store with the given name.

Parameters

model_store_name (str) – The name of the model store (non-prefixed).

Returns

List of model names.

Return type

List[str]

class pypgx.api.mllib.ModelRepositoryBuilder(java_generic_model_repository_builder)

Bases: object

ModelRepositoryBuilder object that can be used to configure the connection to a model repository.

db(username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)

Connect to a model repository backed by a database.

Parameters
  • username (Optional[str]) – username in database

  • password (Optional[str]) – password of username in database

  • jdbc_url (Optional[str]) – jdbc url of database

  • keystore_alias (Optional[str]) – the keystore alias to get the password in the keystore

  • schema (Optional[str]) – the schema of the model store in database

Returns

A model repository configured to connect to a database.

Return type

ModelRepository

class pypgx.api.mllib.ModelStorer(model)

Bases: object

ModelStorer object.

Parameters

model (Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel, Model]) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, model_description=None, overwrite=False, keystore_alias=None, schema=None)

Store a model to a database.

Parameters
  • model_store (str) – Model store in database.

  • model_name (str) – name of the model to store.

  • username (Optional[str]) – Username in database.

  • password (Optional[str]) – Password of username in database.

  • jdbc_url (Optional[str]) – JDBC url of database.

  • model_description (Optional[str]) – Description of model.

  • overwrite (bool) – Boolean value for overwriting or not.

  • keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.

  • schema (Optional[str]) –

Return type

None

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Raises

RuntimeError – If the model is not fitted.

Parameters
  • model_store (str) –

  • model_name (str) –

  • username (Optional[str]) –

  • password (Optional[str]) –

  • jdbc_url (Optional[str]) –

  • model_description (Optional[str]) –

  • overwrite (bool) –

  • keystore_alias (Optional[str]) –

  • schema (Optional[str]) –

Return type

None

file(path, key, overwrite=False)

Store an encrypted model to a file.

Parameters
  • path (str) – Path to store model.

  • key (str) – Key used for encryption.

  • overwrite (bool) – Boolean value for overwriting or not.

Raises

RuntimeError – If the model is not fitted.

Return type

None

class pypgx.api.mllib.OneHotEncodingConfig(java_config, params)

Bases: CategoricalPropertyConfig

Configuration class for handling categorical input properties.

property categorical: bool

Get whether the feature is categorical.

Returns

whether the feature is categorical

property categorical_embedding_type: bool

Get the type of categorical embedding.

Returns

type of categorical embedding

property max_vocabulary_size: int

Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.

Returns

maximum number of tokens allowed

property property_name: str

Get the name of the feature that the configuration is used for.

Returns

name of the feature that the configuration is used for

set_max_vocabulary_size(max_vocabulary_size)

Set max vocabulary size to a given value.

Parameters

max_vocabulary_size (int) – set the maximum vocabulary size to the given value

Return type

None

set_shared(shared)

Set whether the feature is shared among vertex/edge types.

Parameters

shared – set shared to the value

Return type

None

property shared: bool

Get whether the feature is shared among vertex/edge types.

Returns

whether the feature is shared among vertex/edge types

class pypgx.api.mllib.PermutationCorruption(java_permutation_corruption)

Bases: CorruptionFunction

Permutation Function which shuffle the nodes to generate the corrupted subgraph for DGI

class pypgx.api.mllib.Pg2vecModel(java_pg2vec_model)

Bases: Model

Pg2Vec model object.

property batch_size: int

Get the batch size.

close()

Call destroy

Return type

None

compute_similars(graphlet_id, k)

Compute the top-k similar graphlets for a list of input graphlets.

Parameters
  • graphlet_id (Union[Iterable[Union[int, str]], int, str]) – graphletIds or iterable of graphletIds

  • k (int) – number of similars to return

Return type

PgxFrame

destroy()

Destroy this model object.

Return type

None

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Return type

None

property graphlet_id_property_name: str

Get the graphlet id property name.

property graphlet_size_property_name: str

Get the graphlet size property name.

infer_graphlet_vector(graph)

Return the inferred vector of the input graphlet as a PgxFrame.

Parameters

graph (PgxGraph) – graphlet for which to infer a vector

Return type

PgxFrame

infer_graphlet_vector_batched(graph)

Return the inferred vectors of the input graphlets as a PgxFrame.

Parameters

graph (PgxGraph) – graphlets (as a single graph but different graphlet-id) for which to infer vectors

Return type

PgxFrame

is_fitted()

Whether or not the model has been fitted.

Return type

bool

property layer_size: int

Get the layer size.

property learning_rate: float

Get the learning rate.

property loss: Optional[float]

Get the loss of the model.

property min_learning_rate: float

Get the minimum learning rate.

property min_word_frequency: int

Get the minimum word frequency.

property num_epochs: int

Get the number of epochs.

property seed: Optional[int]

Get the seed.

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (Optional[str]) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Return type

None

property trained_graphlet_vectors: PgxFrame

Get the trained graphlet vectors for the current pg2vec model.

Returns

PgxFrame containing the trained graphlet vectors

property use_graphlet_size: bool

Get the use graphlet size.

property validation_fraction: float

Get the validation fraction.

validation_fraction is deprecated since 23.4, the loss now is computed on all samples

property vertex_property_names: List[str]

Get the vertex property names.

property walk_length: int

Get the walk length.

property walks_per_vertex: int

Get the walks per vertex.

property window_size: int

Get the window size.

class pypgx.api.mllib.ProductEdgeCombinationMethod(use_source_vertex, use_destination_vertex, use_edge)

Bases: EdgeCombinationMethod

Product method for edge embedding generation

Parameters
  • use_source_vertex (bool) –

  • use_destination_vertex (bool) –

  • use_edge (bool) –

get_aggregation_type()

Get the aggregation type

Returns:

the aggregation type

Return type

str

use_dst_vertex()

Get if destination vertex embedding is used or not for the edge embedding.

Returns

uses or not the destination vertex

Return type

bool

use_edge()

Get if edge features are used or not for the edge embedding

Returns

uses or not the edge features

Return type

bool

use_src_vertex()

Get if source vertex embedding is used or not for the edge embedding.

Returns

uses or not the source vertex

Return type

bool

class pypgx.api.mllib.SigmoidCrossEntropyLoss

Bases: LossFunction

Sigmoid Cross Entropy loss for binary classification

class pypgx.api.mllib.SoftmaxCrossEntropyLoss

Bases: LossFunction

Softmax Cross Entropy loss for multi-class classification

class pypgx.api.mllib.SupervisedEdgeWiseModel(java_edgewise_model, params=None)

Bases: EdgeWiseModel

SupervisedEdgeWise model object.

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

evaluate(graph, edges, threshold=0.0)

Evaluate performance statistics for the specified edges.

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to evaluate on. Can be a list of edges or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

evaluate_labels(graph, edges, threshold=0.0)

Evaluate (macro averaged) classification performance statistics for the specified edges.

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to evaluate on. Can be a list of edges or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_class_weights()

Get the class weights.

Returns

a dictionary mapping classes to their weights.

Return type

dict

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_edge_combination_method()

Get the edge combination method used to compute the edge embedding

Returns

edge combination method

Return type

EdgeCombinationMethod

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_edge_target_property_name()

Get the target property name

Returns

target property name

Return type

str

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_loss_function()

Get the loss function name.

Returns

loss function name. Can be one of softmax_cross_entropy, sigmoid_cross_entropy, devnet

Return type

str

get_loss_function_class()

Get the loss function.

Returns

loss function

Return type

LossFunction

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_prediction_layer_configs()

Get the configuration objects for the prediction layers.

Returns

configuration of the prediction layer

Return type

GraphWisePredictionLayerConfig

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_target_edge_labels()

Get the target edge labels

Returns

target edge labels

Return type

List[str]

get_training_loss()

Get the final training loss

Returns

training loss

Return type

float

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

infer(graph, edges, threshold=0.0)

Infer the predictions for the specified edges

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer embeddings for. Can be a list of edges or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the inference results for each edge

Return type

PgxFrame

infer_embeddings(graph, edges)

Infer the embeddings for the specified edges

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer embeddings for. Can be a list of edges or their IDs.

Returns

PgxFrame containing the embeddings for each edge

Return type

PgxFrame

infer_labels(graph, edges, threshold=0.0)

Infer the labels for the specified edges

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer labels for. Can be a list of edges or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the labels for each vertex

Return type

PgxFrame

infer_logits(graph, edges)

Infer the prediction logits for the specified edges

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer logits for. Can be a list of edges or their IDs.

Returns

PgxFrame containing the logits for each vertex

Return type

PgxFrame

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (str) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

class pypgx.api.mllib.SupervisedGnnExplainer(java_explainer, bool_label)

Bases: GnnExplainer

SupervisedGnnExplainer used to request explanations from supervised model predictions.

Parameters

bool_label (bool) –

infer_and_explain(graph, vertex, threshold=0.0)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the prediction.

Parameters
  • graph (PgxGraph) – the graph

  • vertex (PgxVertex or int) – the vertex or its ID

  • threshold (float) – decision threshold

Returns

explanation containing feature importance and vertex importance.

Return type

SupervisedGnnExplanation

property learning_rate: float

Get learning rate.

Returns

learning rate

Return type

float

property marginalize: bool

Get value of marginalize.

Returns

value of marginalize

Return type

bool

property num_optimization_steps: int

Get number of optimization steps.

Returns

number of optimization steps

Return type

int

class pypgx.api.mllib.SupervisedGnnExplanation(java_supervised_gnn_explanation, bool_label)

Bases: GnnExplanation

SupervisedGnnExplanation object

Parameters

bool_label (bool) –

get_embedding()

Get the inferred embedding of the specified vertex.

Returns

the embedding

Return type

List[float]

get_importance_graph()

Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.

Returns

the importance graph

Return type

PgxGraph

get_label()

Get the inferred label of the specified vertex.

Returns

the label

Return type

Any

get_logits()

Get the inferred logits of the specified vertex.

Returns

the logits

Return type

List[float]

get_vertex_feature_importance()

Get the feature importances as a map from property to importance value.

Returns

the feature importances.

Return type

Dict[VertexProperty, float]

get_vertex_importance_property()

Get the vertex property that contains the computed vertex importance.

Returns

the vertex importance property

Return type

VertexProperty

class pypgx.api.mllib.SupervisedGraphWiseModel(java_graphwise_model, params=None)

Bases: GraphWiseModel

SupervisedGraphWise model object.

check_is_fitted()

Make sure the model is fitted.

Returns

None

Raise

RuntimeError if the model is not fitted

Return type

None

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

property edge_input_feature_dim: int

Get the dimension of the edge input features.

evaluate(graph, vertices, threshold=0.0)

Evaluate performance statistics for the specified vertices.

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on. Can be a list of vertices or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

evaluate_labels(graph, vertices, threshold=0.0)

Evaluate (macro averaged) classification performance statistics for the specified vertices.

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on. Can be a list of vertices or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_class_weights()

Get the class weights.

Returns

a dictionary mapping classes to their weights.

Return type

dict

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_loss_function()

Get the loss function name.

Returns

loss function name. Can be one of softmax_cross_entropy, sigmoid_cross_entropy, devnet

Return type

str

get_loss_function_class()

Get the loss function.

Returns

loss function

Return type

LossFunction

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_prediction_layer_configs()

Get the configuration objects for the prediction layers.

Returns

configuration of the prediction layer

Return type

GraphWisePredictionLayerConfig

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_target_vertex_labels()

Get the target vertex labels

Returns

target vertex labels

Return type

List[str]

get_training_loss()

Get the final training loss

Returns

training loss

Return type

Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

get_vertex_target_property_name()

Get the target property name

Returns

target property name

Return type

str

gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False)

Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.

Parameters
  • num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200

  • learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05

  • marginalize (bool, optional) – marginalize the loss over features, defaults to False

Returns

SupervisedGnnExplainer object of this model

Return type

SupervisedGnnExplainer

infer(graph, vertices, threshold=0.0)

Infer the predictions for the specified vertices

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer embeddings for. Can be a list of vertices or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the inference results for each vertex

Return type

PgxFrame

infer_and_get_explanation(graph, vertex, num_optimization_steps=200, learning_rate=0.05, marginalize=False, threshold=0.0)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the prediction.

Parameters
  • graph (PgxGraph) – the graph

  • vertex (PgxVertex or int) – the vertex or its ID

  • threshold (float) – decision threshold for classification (unused for regression)

  • num_optimization_steps (int) –

  • learning_rate (float) –

  • marginalize (bool) –

Returns

explanation containing feature importance and vertex importance.

Return type

SupervisedGnnExplanation

infer_embeddings(graph, vertices)

Infer the embeddings for the specified vertices

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer embeddings for. Can be a list of vertices or their IDs.

Returns

PgxFrame containing the embeddings for each vertex

Return type

PgxFrame

infer_labels(graph, vertices, threshold=0.0)

Infer the labels for the specified vertices

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer labels for. Can be a list of vertices or their IDs.

  • threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the labels for each vertex

Return type

PgxFrame

infer_logits(graph, vertices)

Infer the prediction logits for the specified vertices

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer logits for. Can be a list of vertices or their IDs.

Returns

PgxFrame containing the logits for each vertex

Return type

PgxFrame

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

property loss: Optional[float]

Get the training loss.

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (str) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

property vertex_input_feature_dim: int

Get the dimension of the vertex input features.

class pypgx.api.mllib.UnsupervisedAnomalyDetectionGraphWiseModel(java_graphwise_model, params=None)

Bases: UnsupervisedGraphWiseModel

UnsupervisedGraphWise model object.

check_is_fitted()

Make sure the model is fitted.

Returns

None

Raise

RuntimeError if the model is not fitted

Return type

None

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

property edge_input_feature_dim: int

Get the dimension of the edge input features.

evaluate_anomaly_labels(graph, vertices, vertex_anomaly_property_name, anomaly_property_value, threshold)

Evaluate anomaly detection performance statistics for the specified vertices.

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on

  • vertex_anomaly_property_name (str) – the name of the property containing the anomaly

  • anomaly_property_value (object) – the value indicating an anomaly in vertex_anomaly_property_name property

  • threshold (float) – the anomaly threshold

Raises

LookupError – if the property is not found

Returns

PgxFrame containing the evaluation results.

Return type

PgxFrame

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

find_anomaly_threshold(graph, vertices, contamination_factor)

Find an appropriate anomaly threshold for labeling the input vertices as anomalies, respecting the proportion given by the contamination factor

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer on

  • contamination_factor (float) – the contamination factor

Returns

the threshold

Return type

float

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_dgi_layer_config()

Get the configuration object for the dgi layer.

Returns

configuration

Return type

GraphWiseDgiLayerConfig

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_embedding_config()

Get the configuration object for the embedding method

Returns

configuration

Return type

GraphWiseEmbeddingConfig

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_loss_function()

Get the loss function name.

Returns

loss function name. Can only be sigmoid_cross_entropy

Return type

str

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_training_loss()

Get the final training loss

Returns

training loss

Return type

Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False, num_clusters=50, num_samples=10000)

Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.

Parameters
  • num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200

  • learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05

  • marginalize (bool, optional) – marginalize the loss over features, defaults to False

  • num_clusters (int, optional) – number of clusters to use, defaults to 50

  • num_samples (int, optional) – number of samples to use, defaults to 10000

Returns

UnsupervisedGnnExplainer object of this model

Return type

UnsupervisedGnnExplainer

infer_and_get_explanation(graph, vertex, num_clusters=50, num_samples=10000, num_optimization_steps=200, learning_rate=0.05, marginalize=False)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.

Parameters
  • graph (PgxGraph) – the graph

  • vertex (Union[PgxVertex, int]) – the vertex

  • num_clusters (int) – the number of semantic vertex clusters expected in the graph, must be greater than 1

  • num_samples (int) –

  • num_optimization_steps (int) –

  • learning_rate (float) –

  • marginalize (bool) –

Returns

explanation containing feature importance and vertex importance.

Return type

GnnExplanation

infer_anomaly_labels(graph, vertices, threshold)

Infer the anomaly labels for the specified vertices.

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer on

  • threshold (float) – the anomaly threshold

Returns

PgxFrame containing the anomaly labels for each vertex.

Return type

PgxFrame

infer_anomaly_scores(graph, vertices)

Infer the anomaly scores for the specified vertices.

Parameters
  • graph (PgxGraph) – the graph

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer on

Returns

PgxFrame containing the anomaly scores for each vertex.

Return type

PgxFrame

infer_embeddings(graph, vertices)

Infer the embeddings for the specified vertices.

Returns

PgxFrame containing the embeddings for each vertex.

Return type

PgxFrame

Parameters
  • graph (PgxGraph) –

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) –

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

property loss: Optional[float]

Get the training loss.

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (str) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

property vertex_input_feature_dim: int

Get the dimension of the vertex input features.

class pypgx.api.mllib.UnsupervisedEdgeWiseModel(java_edgewise_model, params=None)

Bases: EdgeWiseModel

UnsupervisedEdgeWise model object.

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_dgi_layer_config()

Get the configuration object for the dgi layer.

Returns

configuration

Return type

GraphWiseDgiLayerConfig

get_edge_combination_method()

Get the edge combination method used to compute the edge embedding

Returns

edge combination method

Return type

EdgeCombinationMethod

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_loss_function()

Get the loss function name.

Returns

loss function name: sigmoid_cross_entropy

Return type

str

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_target_edge_labels()

Get the target edge labels

Returns

target edge labels

Return type

List[str]

get_training_loss()

Get the final training loss

Returns

training loss

Return type

float

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

infer_embeddings(graph, edges)

Infer the embeddings for the specified edges

Parameters
  • graph (PgxGraph) – the graph

  • edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer embeddings for. Can be a list of edges or their IDs.

Returns

PgxFrame containing the embeddings for each edge

Return type

PgxFrame

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (str) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

class pypgx.api.mllib.UnsupervisedGnnExplainer(java_explainer)

Bases: GnnExplainer

UnsupervisedGnnExplainer used to request explanations from unsupervised model predictions.

infer_and_explain(graph, vertex)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.

Parameters
  • graph (PgxGraph) – The graph.

  • vertex (Union[PgxVertex, int, str]) – The vertex.

Returns

Explanation containing feature importance and vertex importance.

Return type

GnnExplanation

property learning_rate: float

Get learning rate.

Returns

learning rate

Return type

float

property marginalize: bool

Get value of marginalize.

Returns

value of marginalize

Return type

bool

property num_clusters: int

Get number of clusters.

Returns

Number of clusters.

Return type

int

property num_optimization_steps: int

Get number of optimization steps.

Returns

number of optimization steps

Return type

int

property num_samples: int

Get number of samples.

Returns

Number of samples.

Return type

int

class pypgx.api.mllib.UnsupervisedGnnExplanation(java_unsupervised_gnn_explanation)

Bases: GnnExplanation

UnsupervisedGnnExplanation object

get_embedding()

Get the inferred embedding of the specified vertex.

Returns

the embedding

Return type

List[float]

get_importance_graph()

Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.

Returns

the importance graph

Return type

PgxGraph

get_vertex_feature_importance()

Get the feature importances as a map from property to importance value.

Returns

the feature importances.

Return type

Dict[VertexProperty, float]

get_vertex_importance_property()

Get the vertex property that contains the computed vertex importance.

Returns

the vertex importance property

Return type

VertexProperty

class pypgx.api.mllib.UnsupervisedGraphWiseModel(java_graphwise_model, params=None)

Bases: GraphWiseModel

UnsupervisedGraphWise model object.

check_is_fitted()

Make sure the model is fitted.

Returns

None

Raise

RuntimeError if the model is not fitted

Return type

None

close()

Call destroy

Return type

None

destroy()

Destroy this model object.

Return type

None

property edge_input_feature_dim: int

Get the dimension of the edge input features.

export()

Return a ModelStorer object which can be used to save the model.

Returns

ModelStorer object

Return type

ModelStorer

fit(graph)

Fit the model on a graph.

Parameters

graph (PgxGraph) – Graph to fit on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns

batch size

Return type

int

get_config()

Return the GraphWiseModelConfig object

Returns

the config

Return type

GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns

configurations

Return type

List[GraphWiseConvLayerConfig]

get_dgi_layer_config()

Get the configuration object for the dgi layer.

Returns

configuration

Return type

GraphWiseDgiLayerConfig

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns

edges input feature dimension

Return type

int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type

List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns

edges input feature names

Return type

list(str)

get_embedding_config()

Get the configuration object for the embedding method

Returns

configuration

Return type

GraphWiseEmbeddingConfig

get_layer_size()

Get the dimension of the embeddings

Returns

embedding dimension

Return type

int

get_learning_rate()

Get the initial learning rate

Returns

initial learning rate

Return type

float

get_loss_function()

Get the loss function name.

Returns

loss function name. Can only be sigmoid_cross_entropy

Return type

str

get_num_epochs()

Get the number of epochs to train the model

Returns

number of epochs to train the model

Return type

int

get_seed()

Get the random seed

Returns

random seed

Return type

int

get_training_loss()

Get the final training loss

Returns

training loss

Return type

Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns

input feature dimension

Return type

int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns

configurations

Return type

List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns

vertices input feature names

Return type

list(str)

gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False, num_clusters=50, num_samples=10000)

Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.

Parameters
  • num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200

  • learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05

  • marginalize (bool, optional) – marginalize the loss over features, defaults to False

  • num_clusters (int, optional) – number of clusters to use, defaults to 50

  • num_samples (int, optional) – number of samples to use, defaults to 10000

Returns

UnsupervisedGnnExplainer object of this model

Return type

UnsupervisedGnnExplainer

infer_and_get_explanation(graph, vertex, num_clusters=50, num_samples=10000, num_optimization_steps=200, learning_rate=0.05, marginalize=False)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.

Parameters
  • graph (PgxGraph) – the graph

  • vertex (Union[PgxVertex, int]) – the vertex

  • num_clusters (int) – the number of semantic vertex clusters expected in the graph, must be greater than 1

  • num_samples (int) –

  • num_optimization_steps (int) –

  • learning_rate (float) –

  • marginalize (bool) –

Returns

explanation containing feature importance and vertex importance.

Return type

GnnExplanation

infer_embeddings(graph, vertices)

Infer the embeddings for the specified vertices.

Returns

PgxFrame containing the embeddings for each vertex.

Return type

PgxFrame

Parameters
  • graph (PgxGraph) –

  • vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) –

is_fitted()

Check if the model is fitted

Returns

True if the model is fitted, False otherwise

Return type

bool

property loss: Optional[float]

Get the training loss.

store(path, key, overwrite=False)

Store the model in a file.

Parameters
  • path (str) – Path where to store the model

  • key (str) – Encryption key

  • overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

property vertex_input_feature_dim: int

Get the dimension of the vertex input features.

class pypgx.api.mllib._model_utils.ModelStorer(model)

Bases: object

ModelStorer object.

Parameters

model (Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel, Model]) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, model_description=None, overwrite=False, keystore_alias=None, schema=None)

Store a model to a database.

Parameters
  • model_store (str) – Model store in database.

  • model_name (str) – name of the model to store.

  • username (Optional[str]) – Username in database.

  • password (Optional[str]) – Password of username in database.

  • jdbc_url (Optional[str]) – JDBC url of database.

  • model_description (Optional[str]) – Description of model.

  • overwrite (bool) – Boolean value for overwriting or not.

  • keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.

  • schema (Optional[str]) –

Return type

None

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Raises

RuntimeError – If the model is not fitted.

Parameters
  • model_store (str) –

  • model_name (str) –

  • username (Optional[str]) –

  • password (Optional[str]) –

  • jdbc_url (Optional[str]) –

  • model_description (Optional[str]) –

  • overwrite (bool) –

  • keystore_alias (Optional[str]) –

  • schema (Optional[str]) –

Return type

None

file(path, key, overwrite=False)

Store an encrypted model to a file.

Parameters
  • path (str) – Path to store model.

  • key (str) – Key used for encryption.

  • overwrite (bool) – Boolean value for overwriting or not.

Raises

RuntimeError – If the model is not fitted.

Return type

None

class pypgx.api.mllib._model_utils.ModelLoader(analyst, java_model_loader, wrapper, java_class)

Bases: object

ModelLoader object.

Parameters
  • analyst (Analyst) –

  • java_model_loader (Any) –

  • wrapper (Callable) –

  • java_class (str) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)

Return a model stored in a database.

Parameters
  • model_store (str) – Model store in database.

  • model_name (str) – name of the model to store.

  • username (Optional[str]) – Username in database.

  • password (Optional[str]) – Password of username in database.

  • jdbc_url (Optional[str]) – JDBC url of database.

  • model_description (Optional[str]) – Description of model.

  • overwrite (bool) – Boolean value for overwriting or not.

  • keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.

  • schema (Optional[str]) –

Return type

Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel]

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Returns

The model stored in database.

Return type

Union[

“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]

Parameters
  • model_store (str) –

  • model_name (str) –

  • username (Optional[str]) –

  • password (Optional[str]) –

  • jdbc_url (Optional[str]) –

  • keystore_alias (Optional[str]) –

  • schema (Optional[str]) –

file(path, key)

Return an encrypted model stored in a file.

Parameters
  • path (str) – Path of stored model.

  • key (str) – Used for encryption.

Returns

The model stored in file.

Return type

Union[

“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]