MLlib

Graph machine learning tools for use with PGX.

class pypgx.api.mllib.CategoricalPropertyConfig(java_config, params)

Bases: InputPropertyConfig

Configuration class for handling categorical input properties.

property categorical: bool

Get whether the feature is categorical.

Returns: whether the feature is categorical

property categorical_embedding_type: bool

Get the type of categorical embedding.

Returns: type of categorical embedding

property max_vocabulary_size: int

Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.

Returns: maximum number of tokens allowed

property property_name: str

Get the name of the feature that the configuration is used for.

Returns: name of the feature that the configuration is used for

set_max_vocabulary_size(max_vocabulary_size)

Set max vocabulary size to a given value.

Parameters: max_vocabulary_size (int) – set the maximum vocabulary size to the given value
Return type: None

set_shared(shared)

Set whether the feature is shared among vertex/edge types.

Parameters: shared – set shared to the value
Return type: None

property shared: bool

Get whether the feature is shared among vertex/edge types.

Returns: whether the feature is shared among vertex/edge types

class pypgx.api.mllib.ConcatEdgeCombinationMethod(use_source_vertex, use_destination_vertex, use_edge)

Bases: EdgeCombinationMethod

Concatenation method for edge embedding generation

Parameters

use_source_vertex (bool) –
use_destination_vertex (bool) –
use_edge (bool) –

get_aggregation_type()

Get the aggregation type

Returns:: the aggregation type

Return type: str

use_dst_vertex()

Get if destination vertex embedding is used or not for the edge embedding.

Returns: uses or not the destination vertex
Return type: bool

use_edge()

Get if edge features are used or not for the edge embedding

Returns: uses or not the edge features
Return type: bool

use_src_vertex()

Get if source vertex embedding is used or not for the edge embedding.

Returns: uses or not the source vertex
Return type: bool

class pypgx.api.mllib.CorruptionFunction(java_corruption_function)

Bases: object

Abstract Corruption Function which generate the corrupted subgraph for DGI

class pypgx.api.mllib.DeepWalkModel(java_deepwalk_model)

Bases: Model

DeepWalk model object.

property batch_size: int: Get the batch size.

close()

Call destroy

Return type: None

compute_similars(v, k)

Compute the top-k similar vertices for a given vertex.

Parameters

v (Union[int, str, List[int], List[str]]) – id of the vertex or list of vertex ids for which to compute the similar vertices
k (int) – number of similar vertices to return

Return type

PgxFrame

destroy()

Destroy this model object.

Return type: None

property enable_accelerator: bool: Get whether the accelerator is used if available.

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

fit(graph)

Fit the model on a graph.

Parameters: graph (PgxGraph) – Graph to fit on
Return type: None

is_fitted()

Whether or not the model has been fitted.

Return type: bool

property layer_size: int: Get the layer size.

property learning_rate: float: Get the learning rate.

property loss: Optional[float]: Get the loss of the model.

property min_learning_rate: float: Get the minimum learning rate.

property min_word_frequency: int: Get the minimum word frequency.

property negative_sample: int: Get the negative sample.

property num_epochs: int: Get the number of epochs.

property sample_rate: float: Get the sample rate.

property seed: Optional[int]: Get the seed.

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (Optional[str]) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Return type

None

property trained_vectors: PgxFrame

Get the trained vertex vectors for the current DeepWalk model.

Returns: PgxFrame object with the trained vertex vectors
Return type: PgxFrame

property walk_length: int: Get the walk length.

property walks_per_vertex: int: Get the number of walks per vertex.

property window_size: int: Get the window size.

class pypgx.api.mllib.DevNetLoss(confidence_margin, anomaly_property_value)

Bases: LossFunction

Deviation loss for anomaly detection

Parameters

confidence_margin (float) –
anomaly_property_value (bool) –

get_anomaly_property_value()

Get Anomaly Property Value.

Returns: the anomaly property value
Return type: Any

get_confidence_margin()

Get confidence margin of the loss function.

Returns: the confidence margin
Return type: float

class pypgx.api.mllib.EdgeWiseModel(java_edgewise_model, params=None)

Bases: Model

EdgeWise model object.

This is a base class for SupervisedEdgeWiseModel.

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_edge_combination_method()

Get the edge combination method used to compute the edge embedding

Returns: edge combination method
Return type: EdgeCombinationMethod

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: float

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

class pypgx.api.mllib.EdgeWiseModelConfig(java_edgewise_model_config)

Bases: object

Edgewise Model Configuration class

property backend: str: Get the backend.

property batch_size: int: Get the batch size.

property conv_layer_configs: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]: Get the conv layer configs.

property edge_combination_method: EdgeCombinationMethod: Get the edge combination method.

property edge_input_feature_dim: int: Get the edge input feature dimension.

property edge_input_property_configs: Dict[str, InputPropertyConfig]: Get the edge input property configs.

property edge_input_property_names: Optional[List[str]]: Get the edge input property names.

property embedding_dim: int: Get the embedding dimension.

property enable_accelerator: str: Get whether to use the accelerator if available.

get_conv_layer_configs()

Return a list of conv layer configs

Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_validation_config()

Return the validation config

Return type: GraphWiseValidationConfig

property input_feature_dim: int: Get the input feature dimension.

property is_fitted: bool: Whether or not the model is fitted.

property learning_rate: float: Get the learning rate.

property normalize: bool: Whether or not normalization is enabled.

property num_epochs: int: Get the number of epochs.

property seed: int: Get the seed.

set_batch_size(batch_size)

Set the batch size

Parameters: batch_size (int) – batch size
Return type: None

set_edge_combination_method(edge_combination_method)

Set the edge combination method

Parameters: edge_combination_method (EdgeCombinationMethod) – edge combination method
Return type: None

set_edge_input_feature_dim(edge_input_feature_dim)

Set the edge input feature dimension

Parameters: edge_input_feature_dim (int) – edge input feature dimension
Return type: None

set_embedding_dim(embedding_dim)

Set the embedding dimension

Parameters: embedding_dim (int) – embedding dimension
Return type: None

set_enable_accelerator(enable_accelerator)

Set whether to use the accelerator if available

Parameters

shuffle (bool) – enable accelerator flag
enable_accelerator (bool) –

Return type

None

set_fitted(fitted)

Set the fitted flag

Parameters: fitted (bool) – fitted flag
Return type: None

set_input_feature_dim(input_feature_dim)

Set the input feature dimension

Parameters: input_feature_dim (int) – input feature dimension
Return type: None

set_learning_rate(learning_rate)

Set the learning rate

Parameters: learning_rate (int) – initial learning rate
Return type: None

set_normalize(normalize)

Whether or not normalization is enabled.

Parameters: normalize (bool) –
Return type: None

set_num_epochs(num_epochs)

Set the number of epochs

Parameters: num_epochs (int) – number of epochs
Return type: None

set_seed(seed)

Set the seed

Parameters: seed (int) – seed
Return type: None

set_shuffle(shuffle)

Set the shuffling flag

Parameters: shuffle (bool) – shuffling flag
Return type: None

set_standardize(standardize)

Set the standardize flag

Parameters: standardize (bool) – standardize flag
Return type: None

set_training_loss(training_loss)

Set the training loss

Parameters: training_loss (float) – training loss
Return type: None

set_weight_decay(weight_decay)

Set the weight decay

Parameters: weight_decay (float) – weight decay
Return type: None

property shuffle: bool: Whether or not shuffling is enabled.

property standardize: bool: Whether or not standardization is enabled.

property training_loss: float: Get the training loss.

property validation_config: GraphWiseValidationConfig: Get the validation config.

property vertex_input_property_configs: Dict[str, InputPropertyConfig]: Get the vertex input property configs.

property vertex_input_property_names: Optional[List[str]]: Get the vertex input property names.

property weight_decay: float: Get the weight decay.

class pypgx.api.mllib.EmbeddingTableConfig(java_config, params)

Bases: CategoricalPropertyConfig

Configuration class for handling categorical input properties using embedding table method.

property categorical: bool

Get whether the feature is categorical.

Returns: whether the feature is categorical

property categorical_embedding_type: bool

Get the type of categorical embedding.

Returns: type of categorical embedding

property embedding_dim: int

Get the embedding dimension.

Returns: embedding dimension

property max_vocabulary_size: int

Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.

Returns: maximum number of tokens allowed

property oov_probability: float

Get the probability of randomly setting the category value to OOV token during training.

Returns: probability of using OOV token

property property_name: str

Get the name of the feature that the configuration is used for.

Returns: name of the feature that the configuration is used for

set_embedding_dimension(embedding_dim)

Set the embedding dimension.

Parameters: embedding_dim (int) – embedding dimension
Return type: None

set_max_vocabulary_size(max_vocabulary_size)

Set max vocabulary size to a given value.

Parameters: max_vocabulary_size (int) – set the maximum vocabulary size to the given value
Return type: None

set_oov_probability(oov_proba)

Set the out of vocabulary probability.

Parameters: oov_proba (float) – out of vocabulary probability
Return type: None

set_shared(shared)

Set whether the feature is shared among vertex/edge types.

Parameters: shared – set shared to the value
Return type: None

property shared: bool

Get whether the feature is shared among vertex/edge types.

Returns: whether the feature is shared among vertex/edge types

class pypgx.api.mllib.GnnExplainer(java_explainer)

Bases: object

GnnExplainer object used to request explanations from model predictions.

property learning_rate: float

Get learning rate.

Returns: learning rate
Return type: float

property marginalize: bool

Get value of marginalize.

Returns: value of marginalize
Return type: bool

property num_optimization_steps: int

Get number of optimization steps.

Returns: number of optimization steps
Return type: int

class pypgx.api.mllib.GnnExplanation(java_gnn_explanation)

Bases: object

GnnExplanation object

get_embedding()

Get the inferred embedding of the specified vertex.

Returns: the embedding
Return type: List[float]

get_importance_graph()

Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.

Returns: the importance graph
Return type: PgxGraph

get_vertex_feature_importance()

Get the feature importances as a map from property to importance value.

Returns: the feature importances.
Return type: Dict[VertexProperty, float]

get_vertex_importance_property()

Get the vertex property that contains the computed vertex importance.

Returns: the vertex importance property
Return type: VertexProperty

class pypgx.api.mllib.GraphWiseAttentionLayerConfig(java_config, params)

Bases: GraphWiseBaseConvLayerConfig

GraphWise attention layer configuration.

property activation_fn: Any: Get the activation function.

property edge_to_edge_connection: Optional[bool]: Get the edge to edge connection.

property edge_to_vertex_connection: Optional[bool]: Get the edge to vertex connection.

property head_aggregation: Any: Get the aggregation operation for heads.

property num_heads: int: Get the number of heads.

property num_sampled_neighbors: int: Get the number of sampled neighbors.

property vertex_to_edge_connection: Optional[bool]: Get the vertex to edge connection.

property vertex_to_vertex_connection: Optional[bool]: Get the vertex to vertex connection.

property weight_init_scheme: Any: Get the weight initialization scheme.

class pypgx.api.mllib.GraphWiseConvLayerConfig(java_config, params)

Bases: GraphWiseBaseConvLayerConfig

GraphWise conv layer configuration.

property activation_fn: Any: Get the activation function.

property edge_to_edge_connection: Optional[bool]: Get the edge to edge connection.

property edge_to_vertex_connection: Optional[bool]: Get the edge to vertex connection.

property neighbor_weight_property_name: str: Get the name of the property that stores the weight of the edge.

property num_sampled_neighbors: int: Get the number of sampled neighbors.

property vertex_to_edge_connection: Optional[bool]: Get the vertex to edge connection.

property vertex_to_vertex_connection: Optional[bool]: Get the vertex to vertex connection.

property weight_init_scheme: Any: Get the weight initialization scheme.

class pypgx.api.mllib.GraphWiseDgiLayerConfig(java_config, params)

Bases: GraphWiseEmbeddingConfig

GraphWise dgi layer configuration.

get_corruption_function()

Return the corruption function

Return type: PermutationCorruption

get_discriminator()

Return the discriminator

Return type: str

get_embedding_type()

Return the embedding type used by this config

Return type: str

get_readout_function()

Return the readout function

Return type: str

set_corruption_function(corruption_function)

Set the corruption function

Parameters: corruption_function (CorruptionFunction) – the corruption function. Supported currently: PermutationCorruption

set_discriminator(discriminator)

Set the discriminator

Parameters: discriminator (str) – The discriminator function. Supported currently: ‘bilinear’
Return type: None

set_readout_function(readout_function)

Set the readout function

Parameters: readout_function (str) – The readout function. Supported currently: ‘mean’
Return type: None

class pypgx.api.mllib.GraphWiseDominantLayerConfig(java_config, params)

Bases: GraphWiseEmbeddingConfig

GraphWise dominant layer configuration.

get_alpha()

Return alpha.

Returns: alpha of the decoder layer
Return type: float

get_decoder_layer_configs()

Get the configuration objects for the decoder layers.

Returns: configuration of the decoder layer
Return type: GraphWisePredictionLayerConfig

get_embedding_type()

Return the embedding type used by this config

Returns: embedding type
Return type: str

set_alpha(alpha)

Set the alpha parameter

Parameters: alpha (float) – The alpha parameter to set.

set_decoder_layer_configs(decoder_layer_configs)

Set the configuration objects for the decoder layers.

Parameters: decoder_layer_configs (GraphWisePredictionLayerConfig) – configuration of the decoder layer
Return type: GraphWisePredictionLayerConfig

class pypgx.api.mllib.GraphWiseEmbeddingConfig

Bases: object

GraphWise embedding configuration.

get_embedding_type()

Return the embedding type used by this config

Return type: str

class pypgx.api.mllib.GraphWiseModel(java_graphwise_model, params=None)

Bases: Model

GraphWise model object.

This is a base class for UnsupervisedGraphWiseModel and SupervisedGraphWiseModel.

check_is_fitted()

Make sure the model is fitted.

Returns: None
Raise: RuntimeError if the model is not fitted
Return type: None

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

property edge_input_feature_dim: int: Get the dimension of the edge input features.

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

property loss: Optional[float]: Get the training loss.

property vertex_input_feature_dim: int: Get the dimension of the vertex input features.

class pypgx.api.mllib.GraphWiseModelConfig(java_graphwise_model_config)

Bases: object

Graphwise Model Configuration class

property backend: str: Get the backend.

property batch_size: int: Get the batch size.

property conv_layer_configs: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]: Get the conv layer configs.

property edge_input_feature_dim: int: Get the edge input feature dimension.

property edge_input_property_configs: Dict[str, InputPropertyConfig]: Get the edge input property configs.

property edge_input_property_names: Optional[List[str]]: Get the edge input property names.

property embedding_dim: int: Get the embedding dimension.

property enable_accelerator: str: Get whether to use the accelerator if available.

property fitted: bool: Whether or not the model is fitted.

get_conv_layer_configs()

Return a list of conv layer configs

Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_validation_config()

Return the validation config

Return type: GraphWiseValidationConfig

property input_feature_dim: int: Get the input feature dimension.

property is_fitted: bool: Whether or not the model is fitted.

property learning_rate: float: Get the learning rate.

property normalize: bool: Whether or not normalization is enabled.

property num_epochs: int: Get the number of epochs.

property seed: int: Get the seed.

set_batch_size(batch_size)

Set the batch size

Parameters: batch_size (int) – batch size
Return type: None

set_edge_input_feature_dim(edge_input_feature_dim)

Set the edge input feature dimension

Parameters: edge_input_feature_dim (int) – edge input feature dimension
Return type: None

set_embedding_dim(embedding_dim)

Set the embedding dimension

Parameters: embedding_dim (int) – embedding dimension
Return type: None

set_enable_accelerator(enable_accelerator)

Set whether to use the accelerator if available

Parameters

shuffle (bool) – enable accelerator flag
enable_accelerator (bool) –

Return type

None

set_fitted(fitted)

Set the fitted flag

Parameters: fitted (bool) – fitted flag
Return type: None

set_input_feature_dim(input_feature_dim)

Set the input feature dimension

Parameters: input_feature_dim (int) – input feature dimension
Return type: None

set_learning_rate(learning_rate)

Set the learning rate

Parameters: learning_rate (int) – initial learning rate
Return type: None

set_normalize(normalize)

Whether or not normalization is enabled.

Parameters: normalize (bool) –
Return type: None

set_num_epochs(num_epochs)

Set the number of epochs

Parameters: num_epochs (int) – number of epochs
Return type: None

set_seed(seed)

Set the seed

Parameters: seed (int) – seed
Return type: None

set_shuffle(shuffle)

Set the shuffling flag

Parameters: shuffle (bool) – shuffling flag
Return type: None

set_standardize(standardize)

Set the standardize flag

Parameters: standardize (bool) – standardize flag
Return type: None

set_training_loss(training_loss)

Set the training loss

Parameters: training_loss (float) – training loss
Return type: None

set_weight_decay(weight_decay)

Set the weight decay

Parameters: weight_decay (float) – weight decay
Return type: None

property shuffle: bool: Whether or not shuffling is enabled.

property standardize: bool: Whether or not standardization is enabled.

property training_loss: float: Get the training loss.

property validation_config: GraphWiseValidationConfig: Get the validation config.

property vertex_input_property_configs: Dict[str, InputPropertyConfig]: Get the vertex input property configs.

property vertex_input_property_names: Optional[List[str]]: Get the vertex input property names.

property weight_decay: float: Get the weight decay.

class pypgx.api.mllib.GraphWisePredictionLayerConfig(java_config, params)

Bases: object

GraphWise prediction layer configuration.

class pypgx.api.mllib.GraphWiseValidationConfig(java_config, params)

Bases: object

GraphWise validation configuration.

property evaluation_frequency: int

Get the evaluation frequency.

Returns: evaluation frequency

property evaluation_frequency_scale: Any

Get the evaluation frequency scale.

Returns: evaluation frequency scale (options: “epoch” or “step”)

class pypgx.api.mllib.InputPropertyConfig(java_config, params)

Bases: object

Configuration class for handling input properties using one hot encoding method.

property categorical: bool

Get whether the feature is categorical.

Returns: whether the feature is categorical

property property_name: str

Get the name of the feature that the configuration is used for.

Returns: name of the feature that the configuration is used for

class pypgx.api.mllib.MSELoss

Bases: LossFunction

MSE loss for regression

class pypgx.api.mllib.Model(java_generic_model)

Bases: PgxContextManager

Model object

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

is_fitted()

Whether or not the model has been fitted.

Returns: Always returns False since this base class cant be fitted.
Return type: bool

class pypgx.api.mllib.ModelLoader(analyst, java_model_loader, wrapper, java_class)

Bases: object

ModelLoader object.

Parameters

analyst (Analyst) –
java_model_loader (Any) –
wrapper (Callable) –
java_class (str) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)

Return a model stored in a database.

Parameters

model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –

Return type

Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel]

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Returns

The model stored in database.

Return type

Union[: “SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]

Parameters

model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –

file(path, key)

Return an encrypted model stored in a file.

Parameters

path (str) – Path of stored model.
key (str) – Used for encryption.

Returns

The model stored in file.

Return type

Union[: “SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]

class pypgx.api.mllib.ModelRepository(java_generic_model_repository)

Bases: object

ModelRepository object that exposes crud operations on - model stores and - the models within these model stores.

create(model_store_name)

Create a new model store.

Parameters: model_store_name (str) – The name of the model store.
Return type: None

delete_model(model_store_name, model_name)

Delete the model in the specified model store with the given model name.

Parameters

model_store_name (str) – The name of the model store.
model_name (str) – The name under which the model was stored.

Return type

None

delete_model_store(model_store_name)

Delete a model store.

Parameters: model_store_name (str) – The name of the model store.
Return type: None

get_model_description(model_store_name, model_name)

Retrieve the description of the model in the specified model store, with the given model name.

Parameters

model_store_name (str) – The name of the model store.
model_name (str) – The name under which the model was stored.

Returns

A string containing the description that was stored with the model.

Return type

str

list_model_stores_names()

List the names of all model stores in the model repository.

Returns: List of names.
Return type: List[str]

list_model_stores_names_matching(regex)

List the names of all model stores in the model repository that match the regex.

Parameters: regex (str) – A regex in form of a string.
Returns: List of matching names.
Return type: List[str]

list_models(model_store_name)

List the models present in the model store with the given name.

Parameters: model_store_name (str) – The name of the model store (non-prefixed).
Returns: List of model names.
Return type: List[str]

class pypgx.api.mllib.ModelRepositoryBuilder(java_generic_model_repository_builder)

Bases: object

ModelRepositoryBuilder object that can be used to configure the connection to a model repository.

db(username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)

Connect to a model repository backed by a database.

Parameters

username (Optional[str]) – username in database
password (Optional[str]) – password of username in database
jdbc_url (Optional[str]) – jdbc url of database
keystore_alias (Optional[str]) – the keystore alias to get the password in the keystore
schema (Optional[str]) – the schema of the model store in database

Returns

A model repository configured to connect to a database.

Return type

ModelRepository

class pypgx.api.mllib.ModelStorer(model)

Bases: object

ModelStorer object.

Parameters: model (Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel, Model]) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, model_description=None, overwrite=False, keystore_alias=None, schema=None)

Store a model to a database.

Parameters

model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –

Return type

None

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Raises

RuntimeError – If the model is not fitted.

Parameters

model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
model_description (Optional[str]) –
overwrite (bool) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –

Return type

None

file(path, key, overwrite=False)

Store an encrypted model to a file.

Parameters

path (str) – Path to store model.
key (str) – Key used for encryption.
overwrite (bool) – Boolean value for overwriting or not.

Raises

RuntimeError – If the model is not fitted.

Return type

None

class pypgx.api.mllib.OneHotEncodingConfig(java_config, params)

Bases: CategoricalPropertyConfig

Configuration class for handling categorical input properties.

property categorical: bool

Get whether the feature is categorical.

Returns: whether the feature is categorical

property categorical_embedding_type: bool

Get the type of categorical embedding.

Returns: type of categorical embedding

property max_vocabulary_size: int

Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.

Returns: maximum number of tokens allowed

property property_name: str

Get the name of the feature that the configuration is used for.

Returns: name of the feature that the configuration is used for

set_max_vocabulary_size(max_vocabulary_size)

Set max vocabulary size to a given value.

Parameters: max_vocabulary_size (int) – set the maximum vocabulary size to the given value
Return type: None

set_shared(shared)

Set whether the feature is shared among vertex/edge types.

Parameters: shared – set shared to the value
Return type: None

property shared: bool

Get whether the feature is shared among vertex/edge types.

Returns: whether the feature is shared among vertex/edge types

class pypgx.api.mllib.PermutationCorruption(java_permutation_corruption)

Bases: CorruptionFunction

Permutation Function which shuffle the nodes to generate the corrupted subgraph for DGI

class pypgx.api.mllib.Pg2vecModel(java_pg2vec_model)

Bases: Model

Pg2Vec model object.

property batch_size: int: Get the batch size.

close()

Call destroy

Return type: None

compute_similars(graphlet_id, k)

Compute the top-k similar graphlets for a list of input graphlets.

Parameters

graphlet_id (Union[Iterable[Union[int, str]], int, str]) – graphletIds or iterable of graphletIds
k (int) – number of similars to return

Return type

PgxFrame

destroy()

Destroy this model object.

Return type: None

property enable_accelerator: bool: Get whether the accelerator is used if available.

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

fit(graph)

Fit the model on a graph.

Parameters: graph (PgxGraph) – Graph to fit on
Return type: None

property graphlet_id_property_name: str: Get the graphlet id property name.

property graphlet_size_property_name: str: Get the graphlet size property name.

infer_graphlet_vector(graph)

Return the inferred vector of the input graphlet as a PgxFrame.

Parameters: graph (PgxGraph) – graphlet for which to infer a vector
Return type: PgxFrame

infer_graphlet_vector_batched(graph)

Return the inferred vectors of the input graphlets as a PgxFrame.

Parameters: graph (PgxGraph) – graphlets (as a single graph but different graphlet-id) for which to infer vectors
Return type: PgxFrame

is_fitted()

Whether or not the model has been fitted.

Return type: bool

property layer_size: int: Get the layer size.

property learning_rate: float: Get the learning rate.

property loss: Optional[float]: Get the loss of the model.

property min_learning_rate: float: Get the minimum learning rate.

property min_word_frequency: int: Get the minimum word frequency.

property num_epochs: int: Get the number of epochs.

property seed: Optional[int]: Get the seed.

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (Optional[str]) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Return type

None

property trained_graphlet_vectors: PgxFrame

Get the trained graphlet vectors for the current pg2vec model.

Returns: PgxFrame containing the trained graphlet vectors

property use_graphlet_size: bool: Get the use graphlet size.

property vertex_property_names: List[str]: Get the vertex property names.

property walk_length: int: Get the walk length.

property walks_per_vertex: int: Get the walks per vertex.

property window_size: int: Get the window size.

class pypgx.api.mllib.ProductEdgeCombinationMethod(use_source_vertex, use_destination_vertex, use_edge)

Bases: EdgeCombinationMethod

Product method for edge embedding generation

Parameters

use_source_vertex (bool) –
use_destination_vertex (bool) –
use_edge (bool) –

get_aggregation_type()

Get the aggregation type

Returns:: the aggregation type

Return type: str

use_dst_vertex()

Get if destination vertex embedding is used or not for the edge embedding.

Returns: uses or not the destination vertex
Return type: bool

use_edge()

Get if edge features are used or not for the edge embedding

Returns: uses or not the edge features
Return type: bool

use_src_vertex()

Get if source vertex embedding is used or not for the edge embedding.

Returns: uses or not the source vertex
Return type: bool

class pypgx.api.mllib.SigmoidCrossEntropyLoss

Bases: LossFunction

Sigmoid Cross Entropy loss for binary classification

class pypgx.api.mllib.SoftmaxCrossEntropyLoss

Bases: LossFunction

Softmax Cross Entropy loss for multi-class classification

class pypgx.api.mllib.SupervisedEdgeWiseModel(java_edgewise_model, params=None)

Bases: EdgeWiseModel

SupervisedEdgeWise model object.

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

evaluate(graph, edges, threshold=0.0)

Evaluate performance statistics for the specified edges.

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to evaluate on. Can be a list of edges or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

evaluate_labels(graph, edges, threshold=0.0)

Evaluate (macro averaged) classification performance statistics for the specified edges.

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to evaluate on. Can be a list of edges or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

fit(graph, validation_graph=None)

Fit the model on the graph while validating on the validation_graph.

Parameters

graph (PgxGraph) – Graph to fit on
validation_graph (PgxGraph) – Graph to validate on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_class_weights()

Get the class weights.

Returns: a dictionary mapping classes to their weights.
Return type: dict

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_edge_combination_method()

Get the edge combination method used to compute the edge embedding

Returns: edge combination method
Return type: EdgeCombinationMethod

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_edge_target_property_name()

Get the target property name

Returns: target property name
Return type: str

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_loss_function()

Get the loss function name.

Returns: loss function name. Can be one of softmax_cross_entropy, sigmoid_cross_entropy, devnet
Return type: str

get_loss_function_class()

Get the loss function.

Returns: loss function
Return type: LossFunction

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_prediction_layer_configs()

Get the configuration objects for the prediction layers.

Returns: configuration of the prediction layer
Return type: GraphWisePredictionLayerConfig

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_target_edge_labels()

Get the target edge labels

Returns: target edge labels
Return type: List[str]

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: float

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

infer(graph, edges, threshold=0.0)

Infer the predictions for the specified edges

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer embeddings for. Can be a list of edges or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the inference results for each edge

Return type

PgxFrame

infer_embeddings(graph, edges)

Infer the embeddings for the specified edges

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer embeddings for. Can be a list of edges or their IDs.

Returns

PgxFrame containing the embeddings for each edge

Return type

PgxFrame

infer_labels(graph, edges, threshold=0.0)

Infer the labels for the specified edges

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer labels for. Can be a list of edges or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the labels for each vertex

Return type

PgxFrame

infer_logits(graph, edges)

Infer the prediction logits for the specified edges

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer logits for. Can be a list of edges or their IDs.

Returns

PgxFrame containing the logits for each vertex

Return type

PgxFrame

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

class pypgx.api.mllib.SupervisedGnnExplainer(java_explainer, bool_label)

Bases: GnnExplainer

SupervisedGnnExplainer used to request explanations from supervised model predictions.

Parameters: bool_label (bool) –

infer_and_explain(graph, vertex, threshold=0.0)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the prediction.

Parameters

graph (PgxGraph) – the graph
vertex (PgxVertex or int) – the vertex or its ID
threshold (float) – decision threshold

Returns

explanation containing feature importance and vertex importance.

Return type

SupervisedGnnExplanation

property learning_rate: float

Get learning rate.

Returns: learning rate
Return type: float

property marginalize: bool

Get value of marginalize.

Returns: value of marginalize
Return type: bool

property num_optimization_steps: int

Get number of optimization steps.

Returns: number of optimization steps
Return type: int

class pypgx.api.mllib.SupervisedGnnExplanation(java_supervised_gnn_explanation, bool_label)

Bases: GnnExplanation

SupervisedGnnExplanation object

Parameters: bool_label (bool) –

get_embedding()

Get the inferred embedding of the specified vertex.

Returns: the embedding
Return type: List[float]

get_importance_graph()

Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.

Returns: the importance graph
Return type: PgxGraph

get_label()

Get the inferred label of the specified vertex.

Returns: the label
Return type: Any

get_logits()

Get the inferred logits of the specified vertex.

Returns: the logits
Return type: List[float]

get_vertex_feature_importance()

Get the feature importances as a map from property to importance value.

Returns: the feature importances.
Return type: Dict[VertexProperty, float]

get_vertex_importance_property()

Get the vertex property that contains the computed vertex importance.

Returns: the vertex importance property
Return type: VertexProperty

class pypgx.api.mllib.SupervisedGraphWiseModel(java_graphwise_model, params=None)

Bases: GraphWiseModel

SupervisedGraphWise model object.

check_is_fitted()

Make sure the model is fitted.

Returns: None
Raise: RuntimeError if the model is not fitted
Return type: None

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

property edge_input_feature_dim: int: Get the dimension of the edge input features.

evaluate(graph, vertices, threshold=0.0)

Evaluate performance statistics for the specified vertices.

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on. Can be a list of vertices or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

evaluate_labels(graph, vertices, threshold=0.0)

Evaluate (macro averaged) classification performance statistics for the specified vertices.

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on. Can be a list of vertices or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the metrics

Return type

PgxFrame

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

fit(graph, validation_graph=None)

Fit the model on the graph while validating on the validation_graph.

Parameters

graph (PgxGraph) – Graph to fit on
validation_graph (PgxGraph) – Graph to validate on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_class_weights()

Get the class weights.

Returns: a dictionary mapping classes to their weights.
Return type: dict

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_loss_function()

Get the loss function name.

Returns: loss function name. Can be one of softmax_cross_entropy, sigmoid_cross_entropy, devnet
Return type: str

get_loss_function_class()

Get the loss function.

Returns: loss function
Return type: LossFunction

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_prediction_layer_configs()

Get the configuration objects for the prediction layers.

Returns: configuration of the prediction layer
Return type: GraphWisePredictionLayerConfig

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_target_vertex_labels()

Get the target vertex labels

Returns: target vertex labels
Return type: List[str]

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

get_vertex_target_property_name()

Get the target property name

Returns: target property name
Return type: str

gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False)

Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.

Parameters

num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200
learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05
marginalize (bool, optional) – marginalize the loss over features, defaults to False

Returns

SupervisedGnnExplainer object of this model

Return type

SupervisedGnnExplainer

infer(graph, vertices, threshold=0.0)

Infer the predictions for the specified vertices

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer embeddings for. Can be a list of vertices or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the inference results for each vertex

Return type

PgxFrame

infer_and_get_explanation(graph, vertex, num_optimization_steps=200, learning_rate=0.05, marginalize=False, threshold=0.0)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the prediction.

Parameters

graph (PgxGraph) – the graph
vertex (PgxVertex or int) – the vertex or its ID
threshold (float) – decision threshold for classification (unused for regression)
num_optimization_steps (int) –
learning_rate (float) –
marginalize (bool) –

Returns

explanation containing feature importance and vertex importance.

Return type

SupervisedGnnExplanation

infer_embeddings(graph, vertices)

Infer the embeddings for the specified vertices

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer embeddings for. Can be a list of vertices or their IDs.

Returns

PgxFrame containing the embeddings for each vertex

Return type

PgxFrame

infer_labels(graph, vertices, threshold=0.0)

Infer the labels for the specified vertices

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer labels for. Can be a list of vertices or their IDs.
threshold (float) – decision threshold for classification (unused for regression)

Returns

PgxFrame containing the labels for each vertex

Return type

PgxFrame

infer_logits(graph, vertices)

Infer the prediction logits for the specified vertices

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer logits for. Can be a list of vertices or their IDs.

Returns

PgxFrame containing the logits for each vertex

Return type

PgxFrame

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

property loss: Optional[float]: Get the training loss.

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

property vertex_input_feature_dim: int: Get the dimension of the vertex input features.

class pypgx.api.mllib.UnsupervisedAnomalyDetectionGraphWiseModel(java_graphwise_model, params=None)

Bases: UnsupervisedGraphWiseModel

UnsupervisedGraphWise model object.

check_is_fitted()

Make sure the model is fitted.

Returns: None
Raise: RuntimeError if the model is not fitted
Return type: None

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

property edge_input_feature_dim: int: Get the dimension of the edge input features.

evaluate_anomaly_labels(graph, vertices, vertex_anomaly_property_name, anomaly_property_value, threshold)

Evaluate anomaly detection performance statistics for the specified vertices.

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on
vertex_anomaly_property_name (str) – the name of the property containing the anomaly
anomaly_property_value (object) – the value indicating an anomaly in vertex_anomaly_property_name property
threshold (float) – the anomaly threshold

Raises

LookupError – if the property is not found

Returns

PgxFrame containing the evaluation results.

Return type

PgxFrame

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

find_anomaly_threshold(graph, vertices, contamination_factor)

Find an appropriate anomaly threshold for labeling the input vertices as anomalies, respecting the proportion given by the contamination factor

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer on
contamination_factor (float) – the contamination factor

Returns

the threshold

Return type

float

fit(graph, validation_graph=None)

Fit the model on the graph while validating on the validation_graph.

Parameters

graph (PgxGraph) – Graph to fit on
validation_graph (PgxGraph) – Graph to validate on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_dgi_layer_config()

Get the configuration object for the dgi layer.

Returns: configuration
Return type: GraphWiseDgiLayerConfig

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_embedding_config()

Get the configuration object for the embedding method

Returns: configuration
Return type: GraphWiseEmbeddingConfig

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_loss_function()

Get the loss function name.

Returns: loss function name. Can only be sigmoid_cross_entropy
Return type: str

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False, num_clusters=50, num_samples=10000)

Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.

Parameters

num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200
learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05
marginalize (bool, optional) – marginalize the loss over features, defaults to False
num_clusters (int, optional) – number of clusters to use, defaults to 50
num_samples (int, optional) – number of samples to use, defaults to 10000

Returns

UnsupervisedGnnExplainer object of this model

Return type

UnsupervisedGnnExplainer

infer_and_get_explanation(graph, vertex, num_clusters=50, num_samples=10000, num_optimization_steps=200, learning_rate=0.05, marginalize=False)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.

Parameters

graph (PgxGraph) – the graph
vertex (Union[PgxVertex, int]) – the vertex
num_clusters (int) – the number of semantic vertex clusters expected in the graph, must be greater than 1
num_samples (int) –
num_optimization_steps (int) –
learning_rate (float) –
marginalize (bool) –

Returns

explanation containing feature importance and vertex importance.

Return type

GnnExplanation

infer_anomaly_labels(graph, vertices, threshold)

Infer the anomaly labels for the specified vertices.

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer on
threshold (float) – the anomaly threshold

Returns

PgxFrame containing the anomaly labels for each vertex.

Return type

PgxFrame

infer_anomaly_scores(graph, vertices)

Infer the anomaly scores for the specified vertices.

Parameters

graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to infer on

Returns

PgxFrame containing the anomaly scores for each vertex.

Return type

PgxFrame

infer_embeddings(graph, vertices)

Infer the embeddings for the specified vertices.

Returns

PgxFrame containing the embeddings for each vertex.

Return type

PgxFrame

Parameters

graph (PgxGraph) –
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) –

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

property loss: Optional[float]: Get the training loss.

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

property vertex_input_feature_dim: int: Get the dimension of the vertex input features.

class pypgx.api.mllib.UnsupervisedEdgeWiseModel(java_edgewise_model, params=None)

Bases: EdgeWiseModel

UnsupervisedEdgeWise model object.

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

fit(graph, validation_graph=None)

Fit the model on the graph while validating on the validation_graph.

Parameters

graph (PgxGraph) – Graph to fit on
validation_graph (PgxGraph) – Graph to validate on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_dgi_layer_config()

Get the configuration object for the dgi layer.

Returns: configuration
Return type: GraphWiseDgiLayerConfig

get_edge_combination_method()

Get the edge combination method used to compute the edge embedding

Returns: edge combination method
Return type: EdgeCombinationMethod

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_loss_function()

Get the loss function name.

Returns: loss function name: sigmoid_cross_entropy
Return type: str

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_target_edge_labels()

Get the target edge labels

Returns: target edge labels
Return type: List[str]

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: float

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

infer_embeddings(graph, edges)

Infer the embeddings for the specified edges

Parameters

graph (PgxGraph) – the graph
edges (Union[Iterable[PgxEdge], Iterable[int]]) – the edges to infer embeddings for. Can be a list of edges or their IDs.

Returns

PgxFrame containing the embeddings for each edge

Return type

PgxFrame

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

class pypgx.api.mllib.UnsupervisedGnnExplainer(java_explainer)

Bases: GnnExplainer

UnsupervisedGnnExplainer used to request explanations from unsupervised model predictions.

infer_and_explain(graph, vertex)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.

Parameters

graph (PgxGraph) – The graph.
vertex (Union[PgxVertex, int, str]) – The vertex.

Returns

Explanation containing feature importance and vertex importance.

Return type

GnnExplanation

property learning_rate: float

Get learning rate.

Returns: learning rate
Return type: float

property marginalize: bool

Get value of marginalize.

Returns: value of marginalize
Return type: bool

property num_clusters: int

Get number of clusters.

Returns: Number of clusters.
Return type: int

property num_optimization_steps: int

Get number of optimization steps.

Returns: number of optimization steps
Return type: int

property num_samples: int

Get number of samples.

Returns: Number of samples.
Return type: int

class pypgx.api.mllib.UnsupervisedGnnExplanation(java_unsupervised_gnn_explanation)

Bases: GnnExplanation

UnsupervisedGnnExplanation object

get_embedding()

Get the inferred embedding of the specified vertex.

Returns: the embedding
Return type: List[float]

get_importance_graph()

Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.

Returns: the importance graph
Return type: PgxGraph

get_vertex_feature_importance()

Get the feature importances as a map from property to importance value.

Returns: the feature importances.
Return type: Dict[VertexProperty, float]

get_vertex_importance_property()

Get the vertex property that contains the computed vertex importance.

Returns: the vertex importance property
Return type: VertexProperty

class pypgx.api.mllib.UnsupervisedGraphWiseModel(java_graphwise_model, params=None)

Bases: GraphWiseModel

UnsupervisedGraphWise model object.

check_is_fitted()

Make sure the model is fitted.

Returns: None
Raise: RuntimeError if the model is not fitted
Return type: None

close()

Call destroy

Return type: None

destroy()

Destroy this model object.

Return type: None

property edge_input_feature_dim: int: Get the dimension of the edge input features.

export()

Return a ModelStorer object which can be used to save the model.

Returns: ModelStorer object
Return type: ModelStorer

fit(graph, validation_graph=None)

Fit the model on the graph while validating on the validation_graph.

Parameters

graph (PgxGraph) – Graph to fit on
validation_graph (PgxGraph) – Graph to validate on

Returns

None

Return type

None

get_batch_size()

Get the batch size

Returns: batch size
Return type: int

get_config()

Return the GraphWiseModelConfig object

Returns: the config
Return type: GraphWiseModelConfig

get_conv_layer_config()

Get the configuration objects for the convolutional layers

Returns: configurations
Return type: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]

get_dgi_layer_config()

Get the configuration object for the dgi layer.

Returns: configuration
Return type: GraphWiseDgiLayerConfig

get_edge_input_feature_dim()

Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated

Returns: edges input feature dimension
Return type: int

get_edge_input_property_configs()

Get the configuration objects for edge input properties

Return type: List[InputPropertyConfig]

get_edge_input_property_names()

Get the edges input feature names

Returns: edges input feature names
Return type: list(str)

get_embedding_config()

Get the configuration object for the embedding method

Returns: configuration
Return type: GraphWiseEmbeddingConfig

get_layer_size()

Get the dimension of the embeddings

Returns: embedding dimension
Return type: int

get_learning_rate()

Get the initial learning rate

Returns: initial learning rate
Return type: float

get_loss_function()

Get the loss function name.

Returns: loss function name. Can only be sigmoid_cross_entropy
Return type: str

get_num_epochs()

Get the number of epochs to train the model

Returns: number of epochs to train the model
Return type: int

get_seed()

Get the random seed

Returns: random seed
Return type: int

get_training_log()

Get the log of validation during the training.

Returns: training log
Return type: PgxFrame

get_training_loss()

Get the final training loss

Returns: training loss
Return type: Optional[float]

get_vertex_input_feature_dim()

Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated

Returns: input feature dimension
Return type: int

get_vertex_input_property_configs()

Get the configuration objects for vertex input properties

Returns: configurations
Return type: List[InputPropertyConfig]

get_vertex_input_property_names()

Get the vertices input feature names

Returns: vertices input feature names
Return type: list(str)

gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False, num_clusters=50, num_samples=10000)

Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.

Parameters

num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200
learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05
marginalize (bool, optional) – marginalize the loss over features, defaults to False
num_clusters (int, optional) – number of clusters to use, defaults to 50
num_samples (int, optional) – number of samples to use, defaults to 10000

Returns

UnsupervisedGnnExplainer object of this model

Return type

UnsupervisedGnnExplainer

infer_and_get_explanation(graph, vertex, num_clusters=50, num_samples=10000, num_optimization_steps=200, learning_rate=0.05, marginalize=False)

Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.

Parameters

graph (PgxGraph) – the graph
vertex (Union[PgxVertex, int]) – the vertex
num_clusters (int) – the number of semantic vertex clusters expected in the graph, must be greater than 1
num_samples (int) –
num_optimization_steps (int) –
learning_rate (float) –
marginalize (bool) –

Returns

explanation containing feature importance and vertex importance.

Return type

GnnExplanation

infer_embeddings(graph, vertices)

Infer the embeddings for the specified vertices.

Returns

PgxFrame containing the embeddings for each vertex.

Return type

PgxFrame

Parameters

graph (PgxGraph) –
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) –

is_fitted()

Check if the model is fitted

Returns: True if the model is fitted, False otherwise
Return type: bool

property loss: Optional[float]: Get the training loss.

store(path, key, overwrite=False)

Store the model in a file.

Parameters

path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file

Returns

None

Return type

None

property vertex_input_feature_dim: int: Get the dimension of the vertex input features.

class pypgx.api.mllib._model_utils.ModelStorer(model)

Bases: object

ModelStorer object.

Parameters: model (Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel, Model]) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, model_description=None, overwrite=False, keystore_alias=None, schema=None)

Store a model to a database.

Parameters

model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –

Return type

None

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Raises

RuntimeError – If the model is not fitted.

Parameters

model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
model_description (Optional[str]) –
overwrite (bool) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –

Return type

None

file(path, key, overwrite=False)

Store an encrypted model to a file.

Parameters

path (str) – Path to store model.
key (str) – Key used for encryption.
overwrite (bool) – Boolean value for overwriting or not.

Raises

RuntimeError – If the model is not fitted.

Return type

None

class pypgx.api.mllib._model_utils.ModelLoader(analyst, java_model_loader, wrapper, java_class)

Bases: object

ModelLoader object.

Parameters

analyst (Analyst) –
java_model_loader (Any) –
wrapper (Callable) –
java_class (str) –

db(model_store, model_name, username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)

Return a model stored in a database.

Parameters

model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –

Return type

Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel]

:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]

Returns

The model stored in database.

Return type

Union[: “SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]

Parameters

model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –

file(path, key)

Return an encrypted model stored in a file.

Parameters

path (str) – Path of stored model.
key (str) – Used for encryption.

Returns

The model stored in file.

Return type

Union[: “SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”

]