MLlib
Graph machine learning tools for use with PGX.
- class pypgx.api.mllib.CategoricalPropertyConfig(java_config, params)
Bases:
InputPropertyConfig
Configuration class for handling categorical input properties.
- property categorical: bool
Get whether the feature is categorical.
- Returns
whether the feature is categorical
- property categorical_embedding_type: bool
Get the type of categorical embedding.
- Returns
type of categorical embedding
- property max_vocabulary_size: int
Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.
- Returns
maximum number of tokens allowed
- property property_name: str
Get the name of the feature that the configuration is used for.
- Returns
name of the feature that the configuration is used for
- set_max_vocabulary_size(max_vocabulary_size)
Set max vocabulary size to a given value.
- Parameters
max_vocabulary_size (int) – set the maximum vocabulary size to the given value
- Return type
None
Set whether the feature is shared among vertex/edge types.
- Parameters
shared – set shared to the value
- Return type
None
Get whether the feature is shared among vertex/edge types.
- Returns
whether the feature is shared among vertex/edge types
- class pypgx.api.mllib.ConcatEdgeCombinationMethod(use_source_vertex, use_destination_vertex, use_edge)
Bases:
EdgeCombinationMethod
Concatenation method for edge embedding generation
- Parameters
use_source_vertex (bool) –
use_destination_vertex (bool) –
use_edge (bool) –
- get_aggregation_type()
Get the aggregation type
- Returns:
the aggregation type
- Return type
str
- use_dst_vertex()
Get if destination vertex embedding is used or not for the edge embedding.
- Returns
uses or not the destination vertex
- Return type
bool
- use_edge()
Get if edge features are used or not for the edge embedding
- Returns
uses or not the edge features
- Return type
bool
- use_src_vertex()
Get if source vertex embedding is used or not for the edge embedding.
- Returns
uses or not the source vertex
- Return type
bool
- class pypgx.api.mllib.CorruptionFunction(java_corruption_function)
Bases:
object
Abstract Corruption Function which generate the corrupted subgraph for DGI
- class pypgx.api.mllib.DeepWalkModel(java_deepwalk_model)
Bases:
Model
DeepWalk model object.
- property batch_size: int
Get the batch size.
- close()
Call destroy
- Return type
None
- compute_similars(v, k)
Compute the top-k similar vertices for a given vertex.
- Parameters
v (Union[int, str, List[int], List[str]]) – id of the vertex or list of vertex ids for which to compute the similar vertices
k (int) – number of similar vertices to return
- Return type
- destroy()
Destroy this model object.
- Return type
None
- property enable_accelerator: bool
Get whether the accelerator is used if available.
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Return type
None
- is_fitted()
Whether or not the model has been fitted.
- Return type
bool
- property layer_size: int
Get the layer size.
- property learning_rate: float
Get the learning rate.
- property loss: Optional[float]
Get the loss of the model.
- property min_learning_rate: float
Get the minimum learning rate.
- property min_word_frequency: int
Get the minimum word frequency.
- property negative_sample: int
Get the negative sample.
- property num_epochs: int
Get the number of epochs.
- property sample_rate: float
Get the sample rate.
- property seed: Optional[int]
Get the seed.
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (Optional[str]) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Return type
None
- property trained_vectors: PgxFrame
Get the trained vertex vectors for the current DeepWalk model.
- Returns
PgxFrame object with the trained vertex vectors
- Return type
- property validation_fraction: float
Get the validation fraction.
validation_fraction is deprecated since 23.4, the loss now is computed on all samples
- property walk_length: int
Get the walk length.
- property walks_per_vertex: int
Get the number of walks per vertex.
- property window_size: int
Get the window size.
- class pypgx.api.mllib.DevNetLoss(confidence_margin, anomaly_property_value)
Bases:
LossFunction
Deviation loss for anomaly detection
- Parameters
confidence_margin (float) –
anomaly_property_value (bool) –
- get_anomaly_property_value()
Get Anomaly Property Value.
- Returns
the anomaly property value
- Return type
Any
- get_confidence_margin()
Get confidence margin of the loss function.
- Returns
the confidence margin
- Return type
float
- class pypgx.api.mllib.EdgeWiseModel(java_edgewise_model, params=None)
Bases:
Model
EdgeWise model object.
This is a base class for
SupervisedEdgeWiseModel
.- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_edge_combination_method()
Get the edge combination method used to compute the edge embedding
- Returns
edge combination method
- Return type
EdgeCombinationMethod
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
float
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- class pypgx.api.mllib.EdgeWiseModelConfig(java_edgewise_model_config)
Bases:
object
Edgewise Model Configuration class
- property backend: str
Get the backend.
- property batch_size: int
Get the batch size.
- property conv_layer_configs: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
Get the conv layer configs.
- property edge_combination_method: EdgeCombinationMethod
Get the edge combination method.
- property edge_input_feature_dim: int
Get the edge input feature dimension.
- property edge_input_property_configs: Dict[str, InputPropertyConfig]
Get the edge input property configs.
- property edge_input_property_names: Optional[List[str]]
Get the edge input property names.
- property embedding_dim: int
Get the embedding dimension.
- property enable_accelerator: str
Get whether to use the accelerator if available.
- get_conv_layer_configs()
Return a list of conv layer configs
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- property input_feature_dim: int
Get the input feature dimension.
- property is_fitted: bool
Whether or not the model is fitted.
- property learning_rate: float
Get the learning rate.
- property normalize: bool
Whether or not normalization is enabled.
- property num_epochs: int
Get the number of epochs.
- property seed: int
Get the seed.
- set_batch_size(batch_size)
Set the batch size
- Parameters
batch_size (int) – batch size
- Return type
None
- set_edge_combination_method(edge_combination_method)
Set the edge combination method
- Parameters
edge_combination_method (EdgeCombinationMethod) – edge combination method
- Return type
None
- set_edge_input_feature_dim(edge_input_feature_dim)
Set the edge input feature dimension
- Parameters
edge_input_feature_dim (int) – edge input feature dimension
- Return type
None
- set_embedding_dim(embedding_dim)
Set the embedding dimension
- Parameters
embedding_dim (int) – embedding dimension
- Return type
None
- set_enable_accelerator(enable_accelerator)
Set whether to use the accelerator if available
- Parameters
shuffle (bool) – enable accelerator flag
enable_accelerator (bool) –
- Return type
None
- set_fitted(fitted)
Set the fitted flag
- Parameters
fitted (bool) – fitted flag
- Return type
None
- set_input_feature_dim(input_feature_dim)
Set the input feature dimension
- Parameters
input_feature_dim (int) – input feature dimension
- Return type
None
- set_learning_rate(learning_rate)
Set the learning rate
- Parameters
learning_rate (int) – initial learning rate
- Return type
None
- set_normalize(normalize)
Whether or not normalization is enabled.
- Parameters
normalize (bool) –
- Return type
None
- set_num_epochs(num_epochs)
Set the number of epochs
- Parameters
num_epochs (int) – number of epochs
- Return type
None
- set_seed(seed)
Set the seed
- Parameters
seed (int) – seed
- Return type
None
- set_shuffle(shuffle)
Set the shuffling flag
- Parameters
shuffle (bool) – shuffling flag
- Return type
None
- set_standardize(standardize)
Set the standardize flag
- Parameters
standardize (bool) – standardize flag
- Return type
None
- set_training_loss(training_loss)
Set the training loss
- Parameters
training_loss (float) – training loss
- Return type
None
- set_weight_decay(weight_decay)
Set the weight decay
- Parameters
weight_decay (float) – weight decay
- Return type
None
- property shuffle: bool
Whether or not shuffling is enabled.
- property standardize: bool
Whether or not standardization is enabled.
- property training_loss: float
Get the training loss.
- property vertex_input_property_configs: Dict[str, InputPropertyConfig]
Get the vertex input property configs.
- property vertex_input_property_names: Optional[List[str]]
Get the vertex input property names.
- property weight_decay: float
Get the weight decay.
- class pypgx.api.mllib.EmbeddingTableConfig(java_config, params)
Bases:
CategoricalPropertyConfig
Configuration class for handling categorical input properties using embedding table method.
- property categorical: bool
Get whether the feature is categorical.
- Returns
whether the feature is categorical
- property categorical_embedding_type: bool
Get the type of categorical embedding.
- Returns
type of categorical embedding
- property embedding_dim: int
Get the embedding dimension.
- Returns
embedding dimension
- property max_vocabulary_size: int
Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.
- Returns
maximum number of tokens allowed
- property oov_probability: float
Get the probability of randomly setting the category value to OOV token during training.
- Returns
probability of using OOV token
- property property_name: str
Get the name of the feature that the configuration is used for.
- Returns
name of the feature that the configuration is used for
- set_embedding_dimension(embedding_dim)
Set the embedding dimension.
- Parameters
embedding_dim (int) – embedding dimension
- Return type
None
- set_max_vocabulary_size(max_vocabulary_size)
Set max vocabulary size to a given value.
- Parameters
max_vocabulary_size (int) – set the maximum vocabulary size to the given value
- Return type
None
- set_oov_probability(oov_proba)
Set the out of vocabulary probability.
- Parameters
oov_proba (float) – out of vocabulary probability
- Return type
None
Set whether the feature is shared among vertex/edge types.
- Parameters
shared – set shared to the value
- Return type
None
Get whether the feature is shared among vertex/edge types.
- Returns
whether the feature is shared among vertex/edge types
- class pypgx.api.mllib.GnnExplainer(java_explainer)
Bases:
object
GnnExplainer object used to request explanations from model predictions.
- property learning_rate: float
Get learning rate.
- Returns
learning rate
- Return type
float
- property marginalize: bool
Get value of marginalize.
- Returns
value of marginalize
- Return type
bool
- property num_optimization_steps: int
Get number of optimization steps.
- Returns
number of optimization steps
- Return type
int
- class pypgx.api.mllib.GnnExplanation(java_gnn_explanation)
Bases:
object
GnnExplanation object
- get_embedding()
Get the inferred embedding of the specified vertex.
- Returns
the embedding
- Return type
List[float]
- get_importance_graph()
Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.
- Returns
the importance graph
- Return type
- get_vertex_feature_importance()
Get the feature importances as a map from property to importance value.
- Returns
the feature importances.
- Return type
Dict[VertexProperty, float]
- get_vertex_importance_property()
Get the vertex property that contains the computed vertex importance.
- Returns
the vertex importance property
- Return type
- class pypgx.api.mllib.GraphWiseAttentionLayerConfig(java_config, params)
Bases:
GraphWiseBaseConvLayerConfig
GraphWise attention layer configuration.
- property activation_fn: Any
Get the activation function.
- property edge_to_edge_connection: Optional[bool]
Get the edge to edge connection.
- property edge_to_vertex_connection: Optional[bool]
Get the edge to vertex connection.
- property head_aggregation: Any
Get the aggregation operation for heads.
- property num_heads: int
Get the number of heads.
- property num_sampled_neighbors: int
Get the number of sampled neighbors.
- property vertex_to_edge_connection: Optional[bool]
Get the vertex to edge connection.
- property vertex_to_vertex_connection: Optional[bool]
Get the vertex to vertex connection.
- property weight_init_scheme: Any
Get the weight initialization scheme.
- class pypgx.api.mllib.GraphWiseConvLayerConfig(java_config, params)
Bases:
GraphWiseBaseConvLayerConfig
GraphWise conv layer configuration.
- property activation_fn: Any
Get the activation function.
- property edge_to_edge_connection: Optional[bool]
Get the edge to edge connection.
- property edge_to_vertex_connection: Optional[bool]
Get the edge to vertex connection.
- property neighbor_weight_property_name: str
Get the name of the property that stores the weight of the edge.
- property num_sampled_neighbors: int
Get the number of sampled neighbors.
- property vertex_to_edge_connection: Optional[bool]
Get the vertex to edge connection.
- property vertex_to_vertex_connection: Optional[bool]
Get the vertex to vertex connection.
- property weight_init_scheme: Any
Get the weight initialization scheme.
- class pypgx.api.mllib.GraphWiseDgiLayerConfig(java_config, params)
Bases:
GraphWiseEmbeddingConfig
GraphWise dgi layer configuration.
- get_corruption_function()
Return the corruption function
- Return type
- get_discriminator()
Return the discriminator
- Return type
str
- get_embedding_type()
Return the embedding type used by this config
- Return type
str
- get_readout_function()
Return the readout function
- Return type
str
- set_corruption_function(corruption_function)
Set the corruption function
- Parameters
corruption_function (CorruptionFunction) – the corruption function. Supported currently:
PermutationCorruption
- set_discriminator(discriminator)
Set the discriminator
- Parameters
discriminator (str) – The discriminator function. Supported currently: ‘bilinear’
- Return type
None
- set_readout_function(readout_function)
Set the readout function
- Parameters
readout_function (str) – The readout function. Supported currently: ‘mean’
- Return type
None
- class pypgx.api.mllib.GraphWiseDominantLayerConfig(java_config, params)
Bases:
GraphWiseEmbeddingConfig
GraphWise dominant layer configuration.
- get_alpha()
Return alpha.
- Returns
alpha of the decoder layer
- Return type
float
- get_decoder_layer_configs()
Get the configuration objects for the decoder layers.
- Returns
configuration of the decoder layer
- Return type
- get_embedding_type()
Return the embedding type used by this config
- Returns
embedding type
- Return type
str
- set_alpha(alpha)
Set the alpha parameter
- Parameters
alpha (float) – The alpha parameter to set.
- set_decoder_layer_configs(decoder_layer_configs)
Set the configuration objects for the decoder layers.
- Parameters
decoder_layer_configs (GraphWisePredictionLayerConfig) – configuration of the decoder layer
- Return type
- class pypgx.api.mllib.GraphWiseEmbeddingConfig
Bases:
object
GraphWise embedding configuration.
- get_embedding_type()
Return the embedding type used by this config
- Return type
str
- class pypgx.api.mllib.GraphWiseModel(java_graphwise_model, params=None)
Bases:
Model
GraphWise model object.
This is a base class for
UnsupervisedGraphWiseModel
andSupervisedGraphWiseModel
.- check_is_fitted()
Make sure the model is fitted.
- Returns
None
- Raise
RuntimeError if the model is not fitted
- Return type
None
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- property edge_input_feature_dim: int
Get the dimension of the edge input features.
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
Optional[float]
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- property loss: Optional[float]
Get the training loss.
- property vertex_input_feature_dim: int
Get the dimension of the vertex input features.
- class pypgx.api.mllib.GraphWiseModelConfig(java_graphwise_model_config)
Bases:
object
Graphwise Model Configuration class
- property backend: str
Get the backend.
- property batch_size: int
Get the batch size.
- property conv_layer_configs: List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
Get the conv layer configs.
- property edge_input_feature_dim: int
Get the edge input feature dimension.
- property edge_input_property_configs: Dict[str, InputPropertyConfig]
Get the edge input property configs.
- property edge_input_property_names: Optional[List[str]]
Get the edge input property names.
- property embedding_dim: int
Get the embedding dimension.
- property enable_accelerator: str
Get whether to use the accelerator if available.
- property fitted: bool
Whether or not the model is fitted.
- get_conv_layer_configs()
Return a list of conv layer configs
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- property input_feature_dim: int
Get the input feature dimension.
- property is_fitted: bool
Whether or not the model is fitted.
- property learning_rate: float
Get the learning rate.
- property normalize: bool
Whether or not normalization is enabled.
- property num_epochs: int
Get the number of epochs.
- property seed: int
Get the seed.
- set_batch_size(batch_size)
Set the batch size
- Parameters
batch_size (int) – batch size
- Return type
None
- set_edge_input_feature_dim(edge_input_feature_dim)
Set the edge input feature dimension
- Parameters
edge_input_feature_dim (int) – edge input feature dimension
- Return type
None
- set_embedding_dim(embedding_dim)
Set the embedding dimension
- Parameters
embedding_dim (int) – embedding dimension
- Return type
None
- set_enable_accelerator(enable_accelerator)
Set whether to use the accelerator if available
- Parameters
shuffle (bool) – enable accelerator flag
enable_accelerator (bool) –
- Return type
None
- set_fitted(fitted)
Set the fitted flag
- Parameters
fitted (bool) – fitted flag
- Return type
None
- set_input_feature_dim(input_feature_dim)
Set the input feature dimension
- Parameters
input_feature_dim (int) – input feature dimension
- Return type
None
- set_learning_rate(learning_rate)
Set the learning rate
- Parameters
learning_rate (int) – initial learning rate
- Return type
None
- set_normalize(normalize)
Whether or not normalization is enabled.
- Parameters
normalize (bool) –
- Return type
None
- set_num_epochs(num_epochs)
Set the number of epochs
- Parameters
num_epochs (int) – number of epochs
- Return type
None
- set_seed(seed)
Set the seed
- Parameters
seed (int) – seed
- Return type
None
- set_shuffle(shuffle)
Set the shuffling flag
- Parameters
shuffle (bool) – shuffling flag
- Return type
None
- set_standardize(standardize)
Set the standardize flag
- Parameters
standardize (bool) – standardize flag
- Return type
None
- set_training_loss(training_loss)
Set the training loss
- Parameters
training_loss (float) – training loss
- Return type
None
- set_weight_decay(weight_decay)
Set the weight decay
- Parameters
weight_decay (float) – weight decay
- Return type
None
- property shuffle: bool
Whether or not shuffling is enabled.
- property standardize: bool
Whether or not standardization is enabled.
- property training_loss: float
Get the training loss.
- property vertex_input_property_configs: Dict[str, InputPropertyConfig]
Get the vertex input property configs.
- property vertex_input_property_names: Optional[List[str]]
Get the vertex input property names.
- property weight_decay: float
Get the weight decay.
- class pypgx.api.mllib.GraphWisePredictionLayerConfig(java_config, params)
Bases:
object
GraphWise prediction layer configuration.
- class pypgx.api.mllib.InputPropertyConfig(java_config, params)
Bases:
object
Configuration class for handling input properties using one hot encoding method.
- property categorical: bool
Get whether the feature is categorical.
- Returns
whether the feature is categorical
- property property_name: str
Get the name of the feature that the configuration is used for.
- Returns
name of the feature that the configuration is used for
- class pypgx.api.mllib.MSELoss
Bases:
LossFunction
MSE loss for regression
- class pypgx.api.mllib.Model(java_generic_model)
Bases:
PgxContextManager
Model object
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- is_fitted()
Whether or not the model has been fitted.
- Returns
Always returns False since this base class cant be fitted.
- Return type
bool
- class pypgx.api.mllib.ModelLoader(analyst, java_model_loader, wrapper, java_class)
Bases:
object
ModelLoader object.
- Parameters
analyst (Analyst) –
java_model_loader (Any) –
wrapper (Callable) –
java_class (str) –
- db(model_store, model_name, username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)
Return a model stored in a database.
- Parameters
model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –
- Return type
Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel]
:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]
- Returns
The model stored in database.
- Return type
- Union[
“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”
]
- Parameters
model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –
- file(path, key)
Return an encrypted model stored in a file.
- Parameters
path (str) – Path of stored model.
key (str) – Used for encryption.
- Returns
The model stored in file.
- Return type
- Union[
“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”
]
- class pypgx.api.mllib.ModelRepository(java_generic_model_repository)
Bases:
object
ModelRepository object that exposes crud operations on - model stores and - the models within these model stores.
- create(model_store_name)
Create a new model store.
- Parameters
model_store_name (str) – The name of the model store.
- Return type
None
- delete_model(model_store_name, model_name)
Delete the model in the specified model store with the given model name.
- Parameters
model_store_name (str) – The name of the model store.
model_name (str) – The name under which the model was stored.
- Return type
None
- delete_model_store(model_store_name)
Delete a model store.
- Parameters
model_store_name (str) – The name of the model store.
- Return type
None
- get_model_description(model_store_name, model_name)
Retrieve the description of the model in the specified model store, with the given model name.
- Parameters
model_store_name (str) – The name of the model store.
model_name (str) – The name under which the model was stored.
- Returns
A string containing the description that was stored with the model.
- Return type
str
- list_model_stores_names()
List the names of all model stores in the model repository.
- Returns
List of names.
- Return type
List[str]
- list_model_stores_names_matching(regex)
List the names of all model stores in the model repository that match the regex.
- Parameters
regex (str) – A regex in form of a string.
- Returns
List of matching names.
- Return type
List[str]
- list_models(model_store_name)
List the models present in the model store with the given name.
- Parameters
model_store_name (str) – The name of the model store (non-prefixed).
- Returns
List of model names.
- Return type
List[str]
- class pypgx.api.mllib.ModelRepositoryBuilder(java_generic_model_repository_builder)
Bases:
object
ModelRepositoryBuilder object that can be used to configure the connection to a model repository.
- db(username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)
Connect to a model repository backed by a database.
- Parameters
username (Optional[str]) – username in database
password (Optional[str]) – password of username in database
jdbc_url (Optional[str]) – jdbc url of database
keystore_alias (Optional[str]) – the keystore alias to get the password in the keystore
schema (Optional[str]) – the schema of the model store in database
- Returns
A model repository configured to connect to a database.
- Return type
- class pypgx.api.mllib.ModelStorer(model)
Bases:
object
ModelStorer object.
- Parameters
model (Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel, Model]) –
- db(model_store, model_name, username=None, password=None, jdbc_url=None, model_description=None, overwrite=False, keystore_alias=None, schema=None)
Store a model to a database.
- Parameters
model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –
- Return type
None
:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]
- Raises
RuntimeError – If the model is not fitted.
- Parameters
model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
model_description (Optional[str]) –
overwrite (bool) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –
- Return type
None
- file(path, key, overwrite=False)
Store an encrypted model to a file.
- Parameters
path (str) – Path to store model.
key (str) – Key used for encryption.
overwrite (bool) – Boolean value for overwriting or not.
- Raises
RuntimeError – If the model is not fitted.
- Return type
None
- class pypgx.api.mllib.OneHotEncodingConfig(java_config, params)
Bases:
CategoricalPropertyConfig
Configuration class for handling categorical input properties.
- property categorical: bool
Get whether the feature is categorical.
- Returns
whether the feature is categorical
- property categorical_embedding_type: bool
Get the type of categorical embedding.
- Returns
type of categorical embedding
- property max_vocabulary_size: int
Get the maximum number of tokens allowed in the vocabulary of a categorical feature. The most frequent category values numbering max_vocabulary_size are kept, the rest are treated as OOV tokens.
- Returns
maximum number of tokens allowed
- property property_name: str
Get the name of the feature that the configuration is used for.
- Returns
name of the feature that the configuration is used for
- set_max_vocabulary_size(max_vocabulary_size)
Set max vocabulary size to a given value.
- Parameters
max_vocabulary_size (int) – set the maximum vocabulary size to the given value
- Return type
None
Set whether the feature is shared among vertex/edge types.
- Parameters
shared – set shared to the value
- Return type
None
Get whether the feature is shared among vertex/edge types.
- Returns
whether the feature is shared among vertex/edge types
- class pypgx.api.mllib.PermutationCorruption(java_permutation_corruption)
Bases:
CorruptionFunction
Permutation Function which shuffle the nodes to generate the corrupted subgraph for DGI
- class pypgx.api.mllib.Pg2vecModel(java_pg2vec_model)
Bases:
Model
Pg2Vec model object.
- property batch_size: int
Get the batch size.
- close()
Call destroy
- Return type
None
- compute_similars(graphlet_id, k)
Compute the top-k similar graphlets for a list of input graphlets.
- Parameters
graphlet_id (Union[Iterable[Union[int, str]], int, str]) – graphletIds or iterable of graphletIds
k (int) – number of similars to return
- Return type
- destroy()
Destroy this model object.
- Return type
None
- property enable_accelerator: bool
Get whether the accelerator is used if available.
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Return type
None
- property graphlet_id_property_name: str
Get the graphlet id property name.
- property graphlet_size_property_name: str
Get the graphlet size property name.
- infer_graphlet_vector(graph)
Return the inferred vector of the input graphlet as a PgxFrame.
- infer_graphlet_vector_batched(graph)
Return the inferred vectors of the input graphlets as a PgxFrame.
- is_fitted()
Whether or not the model has been fitted.
- Return type
bool
- property layer_size: int
Get the layer size.
- property learning_rate: float
Get the learning rate.
- property loss: Optional[float]
Get the loss of the model.
- property min_learning_rate: float
Get the minimum learning rate.
- property min_word_frequency: int
Get the minimum word frequency.
- property num_epochs: int
Get the number of epochs.
- property seed: Optional[int]
Get the seed.
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (Optional[str]) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Return type
None
- property trained_graphlet_vectors: PgxFrame
Get the trained graphlet vectors for the current pg2vec model.
- Returns
PgxFrame containing the trained graphlet vectors
- property use_graphlet_size: bool
Get the use graphlet size.
- property validation_fraction: float
Get the validation fraction.
validation_fraction is deprecated since 23.4, the loss now is computed on all samples
- property vertex_property_names: List[str]
Get the vertex property names.
- property walk_length: int
Get the walk length.
- property walks_per_vertex: int
Get the walks per vertex.
- property window_size: int
Get the window size.
- class pypgx.api.mllib.ProductEdgeCombinationMethod(use_source_vertex, use_destination_vertex, use_edge)
Bases:
EdgeCombinationMethod
Product method for edge embedding generation
- Parameters
use_source_vertex (bool) –
use_destination_vertex (bool) –
use_edge (bool) –
- get_aggregation_type()
Get the aggregation type
- Returns:
the aggregation type
- Return type
str
- use_dst_vertex()
Get if destination vertex embedding is used or not for the edge embedding.
- Returns
uses or not the destination vertex
- Return type
bool
- use_edge()
Get if edge features are used or not for the edge embedding
- Returns
uses or not the edge features
- Return type
bool
- use_src_vertex()
Get if source vertex embedding is used or not for the edge embedding.
- Returns
uses or not the source vertex
- Return type
bool
- class pypgx.api.mllib.SigmoidCrossEntropyLoss
Bases:
LossFunction
Sigmoid Cross Entropy loss for binary classification
- class pypgx.api.mllib.SoftmaxCrossEntropyLoss
Bases:
LossFunction
Softmax Cross Entropy loss for multi-class classification
- class pypgx.api.mllib.SupervisedEdgeWiseModel(java_edgewise_model, params=None)
Bases:
EdgeWiseModel
SupervisedEdgeWise model object.
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- evaluate(graph, edges, threshold=0.0)
Evaluate performance statistics for the specified edges.
- evaluate_labels(graph, edges, threshold=0.0)
Evaluate (macro averaged) classification performance statistics for the specified edges.
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Returns
None
- Return type
None
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_class_weights()
Get the class weights.
- Returns
a dictionary mapping classes to their weights.
- Return type
dict
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_edge_combination_method()
Get the edge combination method used to compute the edge embedding
- Returns
edge combination method
- Return type
EdgeCombinationMethod
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_edge_target_property_name()
Get the target property name
- Returns
target property name
- Return type
str
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_loss_function()
Get the loss function name.
- Returns
loss function name. Can be one of softmax_cross_entropy, sigmoid_cross_entropy, devnet
- Return type
str
- get_loss_function_class()
Get the loss function.
- Returns
loss function
- Return type
LossFunction
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_prediction_layer_configs()
Get the configuration objects for the prediction layers.
- Returns
configuration of the prediction layer
- Return type
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_target_edge_labels()
Get the target edge labels
- Returns
target edge labels
- Return type
List[str]
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
float
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- infer(graph, edges, threshold=0.0)
Infer the predictions for the specified edges
- Parameters
- Returns
PgxFrame containing the inference results for each edge
- Return type
- infer_embeddings(graph, edges)
Infer the embeddings for the specified edges
- infer_labels(graph, edges, threshold=0.0)
Infer the labels for the specified edges
- Parameters
- Returns
PgxFrame containing the labels for each vertex
- Return type
- infer_logits(graph, edges)
Infer the prediction logits for the specified edges
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Returns
None
- Return type
None
- class pypgx.api.mllib.SupervisedGnnExplainer(java_explainer, bool_label)
Bases:
GnnExplainer
SupervisedGnnExplainer used to request explanations from supervised model predictions.
- Parameters
bool_label (bool) –
- infer_and_explain(graph, vertex, threshold=0.0)
Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the prediction.
- Parameters
- Returns
explanation containing feature importance and vertex importance.
- Return type
- property learning_rate: float
Get learning rate.
- Returns
learning rate
- Return type
float
- property marginalize: bool
Get value of marginalize.
- Returns
value of marginalize
- Return type
bool
- property num_optimization_steps: int
Get number of optimization steps.
- Returns
number of optimization steps
- Return type
int
- class pypgx.api.mllib.SupervisedGnnExplanation(java_supervised_gnn_explanation, bool_label)
Bases:
GnnExplanation
SupervisedGnnExplanation object
- Parameters
bool_label (bool) –
- get_embedding()
Get the inferred embedding of the specified vertex.
- Returns
the embedding
- Return type
List[float]
- get_importance_graph()
Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.
- Returns
the importance graph
- Return type
- get_label()
Get the inferred label of the specified vertex.
- Returns
the label
- Return type
Any
- get_logits()
Get the inferred logits of the specified vertex.
- Returns
the logits
- Return type
List[float]
- get_vertex_feature_importance()
Get the feature importances as a map from property to importance value.
- Returns
the feature importances.
- Return type
Dict[VertexProperty, float]
- get_vertex_importance_property()
Get the vertex property that contains the computed vertex importance.
- Returns
the vertex importance property
- Return type
- class pypgx.api.mllib.SupervisedGraphWiseModel(java_graphwise_model, params=None)
Bases:
GraphWiseModel
SupervisedGraphWise model object.
- check_is_fitted()
Make sure the model is fitted.
- Returns
None
- Raise
RuntimeError if the model is not fitted
- Return type
None
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- property edge_input_feature_dim: int
Get the dimension of the edge input features.
- evaluate(graph, vertices, threshold=0.0)
Evaluate performance statistics for the specified vertices.
- Parameters
- Returns
PgxFrame containing the metrics
- Return type
- evaluate_labels(graph, vertices, threshold=0.0)
Evaluate (macro averaged) classification performance statistics for the specified vertices.
- Parameters
- Returns
PgxFrame containing the metrics
- Return type
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Returns
None
- Return type
None
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_class_weights()
Get the class weights.
- Returns
a dictionary mapping classes to their weights.
- Return type
dict
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_loss_function()
Get the loss function name.
- Returns
loss function name. Can be one of softmax_cross_entropy, sigmoid_cross_entropy, devnet
- Return type
str
- get_loss_function_class()
Get the loss function.
- Returns
loss function
- Return type
LossFunction
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_prediction_layer_configs()
Get the configuration objects for the prediction layers.
- Returns
configuration of the prediction layer
- Return type
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_target_vertex_labels()
Get the target vertex labels
- Returns
target vertex labels
- Return type
List[str]
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
Optional[float]
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- get_vertex_target_property_name()
Get the target property name
- Returns
target property name
- Return type
str
- gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False)
Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.
- Parameters
num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200
learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05
marginalize (bool, optional) – marginalize the loss over features, defaults to False
- Returns
SupervisedGnnExplainer object of this model
- Return type
- infer(graph, vertices, threshold=0.0)
Infer the predictions for the specified vertices
- Parameters
- Returns
PgxFrame containing the inference results for each vertex
- Return type
- infer_and_get_explanation(graph, vertex, num_optimization_steps=200, learning_rate=0.05, marginalize=False, threshold=0.0)
Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the prediction.
- Parameters
- Returns
explanation containing feature importance and vertex importance.
- Return type
- infer_embeddings(graph, vertices)
Infer the embeddings for the specified vertices
- infer_labels(graph, vertices, threshold=0.0)
Infer the labels for the specified vertices
- Parameters
- Returns
PgxFrame containing the labels for each vertex
- Return type
- infer_logits(graph, vertices)
Infer the prediction logits for the specified vertices
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- property loss: Optional[float]
Get the training loss.
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Returns
None
- Return type
None
- property vertex_input_feature_dim: int
Get the dimension of the vertex input features.
- class pypgx.api.mllib.UnsupervisedAnomalyDetectionGraphWiseModel(java_graphwise_model, params=None)
Bases:
UnsupervisedGraphWiseModel
UnsupervisedGraphWise model object.
- check_is_fitted()
Make sure the model is fitted.
- Returns
None
- Raise
RuntimeError if the model is not fitted
- Return type
None
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- property edge_input_feature_dim: int
Get the dimension of the edge input features.
- evaluate_anomaly_labels(graph, vertices, vertex_anomaly_property_name, anomaly_property_value, threshold)
Evaluate anomaly detection performance statistics for the specified vertices.
- Parameters
graph (PgxGraph) – the graph
vertices (Union[Iterable[PgxVertex], Iterable[int], Iterable[str]]) – the vertices to evaluate on
vertex_anomaly_property_name (str) – the name of the property containing the anomaly
anomaly_property_value (object) – the value indicating an anomaly in vertex_anomaly_property_name property
threshold (float) – the anomaly threshold
- Raises
LookupError – if the property is not found
- Returns
PgxFrame containing the evaluation results.
- Return type
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- find_anomaly_threshold(graph, vertices, contamination_factor)
Find an appropriate anomaly threshold for labeling the input vertices as anomalies, respecting the proportion given by the contamination factor
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Returns
None
- Return type
None
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_dgi_layer_config()
Get the configuration object for the dgi layer.
- Returns
configuration
- Return type
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_embedding_config()
Get the configuration object for the embedding method
- Returns
configuration
- Return type
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_loss_function()
Get the loss function name.
- Returns
loss function name. Can only be sigmoid_cross_entropy
- Return type
str
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
Optional[float]
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False, num_clusters=50, num_samples=10000)
Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.
- Parameters
num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200
learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05
marginalize (bool, optional) – marginalize the loss over features, defaults to False
num_clusters (int, optional) – number of clusters to use, defaults to 50
num_samples (int, optional) – number of samples to use, defaults to 10000
- Returns
UnsupervisedGnnExplainer object of this model
- Return type
- infer_and_get_explanation(graph, vertex, num_clusters=50, num_samples=10000, num_optimization_steps=200, learning_rate=0.05, marginalize=False)
Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.
- Parameters
- Returns
explanation containing feature importance and vertex importance.
- Return type
- infer_anomaly_labels(graph, vertices, threshold)
Infer the anomaly labels for the specified vertices.
- infer_anomaly_scores(graph, vertices)
Infer the anomaly scores for the specified vertices.
- infer_embeddings(graph, vertices)
Infer the embeddings for the specified vertices.
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- property loss: Optional[float]
Get the training loss.
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Returns
None
- Return type
None
- property vertex_input_feature_dim: int
Get the dimension of the vertex input features.
- class pypgx.api.mllib.UnsupervisedEdgeWiseModel(java_edgewise_model, params=None)
Bases:
EdgeWiseModel
UnsupervisedEdgeWise model object.
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Returns
None
- Return type
None
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_dgi_layer_config()
Get the configuration object for the dgi layer.
- Returns
configuration
- Return type
- get_edge_combination_method()
Get the edge combination method used to compute the edge embedding
- Returns
edge combination method
- Return type
EdgeCombinationMethod
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_loss_function()
Get the loss function name.
- Returns
loss function name: sigmoid_cross_entropy
- Return type
str
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_target_edge_labels()
Get the target edge labels
- Returns
target edge labels
- Return type
List[str]
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
float
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- infer_embeddings(graph, edges)
Infer the embeddings for the specified edges
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Returns
None
- Return type
None
- class pypgx.api.mllib.UnsupervisedGnnExplainer(java_explainer)
Bases:
GnnExplainer
UnsupervisedGnnExplainer used to request explanations from unsupervised model predictions.
- infer_and_explain(graph, vertex)
Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.
- Parameters
- Returns
Explanation containing feature importance and vertex importance.
- Return type
- property learning_rate: float
Get learning rate.
- Returns
learning rate
- Return type
float
- property marginalize: bool
Get value of marginalize.
- Returns
value of marginalize
- Return type
bool
- property num_clusters: int
Get number of clusters.
- Returns
Number of clusters.
- Return type
int
- property num_optimization_steps: int
Get number of optimization steps.
- Returns
number of optimization steps
- Return type
int
- property num_samples: int
Get number of samples.
- Returns
Number of samples.
- Return type
int
- class pypgx.api.mllib.UnsupervisedGnnExplanation(java_unsupervised_gnn_explanation)
Bases:
GnnExplanation
UnsupervisedGnnExplanation object
- get_embedding()
Get the inferred embedding of the specified vertex.
- Returns
the embedding
- Return type
List[float]
- get_importance_graph()
Get the importance Graph, that is, the computation graph with an additional vertex property indicating vertex importance. The additional importance property can be retrieved via get_vertex_importance_property.
- Returns
the importance graph
- Return type
- get_vertex_feature_importance()
Get the feature importances as a map from property to importance value.
- Returns
the feature importances.
- Return type
Dict[VertexProperty, float]
- get_vertex_importance_property()
Get the vertex property that contains the computed vertex importance.
- Returns
the vertex importance property
- Return type
- class pypgx.api.mllib.UnsupervisedGraphWiseModel(java_graphwise_model, params=None)
Bases:
GraphWiseModel
UnsupervisedGraphWise model object.
- check_is_fitted()
Make sure the model is fitted.
- Returns
None
- Raise
RuntimeError if the model is not fitted
- Return type
None
- close()
Call destroy
- Return type
None
- destroy()
Destroy this model object.
- Return type
None
- property edge_input_feature_dim: int
Get the dimension of the edge input features.
- export()
Return a ModelStorer object which can be used to save the model.
- Returns
ModelStorer object
- Return type
- fit(graph)
Fit the model on a graph.
- Parameters
graph (PgxGraph) – Graph to fit on
- Returns
None
- Return type
None
- get_batch_size()
Get the batch size
- Returns
batch size
- Return type
int
- get_config()
Return the GraphWiseModelConfig object
- Returns
the config
- Return type
- get_conv_layer_config()
Get the configuration objects for the convolutional layers
- Returns
configurations
- Return type
List[Union[GraphWiseConvLayerConfig, GraphWiseAttentionLayerConfig]]
- get_dgi_layer_config()
Get the configuration object for the dgi layer.
- Returns
configuration
- Return type
- get_edge_input_feature_dim()
Get the edges input feature dimension, that is, the dimension of all the input edge properties when concatenated
- Returns
edges input feature dimension
- Return type
int
- get_edge_input_property_configs()
Get the configuration objects for edge input properties
- Return type
List[InputPropertyConfig]
- get_edge_input_property_names()
Get the edges input feature names
- Returns
edges input feature names
- Return type
list(str)
- get_embedding_config()
Get the configuration object for the embedding method
- Returns
configuration
- Return type
- get_layer_size()
Get the dimension of the embeddings
- Returns
embedding dimension
- Return type
int
- get_learning_rate()
Get the initial learning rate
- Returns
initial learning rate
- Return type
float
- get_loss_function()
Get the loss function name.
- Returns
loss function name. Can only be sigmoid_cross_entropy
- Return type
str
- get_num_epochs()
Get the number of epochs to train the model
- Returns
number of epochs to train the model
- Return type
int
- get_seed()
Get the random seed
- Returns
random seed
- Return type
int
- get_training_loss()
Get the final training loss
- Returns
training loss
- Return type
Optional[float]
- get_vertex_input_feature_dim()
Get the input feature dimension, that is, the dimension of all the input vertex properties when concatenated
- Returns
input feature dimension
- Return type
int
- get_vertex_input_property_configs()
Get the configuration objects for vertex input properties
- Returns
configurations
- Return type
List[InputPropertyConfig]
- get_vertex_input_property_names()
Get the vertices input feature names
- Returns
vertices input feature names
- Return type
list(str)
- gnn_explainer(num_optimization_steps=200, learning_rate=0.05, marginalize=False, num_clusters=50, num_samples=10000)
Configure and return the GnnExplainer object of this model that can be used to request explanations of predictions.
- Parameters
num_optimization_steps (int, optional) – optimization steps for the explainer, defaults to 200
learning_rate (float, optional) – learning rate for the explainer, defaults to 0.05
marginalize (bool, optional) – marginalize the loss over features, defaults to False
num_clusters (int, optional) – number of clusters to use, defaults to 50
num_samples (int, optional) – number of samples to use, defaults to 10000
- Returns
UnsupervisedGnnExplainer object of this model
- Return type
- infer_and_get_explanation(graph, vertex, num_clusters=50, num_samples=10000, num_optimization_steps=200, learning_rate=0.05, marginalize=False)
Perform inference on the specified vertex and generate an explanation that contains scores of how important each property and each vertex in the computation graph is for the embeddings position relative to embeddings of other vertices in the graph.
- Parameters
- Returns
explanation containing feature importance and vertex importance.
- Return type
- infer_embeddings(graph, vertices)
Infer the embeddings for the specified vertices.
- is_fitted()
Check if the model is fitted
- Returns
True if the model is fitted, False otherwise
- Return type
bool
- property loss: Optional[float]
Get the training loss.
- store(path, key, overwrite=False)
Store the model in a file.
- Parameters
path (str) – Path where to store the model
key (str) – Encryption key
overwrite (bool) – Whether or not to overwrite pre-existing file
- Returns
None
- Return type
None
- property vertex_input_feature_dim: int
Get the dimension of the vertex input features.
- class pypgx.api.mllib._model_utils.ModelStorer(model)
Bases:
object
ModelStorer object.
- Parameters
model (Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel, Model]) –
- db(model_store, model_name, username=None, password=None, jdbc_url=None, model_description=None, overwrite=False, keystore_alias=None, schema=None)
Store a model to a database.
- Parameters
model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –
- Return type
None
:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]
- Raises
RuntimeError – If the model is not fitted.
- Parameters
model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
model_description (Optional[str]) –
overwrite (bool) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –
- Return type
None
- file(path, key, overwrite=False)
Store an encrypted model to a file.
- Parameters
path (str) – Path to store model.
key (str) – Key used for encryption.
overwrite (bool) – Boolean value for overwriting or not.
- Raises
RuntimeError – If the model is not fitted.
- Return type
None
- class pypgx.api.mllib._model_utils.ModelLoader(analyst, java_model_loader, wrapper, java_class)
Bases:
object
ModelLoader object.
- Parameters
analyst (Analyst) –
java_model_loader (Any) –
wrapper (Callable) –
java_class (str) –
- db(model_store, model_name, username=None, password=None, jdbc_url=None, keystore_alias=None, schema=None)
Return a model stored in a database.
- Parameters
model_store (str) – Model store in database.
model_name (str) – name of the model to store.
username (Optional[str]) – Username in database.
password (Optional[str]) – Password of username in database.
jdbc_url (Optional[str]) – JDBC url of database.
model_description (Optional[str]) – Description of model.
overwrite (bool) – Boolean value for overwriting or not.
keystore_alias (Optional[str]) – The keystore alias to get the password in the keystore.
schema (Optional[str]) –
- Return type
Union[SupervisedGraphWiseModel, Pg2vecModel, UnsupervisedGraphWiseModel, DeepWalkModel, SupervisedEdgeWiseModel, UnsupervisedEdgeWiseModel]
:type keystore_alias:Optional[str] :param schema: The schema of the model store in database. :type schema: Optional[str]
- Returns
The model stored in database.
- Return type
- Union[
“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”
]
- Parameters
model_store (str) –
model_name (str) –
username (Optional[str]) –
password (Optional[str]) –
jdbc_url (Optional[str]) –
keystore_alias (Optional[str]) –
schema (Optional[str]) –
- file(path, key)
Return an encrypted model stored in a file.
- Parameters
path (str) – Path of stored model.
key (str) – Used for encryption.
- Returns
The model stored in file.
- Return type
- Union[
“SupervisedGraphWiseModel”, “Pg2vecModel”, “UnsupervisedGraphWiseModel”, “DeepWalkModel”, “SupervisedEdgeWiseModel”
]