API package 

all_reachable_vertices_edges(graph, src, dst, k, filter=None)

Find all the vertices and edges on a path between the src and target of length smaller or equal to k.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – The source vertex
dst (PgxVertex) – The destination vertex
k (int) – The dimension of the distances property; i.e. number of high-degree vertices.
filter (Optional[EdgeFilter]) – The filter to be used on edges when searching for a path

Returns

The vertices on the path, the edges on the path and a map containing the distances from the source vertex for each vertex on the path

Return type

Tuple[VertexSet, EdgeSet, PgxMap]

approximate_vertex_betweenness_centrality(graph, seeds, bc='approx_betweenness')

Parameters

graph (PgxGraph) – Input graph
seeds (Union[VertexSet, int]) – The (unique) chosen nodes to be used to compute the approximated betweenness centrality coefficients
bc (Union[VertexProperty, str]) – Vertex property holding the betweenness centrality value for each vertex

Returns

Vertex property holding the computed scores

Example

graph = ....
betweenness = analyst.approximate_vertex_betweenness_centrality(
    graph, 100, 'bc')
result_set = graph.query_pgql(
    "SELECT x, x.bc MATCH (x) ORDER BY x.bc DESC")
result_set.print()

Return type

bipartite_check(graph, is_left='is_left')

Verify whether a graph is bipartite.

Parameters

graph (PgxGraph) – Input graph
is_left (Union[VertexProperty, str]) – (out-argument) vertex property holding the side of each vertex in a bipartite graph (true for left, false for right).

Returns

vertex property holding the side of each vertex in a bipartite graph (true for left, false for right).

Example

graph = ....
is_left = graph.get_or_create_vertex_property(
    "boolean", "is_left")
bipartite = analyst.bipartite_check(graph, is_left)
result_set = graph.query_pgql(
    "SELECT x, x.is_left MATCH (x) ORDER BY x.is_left DESC")
result_set.print()

Return type

center(graph, center=None)

Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.

The center is comprised by the set of vertices with eccentricity equal to the radius of the graph.

Parameters

graph (PgxGraph) – Input graph
center (Optional[Union[VertexSet, str]]) – (Out argument) vertex set holding the vertices from the center of the graph

Example

graph = ....
verticies = analyst.center(graph)
for vertex in verticies:
    print(vertex)

Return type

VertexSet

close()

Destroy without waiting for completion.

Return type: None

closeness_centrality(graph, cc='closeness')

Parameters

graph (PgxGraph) – Input graph
cc (Union[VertexProperty, str]) – Vertex property holding the closeness centrality

Example

graph = ....
closeness = analyst.closeness_centrality(graph, "closeness")
result_set = graph.query_pgql(
    "SELECT x, x.closeness MATCH (x) ORDER BY x.closeness DESC")
result_set.print()

Return type

communities_conductance_minimization(graph, max_iter=100, label='conductance_minimization')

Soman and Narang can find communities in a graph taking weighted edges into account.

Parameters

graph (PgxGraph) – Input graph
max_iter (int) – Maximum number of iterations that will be performed
label (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Example

graph = ....
communities = analyst.communities_conductance_minimization(
    graph, 100, 'conductance_min')
result_set = graph.query_pgql(
    "SELECT x, x.conductance_min MATCH (x) "
    "ORDER BY x.conductance_min DESC")
result_set.print()

Return type

communities_infomap(graph, rank, weight, tau=0.15, tol=0.0001, max_iter=100, label='infomap')

Infomap can find high quality communities in a graph.

Parameters

graph (PgxGraph) – Input graph
rank (VertexProperty) – Vertex property holding the normalized PageRank value for each vertex
weight (EdgeProperty) – Ridge property holding the weight of each edge in the graph
tau (float) – Damping factor
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter (int) – Maximum iteration number
label (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Example

graph = ....
cost = graph.get_or_create_edge_property("double", "cost")
pagerank = analyst.weighted_pagerank(
    graph, cost, norm=False, rank="weighted_pagerank")
partition = analyst.communities_infomap(graph, pagerank, cost)
first_component = partition.get_partition_by_index(0)
for vertex in first_component:
    print(vertex)

Return type

communities_label_propagation(graph, max_iter=100, label='label_propagation')

Label propagation can find communities in a graph relatively fast.

Parameters

graph (PgxGraph) – Input graph
max_iter (int) – Maximum number of iterations that will be performed
label (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object
label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Example

graph = ....
conductance = analyst.communities_label_propagation(
    graph, 100, 'label_propagation')
result_set = graph.query_pgql(
    "SELECT x, x.label_propagation MATCH (x) "
    "ORDER BY x.label_propagation DESC")
result_set.print()

Return type

compute_high_degree_vertices(graph, k, high_degree_vertex_mapping=None, high_degree_vertices=None)

Compute the k vertices with the highest degrees in the graph.

Parameters

graph (PgxGraph) – Input graph
k (int) – Number of high-degree vertices to be computed
high_degree_vertex_mapping (Optional[Union[PgxMap, str]]) – (out argument) map with the top k high-degree vertices and their indices
high_degree_vertices (Optional[Union[VertexSet, str]]) – (out argument) the high-degree vertices

Returns

a map with the top k high-degree vertices and their indices and a vertex set containing the same vertices

Return type

Tuple[PgxMap, VertexSet]

conductance(graph, partition, partition_idx)

Conductance assesses the quality of a partition in a graph.

Parameters

graph (PgxGraph) – Input graph
partition (PgxPartition) – Partition of the graph with the corresponding node collections
partition_idx (int) – Number of the component to be used for computing its conductance

Example

graph = ....
partition = analyst.communities_conductance_minimization(graph)
conductance = analyst.conductance(graph, partition, 0)
print(conductance)

Return type

float

count_triangles(graph, sort_vertices_by_degree)

Triangle counting gives an overview of the amount of connections between vertices in neighborhoods.

Parameters

graph (PgxGraph) – Input graph
sort_vertices_by_degree (bool) – Boolean flag for sorting the nodes by their degree as preprocessing step

Returns

The total number of triangles found

Example

graph = ....
result = analyst.count_triangles(graph, sort_vertices_by_degree=True)
print(result)

Return type

int

create_distance_index(graph, high_degree_vertex_mapping, high_degree_vertices, index=None)

Compute an index with distances to each high-degree vertex

Parameters

graph (PgxGraph) – Input graph
high_degree_vertex_mapping (PgxMap) – a map with the top k high-degree vertices and their indices and a vertex
high_degree_vertices (VertexSet) – the high-degree vertices
index (Optional[Union[VertexProperty, str]]) – (out-argument) the index containing the distances to each high-degree vertex for all vertices

Returns

the index containing the distances to each high-degree vertex for all vertices

Return type

deepwalk_builder(min_word_frequency=1, batch_size=128, num_epochs=2, layer_size=200, learning_rate=0.025, min_learning_rate=0.0001, window_size=5, walk_length=5, walks_per_vertex=4, sample_rate=0.0, negative_sample=10, *, seed=None, ignore_isolated=True)

Build a DeepWalk model and return it.

Parameters

min_word_frequency (int) – Minimum word frequency to consider before pruning
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
layer_size (int) – Number of dimensions for the output vectors
learning_rate (float) – Initial learning rate
min_learning_rate (float) – Minimum learning rate
window_size (int) – Window size to consider while training the model
walk_length (int) – Length of the walks
walks_per_vertex (int) – Number of walks to consider per vertex
sample_rate (float) – Sample rate
negative_sample (int) – Number of negative samples
seed (Optional[int]) – Random seed for training the model
ignore_isolated (bool) – Whether to ignore isolated vertices. If false, pseudo-walks consisting of only the node itself will be inserted into the dataset.

Returns

Built DeepWalk model

Return type

DeepWalkModel

Changed in version 23.4: The ignore_isolated parameter has been added. The validation_fraction parameter has been removed.

degree_centrality(graph, dc='degree')

Measure the centrality of the vertices based on its degree.

This lets you see how a vertex influences its neighborhood.

Parameters

graph (PgxGraph) – Input graph
dc (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

Example

graph = ....
degree = analyst.degree_centrality(graph, 'degree')
result_set = graph.query_pgql(
    "SELECT x, x.degree MATCH (x) ORDER BY x.degree DESC")
result_set.print()

Return type

destroy()

Destroy with waiting for completion.

Return type: None

diameter(graph, eccentricity='eccentricity')

Diameter/radius gives an overview of the distances in a graph.

Parameters

graph (PgxGraph) – Input graph
eccentricity (Union[VertexProperty, str]) – (Out argument) vertex property holding the eccentricity value for each vertex

Returns

Pair holding the diameter of the graph and a node property with eccentricity value for each node

Example

graph = ....
diameter = analyst.diameter(graph, 'eccentricity')
result_set = graph.query_pgql(
    "SELECT x, x.eccentricity MATCH (x) "
    "ORDER BY x.eccentricity DESC")
result_set.print()

Return type

Tuple[int, VertexProperty]

eigenvector_centrality(graph, tol=0.001, max_iter=100, l2_norm=False, in_edges=False, ec='eigenvector')

Eigenvector centrality gets the centrality of the vertices in an intrincated way using neighbors, allowing to find well-connected vertices.

Parameters

graph (PgxGraph) – Input graph
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter (int) – Maximum iteration number
l2_norm (bool) – Boolean flag to determine whether the algorithm will use the l2 norm (Euclidean norm) or the l1 norm (absolute value) to normalize the centrality scores
in_edges (bool) – Boolean flag to determine whether the algorithm will use the incoming or the outgoing edges in the graph for the computations
ec (Union[VertexProperty, str]) – Vertex property holding the resulting score for each vertex

Returns

Vertex property holding the computed scores

Example

graph = ....
eigenvector = analyst.eigenvector_centrality(
    graph, ec='eigenvector')
result_set = graph.query_pgql(
    "SELECT x, x.eigenvector MATCH (x) "
    "ORDER BY x.eigenvector DESC")
result_set.print()

Return type

enumerate_simple_paths(graph, src, dst, k, vertices_on_path, edges_on_path, dist)

Enumerate simple paths between the source and destination vertex.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – The source vertex
dst (PgxVertex) – The destination vertex
k (int) – maximum number of iterations
vertices_on_path (VertexSet) – VertexSet containing all vertices to be considered while enumerating paths
edges_on_path (EdgeSet) – EdgeSet containing all edges to be consider while enumerating paths
dist (PgxMap) – map containing the hop-distance from the source vertex to each vertex that is to be considered while enumerating the paths

Returns

Triple containing containing the path lengths, a vertex-sequence containing the vertices on the paths and edge-sequence containing the edges on the paths

Return type

Tuple[List[int], VertexSet, EdgeSet]

fattest_path(graph, root, capacity, distance='fattest_path_distance', parent='fattest_path_parent', parent_edge='fattest_path_parent_edge', ignore_edge_direction=False)

Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters.

Parameters

graph (PgxGraph) – Input graph
root (PgxVertex) – Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters
capacity (EdgeProperty) – Edge property holding the capacity of each edge in the graph
distance (Union[VertexProperty, str]) – Vertex property holding the capacity value of the fattest path up to the current vertex
parent (Union[VertexProperty, str]) – Vertex property holding the parent vertex of the each vertex in the fattest path
parent_edge (Union[VertexProperty, str]) – Vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges during the search

Returns

AllPaths object holding the information of the possible fattest paths from the source node

Example

graph = ....
root = graph.get_vertex("1")
cost = graph.get_or_create_edge_property("double", "cost")
fattest_path = analyst.fattest_path(graph, root, cost)
print(fattest_path)

Return type

filtered_bfs(graph, root, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')

Breadth-first search with an option to filter edges during the traversal of the graph.

Parameters

graph (PgxGraph) – Input graph
root (PgxVertex) – The source vertex from the graph for the path.
navigator (VertexFilter) – Navigator expression to be evaluated on the vertices during the graph traversal
init_with_inf (bool) – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.
max_depth (int) – Maximum depth limit for the BFS traversal
distance (Union[VertexProperty, str]) – Vertex property holding the hop distance for each reachable vertex in the graph
parent (Union[VertexProperty, str]) – Vertex property holding the parent vertex of the each reachable vertex in the path

Returns

Distance and parent vertex properties

Example

graph = ....
cost = graph.get_or_create_vertex_property("double", "cost")
vertex = graph.get_vertex("1")
navigator = VertexFilter("vertex.cost < 2")
bfs = analyst.filtered_bfs(
    graph, vertex, navigator, distance="distance")
result_set = graph.query_pgql(
    "SELECT x, x.distance MATCH (x) ORDER BY x.distance DESC")
result_set.print()

Return type

Tuple[VertexProperty, VertexProperty]

filtered_dfs(graph, root, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')

Depth-first search with an option to filter edges during the traversal of the graph.

Parameters

graph (PgxGraph) – Input graph
root (PgxVertex) – The source vertex from the graph for the path
navigator (VertexFilter) – Navigator expression to be evaluated on the vertices during the graph traversal
init_with_inf (bool) – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.
max_depth (int) – Maximum search depth
distance (Union[VertexProperty, str]) – Vertex property holding the hop distance for each reachable vertex in the graph
parent (Union[VertexProperty, str]) – Vertex property holding the parent vertex of the each reachable vertex in the path

Returns

Distance and parent vertex properties

Example

graph = ....
cost = graph.get_or_create_vertex_property("double", "cost")
vertex = graph.get_vertex("1")
navigator = VertexFilter("vertex.cost < 2")
dfs = analyst.filtered_dfs(
    graph, vertex, navigator, distance="distance")
result_set = graph.query_pgql(
    "SELECT x, x.distance MATCH (x) ORDER BY x.distance DESC")
result_set.print()

Return type

Tuple[VertexProperty, VertexProperty]

find_cycle(graph, src=None, vertex_seq=None, edge_seq=None)

Find cycle looks for any loop in the graph.

Parameters

graph (PgxGraph) – Input graph
src (Optional[PgxVertex]) – Source vertex for the search
vertex_seq (Optional[Union[VertexSequence, str]]) – (Out argument) vertex sequence holding the vertices in the cycle
edge_seq (Optional[Union[EdgeSequence, str]]) – (Out argument) edge sequence holding the edges in the cycle

Returns

PgxPath representing the cycle as path, if exists.

Example

graph = ....
cycle = analyst.find_cycle(graph)
print(cycle)

Return type

GraphWiseAttentionLayerConfig

get_deepwalk_model_loader()

Return a ModelLoader that can be used for loading a DeepWalkModel.

Returns: ModelLoader
Return type: ModelLoader

get_pg2vec_model_loader()

Return a ModelLoader that can be used for loading a Pg2vecModel.

Returns: ModelLoader
Return type: ModelLoader

get_supervised_edgewise_model_loader()

Return a ModelLoader that can be used for loading a SupervisedEdgeWiseModel.

Returns: ModelLoader
Return type: ModelLoader

get_supervised_graphwise_model_loader()

Return a ModelLoader that can be used for loading a SupervisedGraphWiseModel.

Returns: ModelLoader
Return type: ModelLoader

get_unsupervised_anomaly_detection_graphwise_loader()

Return a ModelLoader that can be used for loading a UnsupervisedAnomalyDetectionGraphWiseModel.

Returns: ModelLoader
Return type: ModelLoader

get_unsupervised_edgewise_model_loader()

Return a ModelLoader that can be used for loading a UnsupervisedEdgeWiseModel.

Returns: ModelLoader
Return type: ModelLoader

get_unsupervised_graphwise_model_loader()

Return a ModelLoader that can be used for loading a UnsupervisedGraphWiseModel.

Returns: ModelLoader
Return type: ModelLoader

graphwise_attention_layer_config(num_sampled_neighbors=10, num_heads=3, head_aggregation='mean', activation_fn='leaky_relu', weight_init_scheme='xavier_uniform', vertex_to_vertex_connection=None, edge_to_vertex_connection=None, vertex_to_edge_connection=None, edge_to_edge_connection=None, dropout_rate=0.0)

Build a GraphWise attention layer configuration and return it.

Parameters

num_sampled_neighbors (int) – Number of neighbors to sample
num_heads (int) – Number of heads
activation_fn (str) – Activation function. Supported functions: relu, leaky_relu, tanh, linear. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.
weight_init_scheme (str) – Initialization scheme for the weights in the layer. Supported schemes: xavier, xavier_uniform, ones, zeros. Note that biases are always initialized with zeros.
vertex_to_vertex_connection (Optional[bool]) – Use the connection between vertices to vertices. Should be used only on heterogeneous graphs
edge_to_vertex_connection (Optional[bool]) – Use the connection between edges to vertices. Should be used only on heterogeneous graphs
vertex_to_edges_connection – Use the connection between vertices to edges. Should be used only on heterogeneous graphs
edge_to_edges_connection – Use the connection between edges to edges. Should be used only on heterogeneous graphs
dropout_rate (float) – probability to drop each neuron
head_aggregation (str) –
vertex_to_edge_connection (Optional[bool]) –
edge_to_edge_connection (Optional[bool]) –

Returns

Built GraphWiseAttentionLayerConfig

Return type

graphwise_conv_layer_config(num_sampled_neighbors=10, neighbor_weight_property_name=None, activation_fn='relu', weight_init_scheme='xavier_uniform', vertex_to_vertex_connection=None, edge_to_vertex_connection=None, vertex_to_edge_connection=None, edge_to_edge_connection=None, dropout_rate=0.0)

Build a GraphWise conv layer configuration and return it.

Parameters

num_sampled_neighbors (int) – Number of neighbors to sample
neighbor_weight_property_name (Optional[str]) – Neighbor weight property name.
activation_fn (str) – Activation function. Supported functions: relu, leaky_relu, tanh, linear. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.
weight_init_scheme (str) – Initialization scheme for the weights in the layer. Supported schemes: xavier, xavier_uniform, ones, zeros. Note that biases are always initialized with zeros.
vertex_to_vertex_connection (Optional[bool]) – Use the connection between vertices to vertices. Should be used only on heterogeneous graphs
edge_to_vertex_connection (Optional[bool]) – Use the connection between edges to vertices. Should be used only on heterogeneous graphs
vertex_to_edges_connection – Use the connection between vertices to edges. Should be used only on heterogeneous graphs
edge_to_edges_connection – Use the connection between edges to edges. Should be used only on heterogeneous graphs
dropout_rate (float) – probability to drop each neuron
vertex_to_edge_connection (Optional[bool]) –
edge_to_edge_connection (Optional[bool]) –

Returns

Built GraphWiseConvLayerConfig

Return type

GraphWiseConvLayerConfig

graphwise_dgi_layer_config(corruption_function=None, readout_function='mean', discriminator='bilinear')

Build a GraphWise DGI layer configuration and return it.

Parameters

corruption_function(CorruptionFunction) – Corruption Function to use
readout_function(str) – Neighbor weight property name. Supported functions: mean
discriminator(str) – discriminator function. Supported functions: bilinear
corruption_function (Optional[CorruptionFunction]) –
readout_function (str) –
discriminator (str) –

Returns

GraphWiseDgiLayerConfig object

Return type

GraphWiseDgiLayerConfig

graphwise_dominant_layer_config(alpha=0.5, decoder_layer_config=None)

Build a GraphWise Dominant layer configuration and return it.

Parameters

alpha(float) – alpha parameter to balance feature reconstruction weight
decoder_layer_config (Optional[Iterable[GraphWisePredictionLayerConfig]]) – Decoder layer configuration as list of PredLayerConfig, or default if None
alpha (float) –

Returns

GraphWiseDgiLayerConfig object

graphwise_pred_layer_config(hidden_dim=None, activation_fn='relu', weight_init_scheme='xavier_uniform', dropout_rate=0.0)

Build a GraphWise prediction layer configuration and return it.

Parameters

hidden_dim (Optional[int]) – Hidden dimension. If this is the last layer, this setting will be ignored and replaced by the number of classes.
activation_fn (str) – Activation function. Supported functions: relu, leaky_relu, tanh, linear. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.
weight_init_scheme (str) – Initialization scheme for the weights in the layer. Supportes schemes: xavier, xavier_uniform, ones, zeros. Note that biases are always initialized with zeros.
dropout_rate (float) – probability to drop each neuron.

Returns

Built GraphWisePredictionLayerConfig

Return type

GraphWisePredictionLayerConfig

hits(graph, max_iter=100, auth='authorities', hubs='hubs')

Hyperlink-Induced Topic Search (HITS) assigns ranking scores to the vertices, aimed to assess the quality of information and references in linked structures.

Parameters

graph (PgxGraph) – Input graph
max_iter (int) – Number of iterations that will be performed
auth (Union[VertexProperty, str]) – Vertex property holding the authority score for each vertex
hubs (Union[VertexProperty, str]) – Vertex property holding the hub score for each vertex

Returns

Two vertex properties holding the computed scores

Example

graph = ....
hits = analyst.hits(graph, auth='authorities', hubs='hubs')
result_set = graph.query_pgql(
    "SELECT x, x.authorities, x.hubs "
    "MATCH (x) ORDER BY x.authorities DESC")
result_set.print()

Return type

Tuple[VertexProperty, VertexProperty]

in_degree_centrality(graph, dc='in_degree')

Measure the in-degree centrality of the vertices based on its degree.

This lets you see how a vertex influences its neighborhood.

Parameters

graph (PgxGraph) – Input graph
dc (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

Example

graph = ....
degree = analyst.in_degree_centrality(
    graph, dc='in_deg_centrality')
result_set = graph.query_pgql(
    "SELECT x, x.in_deg_centrality MATCH (x) "
    "ORDER BY x.in_deg_centrality DESC")
result_set.print()

Return type

in_degree_distribution(graph, dist_map=None)

Calculate the in-degree distribution.

In-degree distribution gives information about the incoming flows in a graph.

Parameters

graph (PgxGraph) – Input graph
dist_map (Optional[Union[PgxMap, str]]) – (Out argument) map holding a histogram of the vertex degrees in the graph

Returns

Map holding a histogram of the vertex degrees in the graph

Return type

SupervisedEdgeWiseModel

k_core(graph, min_core=0, max_core=2147483647, kcore='kcore')

k-core decomposes a graph into layers revealing subgraphs with particular properties.

Parameters

graph (PgxGraph) – Input graph
min_core (int) – Minimum k-core value
max_core (int) – Maximum k-core value
kcore (Union[VertexProperty, str]) – Vertex property holding the result value

Returns

Pair holding the maximum core found and a node property with the largest k-core value for each node.

Example

graph = ....
kcore = analyst.k_core(graph, kcore='kcore')
result_set = graph.query_pgql(
    "SELECT x, x.kcore MATCH (x) ORDER BY x.kcore DESC")
result_set.print()

Return type

Tuple[int, VertexProperty]

learned_embedding_categorical_property_config(property_name=None, shared=True, max_vocabulary_size=10000, embedding_dim=None, oov_probability=0.0)

Build a learned embedding table configuration for a categorical feature and return it. :param property_name: Name of the feature that the config will apply to :param shared: whether the feature is treated as shared globally among vertex/edge types

or considered as separate features per type.

Parameters

max_vocabulary_size (int) – Maximum vocabulary size for categories. The most frequent categories numbering “max_vocabulary_size” are kept. Category values below this cutoff are not recorded and set to the OOV token.
embedding_dim (Optional[int]) – the dimension of the vectors encoding categories in the embedding table.
oov_probability (float) – the probability to set category values in the input data to the OOV token randomly during training to learn a meaningful OOV embedding. This procedure is disabled during inference.
property_name (Optional[str]) –
shared (bool) –

Returns

Built EmbeddingTableConfig

Return type

EmbeddingTableConfig

limited_shortest_path_hop_dist(graph, src, dst, max_hops, high_degree_vertex_mapping, high_degree_vertices, index, path_vertices=None, path_edges=None)

Compute the shortest path between the source and destination vertex.

The algorithm only considers paths up to a length of k.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – The source vertex
dst (PgxVertex) – The destination vertex
max_hops (int) – The maximum number of edges to follow when trying to find a path
high_degree_vertex_mapping (PgxMap) – Map with the top k high-degree vertices and their indices
high_degree_vertices (VertexSet) – The high-degree vertices
index (VertexProperty) – Index containing distances to high-degree vertices
path_vertices (Optional[Union[VertexSequence, str]]) – (out-argument) will contain the vertices on the found path or will be empty if there is none
path_edges (Optional[Union[EdgeSequence, str]]) – (out-argument) will contain the vertices on the found path or will be empty if there is none

Returns

A tuple containing the vertices in the shortest path from src to dst and the edges on the path. Both will be empty if there is no path within maxHops steps

Return type

Tuple[VertexSequence, EdgeSequence]

limited_shortest_path_hop_dist_filtered(graph, src, dst, max_hops, high_degree_vertex_mapping, high_degree_vertices, index, filter, path_vertices=None, path_edges=None)

Compute the shortest path between the source and destination vertex.

The algorithm only considers paths up to a length of k.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – The source vertex
dst (PgxVertex) – The destination vertex
max_hops (int) – The maximum number of edges to follow when trying to find a path
high_degree_vertex_mapping (PgxMap) – Map with the top k high-degree vertices and their indices
high_degree_vertices (VertexSet) – The high-degree vertices
index (VertexProperty) – Index containing distances to high-degree vertices
filter (EdgeFilter) – Filter to be evaluated on the edges when searching for a path
path_vertices (Optional[Union[VertexSequence, str]]) – (out-argument) will contain the vertices on the found path or will be empty if there is none
path_edges (Optional[Union[EdgeSequence, str]]) – (out-argument) will contain the vertices on the found path or will be empty if there is none

Returns

A tuple containing the vertices in the shortest path from src to dst and the edges on the path. Both will be empty if there is no path within maxHops steps

Return type

Tuple[VertexSequence, EdgeSequence]

load_deepwalk_model(path, key)

Load an encrypted DeepWalk model.

Parameters

path (str) – Path to model
key (Optional[str]) – The decryption key, or None if no encryption was used

Returns

Loaded model

Return type

DeepWalkModel

load_pg2vec_model(path, key)

Load an encrypted pg2vec model.

Parameters

path (str) – Path to model
key (Optional[str]) – The decryption key, or None if no encryption was used

Returns

Loaded model

Return type

Pg2vecModel

load_supervised_edgewise_model(path, key)

Load an encrypted SupervisedEdgeWise model.

Parameters

path (str) – Path to model
key (Optional[str]) – The decryption key, or None if no encryption was used

Returns

Loaded model

Return type

load_supervised_graphwise_model(path, key)

Load an encrypted SupervisedGraphWise model.

Parameters

path (str) – Path to model
key (Optional[str]) – The decryption key, or None if no encryption was used

Returns

Loaded model

Return type

SupervisedGraphWiseModel

load_unsupervised_edgewise_model(path, key)

Load an encrypted UnsupervisedEdgeWise model.

Parameters

path (str) – Path to model
key (Optional[str]) – The decryption key, or None if no encryption was used

Returns

Loaded model

Return type

UnsupervisedEdgeWiseModel

load_unsupervised_graphwise_model(path, key)

Load an encrypted UnsupervisedGraphWise model.

Parameters

path (str) – Path to model
key (str) – The decryption key, or null if no encryption was used

Returns

Loaded model

Return type

UnsupervisedGraphWiseModel

local_clustering_coefficient(graph, lcc='lcc', ignore_edge_direction=False)

LCC gives information about potential clustering options in a graph.

Parameters

graph (PgxGraph) – Input graph
lcc (Union[VertexProperty, str]) – Vertex property holding the lcc value for each vertex
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges during the search

Returns

Vertex property holding the lcc value for each vertex

Example

graph = ....
lcc = analyst.local_clustering_coefficient(graph, lcc='lcc'
, ignore_edge_direction=True)
result_set = graph.query_pgql(
    "SELECT x, x.lcc MATCH (x) ORDER BY x.lcc DESC")
result_set.print()

Return type

louvain(graph, weight, max_iter=100, nbr_pass=1, tol=0.0001, community='community')

Louvain to detect communities in a graph

Parameters

graph (PgxGraph) – Input graph.
weight (EdgeProperty) – Weights of the edges of the graph.
max_iter (int) – Maximum number of iterations that will be performed during each pass.
nbr_pass (int) – Number of passes that will be performed.
tol (float) – maximum tolerated error value, the algorithm will stop once the graph’s total modularity gain becomes smaller than this value.
community (Union[VertexProperty, str]) – Vertex property holding the community ID assigned to each vertex

Returns

Community IDs vertex property

Example

graph = ....
cost = graph.get_or_create_edge_property("double", "cost")
partition = analyst.louvain(
    graph, cost, community="community")
result_set = graph.query_pgql(
    "SELECT x, x.community MATCH (x) ORDER BY x.community DESC")
result_set.print()

Return type

MatrixFactorizationModel

matrix_factorization_gradient_descent(bipartite_graph, weight, learning_rate=0.001, change_per_step=1.0, lbd=0.15, max_iter=100, vector_length=10, features='features')

Note

The input graph must be a bipartite graph, you can use analyst.bipartite_check.

Parameters

bipartite_graph (BipartiteGraph) – Input graph between 1 and 5, the result will become inaccurate.
learning_rate (float) – Learning rate for the optimization process
change_per_step (float) – Parameter used to modulate the learning rate during the optimization process
lbd (float) – Penalization parameter to avoid over-fitting during optimization process
max_iter (int) – Maximum number of iterations that will be performed
vector_length (int) – Size of the feature vectors to be generated for the factorization
features (Union[VertexProperty, str]) – Vertex property holding the generated feature vectors for each vertex. This function accepts names and VertexProperty objects.
weight (EdgeProperty) –

Returns

Matrix factorization model holding the feature vectors found by the algorithm

Example

graph = ....
cost = graph.get_or_create_edge_property("double", "cost")
model = analyst.matrix_factorization_gradient_descent(graph, cost)
v = get_vertex("1")
print(model.get_estimated_ratings(v))

Return type

matrix_factorization_recommendations(bipartite_graph, user, vector_length, feature, estimated_rating=None)

Complement for Matrix Factorization.

The generated feature vectors will be used for making predictions in cases where the given user vertex has not been related to a particular item from the item set. Similarly to the recommendations from matrix factorization, this algorithm will perform dot products between the given user vertex and the rest of vertices in the graph, giving a score of 0 to the items that are already related to the user and to the products with other user vertices, hence returning the results of the dot products for the unrelated item vertices. The scores from those dot products can be interpreted as the predicted scores for the unrelated items given a particular user vertex.

Parameters

bipartite_graph (BipartiteGraph) – Bipartite input graph
user (PgxVertex) – Vertex from the left (user) side of the graph
vector_length (int) – size of the feature vectors
feature (VertexProperty) – vertex property holding the feature vectors for each vertex
estimated_rating (Optional[Union[VertexProperty, str]]) – (out argument) vertex property holding the estimated rating score for each vertex

Returns

vertex property holding the estimated rating score for each vertex

Return type

model_repository()

Get model repository builder for CRUD access to model stores.

Return type: ModelRepositoryBuilder

one_hot_encoding_categorical_property_config(property_name=None, shared=True, max_vocabulary_size=10000)

Build a learned embedding table configuration for a categorical feature and return it. :param property_name: Name of the feature that the config will apply to :param shared: whether the feature is treated as shared globally among vertex/edge types

or considered as separate features per type.

Parameters

max_vocabulary_size (int) – Maximum vocabulary size for categories. The most frequent categories numbering “max_vocabulary_size” are kept. Category values below this cutoff are not recorded and set to the OOV token.
property_name (Optional[str]) –
shared (bool) –

Returns

Built OneHotEncodingConfig

Return type

OneHotEncodingConfig

out_degree_centrality(graph, dc='out_degree')

Measure the out-degree centrality of the vertices based on its degree.

This lets you see how a vertex influences its neighborhood.

Parameters

graph (PgxGraph) – Input graph
dc (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

Example

graph = ....
degree = analyst.out_degree_centrality(
    graph, dc='out_deg_centrality')
result_set = graph.query_pgql(
    "SELECT x, x.out_deg_centrality MATCH (x) "
    "ORDER BY x.out_deg_centrality DESC")
result_set.print()

Return type

out_degree_distribution(graph, dist_map=None)

Parameters

graph (PgxGraph) – Input graph
dist_map (Optional[Union[PgxMap, str]]) – (Out argument) map holding a histogram of the vertex degrees in the graph

Returns

Map holding a histogram of the vertex degrees in the graph

Example

graph = ....
distribution = graph.create_map('integer', 'long')
analyst.out_degree_distribution(graph, distribution)
for d in distribution:
    print(d)

Return type

pagerank(graph, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='pagerank')

Parameters

graph (PgxGraph) – Input graph
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor
max_iter (int) – Maximum number of iterations that will be performed
norm (bool) – Determine whether the algorithm will take into account dangling vertices for the ranking scores.
rank (Union[VertexProperty, str]) – Vertex property holding the PageRank value for each vertex, or name for a new property

Returns

Vertex property holding the PageRank value for each vertex

Example

graph = ....
pagerank = analyst.pagerank(graph, rank='pagerank')
result_set = graph.query_pgql(
    "SELECT x, x.pagerank MATCH (x)"
    " ORDER BY x.pagerank DESC")
result_set.print()

Return type

pagerank_approximate(graph, tol=0.001, damping=0.85, max_iter=100, rank='approx_pagerank')

Parameters

graph (PgxGraph) – Input graph
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor
max_iter (int) – Maximum number of iterations that will be performed
rank (Union[VertexProperty, str]) – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the PageRank value for each vertex

Example

graph = ....
pagerank = analyst.pagerank_approximate(
    graph, rank='approx_pagerank')
result_set = graph.query_pgql(
    "SELECT x, x.approx_pagerank MATCH (x) "
    "ORDER BY x.approx_pagerank DESC")
result_set.print()

Return type

partition_conductance(graph, partition)

Partition conductance assesses the quality of many partitions in a graph.

Parameters

graph (PgxGraph) – Input graph
partition (PgxPartition) – Partition of the graph with the corresponding node collections

Example

graph = ....
partition = analyst.communities_conductance_minimization(graph)
conductance = analyst.partition_conductance(graph, partition)
print(conductance)

Return type

Tuple[float, float]

partition_modularity(graph, partition)

Modularity summarizes information about the quality of components in a graph.

Parameters

graph (PgxGraph) – Input graph
partition (PgxPartition) – Partition of the graph with the corresponding node collections

Returns

Scalar (double) to store the conductance value of the given cut

Example

graph = ....
partition = analyst.communities_conductance_minimization(graph)
modularity = analyst.partition_modularity(graph, partition)
print(modularity)

Return type

float

periphery(graph, periphery=None)

Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.

Parameters

graph (PgxGraph) – Input graph
periphery (Optional[Union[VertexSet, str]]) – (Out argument) vertex set holding the vertices from the periphery of the graph

Returns

Vertex set holding the vertices from the periphery of the graph

Example

graph = ....
periphery = analyst.periphery(graph)
for vertex in periphery:
    print(vertex)

Return type

VertexSet

personalized_pagerank(graph, v, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_pagerank')

Personalized PageRank for a vertex of interest.

Compares and spots out important vertices in a graph.

Parameters

graph (PgxGraph) – Input graph
v (Union[VertexSet, PgxVertex]) – The chosen vertex from the graph for personalization
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor
max_iter (int) – Maximum number of iterations that will be performed
norm (bool) – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores.
rank (Union[VertexProperty, str]) – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the computed scores

Example

graph = ....
vertex = graph.get_vertex("1")
pagerank = analyst.personalized_pagerank(
    graph, vertex, rank='perso_pagerank')
result_set = graph.query_pgql(
    "SELECT x, x.perso_pagerank MATCH (x) "
    "ORDER BY x.perso_pagerank DESC")
result_set.print()

Return type

personalized_salsa(bipartite_graph, v, tol=0.001, damping=0.85, max_iter=100, rank='personalized_salsa')

Personalized SALSA for a vertex of interest.

Assesses the quality of information and references in linked structures.

Note

The input graph must be a bipartite graph, you can use analyst.bipartite_check.

Parameters

bipartite_graph (BipartiteGraph) – Bipartite graph
v (Union[VertexSet, PgxVertex]) – The chosen vertex from the graph for personalization
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor to modulate the degree of personalization of the scores by the algorithm
max_iter (int) – Maximum number of iterations that will be performed
rank (Union[VertexProperty, str]) – (Out argument) vertex property holding the normalized authority/hub ranking score for each vertex

Returns

Vertex property holding the computed scores

Example

graph = ....
vertex = graph.get_vertex("1")
salsa = analyst.personalized_salsa(
    graph, vertex, rank='personalized_salsa')
result_set = graph.query_pgql(
    "SELECT x, x.personalized_salsa MATCH (x) "
    "ORDER BY x.personalized_salsa DESC")
result_set.print()

Return type

personalized_weighted_pagerank(graph, v, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_weighted_pagerank')

Parameters

graph (PgxGraph) – Input graph
v (Union[VertexSet, PgxVertex]) – The chosen vertex from the graph for personalization
weight (EdgeProperty) – Edge property holding the weight of each edge in the graph
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor
max_iter (int) – Maximum number of iterations that will be performed
norm (bool) – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores
rank (Union[VertexProperty, str]) – Vertex property holding the PageRank value for each vertex

Example

graph = ....
vertex = graph.get_vertex("1")
cost = graph.get_or_create_edge_property("double", "cost")
pagerank = analyst.personalized_weighted_pagerank(
    graph, vertex, cost, norm=False, rank="pagerank")
result_set = graph.query_pgql(
    "SELECT x, x.pagerank MATCH (x) ORDER BY x.pagerank DESC")
result_set.print()

Return type

pg2vec_builder(graphlet_id_property_name, vertex_property_names, min_word_frequency=1, batch_size=128, num_epochs=5, layer_size=200, learning_rate=0.04, min_learning_rate=0.0001, window_size=4, walk_length=8, walks_per_vertex=5, graphlet_size_property_name='graphletSize-Pg2vec', use_graphlet_size=True, validation_fraction=None, seed=None)

Build a pg2Vec model and return it.

Parameters

graphlet_id_property_name (str) – Property name of the graphlet-id in the input graph
vertex_property_names (List[str]) – Property names to consider for pg2vec model training
min_word_frequency (int) – Minimum word frequency to consider before pruning
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
layer_size (int) – Number of dimensions for the output vectors
learning_rate (float) – Initial learning rate
min_learning_rate (float) – Minimum learning rate
window_size (int) – Window size to consider while training the model
walk_length (int) – Length of the walks
walks_per_vertex (int) – Number of walks to consider per vertex
graphlet_size_property_name (str) – Property name for graphlet size
use_graphlet_size (bool) – Whether to use or not the graphlet size
validation_fraction (Optional[float]) – Fraction of training data on which to compute final loss (deprecated since 23.4)
seed (Optional[int]) – Seed

Returns

Built Pg2Vec Model

Return type

Pg2vecModel

prim(graph, weight, mst='mst')

Prim reveals tree structures with shortest paths in a graph.

Parameters

graph (PgxGraph) – Input graph
weight (EdgeProperty) – Edge property holding the weight of each edge in the graph
mst (Union[EdgeProperty, str]) – Edge property holding the edges belonging to the minimum spanning tree of the graph

Returns

Edge property holding the edges belonging to the minimum spanning tree of the graph (i.e. all the edges with in_mst=true)

Example

graph = ....
cost = graph.get_or_create_edge_property("double", "cost")
prim = analyst.prim(graph, cost, mst="mst")
result_set = graph.query_pgql(
    "SELECT x, x.mst MATCH (x) ORDER BY x.mst DESC")
result_set.print()

Return type

radius(graph, eccentricity='eccentricity')

Radius gives an overview of the distances in a graph. it is computed as the minimum graph eccentricity.

Parameters

graph (PgxGraph) – Input graph
eccentricity (Union[VertexProperty, str]) – (Out argument) vertex property holding the eccentricity value for each vertex

Returns

Pair holding the radius of the graph and a node property with eccentricity value for each node

Example

graph = ....
diameter = analyst.radius(
    graph, eccentricity='eccentricity')
result_set = graph.query_pgql(
    "SELECT x, x.eccentricity MATCH (x) "
    "ORDER BY x.eccentricity DESC")
result_set.print()

Return type

Tuple[int, VertexProperty]

random_walk_with_restart(graph, source, length, reset_prob, visit_count=None)

Perform a random walk over the graph.

The walk will start at the given source vertex and will randomly visit neighboring vertices in the graph, with a probability equal to the value of reset_probability of going back to the starting point. The random walk will also go back to the starting point every time it reaches a vertex with no outgoing edges. The algorithm will stop once it reaches the specified walk length.

Parameters

graph (PgxGraph) – Input graph
source (PgxVertex) – Starting point of the random walk
length (int) – Length (number of steps) of the random walk
reset_prob (float) – Probability value for resetting the random walk
visit_count (Optional[PgxMap]) – (out argument) map holding the number of visits during the random walk for each vertex in the graph

Returns

map holding the number of visits during the random walk for each vertex in the graph

Example

graph = ....
src = graph.get_vertex("1")
visits = analyst.random_walk_with_restart(
    graph, src, length=100, reset_prob=0.6)
for visit in visits:
    print(visit)

Return type

reachability(graph, src, dst, max_hops, ignore_edge_direction)

Reachability is a fast way to check if two vertices are reachable from each other.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source vertex for the search
dst (PgxVertex) – Destination vertex for the search
max_hops (int) – Maximum hop distance between the source and destination vertices
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges during the search

Returns

The number of hops between the vertices. It will return -1 if the vertices are not connected or are not reachable given the condition of the maximum hop distance allowed.

Example

graph = ....
src = graph.get_vertex("1")
dst = graph.get_vertex("2")
result = analyst.reachability(
    graph, src, dst, 2, ignore_edge_direction=False)
print(result)

Return type

int

salsa(bipartite_graph, tol=0.001, max_iter=100, rank='salsa')

Stochastic Approach for Link-Structure Analysis (SALSA) computes ranking scores.

It assesses the quality of information and references in linked structures.

Note

The input graph must be a bipartite graph, you can use analyst.bipartite_check.

Parameters

bipartite_graph (BipartiteGraph) – Bipartite graph
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
max_iter (int) – Maximum number of iterations that will be performed
rank (Union[VertexProperty, str]) – Vertex property holding the value for each vertex in the graph

Returns

Vertex property holding the computed scores

Example

graph = ....
analyst.salsa(graph, rank='salsa')
result_set = graph.query_pgql(
    "SELECT x, x.salsa MATCH (x) ORDER BY x.salsa DESC")
result_set.print()

Return type

scc_kosaraju(graph, label='scc_kosaraju')

Kosaraju finds strongly connected components in a graph.

Parameters

graph (PgxGraph) – Input graph
label (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm

Example

graph = ....
pd = graph.create_vertex_property(data_type="long")
scc = analyst.scc_kosaraju(graph, 'scc')
result_set = graph.query_pgql(
    "SELECT x, x.scc MATCH (x) ORDER BY x.scc DESC")
result_set.print()

Return type

scc_tarjan(graph, label='scc_tarjan')

Tarjan finds strongly connected components in a graph.

Parameters

graph (PgxGraph) – Input graph
label (Union[VertexProperty, str]) – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm

Example

graph = ....
scc = analyst.scc_tarjan(graph, label='scc_tarjan')
result_set = graph.query_pgql(
    "SELECT x, x.scc_tarjan MATCH (x) "
    "ORDER BY x.scc_tarjan DESC")
result_set.print()

Return type

shortest_path_bellman_ford(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge', ignore_edge_direction=False)

Bellman-Ford finds multiple shortest paths at the same time.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
distance (Union[VertexProperty, str]) – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
weight (EdgeProperty) – Edge property holding the weight of each edge in the graph
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges during the search
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

Example

graph = ....
src = graph.get_vertex("1")
cost = graph.get_or_create_edge_property("double", "cost")
paths = analyst.shortest_path_bellman_ford(
    graph, src, cost, distance="bellman_ford_distance")
result_set = graph.query_pgql(
    "SELECT x, x.bellman_ford_distance MATCH (x) "
    "ORDER BY x.bellman_ford_distance DESC")
result_set.print()

Return type

shortest_path_bellman_ford_reversed(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')

Reversed Bellman-Ford finds multiple shortest paths at the same time.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
distance (Union[VertexProperty, str]) – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
weight (EdgeProperty) – Edge property holding the weight of each edge in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path.
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path.

Returns

AllPaths holding the information of the possible shortest paths from the source node.

Return type

shortest_path_bidirectional_dijkstra(graph, src, dst, weight, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge', ignore_edge_direction=False)

Bidirectional dijkstra is a fast algorithm for finding a shortest path in a graph.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
dst (PgxVertex) – Destination node
weight (EdgeProperty) – Edge property holding the (positive) weight of each edge in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges

Returns

PgxPath holding the information of the shortest path, if it exists

Example

graph = ....
src = graph.get_vertex("1")
dst = graph.get_vertex("2")
cost = graph.get_or_create_edge_property("double", "cost")
filter = EdgeFilter("edge.cost > 5")
path = analyst.shortest_path_bidirectional_dijkstra(
    graph, src, dst, cost, filter)
print(path)

Return type

shortest_path_dijkstra(graph, src, dst, weight, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge', ignore_edge_direction=False)

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
dst (PgxVertex) – Destination node
weight (EdgeProperty) – Edge property holding the (positive) weight of each edge in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges

Returns

PgxPath holding the information of the shortest path, if it exists

Example

graph = ....
src = graph.get_vertex("1")
dst = graph.get_vertex("2")
cost = graph.get_or_create_edge_property("double", "cost")
path = analyst.shortest_path_dijkstra(
    graph, src, dst, cost)
print(path)

Return type

shortest_path_filtered_bidirectional_dijkstra(graph, src, dst, weight, filter_expression, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge', ignore_edge_direction=False)

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
dst (PgxVertex) – Destination node
weight (EdgeProperty) – Edge property holding the (positive) weight of each edge in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
filter_expression (EdgeFilter) – graphFilter object for filtering
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges

Returns

PgxPath holding the information of the shortest path, if it exists

Return type

shortest_path_filtered_dijkstra(graph, src, dst, weight, filter_expression, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge', ignore_edge_direction=False)

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
dst (PgxVertex) – Destination node
weight (EdgeProperty) – Edge property holding the (positive) weight of each edge in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path
filter_expression (EdgeFilter) – GraphFilter object for filtering
ignore_edge_direction (bool) – Boolean flag for ignoring the direction of the edges

Returns

PgxPath holding the information of the shortest path, if it exists

Example

graph = ....
src = graph.get_vertex("1")
dst = graph.get_vertex("2")
cost = graph.get_or_create_edge_property("double", "cost")
filter = EdgeFilter("edge.cost > 5")
path = analyst.shortest_path_filtered_dijkstra(
    graph, src, dst, cost, filter)
print(path)

Return type

shortest_path_hop_distance(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Hop distance can give a relatively fast insight on the distances in a graph.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
distance (Union[VertexProperty, str]) – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

Return type

shortest_path_hop_distance_reversed(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Backwards hop distance can give a relatively fast insight on the distances in a graph.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
distance (Union[VertexProperty, str]) – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

Return type

shortest_path_hop_distance_undirected(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Undirected hop distance can give a relatively fast insight on the distances in a graph.

Parameters

graph (PgxGraph) – Input graph
src (PgxVertex) – Source node
distance (Union[VertexProperty, str]) – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph
parent (Union[VertexProperty, str]) – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path
parent_edge (Union[VertexProperty, str]) – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

Return type

speaker_listener_label_propagation(graph, labels, max_iter=100, threshold=0.0, delimiter='|')

SLLP to detect overlapping communities

Parameters

graph (PgxGraph) – Input graph.
max_iter (int) – Maximum number of iterations that will be performed.
threshold (float) – The probability of droping a label during the post process.
delimiter (str) – Vertex property holding distinct nodes in the memory with probability greater than or equal to threshold.
labels (Union[VertexProperty, str]) –

Returns

Vertex property holding distinct nodes in the memory with probability greater than or equal to threshold.

Example

graph = ....
labels = analyst.speaker_listener_label_propagation(
    graph, 'labels', max_iter, threshold, delimiter)

Return type

SupervisedEdgeWiseModel

supervised_edgewise_builder(edge_target_property_name, *, vertex_input_property_names=[], edge_input_property_names=[], target_edge_labels=[], loss_fn='softmax_cross_entropy', batch_gen='standard', batch_gen_params=[], pred_layer_config=None, conv_layer_config=None, vertex_input_property_configs=None, edge_input_property_configs=None, batch_size=128, num_epochs=3, learning_rate=0.01, layer_size=128, class_weights=None, seed=None, weight_decay=0.0, standardize=False, normalize=True, edge_combination_method=None)

Build a SupervisedEdgeWise model and return it.

Parameters

edge_target_property_name (str) – Target property name
vertex_input_property_names (List[str]) – Vertices Input feature names
edge_input_property_names (List[str]) – Edges Input feature names
target_edge_labels (List[str]) – Set the target edge labels for the algorithm. Only the related edges need to have the target property. Training and inference will be done on the edges with those labels
loss_fn (Union[LossFunction, str]) – Loss function. Supported: String (‘softmax_cross_entropy’, ‘sigmoid_cross_entropy’) or LossFunction object
batch_gen (str) – Batch generator. Supported: ‘standard’, ‘stratified_oversampling’
batch_gen_params (List[Any]) – List of parameters passed to the batch generator
pred_layer_config (Optional[Iterable[GraphWisePredictionLayerConfig]]) – Prediction layer configuration as list of PredLayerConfig, or default if None
conv_layer_config (Optional[Union[Iterable[GraphWiseConvLayerConfig], Iterable[GraphWiseAttentionLayerConfig]]]) – Conv layer configuration as list of ConvLayerConfig, or default if None
vertex_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Vertex input property configuration as list of InputPropertyConfig, or default if None
edge_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Edge input property configuration as list of InputPropertyConfig, or default if None
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
learning_rate (float) – Learning rate
layer_size (int) – Number of dimensions for the output vectors
class_weights (Optional[Union[Mapping[str, float], Mapping[int, float]]]) – Class weights to be used in the loss function. The loss for the corresponding class will be multiplied by the factor given in this map. If None, uniform class weights will be used.
seed (Optional[int]) – Seed
weight_decay (float) – Weight decay
standardize (bool) – apply batch normalization
normalize (bool) – apply l2 normalization after each convolutional layer
edge_combination_method (Optional[Union[ConcatEdgeCombinationMethod, ProductEdgeCombinationMethod]]) – combination method to apply to vertex embeddings and edge features to compute the edge embedding

Returns

Built SupervisedEdgeWise model

Return type

supervised_graphwise_builder(vertex_target_property_name, vertex_input_property_names=[], edge_input_property_names=[], target_vertex_labels=[], loss_fn='softmax_cross_entropy', batch_gen='standard', batch_gen_params=[], pred_layer_config=None, conv_layer_config=None, vertex_input_property_configs=None, edge_input_property_configs=None, batch_size=128, num_epochs=3, learning_rate=0.01, layer_size=128, class_weights=None, seed=None, weight_decay=0.0, standardize=False, normalize=True)

Build a SupervisedGraphWise model and return it.

Parameters

vertex_target_property_name (str) – Target property name
vertex_input_property_names (List[str]) – Vertices Input feature names
edge_input_property_names (List[str]) – Edges Input feature names
target_vertex_labels (List[str]) – Set the target vertex labels for the algorithm. Only the related vertices need to have the target property. Training and inference will be done on the vertices with those labels
loss_fn (Union[LossFunction, str]) – Loss function. Supported: String (‘softmax_cross_entropy’, ‘sigmoid_cross_entropy’) or LossFunction object
batch_gen (str) – Batch generator. Supported: ‘standard’, ‘stratified_oversampling’
batch_gen_params (List[Any]) – List of parameters passed to the batch generator
pred_layer_config (Optional[Iterable[GraphWisePredictionLayerConfig]]) – Prediction layer configuration as list of PredLayerConfig, or default if None
conv_layer_config (Optional[Union[Iterable[GraphWiseConvLayerConfig], Iterable[GraphWiseAttentionLayerConfig]]]) – Conv layer configuration as list of ConvLayerConfig, or default if None
vertex_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Vertex input property configuration as list of InputPropertyConfig, or default if None
edge_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Edge input property configuration as list of InputPropertyConfig, or default if None
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
learning_rate (float) – Learning rate
layer_size (int) – Number of dimensions for the output vectors
class_weights (Optional[Union[Mapping[str, float], Mapping[int, float]]]) – Class weights to be used in the loss function. The loss for the corresponding class will be multiplied by the factor given in this map. If null, uniform class weights will be used.
seed (Optional[int]) – Seed
weight_decay (float) – Weight decay
standardize (bool) – apply batch normalization
normalize (bool) – apply l2 normalization after each convolutional layer

Returns

Built SupervisedGraphWise model

Return type

SupervisedGraphWiseModel

topological_schedule(graph, vs, topo_sched='topo_sched')

Topological schedule gives an order of visit for the reachable vertices from the source.

Parameters

graph (PgxGraph) – Input graph
vs (VertexSet) – Set of vertices to be used as the starting points for the scheduling order
topo_sched (Union[VertexProperty, str]) – (Out argument) vertex property holding the scheduled order of each vertex

Returns

Vertex property holding the scheduled order of each vertex.

Example

graph = ....
source = graph.get_vertices(
    VertexFilter("vertex.prop1 < 10"))
topo_sched = analyst.topological_schedule(
    graph, source, topo_sched='topo_sched')
result_set = graph.query_pgql(
    "SELECT x, x.topo_sched MATCH (x) "
    "ORDER BY x.topo_sched DESC")
result_set.print()

Return type

topological_sort(graph, topo_sort='topo_sort')

Topological sort gives an order of visit for vertices in directed acyclic graphs.

Parameters

graph (PgxGraph) – Input graph
topo_sort (Union[VertexProperty, str]) – (Out argument) vertex property holding the topological order of each vertex

Return type

UnsupervisedAnomalyDetectionGraphWiseModel

unsupervised_anomaly_detection_graphwise_builder(vertex_input_property_names=[], edge_input_property_names=[], target_vertex_labels=[], loss_fn='sigmoid_cross_entropy', conv_layer_config=None, vertex_input_property_configs=None, edge_input_property_configs=None, batch_size=128, num_epochs=3, learning_rate=0.001, layer_size=128, seed=None, weight_decay=0.0, standardize=False, embedding_config=None)

Build a UnsupervisedAnomalyDetectionGraphWiseModel model and return it.

Parameters

vertex_input_property_names (List[str]) – Vertices Input feature names
edge_input_property_names (List[str]) – Edges Input feature names
target_vertex_labels (List[str]) – Set the target vertex labels for the algorithm. Only the related vertices need to have the target property. Training and inference will be done on the vertices with those labels
loss_fn (str) – Loss function. Supported: sigmoid_cross_entropy
conv_layer_config (Optional[Iterable[GraphWiseConvLayerConfig]]) – Conv layer configuration as list of ConvLayerConfig, or default if None
vertex_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Vertex input property configuration as list of InputPropertyConfig, or default if None
edge_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Edge input property configuration as list of InputPropertyConfig, or default if None
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
learning_rate (float) – Learning rate
layer_size (int) – Number of dimensions for the output vectors
seed (Optional[int]) – Seed
weight_decay (float) – weight decay
standardize (bool) – apply batch normalization
embedding_config (Optional[GraphWiseEmbeddingConfig]) – te embedding configuration as a GraphWiseEmbeddingConfig object, default is None

Returns

Built UnsupervisedAnomalyDetectionGraphWiseModel model

Return type

unsupervised_edgewise_builder(*, vertex_input_property_names=[], edge_input_property_names=[], target_edge_labels=[], loss_fn='sigmoid_cross_entropy', conv_layer_config=None, vertex_input_property_configs=None, edge_input_property_configs=None, batch_size=128, num_epochs=3, learning_rate=0.01, layer_size=128, dgi_layer_config=None, seed=None, weight_decay=0.0, standardize=False, normalize=True, edge_combination_method=None)

Build a UnsupervisedEdgeWise model and return it.

Parameters

vertex_input_property_names (List[str]) – Vertices Input feature names
edge_input_property_names (List[str]) – Edges Input feature names
target_edge_labels (List[str]) – Set the target edge labels for the algorithm. Only the related edges need to have the target property. Training and inference will be done on the edges with those labels
loss_fn (str) – Loss function. Supported: String (‘sigmoid_cross_entropy’)
conv_layer_config (Optional[Union[Iterable[GraphWiseConvLayerConfig], Iterable[GraphWiseAttentionLayerConfig]]]) – Conv layer configuration as list of ConvLayerConfig, or default if None
vertex_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Vertex input property configuration as list of InputPropertyConfig, or default if None
edge_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Edge input property configuration as list of InputPropertyConfig, or default if None
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
learning_rate (float) – Learning rate
layer_size (int) – Number of dimensions for the output vectors
dgi_layer_config (Optional[GraphWiseDgiLayerConfig]) – Dgi layer configuration as DgiLayerConfig object, or default if None
seed (Optional[int]) – Seed
weight_decay (float) – Weight decay
standardize (bool) – apply batch normalization
normalize (bool) – apply l2 normalization after each convolutional layer
edge_combination_method (Optional[Union[ConcatEdgeCombinationMethod, ProductEdgeCombinationMethod]]) – combination method to apply to vertex embeddings and edge features to compute the edge embedding

Returns

Built SupervisedEdgeWise model

Return type

UnsupervisedEdgeWiseModel

unsupervised_graphwise_builder(vertex_input_property_names=[], edge_input_property_names=[], target_vertex_labels=[], loss_fn='sigmoid_cross_entropy', conv_layer_config=None, vertex_input_property_configs=None, edge_input_property_configs=None, batch_size=128, num_epochs=3, learning_rate=0.001, layer_size=128, seed=None, dgi_layer_config=None, weight_decay=0.0, standardize=False, embedding_config=None, normalize=True)

Build a UnsupervisedGraphWise model and return it.

Parameters

vertex_input_property_names (List[str]) – Vertices Input feature names
edge_input_property_names (List[str]) – Edges Input feature names
target_vertex_labels (List[str]) – Set the target vertex labels for the algorithm. Only the related vertices need to have the target property. Training and inference will be done on the vertices with those labels
loss_fn (str) – Loss function. Supported: sigmoid_cross_entropy
conv_layer_config (Optional[Union[Iterable[GraphWiseConvLayerConfig], Iterable[GraphWiseAttentionLayerConfig]]]) – Conv layer configuration as list of ConvLayerConfig, or default if None
vertex_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Vertex input property configuration as list of InputPropertyConfig, or default if None
edge_input_property_configs (Optional[Iterable[InputPropertyConfig]]) – Edge input property configuration as list of InputPropertyConfig, or default if None
batch_size (int) – Batch size for training the model
num_epochs (int) – Number of epochs to train the model
learning_rate (float) – Learning rate
layer_size (int) – Number of dimensions for the output vectors
seed (Optional[int]) – Seed
dgi_layer_config (Optional[GraphWiseDgiLayerConfig]) – Dgi layer configuration as DgiLayerConfig object, or default if None
weight_decay (float) – weight decay
standardize (bool) – apply batch normalization
embedding_config (Optional[GraphWiseEmbeddingConfig]) – te embedding configuration as a GraphWiseEmbeddingConfig object, default is None
normalize (bool) – apply l2 normalization after each convolutional layer

Returns

Built UnsupervisedGraphWise model

Return type

UnsupervisedGraphWiseModel

vertex_betweenness_centrality(graph, bc='betweenness')

Parameters

graph (PgxGraph) – Input graph
bc (Union[VertexProperty, str]) – Vertex property holding the betweenness centrality value for each vertex

Returns

Vertex property holding the computed scores

Example

graph = ....
betweenness = analyst.vertex_betweenness_centrality(
    graph, bc='betweenness')
result_set = graph.query_pgql(
    "SELECT x, x.betweenness MATCH (x) "
    "ORDER BY x.betweenness DESC")
result_set.print()

Return type

wcc(graph, label='wcc')

Identify weakly connected components.

This can be useful for clustering graph data.

Parameters

graph (PgxGraph) – Input graph
label (Union[VertexProperty, str]) – Vertex property holding the value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm.

Example

graph = ....
wcc = analyst.wcc(graph, label='wcc')
result_set = graph.query_pgql(
    "SELECT x, x.wcc MATCH (x) ORDER BY x.wcc DESC")
result_set.print()

Return type

weighted_closeness_centrality(graph, weight, cc='weighted_closeness')

Measure the centrality of the vertices based on weighted distances, allowing to find well-connected vertices.

Parameters

graph (PgxGraph) – Input graph
weight (EdgeProperty) – Edge property holding the weight of each edge in the graph
cc (Union[VertexProperty, str]) – (Out argument) vertex property holding the closeness centrality value for each vertex

Returns

Vertex property holding the computed scores

Return type

weighted_pagerank(graph, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='weighted_pagerank')

Parameters

graph (PgxGraph) – Input graph
weight (EdgeProperty) – Edge property holding the weight of each edge in the graph
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor
max_iter (int) – Maximum number of iterations that will be performed
rank (Union[VertexProperty, str]) – Vertex property holding the PageRank value for each vertex
norm (bool) –

Returns

Vertex property holding the computed the PageRank value

Example

graph = ....
cost = graph.get_or_create_edge_property("double", "cost")
pagerank = analyst.weighted_pagerank(
    graph, cost, norm=False, rank="weighted_pagerank")
result_set = graph.query_pgql(
    "SELECT x, x.weighted_pagerank "
    "MATCH (x) ORDER BY x.weighted_pagerank DESC")
result_set.print()

Return type

MutationStrategyBuilder

whom_to_follow(graph, v, top_k=100, size_circle_of_trust=500, tol=0.001, damping=0.85, max_iter=100, salsa_tol=0.001, salsa_max_iter=100, hubs=None, auth=None)

Whom-to-follow (WTF) is a recommendation algorithm.

It returns two vertex sequences: one of similar users (hubs) and a second one with users to follow (auth).

Parameters

graph (PgxGraph) – Input graph
v (PgxVertex) – The chosen vertex from the graph for personalization of the recommendations
top_k (int) – The maximum number of recommendations that will be returned
size_circle_of_trust (int) – The maximum size of the circle of trust
tol (float) – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.
damping (float) – Damping factor for the Pagerank stage
max_iter (int) – Maximum number of iterations that will be performed for the Pagerank stage
salsa_tol (float) – Maximum tolerated error value for the SALSA stage
salsa_max_iter (int) – Maximum number of iterations that will be performed for the SALSA stage
hubs (Optional[Union[VertexSequence, str]]) – (Out argument) vertex sequence holding the top rated hub vertices (similar users) for the recommendations
auth (Optional[Union[VertexSequence, str]]) – (Out argument) vertex sequence holding the top rated authority vertices (users to follow) for the recommendations

Returns

Vertex properties holding hubs and auth

Example

graph = ....
vertex = graph.get_vertex("1")
hubs, auths = analyst.whom_to_follow(graph, vertex)
for hub in hubs:
    print(hub)
for auth in auths:
    print(auth)

Return type

Tuple[VertexSequence, VertexSequence]

class pypgx.api.BipartiteGraph(session, java_graph)

Bases: PgxGraph

A bipartite PgxGraph.

Parameters: session (PgxSession) –

get_is_left_property()

Get the ‘is Left’ vertex property of the graph.

Return type: VertexProperty

class pypgx.api.CompiledProgram(session, java_program)

Bases: PgxContextManager

A compiled Green-Marl program.

Constructor arguments:

Parameters

session (PgxSession) – Pgx Session
java_program (oracle.pgx.api.CompiledProgram) – Java compiledProgram

property compiler_output: Optional[str]

Get the compiler output of the compiled program.

Returns: The compiler output.
Return type: Optional[str]

destroy()

Free resources on the server taken up by this Program.

Return type: None

get_return_type()

Get the return type of the compiled program.

Returns: The return type of the compiled program.
Return type: str

property id: str

Get the id of the compiled program.

Returns: The id of the compiled program.
Return type: str

run(*argv)

Run the compiled program with the given parameters. If the Green-Marl procedure of this compiled program looks like this: procedure pagerank(graph G, e double, max int, nodePorp){…}

Parameters

argv (Any) – All the arguments required by specified procedure.

Raises

TypeError – If the number of arguments is wrong.
TypeError – If one of the arguments type don’t match.

Returns

Result of analysis as an AnalysisResult as a dict.

Return type

Dict[str, Optional[int]]

class pypgx.api.CpuEnvironment(java_cpu_environment)

Bases: object

A sub environment for CPU bound tasks

get_max_num_threads()

Get the maximum number of threads that can be used by the CPU environment.

Returns: the maximum number of threads.
Return type: int

get_priority()

Get the priority of the CPU environment.

Returns: the environment priority.
Return type: str

get_relevant_fields()

Get the relevant fields of the CPU environment.

Returns: the relevant fields of the CPU environment.
Return type: List[str]

get_values()

Return values of class

Returns: values
Return type: Dict[Any, Any]

get_weight()

Get the weight of the CPU environment.

Returns: the weight of the CPU environment.
Return type: int

reset()

Reset environment

Return type: None

set_max_num_threads(max_num_threads)

Set the maximum number of threads that can be used by the CPU environment.

Parameters: max_num_threads (int) – the maximum number of threads.
Return type: None

set_priority(priority)

Set the priority of the CPU environment.

Parameters: priority (str) – the environment priority.
Return type: None

set_weight(weight)

Set the weight of the CPU environment.

Parameters: weight (int) – the weight of the CPU environment.
Return type: None

class pypgx.api.DbConnectionConfig(java_graph_config)

Bases: object

A class for representing the database connection config interface

get_data_source_id()

Get the data source id to use to connect to an RDBMS instance.

Returns: the data source id
Return type: Optional[str]

get_jdbc_url()

Get the JDBC URL pointing to an RDBMS instance.

Returns: the JDBC URL
Return type: Optional[str]

get_max_prefetched_rows()

Get the maximum number of rows prefetched during each round trip resultset-database.

Returns: the maximum number of prefetched rows
Return type: int

get_schema()

Get the schema to use when reading/writing RDBMS objects.

Returns: the schema
Return type: Optional[str]

get_username()

Get the username to use when connecting to an RDBMS instance.

Returns: the username
Return type: Optional[str]

class pypgx.api.EdgeBuilder(session, java_edge_builder, id_type)

Bases: GraphBuilder

An edge builder for defining edges added with the GraphBuilder.

Parameters

session (PgxSession) –
id_type (str) –

property id: int: Get the id of the element (vertex or edge) this builder belongs to.

is_ignored()

Whether this edge builder ignores method calls (True) or if it performs calls as usual (False. Some issues, such as incompatible changes in a ChangeSet, can be configured to be ignored. In that case, additional method calls on the returned edge builder object will be ignored.

Return type: bool

set_label(label)

Set the new value of the label.

Parameters: label (str) – The new value of the label
Returns: The EdgeBuilder object
Return type: EdgeBuilder

set_property(key, value)

Set the property value of this edge with the given key to the given value.

The first time this method is called, the type of value defines the type of the property.

Changed in version 22.3: If the type of value is Python’s int, the resulting property now always has PGX’s property type long (64 bits).

Parameters

key (str) – The property key
value (Any) – The value of the vertex property

Returns

The EdgeBuilder object

Return type

EdgeBuilder

class pypgx.api.EdgeCollection(graph, java_collection)

Bases: GraphEntityCollection

A collection of edges.

Parameters: graph (PgxGraph) –

add(e)

Add one or multiple edges to the collection.

Parameters: e (Union[PgxEdge, int, Iterable[Union[PgxEdge, int]]]) – Edge or edge id. Can also be an iterable of edge/edge ids.

add_all(edges)

Add multiple vertices to the collection.

Parameters: edges (Iterable[Union[PgxEdge, int]]) – Iterable of edges/edges ids
Return type: None

contains(e)

Check if the collection contains edge e.

Parameters: e (Union[PgxEdge, int]) – PgxEdge object or id:
Returns: Boolean
Return type: bool

remove(e)

Remove one or multiple edges from the collection.

Parameters: e (Union[PgxEdge, int, Iterable[Union[PgxEdge, int]]]) – Edges or edges id. Can also be an iterable of edges/edges ids.

remove_all(edges)

Remove multiple edges from the collection.

Parameters: edges (Iterable[Union[PgxEdge, int]]) – Iterable of edges/edges ids

class pypgx.api.EdgeModifier(session, java_edge_modifier, id_type='integer')

Bases: GraphChangeSet, EdgeBuilder

A class to modify existing edges of a graph.

Parameters

session (PgxSession) –
id_type (str) –

class pypgx.api.EdgeProviderMetaData(name, id_type, directed, labels, properties, source_vertex_provider_name, destination_vertex_provider_name)

Bases: EntityProviderMetaData

Meta information about an edge provider in a PgxGraph

Parameters

name (str) –
id_type (str) –
directed (bool) –
labels (Set[str]) –
properties (List[PropertyMetaData]) –
source_vertex_provider_name (str) –
destination_vertex_provider_name (str) –

get_destination_vertex_provider_name()

Return the name of the vertex provider for the destinations of the edges of this edge provider.

Returns: the name of the vertex provider
Return type: str

get_source_vertex_provider_name()

Return the name of the vertex provider for the sources of the edges of this edge provider.

Returns: the name of the vertex provider
Return type: str

is_directed()

Indicate whether the edge table is directed.

Returns: whether the edge table is directed
Return type: bool

set_directed(directed)

Set whether the edge table is directed.

Parameters: directed (bool) – A Boolean value
Return type: None

class pypgx.api.EdgeSequence(graph, java_collection)

Bases: EdgeCollection

An ordered sequence of edges which may contain duplicates.

Parameters: graph (PgxGraph) –

class pypgx.api.EdgeSet(graph, java_collection)

Bases: EdgeCollection

An unordered set of edges (no duplicates).

Parameters: graph (PgxGraph) –

class pypgx.api.EntityProviderMetaData(java_entity_provider_meta_data)

Bases: object

Abstraction of the meta information about an edge or vertex provider.

get_id_type()

Get the ID type of this entity table.

Returns: the id type.
Return type: str

get_labels()

Return the set of provider labels (“type labels”).

Returns: the set of provider labels
Return type: Set[str]

get_name()

Get the name of this entity table.

Returns: the table name
Return type: str

get_properties()

Return a list containing the metadata for the properties associated to this provider.

Returns: the list of the properties’ metadata
Return type: List[PropertyMetaData]

set_id_type(id_type)

Set the ID type of this entity table.

Parameters: id_type (str) – the new vertex id type
Return type: None

class pypgx.api.ExecutionEnvironment(java_environment, session)

Bases: object

A session bound environment holding the execution configuration for each task type.

Parameters: session (PgxSession) –

allows_concurrent_tasks()

Check if the session allows the tasks to run concurrently.

Returns: True if the session allows the tasks to run concurrently.
Return type: bool

get_analysis_environment()

Get the analysis environment.

Returns: the analysis environment.
Return type: CpuEnvironment

get_fast_analysis_environment()

Get the fast analysis environment.

Returns: the fast analysis environment.
Return type: CpuEnvironment

get_io_environment()

Get the IO environment.

Returns: the IO environment.
Return type: IoEnvironment

get_session()

Get the PGX session associated with this execution environment.

Returns: the PGX session associated with this execution environment.
Return type: PgxSession

get_update_consistency_model()

Get the update consistency model.

Returns: the update consistency model.
Return type: str

get_values()

Get the values of class

Returns: values
Return type: Dict[Any, Any]

reset()

Reset environment

Return type: None

reset_update_consistency_model()

Reset the update consistency model.

Return type: None

set_update_consistency_model(model)

Set the update consistency model.

Parameters: model (str) – the update consistency model.
Return type: None

class pypgx.api.FileGraphConfig(java_graph_config)

Bases: GraphConfig

A class for representing File graph configurations

get_edge_destination_column()

Get the name or index (starting from 1) of column corresponding to edge destination (for CSV format only).

Returns: name or index of column corresponding to edge destination
Return type: Optional[Union[int, str]]

get_edge_id_column()

Get the name or index (starting from 1) of column corresponding to edge id (for CSV format only).

Returns: name or index of column corresponding to edge id
Return type: Optional[Union[int, str]]

get_edge_label_column()

Get the name or index (starting from 1) of column corresponding to edge label (for CSV format only).

Returns: name or index of column corresponding to edge label
Return type: Optional[Union[int, str]]

get_edge_source_column()

Get the name or index (starting from 1) of column corresponding to edge source (for CSV format only).

Returns: name or index of column corresponding to edge source
Return type: Optional[Union[int, str]]

get_edge_uris()

Get the unified resource identifiers for the files with the graph edge data.

Returns: the list of URIs
Return type: List[str]

get_separator()

Get the separator of this graph configuration.

Returns: the separator
Return type: str

get_storing()

Get the storing-specific configuration.

Returns: the storing configuration
Return type: Dict[str, Any]

get_storing_options()

Get the storing configuration.

Returns: the storing configuration
Return type: Dict[str, Any]

get_uri()

Get the unified resource identifier for the file with the graph data.

Returns: the unified resource identifier
Return type: str

get_uris()

Get the unified resource identifiers for the files with the graph data.

Returns: the unified resource identifiers
Return type: List[str]

get_vertex_id_column()

Get the name or index (starting from 1) of column corresponding to vertex id (for CSV format only).

Returns: name or index of column corresponding to vertex id
Return type: Optional[Union[int, str]]

get_vertex_labels_column()

Get the name or index (starting from 1) of column corresponding to vertex labels (for CSV format only).

Returns: name or index of column corresponding to vertex labels
Return type: Optional[Union[int, str]]

get_vertex_uris()

Get the unified resource identifiers for the files with the graph vertex data.

Returns: the list of URIs
Return type: List[str]

is_detect_gzip()

Whether GZip file automatic detection is enabled or not.

Returns: True if GZip file automatic detection is enabled, false otherwise.
Return type: bool

is_header()

Whether the file has a header.

i.e. first line of file is meant for headers, e.g. ‘EdgeId, SourceId, DestId, EdgeProp1, EdgeProp2’

Returns: Whether the file has a header or not
Return type: bool

class pypgx.api.GraphAlterationBuilder(java_graph_alteration_builder, session)

Bases: object

Builder to describe the alterations (graph schema modification) to perform to a graph.

It is for example possible to add or remove vertex and edge providers.

Parameters: session (PgxSession) –

add_edge_provider(path_to_edge_provider_config)

Add an edge provider for which the configuration is in a file at the specified path.

Parameters: path_to_edge_provider_config (str) – the path to the JSON configuration of the edge provider
Returns: a GraphAlterationBuilder instance containing the added edge provider.
Return type: GraphAlterationBuilder

add_empty_edge_provider(provider_name, source_provider, dest_provider, label=None, key_type=None, key_column=None, create_key_mapping=None, properties=None)

Add an empty edge provider

Parameters

provider_name (str) – the name of the edge provider to add
source_provider (str) – the name of the vertex provider for the source of the edges
dest_provider (str) – the name of the vertex provider for the destination of the edges
label (Optional[str]) – the label to associate to the provider
key_type (Optional[str]) – the key type
key_column (Optional[Union[int, str]]) – the key column name or index
create_key_mapping (Optional[bool]) – boolean indicating if the provider key mapping should be created
properties (Optional[List[Union[Tuple[str, str], Tuple[str, str, int], GraphPropertyConfig]]]) – the property configurations, these can either of the following forms: a length 2 tuple (name, type), a length 3 tuple (name, type, dimension) or a GraphPropertyConfig object

Returns

Return type

None

add_empty_vertex_provider(provider_name, label=None, key_type=None, key_column=None, create_key_mapping=None, properties=None)

Add an empty vertex provider

Parameters

provider_name (str) – the name of the vertex provider to add
label (Optional[str]) – the label to associate to the provider
key_type (Optional[str]) – the key type
key_column (Optional[Union[int, str]]) – the key column name or index
create_key_mapping (Optional[bool]) – boolean indicating if the provider key mapping should be created
properties (Optional[List[Union[Tuple[str, str], Tuple[str, str, int], GraphPropertyConfig]]]) – the property configurations, these can either of the following forms: a length 2 tuple (name, type), a length 3 tuple (name, type, dimension) or a GraphPropertyConfig object

Returns

Return type

None

add_vertex_provider(path_to_vertex_provider_config)

Add a vertex provider for which the configuration is in a file at the specified path.

Parameters: path_to_vertex_provider_config (str) – the path to the JSON configuration of the vertex provider.
Returns: a GraphAlterationBuilder instance with the added vertex provider.
Return type: GraphAlterationBuilder

build(new_graph_name=None)

Create a new graph that is the result of the alteration of the current graph.

Parameters: new_graph_name (Optional[str]) – name of the new graph to create.
Returns: a PgxGraph instance of the current alteration builder.
Return type: PgxGraph

build_new_snapshot()

Create a new snapshot for the current graph that is the result of the alteration of the current snapshot.

Returns: a PgxGraph instance of the current alteration builder.
Return type: PgxGraph

cascade_edge_provider_removals(cascade_edge_provider_removals)

Specify if the edge providers associated to a vertex provider (the vertex provider is either the source or destination provider for that edge provider) being removed should be automatically removed too or not. By default, edge providers are not automatically removed whenever an associated vertex is removed. In that setting, if the associated edge providers are not specifically removed, an exception will be thrown to indicate that issue.

Parameters: cascade_edge_provider_removals (bool) – whether or not to automatically remove associated edge providers of removed vertex providers.
Returns: a GraphAlterationBuilder instance with new changes.
Return type: GraphAlterationBuilder

remove_edge_provider(edge_provider_name)

Remove the edge provider that has the given name.

Parameters: edge_provider_name (str) – the name of the provider to remove.
Returns: a GraphAlterationBuilder instance with the edge_provider removed.
Return type: GraphAlterationBuilder

remove_vertex_provider(vertex_provider_name)

Remove the vertex provider that has the given name. Also removes the associated edge providers if True was specified when calling cascade_edge_provider_removals(boolean).

Parameters: vertex_provider_name (str) – the name of the provider to remove.
Returns: a GraphAlterationBuilder instance with the vertex_provider removed.
Return type: GraphAlterationBuilder

set_data_source_version(data_source_version)

Set the version information for the built graph or snapshot.

Parameters: data_source_version (str) – the version information.
Return type: None

class pypgx.api.GraphBuilder(session, java_graph_builder, id_type)

Bases: object

A graph builder for constructing a PgxGraph.

Parameters

session (PgxSession) –
id_type (str) –

add_edge(src, dst, edge_id=None)

Add an edge with the given edge ID and the given source and destination vertices.

Parameters

src (Union[VertexBuilder, str, int]) – Source VertexBuilder or ID
dst (Union[VertexBuilder, str, int]) – Destination VertexBuilder or ID
edge_id (Optional[int]) – the ID of the new edge

Returns

An ‘EdgeBuilder’ instance containing the added edge.

Return type

EdgeBuilder

add_vertex(vertex=None)

Add the vertex with the given id to the graph builder.

If the vertex doesn’t exist it is added, if it exists a builder for that vertex is returned Throws an UnsupportedOperationException if vertex ID generation strategy is set to IdGenerationStrategy.AUTO_GENERATED.

Parameters: vertex (Optional[Union[int, str]]) – The ID of the new vertex
Returns: A vertexBuilder instance
Return type: VertexBuilder

build(name=None)

Parameters: name (Optional[str]) – The new name of the graph. If None, a name is generated.
Returns: PgxGraph object
Return type: PgxGraph

get_config_parameter(parameter)

Retrieve the value for the given config parameter

Parameters: parameter (str) – the config parameter to get the value for
Returns: the value for the given config parameter
Return type: Union[bool, str]

reset_edge(edge)

Reset any change for the given edge.

Parameters: edge (Union[EdgeBuilder, str, int]) – The id or the EdgeBuilder object to reset
Returns: self

reset_vertex(vertex)

Reset any change for the given vertex.

Parameters: vertex (Union[VertexBuilder, str, int]) – The id or the vertexBuilder object to reset
Returns: self

set_config_parameter(parameter, value)

Set the given configuration parameter to the given value

Parameters

parameter (str) – the config parameter to set
value (Union[bool, str]) – the new value for the config parameter

Return type

None

set_data_source_version(version)

Set the version information for the built graph or snapshot.

Parameters: version (str) –
Return type: None

set_retain_edge_ids(retain_edge_ids)

Control whether to retain the vertex ids provided in this graph builder are to be retained in the final graph. If True retain the vertex ids, if False use internally generated edge ids.

Parameters: retain_edge_ids (bool) – Whether or not to retain edge ids
Returns: self
Return type: GraphBuilder

set_retain_ids(retain_ids)

Control for both vertex and edge ids whether to retain them in the final graph.

Parameters: retain_ids (bool) – Whether or not to retain vertex and edge ids
Returns: self
Return type: GraphBuilder

set_retain_vertex_ids(retain_vertex_ids)

Control whether to retain the vertex ids provided in this graph builder are to be retained in the final graph. If True retain the vertex ids, if False use internally generated vertex ids of type Integer.

Parameters: retain_vertex_ids (bool) – Whether or not to retain vertex ids
Returns: self
Return type: GraphBuilder

class pypgx.api.GraphChangeSet(session, java_graph_change_set, id_type='integer')

Bases: GraphBuilder

Class which stores changes of a particular graph.

Changed in version 22.3: The parameter names of the add_edge() method are now src, dst, and edge_id, like in the superclass GraphBuilder.

Changed in version 22.4: Like in the superclass GraphBuilder, the parameter names are now as follows:

add_vertex() method parameters: vertex.
reset_edge() method parameters: edge.

Parameters

session (PgxSession) –
id_type (str) –

build_new_snapshot()

Build a new snapshot of the graph out of this GraphChangeSet.

The resulting PgxGraph is a new snapshot of the PgxGraph object this was created from.

Returns: A new object of type ‘PgxGraph’
Return type: PgxGraph

remove_edge(edge_id)

Remove an edge from the graph.

Parameters: edge_id (int) – The edge id of the edge to remove.
Returns: self
Return type: GraphChangeSet

remove_vertex(vertex_id)

Remove a vertex from the graph.

Parameters: vertex_id (Union[int, str]) – The vertex id of the vertex to remove.
Returns: self
Return type: GraphChangeSet

set_add_existing_edge_policy(add_existing_edge_policy)

Set the policy on what to do when an edge is added that already exists

Parameters: add_existing_edge_policy (str) – the new policy
Returns: this graph builder
Return type: GraphChangeSet

set_add_existing_vertex_policy(add_existing_vertex_policy)

Set the policy on what to do when a vertex is added that already exists

Parameters: add_existing_vertex_policy (str) – the new policy
Returns: this graph builder
Return type: GraphChangeSet

set_invalid_change_policy(invalid_change_policy)

Set the policy on what to do when an invalid action is added

Parameters: invalid_change_policy (str) – the new policy
Returns: this graph builder
Return type: GraphChangeSet

set_required_conversion_policy(required_conversion_policy)

Set the policy on what to do when an invalid type is encountered

Parameters: required_conversion_policy (str) – the new policy
Returns: this graph builder
Return type: GraphChangeSet

update_edge(edge_id)

Return an ‘EdgeModifier’ with which you can update edge properties and the edge label.

Parameters: edge_id (int) – The edge id of the edge to be updated
Returns: An ‘EdgeModifier’
Return type: EdgeModifier

update_vertex(vertex_id)

Return a ‘VertexModifier’ with which you can update vertex properties.

Parameters: vertex_id (Union[int, str]) – The vertex id of the vertex to be updated
Returns: A ‘VertexModifier’
Return type: VertexModifier

class pypgx.api.GraphConfig(java_graph_config)

Bases: object

A class for representing graph configurations.

Variables

is_file_format (bool) – whether the format is a file-based format
has_vertices_and_edges_separated_file_format (bool) – whether given format has vertices and edges separated in different files
is_single_file_format (bool) – whether given format has vertices and edges combined in same file
is_multiple_file_format (bool) – whether given format has vertices and edges separated in different files
supports_edge_label (bool) – whether given format supports edge label
supports_vertex_labels (bool) – whether given format supports vertex labels
supports_vector_properties (bool) – whether given format supports vector properties
supports_property_column (bool) – whether given format supports property columns
name (str) – the graph name of this graph configuration. Note: for file-based graph configurations, this is the file name of the URI this configuration points to.
num_vertex_properties (int) – the number of vertex properties in this graph configuration
num_edge_properties (int) – the number of edge properties in this graph configuration
format (str) – Graph data format. The possible formats are in the table below.

Format string	Description
PGB	PGX Binary File Format (formerly EBIN)
EDGE_LIST	Edge List file format
TWO_TABLES	Two-Tables format (vertices, edges)
ADJ_LIST	Adjacency List File Format
FLAT_FILE	Flat File Format
GRAPHML	GraphML File Format
PG	Property Graph (PG) Database Format
RDF	Resource Description Framework (RDF) Database Format
CSV	Comma-Separated Values (CSV) Format

can_serialize()

Get the serializable property of this config.

Returns: True if it is serializable, False otherwise.
Return type: bool

property edge_id_type: Optional[str]

Get the type of the edge ID.

Returns: a str indicating the type of the vertex ID, one of “integer”, “long” or “string”, or None

property edge_property_types: Dict[str, str]

Get the edge property types as a dictionary.

Returns: dict mapping property names to their types

property edge_props: List[str]: Get the edge property names as a list.

get_array_compaction_threshold()

Get the array compaction threshold.

For graphs optimized for updates, the value corresponds to the ratio at which the delta-logs are compacted into new arrays.

Returns: the compaction threshold
Return type: float

get_attributes()

Get the specific additional attributes needed to read/write the graph data.

Returns: the map of attributes
Return type: Dict[Any, Any]

get_config_fields()

Get the fields of the graph config.

Returns: the fields of the graph config
Return type: List[str]

get_edge_id_strategy()

Get the ID strategy that should be used for the edges of this graph. If not specified (or set to null), the strategy will be determined during loading or using a default value.

Returns: the edge ID strategy
Return type: Optional[str]

get_edge_property_default(i)

Get the default value of an edge property by index.

Parameters: i (int) – the 0-based index of the edge property
Returns: the default value of the edge property
Return type: Any

get_edge_property_dimension(i)

Get the dimension of an edge property by index.

Parameters: i (int) – the 0-based index of the edge property
Returns: the default value of the edge property
Return type: int

get_edge_property_name(i)

Get the name of an edge property by index.

Parameters: i (int) – the 0-based index of the edge property
Returns: the name of the edge property
Return type: str

get_edge_property_type(i)

Get the type of edge property by index.

Parameters: i (int) – the 0-based index of the edge property
Returns: the type of the edge property, can be “integer”, “long”, “string”, etc..
Return type: str

get_edge_props()

Get the edge properties

Return type: List[GraphPropertyConfig]

get_error_handling()

Get the error handling configuration of this graph configuration.

Returns: the error handling configuration
Return type: Dict[str, Any]

get_external_stores()

Get the list of external stores.

Returns: the list of external stores
Return type: List[Dict[str, Any]]

get_keystore_alias()

Get the keystore alias.

Returns: the keystore alias or None if underlying format does not require a keystore
Return type: Optional[str]

get_leftover_values()

Get the values that do not belong to any field.

Returns: the values that do not belong to any field
Return type: Dict[str, Any]

get_loading()

Get the loading-specific configuration.

Returns: the loading-specific configuration
Return type: Dict[str, Any]

get_loading_filter()

Get the loading graph filter if one is present.

Filtered loading is deprecated since 22.3, use Dynamic Subgraph Loading instead.

Returns: the loading graph filter if one is present, None otherwise
Return type: Optional[GraphFilter]

get_loading_options()

Get the loading configuration of this graph configuration.

Returns: the loading configuration, as a dict
Return type: Dict[str, Any]

get_local_date_format()

Get the list of date formats to use when loading and storing local_date properties.

Please see DateTimeFormatter for a documentation of the format string.

Returns: the date format
Return type: List[str]

get_optimized_for()

Indicate if the graph is optimized for reads or updates.

Returns: if the graph is optimized for reads (“read”) or updates (“updates”)
Return type: str

get_partition_while_loading()

Indicate if the graph should be partitioned during loading.

Returns: “by_label” if the graph should be partitioned during loading, “no” otherwise
Return type: str

get_time_format()

Get the list of time formats to use when loading and storing time properties.

Please see DateTimeFormatter for a documentation of the format string.

Returns: the time format
Return type: List[str]

get_time_with_timezone_format()

Get the list of time with timezone formats to use when loading and storing time with timezone properties.

Please see DateTimeFormatter for a documentation of the format string.

Returns: the time with timezone format
Return type: List[str]

get_timestamp_format()

Get the list of timestamp formats to use when loading and storing timestamp properties.

Please see DateTimeFormatter for a documentation of the format string.

Returns: the timestamp format
Return type: List[str]

get_timestamp_with_timezone_format()

Get the list of timestamp with timezone formats to use when loading and storing timestamp with timezone properties.

Please see DateTimeFormatter for a documentation of the format string.

Returns: the timestamp with timezone format
Return type: List[str]

get_validated_edge_id_strategy()

Validate and return the ID strategy used for edges (checking if the strategy is compatible with the rest of the graph configuration).

Returns: the ID strategy that can be used for the edges of the graph, one of “no_ids”, “keys_as_ids”, “unstable_generated_ids”, “partitioned_ids”
Return type: str

get_validated_edge_id_type()

Validate and return the ID type used for edges (checking if the type is compatible with the rest of the configuration).

Returns: the ID type that can be used for the edges of the graph, can be “integer”, “long”, “string”, etc..
Return type: str

get_validated_vertex_id_strategy()

Validate and return the ID strategy used for vertices (checking if the strategy is compatible with the rest of the graph configuration).

Returns: the ID strategy that can be used for the vertices of the graph, one of “no_ids”, “keys_as_ids”, “unstable_generated_ids”, “partitioned_ids”
Return type: str

get_validated_vertex_id_type()

Validate and return the ID type used for vertices (checking if the type is compatible with the rest of the configuration).

Returns: the ID type that can be used for the vertices of the graph, can be “integer”, “long”, “string”, etc..
Return type: str

static get_value_from_environment(key)

Look up a value by a key from java properties and the system environment.

Looks up the provided key first in the java system properties prefixed with SYSTEM_PROPERTY_PREFIX and returns the value if present. If it is not present, looks it up in the system environment prefixed with ENV_VARIABLE_PREFIX and returns this one if present. Returns None if the key is neither found in the properties nor in the environment.

Parameters: key (str) – the key to look up
Returns: the found value or None if the key is not available
Return type: Optional[str]

get_values()

Return values of class

Returns: values
Return type: Dict[Any, Any]

get_values_without_defaults()

Return values without defaults.

Returns: values
Return type: Dict[Any, Any]

get_vector_component_delimiter()

Get delimiter for the different components of vector properties.

Returns: the delimiter
Return type: str

get_vertex_id_strategy()

Get the ID strategy that should be used for the vertices of this graph. If not specified (or set to null), the strategy will be automatically detected.

Returns: the vertex id strategy
Return type: Optional[str]

get_vertex_property_default(i)

Get the default value of a vertex property by index.

Parameters: i (int) – the 0-based index of the vertex property
Returns: the default value of the vertex property
Return type: Any

get_vertex_property_dimension(i)

Get the dimension of a vertex property by index.

Parameters: i (int) – the 0-based index of the vertex property
Returns: the dimension of the vertex property
Return type: int

get_vertex_property_name(i)

Get the name of a vertex property by index.

Parameters: i (int) – the 0-based index of the vertex property
Returns: the name of the vertex property
Return type: str

get_vertex_property_type(i)

Get the type of vertex property by index.

Parameters: i (int) – the 0-based index of the vertex property
Returns: the type of the vertex property, can be “integer”, “long”, “string”, etc..
Return type: str

get_vertex_props()

Get the vertex properties

Return type: List[GraphPropertyConfig]

has_default_value(field)

Check if field has a default value.

Parameters: field (str) – the field
Returns: True, if value for given field is the default value
Return type: bool

is_edge_label_loading_enabled()

Check if edge label loading is enabled.

Returns: True if edge label loading is enabled, False otherwise.
Return type: bool

is_empty()

Check if it’s empty.

Returns: True if it’s empty
Return type: bool

is_load_edge_keys()

Whether to load edge keys.

Returns: True if we should load the edge keys.
Return type: bool

is_load_vertex_keys()

Whether to load vertex keys.

Returns: True if we should load the vertex keys.
Return type: bool

is_vertex_labels_loading_enabled()

Check if vertex labels loading is enabled.

Returns: True if vertex label loading is enabled, False otherwise.
Return type: bool

skip_edge_loading()

Whether to skip edge loading.

Returns: True if we should skip edge loading.
Return type: bool

skip_vertex_loading()

Whether to skip vertex loading.

Returns: True if we should skip vertex loading.
Return type: bool

property vertex_id_type: Optional[str]

Get the type of the vertex ID.

Returns: a str indicating the type of the vertex ID, one of “integer”, “long” or “string”, or None

property vertex_property_types: Dict[str, str]

Get the vertex property types as a dictionary.

Returns: dict mapping property names to their types

property vertex_props: List[str]: Get the vertex property names as a list.

class pypgx.api.GraphConfigFactory(java_graph_config_factory, graph_config_class)

Bases: Generic[_GraphConfig]

A factory class for creating graph configs.

static for_any_format()

Return a new factory to parse any graph config from various input sources.

Return type: GraphConfigFactory[GraphConfig]

static for_file_formats()

Return a new graph config factory to parse file-based graph configs from various input sources.

Return type: GraphConfigFactory[FileGraphConfig]

static for_partitioned()

Return a new graph config factory to parse partitioned graph config.

Return type: GraphConfigFactory[PartitionedGraphConfig]

static for_property_graph_hbase()

Return a new graph config factory to create graph configs targeting the Apache HBase database in the property graph format.

Return type: GraphConfigFactory[PgHbaseGraphConfig]

static for_property_graph_nosql()

Return a new graph config factory to create graph configs targeting the Oracle NoSQL database in the property graph format.

Return type: GraphConfigFactory[PgNosqlGraphConfig]

static for_property_graph_rdbms()

Return a new graph config factory to create graph configs targeting the Oracle RDBMS database in the property graph format.

Return type: GraphConfigFactory[PgRdbmsGraphConfig]

static for_rdf()

Return a new RDF graph config factory.

Return type: GraphConfigFactory[RdfGraphConfig]

static for_two_tables_rdbms()

Return a new graph config factory to create graph configs targeting the Oracle RDBMS database in the two-tables format.

Return type: GraphConfigFactory[TwoTablesRdbmsGraphConfig]

static for_two_tables_text()

Return a new graph config factory to create graph configs targeting files in the two-tables format.

Return type: GraphConfigFactory[TwoTablesTextGraphConfig]

from_file_path(path)

Parse a configuration object given as path to a JSON file.

Relative paths found in JSON are resolved relative to given file.

Parameters: path (str) – The path to the JSON file
Return type: _GraphConfig

from_input_stream(stream)

Parse a configuration object given an input stream.

Parameters: stream – A JAVA ‘InputStream’ object from where to read the configuration
Return type: _GraphConfig

from_json(json)

Parse a configuration object given a JSON string.

Parameters: json (str) – The input JSON string
Return type: _GraphConfig

from_path(path)

Parse a configuration object given a path.

Parameters: path (str) – The path from where to parse the configuration.
Return type: _GraphConfig

from_properties(properties)

Parse a configuration object from a properties object.

Parameters: properties – A JAVA ‘Properties’ object
Return type: _GraphConfig

static init(want_strict_mode=True)

Setter function for the ‘strictMode’ class variable.

Parameters: want_strict_mode (bool) – A boolean value which will be assigned to ‘strictMode’ (Default value = True)
Return type: None

class pypgx.api.GraphDelta(java_graph_delta)

Bases: object

Represents a delta since the last synchronization operation

property num_added_edges: int

Get the number of added edges

Returns: number of added edges
Return type: int

property num_added_vertices: int

Get the number of added vertices

Returns: number of added vertices
Return type: int

property num_removed_edges: int

Get the number of removed edges

Returns: number of removed edges
Return type: int

property num_removed_vertices: int

Get the number of removed vertices

Returns: number of removed vertices
Return type: int

property num_updated_edges: int

Get the number of updated edges

Returns: number of updated edges
Return type: int

property num_updated_vertices: int

Get the number of updated vertices

Returns: number of updated vertices
Return type: int

property total_num_changes: int

Get the total number of changes

Returns: total number of changes
Return type: int

class pypgx.api.GraphMetaData(java_graph_meta_data=None, vertex_id_type=None, edge_id_type=None)

Bases: object

Meta information about PgxGraph.

Parameters

vertex_id_type (Optional[str]) –
edge_id_type (Optional[str]) –

get_config()

Get the graph configuration object used to specify the data source of this graph.

Returns: Returns the ‘GraphConfig’ object of this ‘GraphMetaData’ object.
Return type: Optional[GraphConfig]

get_creation_request_timestamp()

Get the timestamp (milliseconds since Jan 1st 1970) when this graph was requested to be created.

Returns: A long value containing the timestamp.
Return type: int

get_creation_timestamp()

Get the timestamp (milliseconds since Jan 1st 1970) when this graph finished creation.

Returns: A long value containing the timestamp.
Return type: int

get_data_source_version()

Get the format-specific version identifier provided by the data-source.

Returns: A string containing the version.
Return type: str

get_edge_id_strategy()

Get the ID strategy used for the edges of this graph.

Returns: the edge id strategy.
Return type: str

get_edge_id_type()

Get the edge ID type of this graph.

Returns: the edge id type.
Return type: str

get_edge_providers_meta_data()

Get the edge tables metadata.

Returns: the edge tables metadata
Return type: Mapping[str, EdgeProviderMetaData]

get_main_edge_provider_meta_data()

Get the main edge table metadata. This is only valid for non-partitioned graphs.

Returns: the main edge table metadata
Return type: EdgeProviderMetaData

get_main_vertex_provider_meta_data()

Get the main vertex table metadata. This is only valid for non-partitioned graphs.

Returns: the main vertex table metadata
Return type: VertexProviderMetaData

get_memory_mb()

Get the estimated number of memory this graph (including its properties) consumes in memory (in megabytes).

Returns: A long value containing the estimated amount of memory.
Return type: int

get_num_edges()

Get the number of edges.

Returns: A long value containing the number of edges.
Return type: int

get_num_vertices()

Get the number of vertices.

Returns: A long value containing the number of vertices.
Return type: int

get_vertex_id_strategy()

Get the ID strategy used for the vertices of this graph.

Returns: the vertex id strategy.
Return type: str

get_vertex_id_type()

Get the vertex ID type of this graph.

Returns: the vertex id type.
Return type: str

get_vertex_providers_meta_data()

Get the vertex tables metadata.

Returns: the vertex tables metadata
Return type: Mapping[str, VertexProviderMetaData]

hash_code()

Return the hash code of this object.

Returns: An int value containing the hash code.
Return type: int

is_directed()

Return if the graph is directed.

Returns: ‘True’ if the graph is directed and ‘False’ otherwise.
Return type: bool

is_graph_pinned()

Return if the graph is pinned.

Returns: ‘True’ if the graph is pinned and ‘False’ otherwise.
Return type: bool

is_partitioned()

Return if the graph is partitioned or not.

Returns: ‘True’ if the graph is partitioned and ‘False’ otherwise.
Return type: bool

is_snapshot_pinned()

Return if the snapshot is pinned.

Returns: ‘True’ if the snapshot is pinned and ‘False’ otherwise.
Return type: bool

set_config(config)

Set a new ‘GraphConfig’.

Parameters: config (GraphConfig) – An object of type ‘GraphConfig’.
Return type: None

set_creation_request_timestamp(creation_request_timestamp)

Set a new creation-request timestamp.

Parameters: creation_request_timestamp (int) – A long value containing the new creation-request timestamp.
Return type: None

set_creation_timestamp(creation_timestamp)

Set a new creation timestamp.

Parameters: creation_timestamp (int) – A long value containing the new creation timestamp.
Return type: None

set_data_source_version(data_source_version)

Set a new data source version.

Parameters: data_source_version (str) – A string containing the new version.
Return type: None

set_directed(directed)

Assign a new truth value to the ‘directed’ variable.

Parameters: directed (bool) – A boolean value
Return type: None

set_edge_id_strategy(id_strategy)

Set the ID strategy used for the edges of this graph.

Parameters: id_strategy (str) – the new edge id strategy
Return type: None

set_edge_providers_meta_data(edge_provider_meta_data_map)

Set the edge tables metadata.

Parameters: edge_provider_meta_data_map (Mapping[str, EdgeProviderMetaData]) – the edge tables metadata
Return type: None

set_graph_pinned(pinned)

Set whether the graph is pinned or not.

Parameters: pinned (bool) – ‘True’ to pin the graph, ‘False’ otherwise
Return type: None

set_memory_mb(memory_mb)

Set a new amount of memory usage.

Parameters: memory_mb (int) – A long value containing the new amount of memory.
Return type: None

set_num_edges(num_edges)

Set a new amount of edges.

Parameters: num_edges (int) – A long value containing the new amount of edges.
Return type: None

set_num_vertices(num_vertices)

Set a new amount of vertices.

Parameters: num_vertices (int) – A long value containing the new amount of edges.
Return type: None

set_snapshot_pinned(snapshot_pinned)

Set whether the snapshot is pinned or not.

Parameters: snapshot_pinned (bool) – ‘True’ to pin the snapshot, ‘False’ otherwise
Return type: None

set_vertex_id_strategy(id_strategy)

Set the ID strategy used for the vertices of this graph.

Parameters: id_strategy (str) – the new vertex id strategy
Return type: None

set_vertex_providers_meta_data(vertex_provider_meta_data_map)

Set the vertex tables metadata.

Parameters: vertex_provider_meta_data_map (Mapping[str, VertexProviderMetaData]) – the vertex tables metadata
Return type: None

class pypgx.api.GraphPropertyConfig(name, property_type, dimension=None, formats=None, default=None, column=None, stores=None, max_distinct_strings_per_pool=None, string_pooling_strategy=None, aggregate=None, field=None, group_key=None, drop_after_loading=None)

Bases: object

A class for representing graph property configurations.

Parameters

name (str) –
property_type (str) –
dimension (Optional[int]) –
formats (Optional[List[str]]) –
default (Optional[Any]) –
column (Optional[Union[int, str]]) –
stores (Optional[List[Mapping[str, str]]]) –
max_distinct_strings_per_pool (Optional[int]) –
string_pooling_strategy (Optional[str]) –
aggregate (Optional[str]) –
field (Optional[str]) –
group_key (Optional[str]) –
drop_after_loading (Optional[bool]) –

get_aggregate()

Which aggregation function to use, aggregation always happens by vertex key.

Returns: Aggregation function to use.
Return type: Optional[str]

get_column()

Get name or index (starting from 1) of the column holding the property data. If it is not specified, the loader will try to use the property name as column name (for CSV format only).

Returns: Column of property.
Return type: Optional[Union[int, str]]

static get_config_fields()

Return the config fields

Returns: Collection of config fields.
Return type: Collection[str]

get_default()

Get default value to be assigned to this property if datasource does not provide it. In case of date type: string is expected to be formatted with yyyy-MM-dd HH:mm:ss. If no default is present, non-existent properties will contain default Java types (primitives) or empty string or 01.01.1970 00:00.

Returns: Default of property.
Return type: Any

get_dimension()

Get dimension of property

Returns: Dimension of property.
Return type: int

get_field()

Get name of the JSON field holding the property data. Nesting is denoted by dot - separation. Field names containing dots are possible, in this case the dots need to be escaped using backslashes to resolve ambiguities. Only the exactly specified object are loaded, if they are non existent, the default value is used

Returns: Name of the JSON field.
Return type: Optional[str]

get_format()

Get list of formats of property

Returns: List of formats of property.
Return type: List[str]

get_group_key()

Can only be used if the property / key is part of the grouping expression.

Returns: Group key.
Return type: Optional[str]

get_max_distinct_strings_per_pool()

Get amount of distinct strings per property after which to stop pooling. If the limit is reached an exception is thrown. If set to null, the default value from the global PGX configuration will be used.

Returns: Amount of distinct strings per property after which to stop pooling.
Return type: Optional[int]

get_name()

Get name of property.

Returns: Name of property.
Return type: str

get_parsed_default_value()

Get the parsed default value guaranteed to match the property type (with the exception of type node/edge). In case a default is not specified, the default default value is returned.

Returns: The parsed default value.
Return type: Any

get_source_column()

Return column if indicated, otherwise return the property name.

Returns: The source column.
Return type: Optional[Union[int, str]]

get_stores()

Get list of storage identifiers that indicate where this property resides.

Returns: List of storage identifiers.
Return type: List[Mapping[str, str]]

get_string_pooling_strategy()

Get which string pooling strategy to use. If set to null, the default value from the global PGX configuration will be used.

Returns: String pooling strategy to use.
Return type: Optional[str]

get_type()

Get type of property

Returns: Type of property.
Return type: Optional[str]

static get_value_from_environment(key)

Look up a value by a key from java properties and the system environment. Looks up the provided key first in the java system properties prefixed with SYSTEM_PROPERTY_PREFIX and returns the value if present. If it is not present, looks it up in the system environment prefixed with ENV_VARIABLE_PREFIX and returns this one if present. Returns None if the key is neither found in the properties nor in the environment.

Parameters: key (str) – The key to look up.
Returns: The found value or None if the key is not available.
Return type: Optional[str]

has_default_value(field)

Check if field has a default value.

Parameters: field (str) – The field.
Raises: ValueError – If the field is not a part of the graph property config fields.
Returns: True, if value for given field is the default value.
Return type: bool

is_drop_after_loading()

Whether helper properties are only used for aggregation, which are dropped after loading.

Returns: True, if helper properties are dropped after loading.
Return type: bool

is_empty()

Check if it’s empty.

Returns: True, if the Map ‘values’ is empty.
Return type: bool

is_external()

Whether the property is external

Returns: True if the property is external, False otherwise.
Return type: bool

is_in_memory()

Whether the property is in memory

Returns: True if the property is in memory, False otherwise.
Return type: bool

is_string_pool_enabled()

Whether the string pool is enabled

Returns: True if the the string pool is enabled, False otherwise.
Return type: bool

set_serializable(serializable)

Set this config to be serializable

Parameters: serializable (bool) – True if serializable, False otherwise.
Return type: None

class pypgx.api.IoEnvironment(java_io_environment)

Bases: object

A sub environment for IO tasks

get_num_threads_per_task()

Get the number of threads per task.

Returns: the number of threads per task.
Return type: int

get_relevant_fields()

Get the relevant fields of the IO environment.

Returns: the relevant fields of the IO environment.
Return type: List[str]

get_values()

Return values of class

Returns: values
Return type: Dict[Any, Any]

reset()

Reset environment

Return type: None

set_num_threads_per_task(num_threads_per_task)

Set the number of threads per task.

Parameters: num_threads_per_task (int) – the number of threads per task.
Return type: None

class pypgx.api.MatrixFactorizationModel(graph, java_mfm, features)

Bases: object

Object that holds the state for repeatedly returning estimated ratings.

Parameters: graph (PgxGraph) –

get_estimated_ratings(v)

Return estimated ratings for a specific vertex.

Parameters: v (Union[PgxVertex, str, int]) – The vertex to get estimated ratings for.
Returns: The VertexProperty containing the estimated ratings.
Return type: float

property root_mean_square_error: float

Get the root mean square error of the model.

Returns: The root mean square error.

class pypgx.api.MergingStrategyBuilder(java_mutation_strategy_builder)

Bases: MutationStrategyBuilder

A class for defining a merging strategy on a graph.

set_keep_user_defined_edge_keys(keep_user_defined_edge_keys)

If set to True, the user-defined edge keys are kept as far as possible.

If multiple edges A and B are merged into one edge, a new key is generated for this edge.

Parameters: keep_user_defined_edge_keys (bool) – whether to keep user-defined edge keys
Returns: the MergingStrategyBuilder itself
Return type: MergingStrategyBuilder

set_label_merging_strategy(label_merging_function)

Define a merging function for the edge labels.

By default (without calling this), the labels will be merged using the “max” function

Parameters: label_merging_function (str) – available functions are: “min” and “max”
Returns: the MergingStrategyBuilder itself
Return type: MergingStrategyBuilder

set_property_merging_strategy(prop, merging_function)

Define a merging function for the given edge property.

All properties, where no merging_function was defined will be merged using the “max” function.

This strategy can be used to merge the properties of multi-edges. PGX allows the user to define a merging_function for every property.

Parameters

prop (Union[str, PgxId, EdgeProperty]) – a property name, PgxId, or EdgeProperty
merging_function (str) – available functions are: “min”, “max”, and “sum”

Returns

the MergingStrategyBuilder itself

Return type

MergingStrategyBuilder

class pypgx.api.MutationStrategyBuilder(java_mutation_strategy_builder)

Bases: object

A class for defining a mutation strategy on a graph.

build()

Build the MutationSrategy object with the chosen parameters.

Parameters that were not set, are instantiated with default values.

Returns: a MutationStrategy instance with the chosen parameters
Return type: MutationStrategy

drop_edge_properties(edge_properties)

Set edge properties that will be dropped after the mutation.: By default (without calling this) all edgeProperties will be kept.

Parameters: edge_properties (List[EdgeProperty]) – list of EdgeProperty objects to drop
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

drop_edge_property(edge_property)

Set an edge property that will be dropped after the mutation.

By default (without calling this), all edge properties will be kept.

Parameters: edge_property (EdgeProperty) – EdgeProperty object to drop
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

drop_vertex_properties(vertex_properties)

Set vertex properties that will be dropped after the mutation.

By default (without calling this), all edge properties will be kept.

Parameters: vertex_properties (List[VertexProperty]) – list of VertexProperty objects to drop
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

drop_vertex_property(vertex_property)

Set a vertex property that will be dropped after the mutation.

By default (without calling this), all vertex properties will be kept.

Parameters: vertex_property (VertexProperty) – VertexProperty object to drop
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

set_copy_mode(copy_mode)

Define whether the mutation should occur on the original graph or on a copy.

If set to True, the mutation will occur on the original graph without creating a new instance. If set to False, a new graph instance will be created. The default copy mode is False.

Parameters: copy_mode (bool) – whether to mutate the original graph or create a new one
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

set_kept_edge_properties(props_to_keep)

Set edge properties that will be kept after the mutation.

By default (without calling this), all edge properties will be kept.

Parameters: props_to_keep (List[EdgeProperty]) – list of EdgeProperty objects to keep
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

set_kept_vertex_properties(props_to_keep)

Set vertex properties that will be kept after the mutation.

By default (without calling this), all vertex properties will be kept.

Parameters: props_to_keep (List[VertexProperty]) – list of VertexProperty objects to keep
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

set_multi_edges(keep_multi_edges)

Define if multi edges should be kept in the result.

By default (without calling this), multi edges will be removed.

Parameters

copy_multi_edges – whether to keep or remove multi edges in the result
keep_multi_edges (bool) –

Returns

the MutationStrategyBuilder instance itself

Return type

set_new_graph_name(new_graph_name)

Set a new graph name. If None, a new graph name will be generated.

Parameters: new_graph_name (Optional[str]) – a new graph name
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

set_self_edges(keep_self_edges)

Define if self edges should be kept in the result.

By default (without calling this), self edges will be removed.

Parameters

copy_self_edges – whether to keep or remove self edges in the result
keep_self_edges (bool) –

Returns

the MutationStrategyBuilder instance itself

Return type

MutationStrategyBuilder

set_trivial_vertices(keep_trivial_vertices)

Define if isolated nodes should be kept in the result.

By default (without calling this), isolated nodes will be kept.

Parameters: keep_trivial_vertices (bool) – whether to keep or remove trivial vertices in the result
Returns: the MutationStrategyBuilder instance itself
Return type: MutationStrategyBuilder

class pypgx.api.Namespace(java_namespace)

Bases: object

Represents a namespace for objects (e.g. graphs, properties) in PGX.

Note

This class is just a thin wrapper and does not check if the input is actually a java namespace.

static from_id(namespace_id)

Get the Python namespace object.

Parameters: namespace_id (PgxId) – A new namespace object will be created for this ID.
Returns: The Python namespace object.
Return type: oracle.pgx.api.Namespace

get_java_namespace()

Get the java namespace object.

Returns: The java namespace object.
Return type: oracle.pgx.api.Namespace

get_namespace_id()

Get the Python PgxId object.

Returns: The Python PgxId object.
Return type: PgxId

class pypgx.api.Operation(java_operation)

Bases: object

An operation is part of an execution plan for executing a PGQL query.

The execution plan is composed of a tree of operations.

property cardinality_estimate: float

Estimate the cardinality.

Returns: An estimation of the cardinality after executing this operation.
Return type: float

property children: List[Union[Operation, Any]]

Return the children of this operation. Non leaf operations can have multiple child operations, which will be returned by this function.

Returns: A list of operations which are the children of this operation.
Return type: List[Union[Operation, Any]]

property cost_estimate: float

Estimate the cost of this operation.

Returns: An estimation of the cost of executing this operation.
Return type: float

get_cardinality_estimate()

Estimate the cardinality.

Returns: An estimation of the cardinality after executing this operation.
Return type: float

get_children()

Return the children of this operation. Non leaf operations can have multiple child operations, which will be returned by this function.

Returns: A list of operations which are the children of this operation.
Return type: List[Union[Operation, Any]]

get_cost_estimate()

Estimate the cost of this operation.

Returns: An estimation of the cost of executing this operation.
Return type: float

get_filters()

Return the filters that apply to this operation.

The filters specified in WHERE clauses or through label expressions

Returns: a set of filters that apply to this operation
Return type: Set[str]

get_graph_id()

Return the graph used in the operation.

Returns: The id of the graph used in the operation.
Return type: str

get_operation_type()

Return the type of operation.

Returns: OperationType of this operation as an enum value.
Return type: str

get_pattern_info()

Return the pattern info.

Returns: An string indicating the pattern that will be matched by this operation.
Return type: Optional[str]

get_total_cost_estimate()

Estimate the cost of this operation and all its children.

Returns: An estimation of the cost of executing this operation and all its children.
Return type: float

property graph_id: str

Return the graph used in the operation.

Returns: The id of the graph used in the operation.
Return type: str

is_same_query_plan(other)

Check if the query plan with this operation as root node is equal to the query plan with ‘other’ as root node. This will only check if the operationType and the pattern are the same for each node in both query plans.

Parameters: other (Union[Operation, str]) – The query plan.
Raises: TypeError – other must be an Operation.
Returns: True if both execution plans are the same, false otherwise.
Return type: bool

property operation_type: str

Return the type of operation.

Returns: OperationType of this operation as an enum value.
Return type: str

property pattern_info: Optional[str]

Return the pattern info.

Returns: An string indicating the pattern that will be matched by this operation.
Return type: Optional[str]

print(file=None)

Print the current operation and all its children to standard output.

Parameters: file (Optional[TextIO]) – File to which results are printed (default is sys.stdout).
Return type: None

property total_cost_estimate: float

Estimate the cost of this operation and all its children.

Returns: An estimation of the cost of executing this operation and all its children.
Return type: float

class pypgx.api.PartitionedGraphConfig(java_graph_config)

Bases: GraphConfig, DbConnectionConfig

A class for representing partitioned graph configurations

get_edge_providers()

Get the edge providers of this graph configuration.

Returns: the list of URIs
Return type: List[Dict[str, Any]]

get_es_index_name()

Get the ES Index name.

Returns: the ES Index
Return type: Optional[str]

get_es_url()

Get the ES URL pointing to an ES instance.

Returns: the ES URL
Return type: Optional[str]

get_max_batch_size()

Get the maximum number of docs requested during each ES request, this is the ES default.

Returns: the maximum number of requested docs
Return type: int

get_num_connections()

Get the number of connections to read/write data from/to the RDBMS table.

Returns: the number of connections
Return type: int

get_pg_view_name()

Get the name of the PG view in the database to load the graph from.

Returns: the name of the PG view
Return type: Optional[str]

get_prepared_queries()

Get an additional list of prepared queries with arguments, working in the same way as ‘queries’.

Returns: the list of prepared queries
Return type: Optional[List[Dict[str, Any]]]

get_proxy_url()

Get the proxy server URL to be used for connection to es_url.

Returns: the proxy URL
Return type: Optional[str]

get_queries()

Get a list of queries used to determine which data to load from the database.

Returns: a list of queries
Return type: Optional[List[str]]

get_redaction_rules()

Get the redaction rules from this graph configuration.

Returns: the list of PgxRedactionRuleConfig
Return type: List[PgxRedactionRuleConfig]

get_rules_mapping()

Get the mapping between redaction rules and users/roles.

Returns: the list of PgxRedactionRuleMappingConfig
Return type: List[Dict[str, Any]]

get_scroll_time()

Get the ES scroll time.

Returns: the ES scroll time
Return type: str

get_username()

Get the username to use when connecting to an ES instance.

Returns: the username
Return type: Optional[str]

get_vertex_providers()

Get the vertex providers of this graph configuration.

Returns: the list of URIs
Return type: List[Dict[str, Any]]

class pypgx.api.PgGraphConfig(java_graph_config)

Bases: GraphConfig

A class for representing PG graph configurations

get_db_engine()

Get the target database engine of this configuration.

Returns: the target database engine
Return type: str

get_max_num_connections()

Get the maximum number of connections of this configuration.

Returns: the maximum number of connections
Return type: int

class pypgx.api.PgHbaseGraphConfig(java_graph_config)

Bases: PgGraphConfig

A class for representing Pg HBase graph configurations

get_block_cache_size()

Get the block cache size.

Returns: the block cache size
Return type: int

get_compression()

Get the HBase compression algorithm to use.

Returns: the HBase compression algorithm to use.
Return type: str

get_data_block_encoding()

Get the datablock encoding algorithm to use.

Returns: the datablock encoding algorithm to use
Return type: str

get_hadoop_sec_auth()

Get the hadoop authentication string.

Returns: the Hadoop authentication string
Return type: str

get_hbase_sec_auth()

Get the HBase authentication string.

Returns: the HBase authentication string
Return type: str

get_hm_kerberos_principal()

Get the HM Kerberos principal.

Returns: the HM Kerberos principal
Return type: str

get_initial_edge_num_regions()

Get the number of initial edge regions defined for the HBase tables.

Returns: the number of initial edge regions defined for the HBase tables
Return type: int

get_initial_vertex_num_regions()

Get the number of initial vertex regions defined for the HBase tables.

Returns: the number of initial vertex regions defined for the HBase tables
Return type: int

get_keytab()

Get the path to keytab file.

Returns: path to keytab file
Return type: str

get_rs_kerberos_principal()

Get the RS Kerberos principal.

Returns: the RS Kerberos principal
Return type: str

get_splits_per_region()

Get the splits per region.

Returns: the splits per region
Return type: int

get_user_principal()

Get the user principal.

Returns: the user principal
Return type: str

get_zk_client_port()

Get the ZooKeeper client port.

Returns: the ZooKeeper client port
Return type: int

get_zk_node_parent()

Get the ZooKeeper parent node.

Returns: the ZooKeeper parent node
Return type: str

get_zk_quorum()

Get the ZooKeeper Quorum.

Returns: the ZooKeeper Quorum
Return type: str

get_zk_session_timeout()

Get the ZooKeeper session timeout.

Returns: the ZooKeeper session timeout
Return type: int

class pypgx.api.PgNosqlGraphConfig(java_graph_config)

Bases: PgGraphConfig

A class for representing Pg No SQL graph configurations

get_hosts()

Get the list of hosts.

Returns: the hosts
Return type: List[str]

get_request_timeout_ms()

Get the NoSQL request timeout in milliseconds

Returns: the NoSQL request timeout in milliseconds
Return type: int

get_store_name()

Get the store name.

Returns: the store name
Return type: str

get_username()

Get the name of a NoSQL user.

Returns: the name of a NoSQL user
Return type: Optional[str]

class pypgx.api.PgRdbmsGraphConfig(java_graph_config)

Bases: PgGraphConfig, DbConnectionConfig

A class for representing PG RDBMS graph configurations

get_edges_view_name()

Get the name of view for edges.

Returns: the name of view for edges
Return type: Optional[str]

get_label()

Get the label.

Returns: the label
Return type: Optional[str]

get_options()

Get the parameter that is used by the data access layer (and the underlying database) to change default behaviors of graph instance creation or initialization.

Returns: the parameter
Return type: Optional[str]

get_owner()

Get the owner.

Returns: the owner
Return type: Optional[str]

get_row_label()

Get the row label.

Returns: the row label
Return type: Optional[str]

get_security_policy()

Get the security policy for the label or row label.

Returns: the policy
Return type: Optional[str]

get_vertices_view_name()

Get the name of view for vertices

Returns: the name of view for vertices
Return type: Optional[str]

get_view_parallel_hint_degree()

If view names are given, the resulting query will be hinted to run in parallel with the given degree.

Returns: the view parallel hint degree
Return type: int

class pypgx.api.PgqlResultElement(java_pgql_result_elem)

Bases: object

Type and variable name information on a pattern matching result element

property collection_element_type: Optional[str]

Get the type of the elements stored in the collection if the result element is a collection

Returns: type of the elements stored in the collection

property element_type: Optional[str]

Get the type of this result element

Returns: result element type

property variable_name: str

Get the variable name of the result element

Returns: the variable name

property vertex_edge_id_type: Optional[str]

Get the type of vertex/edge result elements

Returns: type of vertex/edge result elements or None if not vertex/edge.

class pypgx.api.PgqlResultSet(graph, java_pgql_result_set)

Bases: PgxContextManager

Result set of a pattern matching query.

Note: retrieving results from the server is not thread-safe.

Parameters: graph (Optional[PgxGraph]) –

absolute(row)

Move the cursor to the given row number in this ResultSet object.

If the row number is positive, the cursor moves to the given row number with respect to the beginning of the result set. The first row is 1, so absolute(1) moves the cursor to the first row.

If the row number is negative, the cursor moves to the given row number with respect to the end of the result set. So absolute(-1) moves the cursor to the last row.

Parameters: row (int) – Row to move to
Returns: True if the cursor is moved to a position in the ResultSet object; False if the cursor is moved before the first or after the last row
Return type: bool

after_last()

Place the cursor after the last row

Return type: None

before_first()

Set the cursor before the first row

Return type: None

close()

Free resources on the server taken up by this frame.

Return type: None

property col_count: int: Get the number of columns

property columns: List[str]: Get the column names

first()

Move the cursor to the first row in the result set

Returns: True if the cursor points to a valid row; False if the result set does not have any results
Return type: bool

get(element)

Get the value of the designated element by element index or name

Parameters: element (Union[str, int]) – Integer or string representing index or name
Returns: Content of cell
Return type: Any

get_boolean(element)

Get the value of the designated element by element index or name as a Boolean

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: Boolean
Return type: Optional[bool]

get_date(element)

Get the value of the designated element by element index or name as a datetime Date

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: datetime.date
Return type: Optional[date]

get_double(element)

Get the value of the designated element by element index or name as a float.

This method is for precision, as a Java floats and doubles have different precisions.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: Float
Return type: Optional[float]

get_edge(element)

Get the value of the designated element by element index or name as a PgxEdge.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: PgxEdge
Return type: Optional[PgxEdge]

get_float(element)

Get the value of the designated element by element index or name as a float.

This method returns a value with less precision than a double usually has.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: Float
Return type: Optional[float]

get_integer(element)

Get the value of the designated element by element index or name as an int.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: Integer
Return type: Optional[int]

get_legacy_datetime(element)

Get the value of the designated element by element index or name as a datetime.

Works with most time and date type cells. If the date is not specified, default is set to to Jan 1 1970.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: datetime.datetime
Return type: Optional[datetime]

get_list(element)

Get the value of the designated element by element index or name as a list.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: List
Return type: Optional[List[str]]

get_long(element)

Get the value of the designated element by element index or name as an int.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: Long
Return type: Optional[int]

get_point2d(element)

Get the value of the designated element by element index or name as a 2D tuple.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: (X, Y)
Return type: Optional[Tuple[float, float]]

get_row(row)

Get row from result_set. This method may change result_set cursor.

Parameters: row (int) – Row index
Return type: Any

get_slice(start, stop, step=1)

Get slice from result_set. This method may change result_set cursor.

Parameters

start (int) – Start index
stop (int) – Stop index
step (int) – Step size

Return type

List[list]

get_string(element)

Get the value of the designated element by element index or name as a string.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: String
Return type: Optional[str]

get_time(element)

Get the value of the designated element by element index or name as a datetime Time.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: datetime.time
Return type: Optional[time]

get_time_with_timezone(element)

Get the value of the designated element by element index or name as a datetime Time that includes timezone.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: datetime.time
Return type: Optional[time]

get_timestamp(element)

Get the value of the designated element by element index or name as a datetime.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: datetime.datetime
Return type: Optional[datetime]

get_timestamp_with_timezone(element)

Get the value of the designated element by element index or name as a datetime.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: datetime.datetime
Return type: Optional[datetime]

get_vertex(element)

Get the value of the designated element by element index or name as a PgxVertex.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: PgxVertex
Return type: Optional[PgxVertex]

get_vertex_labels(element)

Get the value of the designated element by element index or name as a collection of labels.

Note: This method currently returns a list, but this behavior should not be relied upon. In a future version, a set will be returned instead.

Parameters: element (Union[str, int]) – Integer or String representing index or name
Returns: collection of labels
Return type: Collection[str]

property id: str: Get the id of this result set

last()

Move the cursor to the first row in the result set

Returns: True if the cursor points to a valid row; False if the result set does not have any results
Return type: bool

next()

Move the cursor forward one row from its current position

Returns: True if the cursor points to a valid row; False if the new cursor is positioned after the last row
Return type: bool

property num_results: int: Get the number of results

property pgql_result_elements: Dict: Get the result elements of this result set

previous()

Move the cursor to the previous row from its current position

Returns: True if the cursor points to a valid row; False if the new cursor is positioned before the first row
Return type: bool

print(file=None, num_results=1000, start=0)

Print the result set.

Parameters

file (Optional[TextIO]) – File to which results are printed (default is sys.stdout)
num_results (int) – Number of results to be printed
start (int) – Index of the first result to be printed

Return type

None

relative(rows)

Move the cursor a relative number of row with respect to the current position. A negative number will move the cursor backwards.

Note: Calling relative(1) is equal to next() and relative(-1) is equal to previous. Calling relative(0) is possible when the cursor is positioned at a row, not when it is positioned before the first or after the last row. However, relative(0) will not update the position of the cursor.

Parameters: rows (int) – Relative number of rows to move from current position
Returns: True if the cursor is moved to a position in the ResultSet object; False if the cursor is moved before the first or after the last row
Return type: bool

to_frame()

Copy the content of this result set into a new PgxFrames

Returns: a new PgxFrame containing the content of the result set
Return type: PgxFrame

to_pandas()

Convert to pandas DataFrame. This method may change result_set cursor. This method requires pandas.

Returns: PgqlResultSet as a Pandas Dataframe

class pypgx.api.Pgx(java_pgx_class)

Bases: object

Main entry point for PGX applications.

create_session(source=None, base_url=None)

Create and return a session.

Parameters

source (Optional[str]) – The session source string. Default value is “pgx_python”.
base_url (Optional[str]) – The base URL in the format host [ : port][ /path] of the PGX server REST end-point. If base_url is None, the default will be used which points to embedded PGX instance.

Return type

PgxSession

property default_url: str: Get the default URL of the embedded PGX instance.

get_instance(base_url=None, token=None)

Get a handle to a PGX instance.

Parameters

base_url (Optional[str]) – The base URL in the format host [ : port][ /path] of the PGX server REST end-point. If base_url is None, the default will be used which points to embedded PGX instance.
token (Optional[str]) – The access token

Return type

ServerInstance

set_default_url(url)

Set the default base URL used by invocations of get_instance().

The new default URL affects sub-sequent calls of getInstance().

Parameters: url (str) – New URL
Return type: None

class pypgx.api.PgxCollection(java_collection)

Bases: PgxContextManager

Superclass for Pgx collections.

add_all_elements(source)

Add elements to an existing collection.

Parameters: source (Iterable[Union[PgxEdge, PgxVertex]]) – Elements to add
Return type: None

clear()

Clear an existing collection.

Returns: None
Return type: None

clone(name=None)

Clone and rename existing collection.

Parameters: name (Optional[str]) – New name of the collection. If none, the old name is not changed.
Return type: PgxCollection

close()

Request destruction of this object. After this method returns, the behavior of any method of this class becomes undefined.

Return type: None

property collection_type: str: Get the type of this collection.

property content_type: str: Get the content type of this collection.

destroy()

Request destruction of this object.

After this method returns, the behavior of any method of this class becomes undefined.

Returns: None
Return type: None

get_id()

Return the string representation of an internal identifier for this collection. Only meant for internal usage.

Returns: a string representation of the internal identifier of this collection
Return type: str

get_pgx_id()

Return an internal identifier for this collection. Only meant for internal usage.

Returns: the internal identifier of this collection
Return type: PgxId

property id_type: Optional[str]: Get the id type of this collection.

property is_mutable: bool: Return True if this collection is mutable, False otherwise.

property name: str: Get the name of this collection.

remove_all_elements(source)

Remove elements from an existing collection.

Parameters: source (Iterable[Union[PgxEdge, PgxVertex]]) – Elements to remove
Return type: None

property size: int: Get the number of elements in this collection.

to_mutable(name=None)

Create a mutable copy of an existing collection.

Parameters: name (Optional[str]) – New name of the collection. If none, the old name is not changed.
Return type: PgxCollection

class pypgx.api.PgxEdge(graph, java_edge)

Bases: PgxEntity

An edge of a PgxGraph.

Parameters: graph (PgxGraph) –

property destination: PgxVertex: Get the destination vertex of the edge.

property label: str: Return the edge label.

property source: PgxVertex: Get the source vertex of the edge.

property vertices: Tuple[PgxVertex, PgxVertex]: Return the source and the destination vertex.

class pypgx.api.PgxEntity(graph, java_entity)

Bases: object

An abstraction of vertex and edge.

Parameters: graph (PgxGraph) –

get_property(property_name)

Get a property by name.

Parameters: property_name (str) – Property name
Return type: Any

property id: Get the entity id.

set_property(property_name, value)

Set an entity property.

Parameters

property_name (str) – Property name
value (Any) – New value

Return type

None

property type: str: Get the entity type.

class pypgx.api.PgxGraph(session, java_graph)

Bases: PgxContextManager

A reference to a graph on the server side.

Operations on instances of this class are executed on the server side onto the referenced graph. Note that a session can have multiple objects referencing the same graph: the result of any operation mutating the graph on any of those references will be visible on all of them.

Parameters: session (PgxSession) –

add_redaction_rule(redaction_rule_config, authorization_type, *names)

Add a redaction rule for authorization_type names.

Possible authorization types are: [‘user’, ‘role’]

Parameters

authorization_type (str) – the authorization type of the rule to be added
names (str) – the names of the users or roles for which the rule should be added
redaction_rule_config (PgxRedactionRuleConfig) –

Return type

None

alter_graph()

Create a graph alteration builder to define the graph schema alterations to perform on the graph.

Returns: an empty graph alteration builder
Return type: GraphAlterationBuilder

bipartite_sub_graph_from_in_degree(vertex_properties=True, edge_properties=True, name=None, is_left_name=None, in_place=False)

Create a bipartite version of this graph with all vertices of in-degree = 0 being the left set.

Parameters

vertex_properties (Union[List[VertexProperty], bool]) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (bool) – List of edge properties belonging to graph specified to be kept in the new graph
name (Optional[str]) – New graph name
is_left_name (Optional[str]) – Name of the boolean isLeft vertex property of the new graph. If None, a name will be generated.
in_place (bool) – Whether to create a new copy (False) or overwrite this graph (True)

Return type

BipartiteGraph

bipartite_sub_graph_from_left_set(vset, vertex_properties=True, edge_properties=True, name=None, is_left_name=None)

Create a bipartite version of this graph with the given vertex set being the left set.

Parameters

vset (Union[str, VertexSet]) – Vertex set representing the left side
vertex_properties (Union[List[VertexProperty], bool]) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (bool) – List of edge properties belonging to graph specified to be kept in the new graph
name (Optional[str]) – name of the new graph. If None, a name will be generated.
is_left_name (Optional[str]) – Name of the boolean isLeft vertex property of the new graph. If None, a name will be generated.

Return type

BipartiteGraph

clone(vertex_properties=True, edge_properties=True, name=None)

Return a copy of this graph.

Parameters

vertex_properties (bool) – List of vertex properties belonging to graph specified to be cloned as well
edge_properties (bool) – List of edge properties belonging to graph specified to be cloned as well
name (Optional[str]) – Name of the new graph

Return type

clone_and_execute_pgql(pgql_query, new_graph_name=None)

Create a deep copy of the graph, and execute on it the pgql query.

Parameters

pgql_query (str) – Query string in PGQL
new_graph_name (Optional[str]) – Name given to the newly created PgxGraph

Returns

A cloned PgxGraph with the pgql query executed

Return type

throws InterruptedException if the caller thread gets interrupted while waiting for: completion.
throws ExecutionException if any exception occurred during asynchronous execution.: The actual exception will be nested.

close()

Destroy without waiting for completion.

Return type: None

combine_edge_properties_into_vector_property(properties, name=None)

Take a list of scalar edge properties of same type and create a new edge vector property by combining them.

The dimension of the vector property will be equals to the number of properties.

Parameters

properties (List[Union[EdgeProperty, str]]) – List of scalar edge properties
name (Optional[str]) – Name for the vector property. If not null, vector property will be named. If that results in a name conflict, the returned future will complete exceptionally.

Return type

combine_vertex_properties_into_vector_property(properties, name=None)

Take a list of scalar vertex properties of same type and create a new vertex vector property by combining them.

The dimension of the vector property will be equals to the number of properties.

Parameters

properties (List[Union[VertexProperty, str]]) – List of scalar vertex properties
name (Optional[str]) – Name for the vector property. If not null, vector property will be named. If that results in a name conflict, the returned future will complete exceptionally.

Return type

property config: Optional[GraphConfig]: Get the GraphConfig object.

create_all_paths(src, cost, dist, parent, parent_edge)

Create an AllPaths object representing all the shortest paths from a single source to all the possible destinations (shortest regarding the given edge costs).

Parameters

src (Union[str, PgxVertex]) – Source vertex of the path
cost (Optional[Union[str, EdgeProperty]]) – Property holding the edge costs. If None, the resulting cost will equal the hop distance
dist (Union[VertexProperty, str]) – Property holding the distance to the source vertex for each vertex in the graph
parent (Union[VertexProperty, str]) – Property holding the parent vertices of all the shortest paths For example, if the shortest path is A -> B -> C, then parent[C] -> B and parent[B] -> A
parent_edge (Union[VertexProperty, str]) – Property holding the parent edges for each vertex of the shortest path

Returns

The AllPaths object

Return type

create_change_set(vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='auto_generated')

Create a change set for updating the graph.

Uses auto generated IDs for the edges.

Note

This is currently not supported for undirected graphs.

Returns

an empty change set

Return type

GraphChangeSet

Parameters

vertex_id_generation_strategy (str) –
edge_id_generation_strategy (str) –

create_components(components, num_components)

Create a Partition object holding a collection of vertex sets, one for each component.

Parameters

components (Union[VertexProperty, str]) – Vertex property mapping each vertex to its component ID. Note that only component IDs in the range of [0..numComponents-1] are allowed. The returned future will complete exceptionally with an IllegalArgumentException if an invalid component ID is encountered. Gaps are supported: certain IDs not being associated with any vertices will yield to empty components.
num_components (int) – How many different components the components property contains

Returns

The Partition object

Return type

create_edge_property(data_type, name=None)

Create a session-bound edge property.

Parameters

data_type (str) – Type of the edge property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’, ‘boolean’, ‘string’, ‘vertex’, ‘edge’, ‘local_date’, ‘time’, ‘timestamp’, ‘time_with_timezone’, ‘timestamp_with_timezone’)
name (Optional[str]) – Name of the edge property to be created

Return type

create_edge_sequence(name=None)

Create a new edge sequence.

Parameters: name (Optional[str]) – Sequence name
Return type: EdgeSequence

create_edge_set(name=None)

Create a new edge set.

Parameters: name (Optional[str]) – Edge set name
Return type: EdgeSet

create_edge_vector_property(data_type, dim, name=None)

Create a session-bound edge vector property.

Parameters

data_type (str) – Type of the vector property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’)
dim (int) – Dimension of the vector property to be created
name (Optional[str]) – Name of the vector property to be created

Return type

create_map(key_type, val_type, name=None)

Create a session-bound map.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters

key_type (str) – Property type of the keys that are going to be stored inside the map
val_type (str) – Property type of the values that are going to be stored inside the map
name (Optional[str]) – Map name

Return type

create_merging_strategy_builder()

Create a new MergingStrategyBuilder that can be used to build a new MutationStrategy to simplify this graph.

Return type: MergingStrategyBuilder

create_path(src, dst, cost, parent, parent_edge)

Parameters

src (PgxVertex) – Source vertex of the path
dst (PgxVertex) – Destination vertex of the path
cost (EdgeProperty) – Property holding the edge costs. If null, the resulting cost will equal the hop distance.
parent (VertexProperty) – Property holding the parent vertices for each vertex of the shortest path. For example, if the shortest path is A -> B -> C, then parent[C] -> B and parent[B] -> A.
parent_edge (VertexProperty) – Property holding the parent edges for each vertex of the shortest path

Returns

The PgxPath object

Return type

create_picking_strategy_builder()

Create a new PickingStrategyBuilder that can be used to build a new PickingStrategy to simplify this graph.

Return type: PickingStrategyBuilder

create_scalar(data_type, name=None)

Create a new Scalar.

Parameters

data_type (str) – Scalar type
name (Optional[str]) – Name of the scalar to be created

Return type

Scalar

create_synchronizer(*, synchronizer_class='oracle.pgx.api.FlashbackSynchronizer', invalid_change_policy=None, graph_config=None, jdbc_url=None, username=None, password=None)

Create a synchronizer object which can be used to keep this graph in sync with changes happening in its original data source. Only partitioned graphs with all providers loaded from Oracle Database are supported.

Parameters

synchronizer_class (str) – string representing java class including package, currently ‘oracle.pgx.api.FlashbackSynchronizer’ is the only existent option
invalid_change_policy (Optional[str]) – sets the OnInvalidChange parameter to the Synchronizer ChangeSet. Possible values are: ‘ignore’, ‘ignore_and_log’, ‘ignore_and_log_once’, ‘error’.
graph_config (Optional[GraphConfig]) – the graph configuration to use for synchronization
jdbc_url (Optional[str]) – jdbc url of database
username (Optional[str]) – username in database
password (Optional[str]) – password of username in database

Returns

a synchronizer

Return type

Synchronizer

Changed in version 23.4: The parameter connection has been removed. Use jdbc_url, username, and password instead.

create_vector_scalar(data_type, dimension=0, name=None)

Create a new vector scalar.

Parameters

data_type (str) – Property type
dimension (int) – the dimension of the vector scalar
name (Optional[str]) – Name of the scalar to be created

Return type

Scalar

create_vertex_property(data_type, name=None)

Create a session-bound vertex property.

Parameters

data_type (str) – Type of the vertex property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’, ‘boolean’, ‘string’, ‘vertex’, ‘edge’, ‘local_date’, ‘time’, ‘timestamp’, ‘time_with_timezone’, ‘timestamp_with_timezone’)
name (Optional[str]) – Name of the vertex property to be created

Return type

create_vertex_sequence(name=None)

Create a new vertex sequence.

Parameters: name (Optional[str]) – Sequence name
Return type: VertexSequence

create_vertex_set(name=None)

Create a new vertex set.

Parameters: name (Optional[str]) – Set name
Return type: VertexSet

create_vertex_vector_property(data_type, dim, name=None)

Create a session-bound vertex vector property.

Parameters

data_type (str) – Type of the vector property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’)
dim (int) – Dimension of the vector property to be created
name (Optional[str]) – Name of the vector property to be created

Return type

property creation_request_timestamp: str: Get the timestamp of the creation request.

property creation_timestamp: str: Get the timestamp of the creation.

property data_source_version: str: Get the version of the data source that the graph is based on.

destroy()

Destroy the graph with all its properties.

After this operation, neither the graph nor its properties can be used anymore within this session.

Note

if you have multiple PgxGraph objects referencing the same graph (e.g. because you called PgxSession.get_graph() multiple times with the same argument), they will ALL become invalid after calling this method; therefore, subsequent operations on ANY of them will result in an exception.

Return type: None

destroy_edge_property_if_exists(name)

Destroy a specific edge property if it exists.

Parameters: name (str) – Property name
Return type: None

destroy_vertex_property_if_exists(name)

Destroy a specific vertex property if it exists.

Parameters: name (str) – Property name
Return type: None

property edge_id_strategy: str: Get the strategy of the edge id.

execute_pgql(pgql_query)

Execute a PGQL query.

Parameters: pgql_query (str) – Query string in PGQL
Returns: The query result set as PgqlResultSet object
Return type: Optional[PgqlResultSet]

expand_with_pgql(pgql_queries, new_graph_name=None, pg_view_name=None, as_snapshot=False, config=None, *, num_connections=None, data_source_id=None, jdbc_url=None, keystore_alias=None, owner=None, password=None, schema=None, username=None, edge_properties_merging_strategy=None, vertex_properties_merging_strategy=None, pg_sql_name=None)

Expand this graph with data matching one or more PGQL queries. Given a list of either queries or prepared queries (with arguments), this will load all data matching at least on of the queries and merge it with the data from this graph. By default, this will expand from the same graph source as the original graph. To load data from another graph, specify either the pg_view_name or the pg_sql_name parameter.

Parameters

pgql_queries (Union[str, PreparedPgqlQuery, List[Union[str, PreparedPgqlQuery]]]) – One or more PGQL queries (or prepared queries).
new_graph_name (Optional[str]) – An optional name for the new graph.
pg_view_name (Optional[str]) – The PG View name from which to load the data.
scn – The SCN as of which the data should be loaded (optional).
as_snapshot (bool) – Expand as a new snapshot, instead of new graph.
config (Optional[GraphConfig]) – An optional config used to describe how to load the additional graph data.
num_connections (Optional[int]) – The number of connections to open to load the data in parallel.
data_source_id (Optional[str]) – The dataSourceId to which to connect.
jdbc_url (Optional[str]) – The jdbcUrl to use for connection to the DB.
keystore_alias (Optional[str]) – The key store alias to retrieve the password from the keystore.
owner (Optional[str]) – The owner (schema) of the PG view from which to load the data.
password (Optional[str]) – The password to use for connecting to the database.
schema (Optional[str]) – The schema from which to load the PG view.
username (Optional[str]) – The username of the DB user to use to connect to the DB.
edge_properties_merging_strategy (Optional[str]) – The strategy to specify how edge properties of duplicates element are handled. Allowed values: ‘keep_current_values’, ‘update_with_new_values’.
vertex_properties_merging_strategy (Optional[str]) – The strategy to specify how vertex properties of duplicate element are handled. Allowed values: ‘keep_current_values’, ‘update_with_new_values’.
pg_sql_name (Optional[str]) – The name of the SQL property graph from which to load data.

Returns

The graph containing data both from this graph and the external source.

Return type

explain_pgql(pgql_query)

Explain the execution plan of a pattern matching query.

Note: Different PGX versions may return different execution plans.

Parameters: pgql_query (str) – Query string in PGQL
Returns: The query plan
Return type: Operation

filter(graph_filter, vertex_properties=True, edge_properties=True, name=None)

Create a subgraph of this graph.

To create the subgraph, a given filter expression is used to determine which parts of the graph will be part of the subgraph.

Parameters

graph_filter (Union[VertexFilter, EdgeFilter, PathFindingFilter]) – Object representing a filter expression that is applied to create the subgraph
vertex_properties (bool) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (bool) – List of edge properties belonging to graph specified to be kept in the new graph
name (Optional[str]) – Filtered graph name

Return type

get_collections()

Retrieve all currently allocated collections associated with the graph.

Return type: Dict[str, PgxCollection]

get_edge(eid)

Get an edge with a specified id.

Parameters: eid (int) – edge id
Return type: PgxEdge

get_edge_label()

Get the edge labels belonging to this graph.

Return type: EdgeLabel

get_edge_properties()

Get the set of edge properties belonging to this graph.

This list might contain transient, private and published properties.

Return type: List[EdgeProperty]

get_edge_property(name)

Get an edge property by name.

Parameters: name (str) – Property name
Return type: Optional[EdgeProperty]

get_edges(filter_expr=None, name=None)

Create a new edge set containing vertices according to the given filter expression.

Parameters

filter_expr (Optional[Union[str, EdgeFilter]]) – EdgeFilter object with the filter expression. If None all the vertices are returned.
name (Optional[str]) – the name of the collection to be created. If None, a name will be generated.

Return type

EdgeSet

get_id()

Get the Graph id.

Returns: A string representation of the id of this graph.
Return type: str

get_meta_data()

Get the GraphMetaData object.

Returns: A ‘GraphMetaData’ object of this graph.
Return type: GraphMetaData

get_or_create_edge_property(type, /, name)

Get an edge property if it exists or create a new one otherwise.

Parameters

type (str) – Type of the property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’, ‘boolean’, ‘string’, ‘vertex’, ‘edge’, ‘local_date’, ‘time’, ‘timestamp’, ‘time_with_timezone’, ‘timestamp_with_timezone’)
name (str) – Name of the property to be created

Returns

The edge property

Return type

Changed in version 23.4: For consistency, type is now the first parameter of the method. It is no longer optional. The keyword arguments data_type and dim are deprecated.

get_or_create_edge_vector_property(type, dimension, /, name)

Get an edge vector property if it exists or create a new one otherwise.

Parameters

type (str) – Type of the vector property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’)
dimension (int) – Dimension of the vector property to be created
name (str) – Name of the vector property to be created

Return type

Changed in version 23.4: name is no longer optional. The keyword arguments data_type and dim are deprecated.

get_or_create_vertex_property(type, /, name)

Get a vertex property if it exists or create a new one otherwise.

Parameters

type (str) – Type of the property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’, ‘boolean’, ‘string’, ‘vertex’, ‘edge’, ‘local_date’, ‘time’, ‘timestamp’, ‘time_with_timezone’, ‘timestamp_with_timezone’)
name (str) – Name of the property to be created

Returns

The vertex property

Return type

Changed in version 23.4: For consistency, type is now the first parameter of the method. It is no longer optional. The keyword arguments data_type and dim are deprecated.

get_or_create_vertex_vector_property(type, dimension, /, name)

Get a vertex vector property if it exists or create a new one otherwise.

Parameters

type (str) – Type of the vector property to be created (one of ‘integer’, ‘long’, ‘float’, ‘double’)
dimension (int) – Dimension of the vector property to be created
name (str) – Name of the vector property to be created

Return type

Changed in version 23.4: name is no longer optional. The keyword arguments data_type and dim are deprecated.

get_permission()

Return permission object for the graph.

Return type: PgxResourcePermission

get_pgx_id()

Get the Graph id.

Returns: The id of this graph.
Return type: PgxId

get_random_edge()

Get a edge vertex from the graph.

Return type: PgxEdge

get_random_vertex()

Get a random vertex from the graph.

Return type: PgxVertex

get_redaction_rules(authorization_type, name)

Get the redaction rules for an authorization_type name.

Possible authorization types are: [‘user’, ‘role’]

Parameters

authorization_type (str) – the authorization type of the rules to be returned
name (str) – the name of the user or role for which the rules should be returned

Returns

a list of redaction rules for the given name of type authorization_type

Return type

List[PgxRedactionRuleConfig]

get_vertex(vid)

Get a vertex with a specified id.

Parameters: vid (Union[str, int]) – Vertex id
Returns: pgxVertex object
Return type: PgxVertex

get_vertex_labels()

Get the vertex labels belonging to this graph.

Return type: VertexLabels

get_vertex_properties()

Get the set of vertex properties belonging to this graph.

This list might contain transient, private and published properties.

Return type: List[VertexProperty]

get_vertex_property(name)

Get a vertex property by name.

Parameters: name (str) – Property name
Return type: Optional[VertexProperty]

get_vertices(filter_expr=None, name=None)

Create a new vertex set containing vertices according to the given filter expression.

Parameters

filter_expr (Optional[Union[str, VertexFilter]]) – VertexFilter object with the filter expression if None all the vertices are returned
name (Optional[str]) – The name of the collection to be created. If None, a name will be generated.

Return type

VertexSet

grant_permission(permission_entity, pgx_resource_permission)

Grant a permission on this graph to the given entity.

Possible PGXResourcePermission types are: [‘none’, ‘read’, ‘write’, ‘export’, ‘manage’] Possible PermissionEntity objects are: PgxUser and PgxRole.

Cannont grant ‘manage’.

Parameters

permission_entity (PermissionEntity) – the entity the rule is granted to
pgx_resource_permission (str) – the permission type

Return type

None

has_edge(eid)

Check if the edge with id eid is in the graph.

Parameters: eid (int) – Edge id
Return type: bool

has_edge_label()

Return True if the graph has edge labels, False if not.

Return type: bool

has_vertex(vid)

Check if the vertex with id vid is in the graph.

Parameters: vid (Union[str, int]) – vertex id
Return type: bool

has_vertex_labels()

Return True if the graph has vertex labels, False if not.

Return type: bool

is_bipartite(is_left)

Check whether a given graph is a bipartite graph.

A graph is considered a bipartite graph if all nodes can be divided in a ‘left’ and a ‘right’ side where edges only go from nodes on the ‘left’ side to nodes on the ‘right’ side.

Parameters: is_left (Union[VertexProperty, str]) – Boolean vertex property that - if the method returns true - will contain for each node whether it is on the ‘left’ side of the bipartite graph. If the method returns False, the content is undefined.
Return type: int

property is_directed: bool: Whether the graph is directed.

property is_fresh: bool: Check whether an in-memory representation of a graph is fresh.

is_pinned()

For a published graph, indicates if the graph is pinned. A pinned graph will stay published even if no session is using it.

Return type: bool

property is_published: bool: Check if this graph is published with snapshots.

is_published_with_snapshots()

Check if this graph is published with snapshots.

Returns: True if this graph is published, false otherwise
Return type: bool

property is_transient: bool: Whether the graph is transient.

property memory_mb: int: Get the amount of memory in megabytes that the graph consumes.

property name: str: Get the name of the graph.

property num_edges: int: Get the number of edges in the graph.

property num_vertices: int: Get the number of vertices in the graph.

property pgx_instance: ServerInstance: Get the server instance.

pick_random_vertex()

Select a random vertex from the graph.

Returns: The PgxVertex object
Return type: PgxVertex

pin()

For a published graph, pin the graph so that it stays published even if no sessions uses it. This call pins the graph lineage, which ensures that at least the latest available snapshot stays published when no session uses the graph.

Return type: None

prepare_pgql(pgql_query)

Prepare a PGQL query.

Parameters: pgql_query (str) – Query string in PGQL
Returns: A prepared statement object
Return type: PreparedStatement

publish(vertex_properties=False, edge_properties=False)

Publish the graph so it can be shared between sessions.

This moves the graph name from the private into the public namespace.

Parameters

vertex_properties (Union[List[VertexProperty], bool]) – List of vertex properties belonging to graph specified to be published as well
edge_properties (Union[List[EdgeProperty], bool]) – List of edge properties belonging to graph specified by graph to be published as well

Return type

None

publish_with_snapshots()

Publish the graph and all its snapshots so they can be shared between sessions.

Return type: None

query_pgql(query)

Submit a pattern matching select only query.

Parameters: query (str) – Query string in PGQL
Returns: PgqlResultSet with the result
Return type: PgqlResultSet

remove_redaction_rule(redaction_rule_config, authorization_type, *names)

Remove a redaction rule for authorization_type names.

Possible authorization types are: [‘user’, ‘role’]

Parameters

authorization_type (str) – the authorization type of the rule to be removed
names (str) – the names of the users or roles for which the rule should be removed
redaction_rule_config (PgxRedactionRuleConfig) –

Return type

None

rename(name)

Rename this graph.

Parameters: name (str) – New name
Return type: None

revoke_permission(permission_entity)

Revoke all permissions on this graph from the given entity.

Possible PermissionEntity objects are: PgxUser and PgxRole.

Parameters: permission_entity (PermissionEntity) – the entity for which all permissions will be revoked
Return type: None

simplify(vertex_properties=True, edge_properties=True, keep_multi_edges=False, keep_self_edges=False, keep_trivial_vertices=False, in_place=False, name=None)

Create a simplified version of a graph.

Note that the returned graph and properties are transient and therefore session bound. They can be explicitly destroyed and get automatically freed once the session dies.

Parameters

vertex_properties (Union[bool, List[VertexProperty]]) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (Union[bool, List[EdgeProperty]]) – List of edge properties belonging to graph specified to be kept in the new graph
keep_multi_edges (bool) – Defines if multi-edges should be kept in the result
keep_self_edges (bool) – Defines if self-edges should be kept in the result
keep_trivial_vertices (bool) – Defines if isolated nodes should be kept in the result
in_place (bool) – If the operation should be done in place of if a new graph has to be created
name (Optional[str]) – New graph name. If None, a name will be generated. Only relevant if a new graph is to be created.

Return type

simplify_with_strategy(mutation_strategy)

Create a simplified version of a graph using a custom mutation strategy.

Note that the returned graph and properties are transient and therefore session bound. They can be explicitly destroyed and get automatically freed once the session dies.

Parameters: mutation_strategy (MutationStrategy) – Defines a custom strategy for dealing with multi-edges.
Return type: PgxGraph

sort_by_degree(vertex_properties=True, edge_properties=True, ascending=True, in_degree=True, in_place=False, name=None)

Create a sorted version of a graph and all its properties.

The returned graph is sorted such that the node numbering is ordered by the degree of the nodes. Note that the returned graph and properties are transient.

Parameters

vertex_properties (Union[List[VertexProperty], bool]) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (Union[List[EdgeProperty], bool]) – List of edge properties belonging to graph specified to be kept in the new graph
ascending (bool) – Sorting order
in_degree (bool) – If in_degree should be used for sorting. Otherwise use out degree.
in_place (bool) – If the sorting should be done in place or a new graph should be created
name (Optional[str]) – New graph name

Return type

sparsify(sparsification, vertex_properties=True, edge_properties=True, name=None)

Sparsify the given graph and returns a new graph with less edges.

Parameters

sparsification (float) – The sparsification coefficient. Must be between 0.0 and 1.0..
vertex_properties (bool) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (bool) – List of edge properties belonging to graph specified to be kept in the new graph
name (Optional[str]) – Filtered graph name

Return type

store(format, path, num_partitions=None, vertex_properties=True, edge_properties=True, overwrite=False)

Store graph in a file.

This method works for both partitioned and homogeneous graphs. Depending on whether the graph is partitioned or not, the format parameter accepts different values. See the documentation for the format parameter below.

Changed in version 22.3.1: Added support for storing partitioned graphs.

Parameters

format (str) – One of [‘pgb’, ‘edge_list’, ‘two_tables’, ‘adj_list’, ‘flat_file’, ‘graphml’, ‘csv’] for a homogeneous graph or one of [‘pgb’, ‘csv’] for a partitioned graph
path (str) – Path to which graph will be stored
num_partitions (Optional[int]) – The number of partitions that should be created, when exporting to multiple files
vertex_properties (bool) – The collection of vertex properties to store together with the graph data. If not specified all the vertex properties are stored
edge_properties (bool) – The collection of edge properties to store together with the graph data. If not specified all the vertex properties are stored
overwrite (bool) – Overwrite if existing

Return type

GraphConfig

transpose(vertex_properties=True, edge_properties=True, edge_label_mapping=None, in_place=False, name=None)

Create a transpose of this graph.

A transpose of a directed graph is another directed graph on the same set of vertices with all of the edges reversed. If this graph contains an edge (u,v) then the return graph will contain an edge (v,u) and vice versa. If this graph is undirected (isDirected() returns false), this operation has no effect and will either return a copy or act as identity function depending on the mode parameter.

Parameters

vertex_properties (bool) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (bool) – List of edge properties belonging to graph specified to be kept in the new graph
edge_label_mapping (Optional[Mapping[str, str]]) – Can be used to rename edge labels. For example, an edge (John,Mary) labeled “fatherOf” can be transformed to be labeled “hasFather” on the transpose graph’s edge (Mary,John) by passing in a dict like object {“fatherOf”:”hasFather”}.
in_place (bool) – If the transpose should be done in place or a new graph should be created
name (Optional[str]) – New graph name

Return type

undirect(vertex_properties=True, edge_properties=True, keep_multi_edges=True, keep_self_edges=True, keep_trivial_vertices=True, in_place=False, name=None)

Create an undirected version of the graph.

An undirected graph has some restrictions. Some algorithms are only supported on directed graphs or are not yet supported for undirected graphs. Further, PGX does not support storing undirected graphs nor reading from undirected formats. Since the edges do not have a direction anymore, the behavior of pgxEdge.source() or pgxEdge.destination() can be ambiguous. In order to provide deterministic results, PGX will always return the vertex with the smaller internal id as source and the other as destination vertex.

Parameters

vertex_properties (Union[bool, List[VertexProperty]]) – List of vertex properties belonging to graph specified to be kept in the new graph
edge_properties (Union[bool, List[EdgeProperty]]) – List of edge properties belonging to graph specified to be kept in the new graph
keep_multi_edges (bool) – Defines if multi-edges should be kept in the result
keep_self_edges (bool) – Defines if self-edges should be kept in the result
keep_trivial_vertices (bool) – Defines if isolated nodes should be kept in the result
in_place (bool) – If the operation should be done in place of if a new graph has to be created
name (Optional[str]) – New graph name

Return type

undirect_with_strategy(mutation_strategy)

Create an undirected version of the graph using a custom mutation strategy.

An undirected graph has some restrictions. Some algorithms are only supported on directed graphs or are not yet supported for undirected graphs. Further, PGX does not support storing undirected graphs nor reading from undirected formats. Since the edges do not have a direction anymore, the behavior of pgxEdge.source() or pgxEdge.destination() can be ambiguous. In order to provide deterministic results, PGX will always return the vertex with the smaller internal id as source and the other as destination vertex.

Parameters: mutation_strategy (MutationStrategy) – Defines a custom strategy for dealing with multi-edges.
Return type: PgxGraph

unpin()

For a published graph, unpin the graph so that if no snapshot of the graph is used by any session or pinned, the graph and all its snapshots can be removed.

Return type: None

property vertex_id_strategy: str: Get the strategy of the vertex id.

property vertex_id_type: str: Get the type of the vertex id.

class pypgx.api.PgxMap(graph, java_map)

Bases: PgxContextManager

A map is a collection of key-value pairs.

Parameters: graph (Optional[PgxGraph]) –

contains_key(key)

Return True if this map contains the given key.

Parameters: key – Key of the entry
Return type: bool

destroy()

Destroy this map.

Return type: None

entries()

Return an entry set.

Return type: dict

get(key)

Get the entry with the specified key.

Parameters: key – Key of the entry
Returns: Value
Return type: Any

property key_type: str: Type of the keys.

keys()

Return a key set.

Return type: list

property name: str: Name of the map.

put(key, value)

Set the value for a key in the map specified by the given name.

Parameters

key – Key of the entry
value – New value

Return type

None

remove(key)

Remove the entry specified by the given key from the map with the given name.

Returns true if the map did contain an entry with the given key, false otherwise.

Parameters: key – Key of the entry
Returns: True if the map contained the key
Return type: bool

property session_id: int: Session id.

property size: int: Map size.

property value_type: str: Type of the values.

class pypgx.api.PgxPartition(graph, java_partition, property)

Bases: PgxContextManager

A vertex partition of a graph. Each partition is a set of vertices.

destroy()

Destroy the partition object.

Return type: None

get_components_property()

Return the property that contains for each vertex, its associated component ID.

Returns: The property that contains for each vertex.
Return type: VertexProperty

get_partition_by_index(idx)

Get a partition by index.

Parameters: idx (int) – The index. Must be between 0 and size() - 1.
Raises: RuntimeError – It the index is out of the bounds.
Returns: The set of vertices representing the partition.
Return type: VertexSet

get_partition_by_vertex(v)

Get the partition a particular vertex belongs to.

Parameters: v (Union[PgxVertex, int, str]) – The vertex.
Returns: The set of vertices representing the partition the given vertex belongs to.
Return type: VertexSet

get_partition_index_of_vertex(v)

Get a partition by index.

Parameters: v (Union[PgxVertex, int, str]) – The index. Must be between 0 and size() - 1.
Returns: The set of vertices representing the partition.
Return type: Any

property size: int

Return the size of the partition.

Returns: The size of the partition.
Return type: int

class pypgx.api.PgxPath(graph, java_path)

Bases: PgxContextManager

A path from a source to a destination vertex in a PgxGraph.

Parameters: graph (PgxGraph) –

property cost: float: Get the cost of the path.

property destination: Optional[PgxVertex]: Get the destination vertex.

destroy()

Destroy this path.

Return type: None

property edges: List[PgxEdge]: Return a list of edges in the path.

property exists: bool: Whether the path exists.

property hops: int: Get the number of hops in the path.

property path: List[Tuple[PgxVertex, Optional[PgxEdge]]]: Return path as a list of (vertex,edge) tuples.

property source: Optional[PgxVertex]: Get the source vertex.

property vertices: List[PgxVertex]: Return a list of vertices in the path.

class pypgx.api.PgxProperty(graph, java_prop)

Bases: PgxContextManager

A property of a PgxGraph.

Parameters: graph (PgxGraph) –

clone(name=None)

Create a copy of this property.

Parameters: name (Optional[str]) – name of copy to be created. If None, guaranteed unique name will be generated.
Returns: property result
Return type: this class

close()

Free resources on the server taken up by this Property.

Returns: None
Return type: None

destroy()

Free resources on the server taken up by this Property.

Returns: None
Return type: None

property dimension: int: Return the dimension of this property.

property entity_type: str: Entity type of this property.

expand()

If this is a vector property, expands this property into a list of scalar properties of same type.

The first property will contain the first element of the vector, the second property the second element and so on.

Return type: Union[PgxProperty, List[PgxProperty]]

fill(value)

Fill this property with a given value.

Parameters: value (Any) – The value
Return type: None

get(key)

Get a property value.

Parameters: key (Union[PgxEntity, int, str]) – The key (vertex/edge) whose property to get
Return type: Any

get_bottom_k_values(k)

Get the bottom k vertex/edge value pairs according to their value.

Parameters: k (int) – How many top values to retrieve, must be in the range between 0 and number of nodes/edges (inclusive)
Return type: List[Tuple[PgxEntity, Any]]

get_property_id()

Get an internal identifier for this property.

Only meant for internal usage.

Returns: the internal identifier of this property
Return type: PgxId

get_top_k_values(k)

Get the top k vertex/edge value pairs according to their value.

Parameters: k (int) – How many top values to retrieve, must be in the range between 0 and number of nodes/edges (inclusive)
Returns: list of k key-value tuples where the keys vertices/edges and the values are property values, sorted in ascending order
Return type: list of tuple(PgxVertex or PgxEdge, Any)

get_values()

Get the values of this property as a list.

Returns: a list of key-value tuples, where each key is a vertex and each key is the value assigned to that vertex
Return type: list of tuple(PgxVertex, set of str)

property is_published: bool

Check if this property is published.

Returns: True if this property is published, False otherwise.
Return type: bool

property is_transient: bool: Whether this property is transient.

property is_vector_property: bool: Whether this property is a vector property.

property name: str: Name of this property.

publish()

Publish the property into a shared graph so it can be shared between sessions.

Returns: None
Return type: None

rename(name)

Rename this property.

Parameters: name (str) – New name
Returns: None
Return type: None

set(key, value)

Set a property value.

Parameters

key (Union[PgxEntity, int, str]) – The key (vertex/edge) whose property to set
value (Any) – The property value

Return type

None

set_values(values)

Set the labels values.

Parameters: values (PgxMap) – pgxmap with ids and values
Return type: None

property size: int: Return the number of elements in this property.

property type: str: Return the type of this property.

wrap(property_value, property_type)

Take a property value and wraps it pgx entities if applicable

Parameters

property_value (Any) – property value
property_type (str) – A valid property type.

Return type

Any

class pypgx.api.PgxSession(java_session)

Bases: PgxContextManager

A PGX session represents an active user connected to a ServerInstance.

Every session gets a workspace assigned on the server, which can be used to read graph data, create transient data or custom algorithms for the sake of graph analysis. Once a session gets destroyed, all data in the session workspace is freed.

Variables: LATEST_SNAPSHOT – The timestamp of the most recent snapshot, used to easily move to the newest snapshot (see set_snapshot())

close()

Close this session object.

Return type: None

compile_program(path, overwrite=False)

Compile a Green-Marl program for parallel execution with all optimizations enabled.

Parameters

path (str) – Path to program
overwrite (bool) – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise

Return type

CompiledProgram

compile_program_code(code, overwrite=False, parrallel=True, disabled_optimizations=None, verbose=False)

Compile a Green-Marl program (if it is supported by the corresponding PyPGX distribution). Otherwise compile a Java program.

Parameters

code (str) – The Green-Marl/Java code to compile
overwrite (bool) – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise
parrallel (bool) – if False, the compiled program will be optimized for sequential execution
disabled_optimizations (Optional[List[str]]) – list of compiler optimizations to disable
verbose (bool) – if True, the compiler will output compilation stages

Return type

CompiledProgram

create_analyst()

Create and return a new analyst.

Returns: An analyst object
Return type: Analyst

create_frame(schema, column_data, frame_name)

Create a frame with the specified data

Parameters

schema (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
column_data (Dict[str, List]) – Map of iterables, columnName -> columnData
frame_name (str) – Name of the frame

Returns

A frame builder initialized with the given schema

Return type

PgxFrame

create_frame_builder(schema)

Create a frame builder initialized with the given schema

Parameters: schema (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
Returns: A frame builder initialized with the given schema
Return type: PgxFrameBuilder

create_graph_builder(id_type='integer', vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='auto_generated')

Create a graph builder with the given vertex ID type and Ids Mode.

Parameters

id_type (str) – The type of the vertex ID
vertex_id_generation_strategy (str) – The vertices Id generation strategy to be used
edge_id_generation_strategy (str) – The edges Id generation strategy to be used

Return type

GraphBuilder

create_map(key_type, value_type, name=None)

Create a map.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters

key_type (str) – Property type of the keys that are going to be stored inside the map
value_type (str) – Property type of the values that are going to be stored inside the map
name (Optional[str]) – Map name

Returns

A named PgxMap of key content type key_type and value content type value_type

Return type

create_sequence(content_type, name=None)

Create a sequence of scalars.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters

content_type (str) – Property type of the elements in the sequence
name (Optional[str]) – Sequence name

Returns

A named ScalarSequence of content type content_type

Return type

ScalarSequence

create_set(content_type, name=None)

Create a set of scalars.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters

content_type (str) – content type of the set
name (Optional[str]) – the set’s name

Returns

A named ScalarSet of content type content_type

Return type

ScalarSet

describe_graph_file(file_path)

Describe the graph contained in the file at the given path.

Parameters: file_path (str) – Graph file path
Returns: The configuration which can be used to load the graph
Return type: GraphConfig

describe_graph_files(files_path)

Describe the graph contained in the files at the given paths.

Parameters: files_path (str) – Paths to the files
Returns: The configuration which can be used to load the graph
Return type: GraphConfig

destroy()

Destroy this session object.

Return type: None

edge_provider_from_frame(provider_name, source_provider, destination_provider, frame, source_vertex_column='src', destination_vertex_column='dst')

Create an edge provider from a PgxFrame to later build a PgxGraph

Parameters

provider_name (str) – edge provider name
source_provider (str) – vertex source provider name
destination_provider (str) – vertex destination provider name
frame (PgxFrame) – PgxFrame to use
source_vertex_column (str) – column to use as source keys. Defaults to “src”
destination_vertex_column (str) – column to use as destination keys. Defaults to “dst”

Returns

the EdgeFrameDeclaration object

Return type

EdgeFrameDeclaration

execute_pgql(pgql_query)

Submit any query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraphAsync(String).

Parameters: pgql_query (str) – Query string in PGQL
Returns: The query result set
Return type: Optional[PgqlResultSet]

throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

explain_pgql(pgql_query)

Explain the execution plan of a pattern matching query.

Note: Different PGX versions may return different execution plans.

Parameters: pgql_query (str) – Query string in PGQL
Returns: The query plan
Return type: Operation

get_available_compiled_program_ids()

Get the set of available compiled program IDs.

Return type: Set[str]

get_available_snapshots(snapshot)

Return a list of all available snapshots of the given input graph.

Parameters: snapshot (PgxGraph) – A ‘PgxGraph’ object for which the available snapshots shall be retrieved
Returns: A list of ‘GraphMetaData’ objects, each corresponding to a snapshot of the input graph
Return type: List[GraphMetaData]

get_compiled_program(id)

Get a compiled program by ID.

Parameters: id (str) – The id of the compiled program
Return type: CompiledProgram

get_execution_environment()

Get the execution environment for this session.

Returns: the execution environment

get_graph(name, namespace=None)

Find and return a graph with name name within the given namespace loaded inside PGX.

The search for the snapshot to return is done according to the following rules:

if namespace is private, than the search occurs on already referenced snapshots of the graph with name name and the most recent snapshot is returned

if namespace is public, then the search occurs on published graphs and the most recent snapshot of the published graph with name name is returned

if namespace is None, then the private namespace is searched first and, if no snapshot is found, the public namespace is then searched

Multiple calls of this method with the same parameters will return different PgxGraph objects referencing the same graph, with the server keeping track of how many references a session has to each graph.

Therefore, a graph is released within the server either if:

all the references are moved to another graph (e.g. via set_snapshot())

the PgxGraph.destroy() method is called on one reference. Note that this invalidates all references

Parameters

name (str) – The name of the graph
namespace (Namespace or None) – The namespace where to look up the graph

Returns

The graph with the given name

Return type

PgxGraph or None

get_graphs(namespace=None)

Return a collection of graph names accessible under the given namespace.

Parameters: namespace (Optional[Namespace]) – The namespace where to look up the graphs
Return type: List[str]

get_idle_timeout()

Get the idle timeout of this session

Returns: the idle timeout in seconds
Return type: Optional[int]

get_name()

Get the identifier of the current session.

Returns: identifier of this session
Return type: str

get_pgql_result_set(id)

Get a PGQL result set by ID.

Parameters: id (str) – The PGQL result set ID
Returns: The requested PGQL result set or None if no such result set exists for this session
Return type: Optional[PgqlResultSet]

get_session_context()

Get the context describing the current session.

Returns: context of this session
Return type: SessionContext

get_source()

Get the current session source

Returns: session source
Return type: str

get_task_timeout()

Get the task timeout of this session

Returns: the task timeout in seconds
Return type: Optional[int]

graph_from_frames(graph_name, vertex_providers, edge_providers, partitioned=True)

Create PgxGraph from vertex providers and edge providers.

partitioned must be set to True if multiple vertex or edge providers are given

Parameters

graph_name (str) – graph name
vertex_providers (List[VertexFrameDeclaration]) – list of vertex providers
edge_providers (List[EdgeFrameDeclaration]) – list of edge providers
partitioned (bool) – whether the graph is partitioned or not. Defaults to True

Returns

the PgxGraph object

Return type

property id: str

Get the ID of the session.

Returns: The ID of the session.

property idle_timeout: Optional[int]

Get the idle timeout in seconds.

Returns: the idle timeout.

pandas_to_pgx_frame(pandas_dataframe, frame_name)

Create a frame from a pandas dataframe.

Duplicate columns will be renamed. Mixed column types are not supported.

This method requires pandas.

Parameters

pandas_dataframe – The Pandas dataframe to use
frame_name (str) – Name of the frame

Returns

the frame created

Return type

PgxFrame

prepare_pgql(pgql_query)

Prepare a pattern matching query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as getGraph(String).

Parameters: pgql_query (str) – Query string in PGQL
Returns: A prepared statement object
Return type: PreparedStatement

query_pgql(pgql_query)

Submit a pattern matching query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraph(String).

Parameters: pgql_query (str) – Query string in PGQL
Returns: The query result set
Return type: Optional[PgqlResultSet]

throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

read_frame()

Create a new frame reader with which it is possible to parameterize the loading of the row frame.

Returns: A frame reader object with which it is possible to parameterize the loading
Return type: PgxGenericFrameReader

read_graph_as_of(config, meta_data=None, creation_timestamp=None, new_graph_name=None)

Read a graph and its properties of a specific version (metaData or creationTimestamp) into memory.

The creationTimestamp must be a valid version of the graph.

Parameters

config (GraphConfig) – The graph config
meta_data (Optional[GraphMetaData]) – The metaData object returned by get_available_snapshots(GraphConfig) identifying the version
creation_timestamp (Optional[int]) – The creation timestamp (milliseconds since jan 1st 1970) identifying the version to be checked out
new_graph_name (Optional[str]) – How the graph should be named. If None, a name will be generated.

Returns

The PgxGraph object

Return type

read_graph_by_name(graph_name, graph_source, schema=None, options=())

Parameters

graph_name (str) – Name of graph
graph_source (str) – Source of graph
schema (Optional[str]) – Schema of graph
options (Tuple[str, ...]) – Tuple of read graph options as strings. The possible options are in the list below.

Return type

List of options:

optimized_for_updates: Specify if the loaded graph will be optimized for updates (Default).

optimized_for_read: Specify if the loaded graph will be optimized for reads.

synchronizable: If used and graph cannot be synchronized, PGX will throw an Exception.

on_missing_vertex_ignore_edge: Ignore edges with missing source/destination vertex (without logging).

on_missing_vertex_ignore_edge_log: Ignore edges with missing source/destination vertex and log every ignored edge.

on_missing_vertex_ignore_edge_log_once: Ignore edges with missing source/destination vertex and log the first ignored edge.

on_missing_vertex_error: Throw an error when an edge misses source/destination vertex (Default).

Changed in version 23.4.1: Added the options parameter.

read_graph_file(file_path, file_format=None, graph_name=None)

Parameters

file_path (str) – File path
file_format (Optional[str]) – File format of graph
graph_name (Optional[str]) – Name of graph

Return type

read_graph_files(file_paths, edge_file_paths=None, file_format=None, graph_name=None)

Load the graph contained in the files at the given paths.

Parameters

file_paths (Union[str, Iterable[str]]) – Paths to the vertex files
edge_file_paths (Optional[Union[str, Iterable[str]]]) – Path to the edge file
file_format (Optional[str]) – File format
graph_name (Optional[str]) – Loaded graph name

Return type

read_graph_with_properties(config, max_age=9223372036854775807, max_age_time_unit='days', block_if_full=False, update_if_not_fresh=True, graph_name=None)

Read a graph and its properties, specified in the graph config, into memory.

Parameters

config (Union[str, PathLike, Dict[str, Any], GraphConfig]) – The graph config
max_age (int) – If another snapshot of the given graph already exists, the age of the latest existing snapshot will be compared to the given maxAge. If the latest snapshot is in the given range, it will be returned, otherwise a new snapshot will be created.
max_age_time_unit (str) – The time unit of the maxAge parameter
block_if_full (bool) – If true and a new snapshot needs to be created but no more snapshots are allowed by the server configuration, the returned future will not complete until space becomes available. Iterable full and this flag is false, the returned future will complete exceptionally instead.
update_if_not_fresh (bool) – If a newer data version exists in the backing data source (see PgxGraph.is_fresh()), this flag tells whether to read it and create another snapshot inside PGX. If the “snapshots_source” field of config is SnapshotsSource.REFRESH, the returned graph may have multiple snapshots, depending on whether previous reads with the same config occurred; otherwise, if the “snapshots_source” field is SnapshotsSource.CHANGE_SET, only the most recent snapshot (either pre-existing or freshly read) will be visible.
graph_name (Optional[str]) – How the graph should be named. If null, a name will be generated. If a graph with that name already exists, the returned future will complete exceptionally.

Return type

read_subgraph_from_pg_sql(sql_graph_name, queries=None, config=None, *, num_connections=None, data_source_id=None, jdbc_url=None, keystore_alias=None, owner=None, password=None, schema=None, username=None, graph_name=None)

Load a subgraph of a SQL property graph.

Parameters

sql_graph_name (str) – The name of SQL graph.
graph_name (Optional[str]) – the name of the resulting graph.
queries (Union[None, str, PreparedPgqlQuery, List[Union[str, PreparedPgqlQuery]]]) – A query or queries used to specify which data is to be loaded.
config (Optional[GraphConfig]) – An optional config used to describe how data should be loaded.
num_connections (Optional[int]) – The number of connections to open to load the data in parallel.
data_source_id (Optional[str]) – The dataSourceId to which to connect.
jdbc_url (Optional[str]) – The jdbcUrl to use for connection to the DB.
keystore_alias (Optional[str]) – The key store alias to retrieve the password from the keystore.
owner (Optional[str]) – The owner (schema) of the PG view from which to load the data.
password (Optional[str]) – The password to use for connecting to the database.
schema (Optional[str]) – The schema from which to load the PG view.
username (Optional[str]) – The username of the DB user to use to connect to the DB.

Returns

The graph.

Return type

read_subgraph_from_pg_view(view, queries=None, config=None, *, num_connections=None, data_source_id=None, jdbc_url=None, keystore_alias=None, owner=None, password=None, schema=None, username=None, graph_name=None)

Load a graph from PG Views.

Parameters

view (str) – The name of the PG View.
graph_name (Optional[str]) – the name of the resulting graph.
queries (Union[None, str, PreparedPgqlQuery, List[Union[str, PreparedPgqlQuery]]]) – A query or queries used to specify which data is to be loaded.
config (Optional[GraphConfig]) – An optional config used to describe how data should be loaded.
num_connections (Optional[int]) – The number of connections to open to load the data in parallel.
data_source_id (Optional[str]) – The dataSourceId to which to connect.
jdbc_url (Optional[str]) – The jdbcUrl to use for connection to the DB.
keystore_alias (Optional[str]) – The key store alias to retrieve the password from the keystore.
owner (Optional[str]) – The owner (schema) of the PG view from which to load the data.
password (Optional[str]) – The password to use for connecting to the database.
schema (Optional[str]) – The schema from which to load the PG view.
username (Optional[str]) – The username of the DB user to use to connect to the DB.

Returns

The graph.

Return type