PyPGX API

Public API for the PGX client.

class pypgx.api.AllPaths(graph, java_all_paths)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

The paths from one source vertex to all other vertices.

destroy()

Destroy this object.

get_path(destination)

Get the path.

Parameters

destination – The destination node

Returns

The path result to the destination node

class pypgx.api.Analyst(session, java_analyst)

Bases: object

The Analyst gives access to all built-in algorithms of PGX.

Unlike some of the other classes inside this package, the Analyst is not stateless. It creates session-bound transient data to hold the result of algorithms and keeps track of them.

adamic_adar_counting(graph, aa='adamic_adar')

Adamic-adar counting compares the amount of neighbors shared between vertices, this measure can be used with communities.

Parameters
  • graph – Input graph

  • dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

all_reachable_vertices_edges(graph, src, dst, k, filter=None)

Find all the vertices and edges on a path between the src and target of length smaller or equal to k.

Parameters
  • graph – Input graph

  • src – The source vertex

  • dst – The destination vertex

  • k – The dimension of the distances property; i.e. number of high-degree vertices.

  • filter – The filter to be used on edges when searching for a path

Returns

The vertices on the path, the edges on the path and a map containing the distances from the source vertex for each vertex on the path

approximate_vertex_betweenness_centrality(graph, seeds, bc='approx_betweenness')
Parameters
  • graph – Input graph

  • seeds – The (unique) chosen nodes to be used to compute the approximated betweenness centrality coeficients

  • bc – Vertex property holding the betweenness centrality value for each vertex

Returns

Vertex property holding the computed scores

bipartite_check(graph, is_left='is_left')

Verify whether a graph is bipartite.

Parameters
  • graph – Input graph

  • is_left – (out-argument) vertex property holding the side of each vertex in a bipartite graph (true for left, false for right).

Returns

vertex property holding the side of each vertex in a bipartite graph (true for left, false for right).

center(graph, center=None)

Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.

The center is comprised by the set of vertices with eccentricity equal to the radius of the graph.

Parameters
  • graph – Input graph

  • center – (Out argument) vertex set holding the vertices from the periphery or center of the graph

close()

Destroy without waiting for completion.

closeness_centrality(graph, cc='closeness')
Parameters
  • graph – Input graph

  • cc – Vertex property holding the closeness centrality

communities_conductance_minimization(graph, max_iter=100, label='conductance_minimization')

Soman and Narang can find communities in a graph taking weighted edges into account.

Parameters
  • graph – Input graph

  • max_iter – Maximum number of iterations that will be performed

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

communities_infomap(graph, rank, weight, tau=0.15, tol=0.0001, max_iter=100, label='infomap')

Infomap can find high quality communities in a graph.

Parameters
  • graph – Input graph

  • rank – Vertex property holding the normalized PageRank value for each vertex

  • weight – Ridge property holding the weight of each edge in the graph

  • tau – Damping factor

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • max_iter – Maximum iteration number

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

communities_label_propagation(graph, max_iter=100, label='label_propagation')

Label propagation can find communities in a graph relatively fast.

Parameters
  • graph – Input graph

  • max_iter – Maximum number of iterations that will be performed

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

Returns

Partition holding the node collections corresponding to the communities found by the algorithm

compute_high_degree_vertices(graph, k, high_degree_vertex_mapping=None, high_degree_vertices=None)

Compute the k vertices with the highest degrees in the graph.

Parameters
  • graph – Input graph

  • k – Number of high-degree vertices to be computed

  • high_degree_vertex_mapping – (out argument) map with the top k high-degree vertices and their indices

  • high_degree_vertices – (out argument) the high-degree vertices

Returns

a map with the top k high-degree vertices and their indices and a vertex set containing the same vertices

conductance(graph, partition, partition_idx)

Conductance assesses the quality of a partition in a graph.

Parameters
  • graph – Input graph

  • partition – Partition of the graph with the corresponding node collections

  • partition_idx – Number of the component to be used for computing its conductance

count_triangles(graph, sort_vertices_by_degree)

Triangle counting gives an overview of the amount of connections between vertices in neighborhoods.

Parameters
  • graph – Input graph

  • sort_vertices_by_degree – Boolean flag for sorting the nodes by their degree as preprocessing step

Returns

The total number of triangles found

create_distance_index(graph, high_degree_vertex_mapping, high_degree_vertices, index=None)

Compute an index with distances to each high-degree vertex

Parameters
  • graph – Input graph

  • high_degree_vertex_mapping – a map with the top k high-degree vertices and their indices and a vertex

  • high_degree_vertices – the high-degree vertices

  • index – (out-argument) the index containing the distances to each high-degree vertex for all vertices

Returns

the index containing the distances to each high-degree vertex for all vertices

deepwalk_builder(min_word_frequency=1, batch_size=128, num_epochs=2, layer_size=200, learning_rate=0.025, min_learning_rate=0.0001, window_size=5, walk_length=5, walks_per_vertex=4, sample_rate=1e-05, negative_sample=10, validation_fraction=0.05, seed=None)

Build a Deepwalk model and return it.

Parameters
  • min_word_frequency – Minimum word frequency to consider before pruning

  • batch_size – Batch size for training the model

  • num_epochs – Number of epochs to train the model

  • layer_size – Number of dimensions for the output vectors

  • learning_rate – Initial learning rate

  • min_learning_rate – Minimum learning rate

  • window_size – Window size to consider while training the model

  • walk_length – Length of the walks

  • walks_per_vertex – Number of walks to consider per vertex

  • sample_rate – Sample rate

  • negative_sample – Number of negative samples

  • validation_fraction – Fraction of training data on which to compute final loss

  • seed – Random seed for training the model

Returns

Built deepwalk model

degree_centrality(graph, dc='degree')
Measure the centrality of the vertices based on its degree, letting

you see how a vertex influences its neighborhood.

Parameters
  • graph – Input graph

  • dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

destroy()

Destroy with waiting for completion.

diameter(graph, eccentricity='eccentricity')

Diameter/radius gives an overview of the distances in a graph.

Parameters
  • graph – Input graph

  • eccentricity – (Out argument) vertex property holding the eccentricity value for each vertex

Returns

Pair holding the diameter of the graph and a node property with eccentricity value for each node

eigenvector_centrality(graph, tol=0.001, max_iter=100, l2_norm=False, in_edges=False, ec='eigenvector')

Eigenvector centrality gets the centrality of the vertices in an intrincated way using neighbors, allowing to find well-connected vertices.

Parameters
  • graph – Input graph

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • max_iter – Maximum iteration number

  • l2_norm – Boolean flag to determine whether the algorithm will use the l2 norm (Euclidean norm) or the l1 norm (absolute value) to normalize the centrality scores

  • in_edges – Boolean flag to determine whether the algorithm will use the incoming or the outgoing edges in the graph for the computations

  • ec – Vertex property holding the resulting score for each vertex

Returns

Vertex property holding the computed scores

enumerate_simple_paths(graph, src, dst, k, vertices_on_path, edges_on_path, dist)

Enumerate simple paths between the source and destination vertex.

Parameters
  • graph – Input graph

  • src – The source vertex

  • dst – The destination vertex

  • k – maximum number of iterations

  • vertices_on_path – VertexSet containing all vertices to be considered while enumerating paths

  • edges_on_path – EdgeSet containing all edges to be consider while enumerating paths

  • dist – map containing the hop-distance from the source vertex to each vertex that is to be considered while enumerating the paths

Returns

Triple containing containing the path lengths, a vertex-sequence containing the vertices on the paths and edge-sequence containing the edges on the paths

fattest_path(graph, root, capacity, distance='fattest_path_distance', parent='fattest_path_parent', parent_edge='fattest_path_parent_edge')

Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters.

Parameters
  • graph – Input graph

  • root – Fattest path is a fast algorithm for finding a shortest path adding constraints for flowing related matters

  • capacity – Edge property holding the capacity of each edge in the graph

  • distance – Vertex property holding the capacity value of the fattest path up to the current vertex

  • parent – Vertex property holding the parent vertex of the each vertex in the fattest path

  • parent_edge – Vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths object holding the information of the possible fattest paths from the source node

filtered_bfs(graph, root, filter, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')

Breadth-first search with an option to filter edges during the traversal of the graph.

Parameters
  • graph – Input graph

  • root – The source vertex from the graph for the path.

  • filter – GraphFilter object used to filter non desired nodes

  • navigator – Navigator expression to be evaluated on the vertices during the graph traversal

  • init_with_inf – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.

  • max_depth – Maximum depth limit for the BFS traversal

  • distance – Vertex property holding the hop distance for each reachable vertex in the graph

  • parent – Vertex property holding the parent vertex of the each reachable vertex in the path

Returns

Distance and parent vertex properties

filtered_dfs(graph, root, filter, navigator, init_with_inf=True, max_depth=2147483647, distance='distance', parent='parent')

Depth-first search with an option to filter edges during the traversal of the graph.

Parameters
  • graph – Input graph

  • root – The source vertex from the graph for the path

  • filter – GraphFilter object used to filter non desired nodes

  • navigator – Navigator expression to be evaluated on the vertices during the graph traversal

  • init_with_inf – Boolean flag to set the initial distance values of the vertices. If set to true, it will initialize the distances as INF, and -1 otherwise.

  • max_depth – Maximum search depth

  • distance – Vertex property holding the hop distance for each reachable vertex in the graph

  • parent – Vertex property holding the parent vertex of the each reachable vertex in the path

Returns

Distance and parent vertex properties

find_cycle(graph, src=None, vertex_seq=None, edge_seq=None)

Find cycle looks for any loop in the graph.

Parameters
  • graph – Input graph

  • src – Source vertex for the search

  • vertex_seq – (Out argument) vertex sequence holding the vertices in the cycle

  • edge_seq – (Out argument) edge sequence holding the edges in the cycle

Returns

PgxPath representing the cycle as path, if exists.

get_deepwalk_model_loader()

Return a ModelLoader that can be used for loading a DeepWalkModel.

Returns

ModelLoader

get_pg2vec_model_loader()

Return a ModelLoader that can be used for loading a Pg2vecModel.

Returns

ModelLoader

get_supervised_graphwise_model_loader()

Return a ModelLoader that can be used for loading a SupervisedGraphWiseModel.

Returns

ModelLoader

get_unsupervised_graphwise_model_loader()

Return a ModelLoader that can be used for loading a UnsupervisedGraphWiseModel.

Returns

ModelLoader

graphwise_conv_layer_config(num_sampled_neighbors=10, neighbor_weight_property_name=None, activation_fn='ReLU', weight_init_scheme='XAVIER_UNIFORM')

Build a GraphWise conv layer configuration and return it.

Parameters
  • num_sampled_neighbors – Number of neighbors to sample

  • neighbor_weight_property_name – Neighbor weight property name.

  • activation_fn – Activation function. Supported functions: RELU, LEAKY_RELU, TANH, LINEAR. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.

  • weight_init_scheme – Initilization scheme for the weights in the layer. Supportes schemes: XAVIER, XAVIER_UNIFORM, ONES, ZEROS. Note that biases are always initialized with zeros.

Returns

Build GraphWiseConvLayerConfig

graphwise_dgi_layer_config(corruption_function=None, readout_function='MEAN', discriminator='BILINEAR')

Build a GraphWise DGI layer configuration and return it.

Parameters
  • corruption_function(CorruptionFunction) – Corruption Function to use

  • readout_function(str) – Neighbor weight property name. Supported functions: MEAN

  • discriminator(str) – discriminator function. Supported functions: BILINEAR

Returns

GraphWiseDgiLayerConfig object

graphwise_pred_layer_config(hidden_dim=None, activation_fn='ReLU', weight_init_scheme='XAVIER_UNIFORM')

Build a GraphWise prediction layer configuration and return it.

Parameters
  • hidden_dim – Hidden dimension. If this is the last layer, this setting will be ignored and replaced by the number of classes.

  • activation_fn – Activation function. Supported functions: RELU, LEAKY_RELU, TANH, LINEAR. If this is the last layer, this setting will be ignored and replaced by the activation function of the loss function, e.g softmax or sigmoid.

  • weight_init_scheme – Initilization scheme for the weights in the layer. Supportes schemes: XAVIER, XAVIER_UNIFORM, ONES, ZEROS. Note that biases are always initialized with zeros.

Returns

Build GraphWisePredictionLayerConfig

hits(graph, max_iter=100, auth='authorities', hubs='hubs')

Hyperlink-Induced Topic Search (HITS) assigns ranking scores to the vertices, aimed to assess the quality of information and references in linked structures.

Parameters
  • graph – Input graph

  • max_iter – Number of iterations that will be performed

  • auth – Vertex property holding the authority score for each vertex

  • hubs – Vertex property holding the hub score for each vertex

Returns

Vertex property holding the computed scores

in_degree_centrality(graph, dc='in_degree')
Measure the in-degree centrality of the vertices based on its degree, letting

you see how a vertex influences its neighborhood.

Parameters
  • graph – Input graph

  • dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

in_degree_distribution(graph, dist_map=None)
Parameters
  • graph – Input graph

  • dist_map – (Out argument) map holding a histogram of the vertex degrees in the graph

Returns

Map holding a histogram of the vertex degrees in the graph

k_core(graph, min_core=0, max_core=2147483647, kcore='kcore')

k-core decomposes a graph into layers revealing subgraphs with particular properties.

Parameters
  • graph – Input graph

  • min_core – Minimum k-core value

  • max_core – Maximum k-core value

  • kcore – Vertex property holding the result value

Returns

Pair holding the maximum core found and a node property with the largest k-core value for each node.

limited_shortest_path_hop_dist(graph, src, dst, max_hops, high_degree_vertex_mapping, high_degree_vertices, index, path_vertices=None, path_edges=None)
Compute the shortest path between the source and destination vertex. The algorithm

only considers paths up to a length of k.

Parameters
  • graph – Input graph

  • src – The source vertex

  • dst – The destination vertex

  • max_hops – The maximum number of edges to follow when trying to find a path

  • high_degree_vertex_mapping – Map with the top k high-degree vertices and their indices

  • high_degree_vertices – The high-degree vertices

  • index – Index containing distances to high-degree vertices

  • path_vertices – (out-argument) will contain the vertices on the found path or will be empty if there is none

  • path_edges – (out-argument) will contain the vertices on the found path or will be empty if there is none

Returns

A tuple containing the vertices in the shortest path from src to dst and the edges on the path. Both will be empty if there is no path within maxHops steps

limited_shortest_path_hop_dist_filtered(graph, src, dst, max_hops, high_degree_vertex_mapping, high_degree_vertices, index, filter, path_vertices=None, path_edges=None)
Compute the shortest path between the source and destination vertex. The algorithm

only considers paths up to a length of k.

Parameters
  • graph – Input graph

  • src – The source vertex

  • dst – The destination vertex

  • max_hops – The maximum number of edges to follow when trying to find a path

  • high_degree_vertex_mapping – Map with the top k high-degree vertices and their indices

  • high_degree_vertices – The high-degree vertices

  • index – Index containing distances to high-degree vertices

  • filter – Filter to be evaluated on the edges when searching for a path

  • path_vertices – (out-argument) will contain the vertices on the found path or will be empty if there is none

  • path_edges – (out-argument) will contain the vertices on the found path or will be empty if there is none

Returns

A tuple containing the vertices in the shortest path from src to dst and the edges on the path. Both will be empty if there is no path within maxHops steps

load_deepwalk_model(path, key)

Load an encrypted DeepWalk model.

Parameters
  • path – Path to model

  • key – The decryption key, or null if no encryption was used

Returns

Loaded model

load_pg2vec_model(path, key)

Load an encrypted pg2vec model.

Parameters
  • path – Path to model

  • key – The decryption key, or null if no encryption was used

Returns

Loaded model

load_supervised_graphwise_model(path, key)

Load an encrypted SupervisedGraphWise model.

Parameters
  • path – Path to model

  • key – The decryption key, or null if no encryption was used

Returns

Loaded model

load_unsupervised_graphwise_model(path, key)

Load an encrypted UnsupervisedGraphWise model.

Parameters
  • path – Path to model

  • key – The decryption key, or null if no encryption was used

Returns

Loaded model

local_clustering_coefficient(graph, lcc='lcc')

LCC gives information about potential clustering options in a graph.

Parameters
  • graph – Input graph

  • lcc – Vertex property holding the lcc value for each vertex

Returns

Vertex property holding the lcc value for each vertex

louvain(graph, weight, max_iter=100, nbr_pass=1, tol=0.0001, community='community')

Louvain to detect communities in a graph

Parameters
  • graph – Input graph.

  • weight – Weights of the edges of the graph.

  • max_iter – Maximum number of iterations that will be performed during each pass.

  • nbr_pass – Number of passes that will be performed.

  • tol – maximum tolerated error value, the algorithm will stop once the graph’s total modularity gain becomes smaller than this value.

  • label – Vertex property holding the community ID assigned to each vertex

Returns

Community IDs vertex property

matrix_factorization_gradient_descent(bipartite_graph, weight, learning_rate=0.001, change_per_step=1.0, lbd=0.15, max_iter=100, vector_length=10, features='features')
Parameters
  • bipartite_graph – Input graph between 1 and 5, the result will become inaccurate.

  • learning_rate – Learning rate for the optimization process

  • change_per_step – Parameter used to modulate the learning rate during the optimization process

  • lbd – Penalization parameter to avoid over-fitting during optimization process

  • max_iter – Maximum number of iterations that will be performed

  • vector_length – Size of the feature vectors to be generated for the factorization

  • features – Vertex property holding the generated feature vectors for each vertex. This function accepts names and VertexProperty objects.

Returns

Matrix factorization model holding the feature vectors found by the algorithm

matrix_factorization_recommendations(bipartite_graph, user, vector_length, feature, estimated_rating=None)

Complement for Matrix Factorization. The generated feature vectors will be used for making predictions in cases where the given user vertex has not been related to a particular item from the item set. Similarly to the recommendations from matrix factorization, this algorithm will perform dot products between the given user vertex and the rest of vertices in the graph, giving a score of 0 to the items that are already related to the user and to the products with other user vertices, hence returning the results of the dot products for the unrelated item vertices. The scores from those dot products can be interpreted as the predicted scores for the unrelated items given a particular user vertex.

Parameters
  • bipartite_graph – Bipartite input graph

  • user – Vertex from the left (user) side of the graph

  • vector_length – size of the feature vectors

  • feature – vertex property holding the feature vectors for each vertex

  • estimated_rating – (out argument) vertex property holding the estimated rating score for each vertex

Returns

vertex property holding the estimated rating score for each vertex

out_degree_centrality(graph, dc='out_degree')

Measure the out-degree centrality of the vertices based on its degree, letting you see how a vertex influences its neighborhood.

Parameters
  • graph – Input graph

  • dc – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Vertex property holding the computed scores

out_degree_distribution(graph, dist_map=None)
Parameters
  • graph – Input graph

  • dist_map – (Out argument) map holding a histogram of the vertex degrees in the graph

Returns

Map holding a histogram of the vertex degrees in the graph

pagerank(graph, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='pagerank')
Parameters
  • graph – Input graph

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor

  • max_iter – Maximum number of iterations that will be performed

  • norm – Determine whether the algorithm will take into account dangling vertices for the ranking scores.

  • rank – Vertex property holding the PageRank value for each vertex, or name for a new property

Returns

Vertex property holding the PageRank value for each vertex

pagerank_approximate(graph, tol=0.001, damping=0.85, max_iter=100, rank='approx_pagerank')
Parameters
  • graph – Input graph

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor

  • max_iter – Maximum number of iterations that will be performed

  • rank – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the PageRank value for each vertex

partition_conductance(graph, partition)

Partition conductance assesses the quality of many partitions in a graph.

Parameters
  • graph – Input graph

  • partition – Partition of the graph with the corresponding node collections

partition_modularity(graph, partition)

Modularity summarizes information about the quality of components in a graph.

Parameters
  • graph – Input graph

  • partition – Partition of the graph with the corresponding node collections

Returns

Scalar (double) to store the conductance value of the given cut

periphery(graph, periphery=None)

Periphery/center gives an overview of the extreme distances and the corresponding vertices in a graph.

Parameters
  • graph – Input graph

  • periphery – (Out argument) vertex set holding the vertices from the periphery or center of the graph

Returns

Vertex set holding the vertices from the periphery or center of the graph

personalized_pagerank(graph, v, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_pagerank')

Personalized PageRank for a vertex of interest.

Compares and spots out important vertices in a graph.

Parameters
  • graph – Input graph

  • v – The chosen vertex from the graph for personalization

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor

  • max_iter – Maximum number of iterations that will be performed

  • norm – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores.

  • rank – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the computed scores

personalized_salsa(bipartite_graph, v, tol=0.001, damping=0.85, max_iter=100, rank='personalized_salsa')

Personalized SALSA for a vertex of interest.

Assesses the quality of information and references in linked structures.

Parameters
  • bipartite_graph – Bipartite graph

  • v – The chosen vertex from the graph for personalization

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor to modulate the degree of personalization of the scores by the algorithm

  • max_iter – Maximum number of iterations that will be performed

  • rank – (Out argument) vertex property holding the normalized authority/hub ranking score for each vertex

Returns

Vertex property holding the computed scores

personalized_weighted_pagerank(graph, v, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='personalized_weighted_pagerank')
Parameters
  • graph – Input graph

  • v – The chosen vertex from the graph for personalization

  • weight – Edge property holding the weight of each edge in the graph

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor

  • max_iter – Maximum number of iterations that will be performed

  • norm – Boolean flag to determine whether the algorithm will take into account dangling vertices for the ranking scores

  • rank – Vertex property holding the PageRank value for each vertex

pg2vec_builder(graphlet_id_property_name, vertex_property_names, min_word_frequency=1, batch_size=128, num_epochs=5, layer_size=200, learning_rate=0.04, min_learning_rate=0.0001, window_size=4, walk_length=8, walks_per_vertex=5, graphlet_size_property_name='graphletSize-Pg2vec', use_graphlet_size=True, validation_fraction=0.05, seed=None)

Build a pg2Vec model and return it.

Parameters
  • graphlet_id_property_name – Property name of the graphlet-id in the input graph

  • vertex_property_names – Property names to consider for pg2vec model training

  • min_word_frequency – Minimum word frequency to consider before pruning

  • batch_size – Batch size for training the model

  • num_epochs – Number of epochs to train the model

  • layer_size – Number of dimensions for the output vectors

  • learning_rate – Initial learning rate

  • min_learning_rate – Minimum learning rate

  • window_size – Window size to consider while training the model

  • walk_length – Length of the walks

  • walks_per_vertex – Number of walks to consider per vertex

  • graphlet_size_property_name – Property name for graphlet size

  • use_graphlet_size – Whether to use or not the graphlet size

  • validation_fraction – Fraction of training data on which to compute final loss

  • seed – Seed

Returns

Build Pg2Vec Model

prim(graph, weight, mst='mst')

Prim reveals tree structures with shortest paths in a graph.

Parameters
  • graph – Input graph

  • weight – Edge property holding the weight of each edge in the graph

  • mst – Edge property holding the edges belonging to the minimum spanning tree of the graph

Returns

Edge property holding the edges belonging to the minimum spanning tree of the graph (i.e. all the edges with in_mst=true)

radius(graph, eccentricity='eccentricity')

Radius gives an overview of the distances in a graph. it is computed as the minimum graph eccentricity.

Parameters
  • graph – Input graph

  • eccentricity – (Out argument) vertex property holding the eccentricity value for each vertex

Returns

Pair holding the radius of the graph and a node property with eccentricity value for each node

random_walk_with_restart(graph, source, length, reset_prob, visit_count=None)

Perform a random walk over the graph. The walk will start at the given source vertex and will randomly visit neighboring vertices in the graph, with a probability equal to the value of reset_probability of going back to the starting point. The random walk will also go back to the starting point every time it reaches a vertex with no outgoing edges. The algorithm will stop once it reaches the specified walk lenght.

Parameters
  • graph – Input graph

  • source – Starting point of the random walk

  • length – Length (number of steps) of the random walk

  • reset_prob – Probability value for resetting the random walk

  • visit_count – (out argument) map holding the number of visits during the random walk for each vertex in the graph

Returns

map holding the number of visits during the random walk for each vertex in the graph

reachability(graph, src, dst, max_hops, ignore_edge_direction)

Reachability is a fast way to check if two vertices are reachable from each other.

Parameters
  • graph – Input graph

  • src – Source vertex for the search

  • dst – Destination vertex for the search

  • max_hops – Maximum hop distance between the source and destination vertices

  • ignore_edge_direction – Boolean flag for ignoring the direction of the edges during the search

Returns

The number of hops between the vertices. It will return -1 if the vertices are not connected or are not reachable given the condition of the maximum hop distance allowed.

salsa(bipartite_graph, tol=0.001, max_iter=100, rank='salsa')

Stochastic Approach for Link-Structure Analysis (SALSA) computes ranking scores.

It assesses the quality of information and references in linked structures.

Parameters
  • bipartite_graph – Bipartite graph

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • max_iter – Maximum number of iterations that will be performed

  • rank – Vertex property holding the value for each vertex in the graph

Returns

Vertex property holding the computed scores

scc_kosaraju(graph, label='scc_kosaraju')

Kosaraju finds strongly connected components in a graph.

Parameters
  • graph – Input graph

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm

scc_tarjan(graph, label='scc_tarjan')

Tarjan finds strongly connected components in a graph.

Parameters
  • graph – Input graph

  • label – Vertex property holding the degree centrality value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm

shortest_path_bellman_ford(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')

Bellman-Ford finds multiple shortest paths at the same time.

Parameters
  • graph – Input graph

  • src – Source node

  • distance – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph

  • weight – Edge property holding the weight of each edge in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

shortest_path_bellman_ford_reversed(graph, src, weight, distance='bellman_ford_distance', parent='bellman_ford_parent', parent_edge='bellman_ford_parent_edge')

Reversed Bellman-Ford finds multiple shortest paths at the same time.

Parameters
  • graph – Input graph

  • src – Source node

  • distance – (Out argument) vertex property holding the distance to the source vertex for each vertex in the graph

  • weight – Edge property holding the weight of each edge in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path.

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path.

Returns

AllPaths holding the information of the possible shortest paths from the source node.

shortest_path_bidirectional_dijkstra(graph, src, dst, weight, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge')

Bidirectional dijkstra is a fast algorithm for finding a shortest path in a graph.

Parameters
  • graph – Input graph

  • src – Source node

  • dst – Destination node

  • weight – Edge property holding the (positive) weight of each edge in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_dijkstra(graph, src, dst, weight, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge')
Parameters
  • graph – Input graph

  • src – Source node

  • dst – Destination node

  • weight – Edge property holding the (positive) weight of each edge in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_filtered_bidirectional_dijkstra(graph, src, dst, weight, filter_expression, parent='bidirectional_dijkstra_parent', parent_edge='bidirectional_dijkstra_parent_edge')
Parameters
  • graph – Input graph

  • src – Source node

  • dst – Destination node

  • weight – Edge property holding the (positive) weight of each edge in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

  • filter_expression – graphFilter object for filtering

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_filtered_dijkstra(graph, src, dst, weight, filter_expression, parent='dijkstra_parent', parent_edge='dijkstra_parent_edge')
Parameters
  • graph – Input graph

  • src – Source node

  • dst – Destination node

  • weight – Edge property holding the (positive) weight of each edge in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

  • filter_expression – GraphFilter object for filtering

Returns

PgxPath holding the information of the shortest path, if it exists

shortest_path_hop_distance(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Hop distance can give a relatively fast insight on the distances in a graph.

Parameters
  • graph – Input graph

  • src – Source node

  • distance – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

shortest_path_hop_distance_reversed(graph, src, distance='hop_dist_distance', parent='hop_dist_parent', parent_edge='hop_dist_edge')

Backwards hop distance can give a relatively fast insight on the distances in a graph.

Parameters
  • graph – Input graph

  • src – Source node

  • distance – Out argument) vertex property holding the distance to the source vertex for each vertex in the graph

  • parent – (Out argument) vertex property holding the parent vertex of the each vertex in the shortest path

  • parent_edge – (Out argument) vertex property holding the edge ID linking the current vertex in the path with the previous vertex in the path

Returns

AllPaths holding the information of the possible shortest paths from the source node

supervised_graphwise_builder(vertex_target_property_name, vertex_input_property_names=[], edge_input_property_names=[], loss_fn='SOFTMAX_CROSS_ENTROPY', pred_layer_config=None, conv_layer_config=None, batch_size=128, num_epochs=3, learning_rate=0.01, layer_size=128, class_weights=None, seed=None)

Build a SupervisedGraphWise model and return it.

Parameters
  • vertex_target_property_name – Target property name

  • vertex_input_property_names – Vertices Input feature names

  • edge_input_property_names – Edges Input feature names

  • loss_fn – Loss function. Supported: SOFTMAX_CROSS_ENTROPY, SIGMOID_CROSS_ENTROPY

  • pred_layer_config – Prediction layer configuration as list of PredLayerConfig, or default if None

  • conv_layer_config – Conv layer configuration as list of ConvLayerConfig, or default if None

  • batch_size – Batch size for training the model

  • num_epochs – Number of epochs to train the model

  • learning_rate – Learning rate

  • layer_size – Number of dimensions for the output vectors

  • class_weights – Class weights to be used in the loss function. The loss for the corresponding class will be multiplied by the factor given in this map. If null, uniform class weights will be used.

  • seed – Seed

Returns

Build SupervisedGraphWise Model

topological_schedule(graph, vs, topo_sched='topo_sched')

Topological schedule gives an order of visit for the reachable vertices from the source.

Parameters
  • graph – Input graph

  • vs – Set of vertices to be used as the starting points for the scheduling order

  • topo_sched – (Out argument) vertex property holding the scheduled order of each vertex

Returns

Vertex property holding the scheduled order of each vertex.

topological_sort(graph, topo_sort='topo_sort')

Topological sort gives an order of visit for vertices in directed acyclic graphs.

Parameters
  • graph – Input graph

  • topo_sort – (Out argument) vertex property holding the topological order of each vertex

unsupervised_graphwise_builder(vertex_input_property_names=[], edge_input_property_names=[], loss_fn='SIGMOID_CROSS_ENTROPY', conv_layer_config=None, batch_size=128, num_epochs=3, learning_rate=0.001, layer_size=128, class_weights=None, seed=None, dgi_layer_config=None)

Build a UnsupervisedGraphWise model and return it.

Parameters
  • vertex_input_property_names – Vertieces Input feature names

  • edge_input_property_names – Edges Input feature names

  • loss_fn – Loss function. Supported: SIGMOID_CROSS_ENTROPY

  • conv_layer_config – Conv layer configuration as list of ConvLayerConfig, or default if None

  • batch_size – Batch size for training the model

  • num_epochs – Number of epochs to train the model

  • learning_rate – Learning rate

  • layer_size – Number of dimensions for the output vectors

  • seed – Seed

  • dgi_layer_config – Dgi layer configuration as DgiLayerConfig object, or default if None

Returns

Build UnsupervisedGraphWise Model

vertex_betweenness_centrality(graph, bc='betweenness')
Parameters
  • graph – Input graph

  • bc – Vertex property holding the betweenness centrality value for each vertex

Returns

Vertex property holding the computed scores

wcc(graph, label='wcc')

Identify weakly connected components.

This can be useful for clustering graph data.

Parameters
  • graph – Input graph

  • label – Vertex property holding the value for each vertex in the graph. Can be a string or a VertexProperty object.

Returns

Partition holding the node collections corresponding to the components found by the algorithm.

weighted_closeness_centrality(graph, weight, cc='weighted_closeness')

Measure the centrality of the vertices based on weighted distances, allowing to find well-connected vertices.

Parameters
  • graph – Input graph

  • weight – Edge property holding the weight of each edge in the graph

  • cc – (Out argument) vertex property holding the closeness centrality value for each vertex

Returns

Vertex property holding the computed scores

weighted_pagerank(graph, weight, tol=0.001, damping=0.85, max_iter=100, norm=False, rank='weighted_pagerank')
Parameters
  • graph – Input graph

  • weight – Edge property holding the weight of each edge in the graph

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor

  • max_iter – Maximum number of iterations that will be performed

  • rank – Vertex property holding the PageRank value for each vertex

Returns

Vertex property holding the computed the peageRank value

whom_to_follow(graph, v, top_k=100, size_circle_of_trust=500, tol=0.001, damping=0.85, max_iter=100, salsa_tol=0.001, salsa_max_iter=100, hubs=None, auth=None)

Whom-to-follow (WTF) is a recommendation algorithm.

It returns two vertex sequences: one of similar users (hubs) and a second one with users to follow (auth).

Parameters
  • graph – Input graph

  • v – The chosen vertex from the graph for personalization of the recommendations

  • top_k – The maximum number of recommendations that will be returned

  • size_circle_of_trust – The maximum size of the circle of trust

  • tol – Maximum tolerated error value. The algorithm will stop once the sum of the error values of all vertices becomes smaller than this value.

  • damping – Damping factor for the Pagerank stage

  • max_iter – Maximum number of iterations that will be performed for the Pagerank stage

  • salsa_tol – Maximum tolerated error value for the SALSA stage

  • salsa_max_iter – Maximum number of iterations that will be performed for the SALSA stage

  • hubs – (Out argument) vertex sequence holding the top rated hub vertices (similar users) for the recommendations

  • auth – (Out argument) vertex sequence holding the top rated authority vertices (users to follow) for the recommendations

Returns

Vertex properties holding hubs and auth

class pypgx.api.BipartiteGraph(session, java_graph)

Bases: pypgx.api._pgx_graph.PgxGraph

A bipartite PgxGraph.

get_is_left_property()

Get the ‘is Left’ vertex property of the graph.

class pypgx.api.CompiledProgram(session, java_program)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

A compiled Green-Marl program.

Constructor arguments: :param session: Pgx Session :param java_program: Java compiledProgram

destroy()

Free resources on the server taken up by this Program.

run(*argv)

Run the compiled program with the given parameters.

If the Green-Marl procedure of this compiled program looks like this: procedure pagerank(G: Graph, e double, max int, nodePorp){…}

Parameters

argv – All the arguments required by specified procedure

Returns

Result of analysis as an AnalysisResult as a dict

class pypgx.api.EdgeCollection(graph, java_collection)

Bases: pypgx.api._pgx_collection.PgxCollection

A collection of edges.

add(e)

Add one or multiple edges to the collection.

Parameters

e – Edge or edge id. Can also be an iterable of edge/edge ids.

add_all(edges)

Add multiple vertices to the collection.

Parameters

edges – Iterable of edges/edges ids

contains(e)

Check if the collection contains edge e.

Parameters

e – PgxEdge object or id:

Returns

Boolean

remove(e)

Remove one or multiple edges from the collection.

Parameters

e – Edges or edges id. Can also be an iterable of edges/edges ids.

remove_all(edges)

Remove multiple edges from the collection.

Parameters

edges – Iterable of edges/edges ids

class pypgx.api.EdgeSequence(graph, java_collection)

Bases: pypgx.api._pgx_collection.EdgeCollection

An ordered sequence of edges which may contain duplicates.

class pypgx.api.EdgeSet(graph, java_collection)

Bases: pypgx.api._pgx_collection.EdgeCollection

An unordered set of edges (no duplicates).

class pypgx.api.FlashbackSynchronizer(java_flashback_synchronizer)

Bases: object

Synchronizes a PGX graph with an Oracle Database using Flashback queries.

class pypgx.api.GraphAlterationBuilder(java_graph_alteration_builder)

Bases: object

Builder to describe the alterations (graph schema modification) to perform to a graph.

It is for example possible to add or remove vertex and edge providers.

class pypgx.api.MatrixFactorizationModel(graph, java_mfm, features)

Bases: object

Object that holds the state for repeatedly returning estimated ratings.

get_estimated_ratings(v)

Return estimated ratings for a specific vertex.

Parameters

v – The vertex to get estimated ratings for.

Returns

The VertexProperty containing the estimated ratings.

class pypgx.api.MergingStrategyBuilder(java_mutation_strategy_builder)

Bases: pypgx.api._mutation_strategy_builder.MutationStrategyBuilder

A class for defining a merging strategy on a graph.

class pypgx.api.MutationStrategyBuilder(java_mutation_strategy_builder)

Bases: object

A class for defining a mutation strategy on a graph.

class pypgx.api.Namespace(java_namespace)

Bases: object

Represents a namespace for objects (e.g. graphs, properties) in PGX

class pypgx.api.PgqlResultSet(graph, java_pgql_result_set)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

Result set of a pattern matching query.

Note: retrieving results from the server is not thread-safe.

absolute(row)

Move the cursor to the given row number in this ResultSet object.

If the row number is positive, the cursor moves to the given row number with respect to the beginning of the result set. The first row is 1, so absolute(1) moves the cursor to the first row.

If the row number is negative, the cursor moves to the given row number with respect to the end of the result set. So absolute(-1) moves the cursor to the last row.

Parameters

row – Row to move to

Returns

True if the cursor is moved to a position in the ResultSet object;

False if the cursor is moved before the first or after the last row

after_last()

Place the cursor after the last row

before_first()

Set the cursor before the first row

close()

Free resources on the server taken up by this frame.

destroy()

Free resources on the server taken up by this frame.

first()

Move the cursor to the first row in the result set

Returns

True if the cursor points to a valid row; False if the result set does not

have any results

get(element)

Get the value of the designated element by element index or name

Parameters

element – Integer or string representing index or name

Returns

Content of cell

get_boolean(element)

Get the value of the designated element by element index or name as a Boolean

Parameters

element – Integer or String representing index or name

Returns

Boolean

get_date(element)

Get the value of the designated element by element index or name as a datetime Date

Parameters

element – Integer or String representing index or name

Returns

datetime.date

get_double(element)

Get the value of the designated element by element index or name as a Float

This method is for precision, as a Java floats and doubles have different precisions

Parameters

element – Integer or String representing index or name

Returns

Float

get_edge(element)

Get the value of the designated element by element index or name as a PgxEdge

Parameters

element – Integer or String representing index or name

Returns

PgxEdge

get_float(element)

Get the value of the designated element by element index or name as a Float

Parameters

element – Integer or String representing index or name

Returns

Float

get_integer(element)

Get the value of the designated element by element index or name as an Integer

Parameters

element – Integer or String representing index or name

Returns

Integer

get_legacy_datetime(element)

Get the value of the designated element by element index or name as a Datetime. Works with most time and date type cells. If the date is not specified, default is set to to Jan 1 1970.

Parameters

element – Integer or String representing index or name

Returns

datetime.datetime

get_list(element)

Get the value of the designated element by element index or name as a List

Parameters

element – Integer or String representing index or name

Returns

List

get_long(element)

Get the value of the designated element by element index or name as a Long

Parameters

element – Integer or String representing index or name

Returns

Long

get_point2d(element)

Get the value of the designated element by element index or name as a 2D tuple

Parameters

element – Integer or String representing index or name

Returns

(X, Y)

get_row(row)

Get row from result_set. This method may change result_set cursor.

Parameters

row – Row index

get_slice(start, stop, step=1)

Get slice from result_set. This method may change result_set cursor.

Parameters
  • start – Start index

  • stop – Stop index

  • step – Step size

get_string(element)

Get the value of the designated element by element index or name as a String

Parameters

element – Integer or String representing index or name

Returns

String

get_time(element)

Get the value of the designated element by element index or name as a datetime Time

Parameters

element – Integer or String representing index or name

Returns

datetime.time

get_time_with_timezone(element)

Get the value of the designated element by element index or name as a datetime Time that includes timezone

Parameters

element – Integer or String representing index or name

Returns

datetime.time

get_timestamp(element)

Get the value of the designated element by element index or name as a Datetime

Parameters

element – Integer or String representing index or name

Returns

datetime.datetime

get_timestamp_with_timezone(element)

Get the value of the designated element by element index or name as a Datetime

Parameters

element – Integer or String representing index or name

Returns

datetime.datetime

get_vertex(element)

Get the value of the designated element by element index or name as a PgxVertex

Parameters

element – Integer or String representing index or name

Returns

PgxVertex

get_vertex_labels(element)

Get the value of the designated element by element index or name a list of labels

Parameters

element – Integer or String representing index or name

Returns

list

last()

Move the cursor to the first row in the result set

Returns

True if the cursor points to a valid row; False if the result set does not

have any results

next()

Move the cursor forward one row from its current position

Returns

True if the cursor points to a valid row; False if the new cursor is positioned

after the last row

previous()

Move the cursor to the previous row from its current position

Returns

True if the cursor points to a valid row; False if the new cursor is positioned

before the first row

print(file=None, num_results=1000, start=0)

Print the result set.

Parameters
  • file – File to which results are printed (default is sys.stdout)

  • num_results – Number of results to be printed

  • start – Index of the first result to be printed

relative(rows)

Move the cursor a relative number of row with respect to the current position. A negative number will move the cursor backwards.

Note: Calling relative(1) is equal to next() and relative(-1) is equal to previous. Calling relative(0) is possible when the cursor is positioned at a row, not when it is positioned before the first or after the last row. However, relative(0) will not update the position of the cursor.

Parameters

rows – Relative number of rows to move from current position

Returns

True if the cursor is moved to a position in the ResultSet object; False if

the cursor is moved before the first or after the last row

to_frame()

Copy the content of this result set into a new PgxFrames

Returns

a new PgxFrame containing the content of the result set

to_pandas()

Convert to pandas DataFrame.

This method may change result_set cursor.

This method requires pandas.

Returns

PgqlResultSet as a Pandas Dataframe

class pypgx.api.Pgx(java_pgx_class)

Bases: object

Main entry point for PGX applications.

create_session(source=None, base_url=None)

Create and return a session.

Parameters
  • source – The session source string. Default value is “pgx_python”.

  • base_url – The base URL in the format host [ : port][ /path] of the PGX server REST end-point. If base_url is None, the default will be used which points to embedded PGX instance.

get_instance(base_url=None, token=None)

Get a handle to a PGX instance.

Parameters
  • base_url – The base URL in the format host [ : port][ /path] of the PGX server REST end-point. If base_url is None, the default will be used which points to embedded PGX instance.

  • token – The access token

set_default_url(url)

Set the default base URL used by invocations of get_instance().

The new default URL affects sub-sequent calls of getInstance().

Parameters

url – New URL

class pypgx.api.PgxCollection(graph, java_collection)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

Superclass for Pgx collections.

add_all_elements(source)

Add elements to an existing collection.

Parameters

source – Elements to add

clone(name=None)

Clone and rename existing collection.

Parameters

name – New name of the collection. If none, the old name is not changed.

close()

Request destruction of this object. After this method returns, the behavior of any method of this class becomes undefined.

get_id()

Return the string representation of an internal identifier for this collection. Only meant for internal usage.

Returns

a string representation of the internal identifier of this collection

get_pgx_id()

Return an internal identifier for this collection. Only meant for internal usage.

Returns

the internal identifier of this collection

remove_all_elements(source)

Remove elements from an existing collection.

Parameters

source – Elements to remove

property size

Get the number of elements in this collection.

to_mutable(name=None)

Create a mutable copy of an existing collection.

Parameters

name – New name of the collection. If none, the old name is not changed.

class pypgx.api.PgxEntity(graph, java_entity)

Bases: object

An abstraction of vertex and edge.

get_property(property_name)

Get a property by name.

Parameters

property_name – Property name

set_property(property_name, value)

Set an entity property.

Parameters
  • property_name – Property name

  • value – New value

class pypgx.api.PgxFrame(java_pgx_frame)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

clone()

Create a new PgxFrame with the same content as the current frame

Returns

PgxFrame

close()

Free resources on the server taken up by this frame.

count()

Count number of elements in the frame.

destroy()

Free resources on the server taken up by this frame.

flatten(*columns, inplace=False)

Create a new PgxFrame with all the specified columns and vector columns flattened into multiple columns.

Parameters
  • columns – Column names

  • inplace – Apply the changes inplace and return self

Returns

PgxFrame

flatten_all(inplace=False)

Create a new PgxFrame with all nested columns and vector columns flattened into multiple columns.

Parameters

inplace – Apply the changes inplace and return self

Returns

PgxFrame

get_column(name)

Return a PgxFrameColumn.

Parameters

name – Column name

Returns

PgxFrameColumn

get_column_descriptors()

Return a list containing the description of the different columns of the frames.

head(num_rows=10, inplace=False)

Return the first num_rows elements of the frame

Parameters
  • num_rows – Number of rows to take

  • inplace – Apply the changes inplace and return self

Returns

PgxFrame

join(right, join_key_column=None, left_join_key_column=None, right_join_key_column=None, left_prefix=None, right_prefix=None, inplace=False)
Create a new PgxFrame by adding the columns of the right frame to this frame, aligned on

equality of entries in column left_join_key_column for this frame and column right_join_key_column for the right frame, or join_key_columns on both frames. The resulting frame will contain the columns of this frame prefixed by left_prefix and the columns of right frame prefixed by right_prefix (if the prefixes are not null). Prefixes must ether not be set or both be set.

Parameters
  • right – PgxFrame whose columns will be added to the columns of this PgxFrame

  • join_key_column – Column of both frames on which the equality test will be performed

  • left_join_key_column – Column of this frame on which the equality test will be

performed with right_join_key_column :param right_join_key_column: Column of right frame on which the equality test will be performed with leftJoinKeyColumn :param left_prefix: Prefix of the columns name of this frame in the resulting frame :param right_prefix: Prefix of the columns name of right frame in the resulting frame :param inplace: Apply the changes inplace and return self :returns: PgxFrame

property length

Return the number of rows in the frame.

Returns

number of rows

print(file=None, num_results=1000, start=0)

Print the frame.

Parameters
  • file – File to which results are printed (default is sys.stdout)

  • num_results – Number of results to be printed

  • start – Index of the first result to be printed

rename_column(old_column_name, new_column_name, inplace=False)

Return a PgxFrame with the column name modified.

Parameters
  • old_column_name – name of the column to rename

  • new_column_name – name of the column after the operation

  • inplace – Apply the changes inplace and return self

Returns

PgxFrame

rename_columns(column_renaming, inplace=False)

Return a PgxFrame with the column name modified.

Parameters

column_renaming – dict-like holding old_column names as keys and new column names as

values :param inplace: Apply the changes inplace and return self :returns: PgxFrame

select(*columns, inplace=False)

Select multiple columns by column name.

Parameters
  • columns – Column names

  • inplace – Apply the changes inplace and return self

Returns

PgxFrame

store(path, file_format='csv', overwrite=True)

Store the frame in a file.

Parameters
  • path – Path where to store the frame

  • file_format – Storage format

  • overwrite – Overwrite current file

tail(num_rows=10, inplace=False)

Return the last num_rows elements of the frame

Parameters
  • num_rows – Number of rows to take

  • inplace – Apply the changes inplace and return self

Returns

PgxFrame

to_pandas()

Convert to pandas DataFrame.

This method may change result_set cursor.

This method requires pandas.

Returns

PgxFrame as a Pandas Dataframe

to_pgql_result_set()

Create a new PgqlResultSet having the same content as this frame.

Returns

PgqlResultSet

union(*frames, inplace=False)

Create a PgxFrame by concatenating the rows of this frame with the rows of the frames in frames. The different frames should have the same columns (same names, types and dimensions), in the same order. The resulting frame is not guaranteed to have any specific ordering of its rows.

Parameters
  • frames – Frames tu add through union

  • inplace – Apply the changes inplace and return self

Returns

PgxFrame

write()

Get Pgx Frame storer

Returns

PgxGenericFrameStorer

class pypgx.api.PgxGraph(session, java_graph)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

A reference to a graph on the server side.

Operations on instances of this class are executed on the server side onto the referenced graph. Note that a session can have multiple objects referencing the same graph: the result of any operation mutating the graph on any of those references will be visible on all of them.

add_redaction_rule(redaction_rule_config, authorization_type, *names)

Add a redaction rule for authorization_type names.

Possible authorization types are: [‘user’, ‘role’]

Parameters
  • authorization_type – the authorization type of the rule to be added

  • names – the names of the users or roles for which the rule should be added

alter_graph()

Create a graph alteration builder to define the graph schema alterations to perform on the graph.

Returns

an empty graph alteration builder

bipartite_sub_graph_from_in_degree(vertex_properties=True, edge_properties=True, name=None, is_left_name=None, in_place=False)

Create a bipartite version of this graph with all vertices of in-degree = 0 being the left set.

Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • name – New graph name

  • is_left_name – Name of the boolean isLeft vertex property of the new graph. If None, a name will be generated.

  • in_place – Whether to create a new copy (False) or overwrite this graph (True)

bipartite_sub_graph_from_left_set(vset, vertex_properties=True, edge_properties=True, name=None, is_left_name=None)

Create a bipartite version of this graph with the given vertex set being the left set.

Parameters
  • vset – Vertex set representing the left side

  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • name – name of the new graph. If None, a name will be generated.

  • is_left_name – Name of the boolean isLeft vertex property of the new graph. If None, a name will be generated.

clone(vertex_properties=True, edge_properties=True, name=None)

Return a copy of this graph.

Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be cloned as well

  • edge_properties – List of edge properties belonging to graph specified to be cloned as well

  • name – Name of the new graph

clone_and_execute_pgql(pgql_query)

Create a deep copy of the graph, and execute on it the pgql query.

Parameters

pgql_query – Query string in PGQL

Returns

A cloned PgxGraph with the pgql query executed

throws InterruptedException if the caller thread gets interrupted while waiting for

completion.

throws ExecutionException if any exception occurred during asynchronous execution.

The actual exception will be nested.

close()

Destroy without waiting for completion.

combine_edge_properties_into_vector_property(properties, name=None)

Take a list of scalar edge properties of same type and create a new edge vector property by combining them.

The dimension of the vector property will be equals to the number of properties.

Parameters
  • properties – List of scalar edge properties

  • name – Name for the vector property. If not null, vector property will be named. If that results in a name conflict, the returned future will complete exceptionally.

combine_vertex_properties_into_vector_property(properties, name=None)

Take a list of scalar vertex properties of same type and create a new vertex vector property by combining them.

The dimension of the vector property will be equals to the number of properties.

Parameters
  • properties – List of scalar vertex properties

  • name – Name for the vector property. If not null, vector property will be named. If that results in a name conflict, the returned future will complete exceptionally.

property config

Get the GraphConfig object.

create_all_paths(src, cost, dist, parent, parent_edge)

Create an AllPaths object representing all the shortest paths from a single source to all the possible destinations (shortest regarding the given edge costs).

Parameters
  • src – Source vertex of the path

  • cost – Property holding the edge costs. If None, the resulting cost will equal the hop distance

  • dist – Property holding the distance to the source vertex for each vertex in the graph

  • parent – Property holding the parent vertices of all the shortest paths For example, if the shortest path is A -> B -> C, then parent[C] -> B and parent[B] -> A

  • parent_edge – Property holding the parent edges for each vertex of the shortest path

Returns

The AllPaths object

create_components(components, num_components)

Create a Partition object holding a collection of vertex sets, one for each component.

Parameters
  • components – Vertex property mapping each vertex to its component ID. Note that only component IDs in the range of [0..numComponents-1] are allowed. The returned future will complete exceptionally with an IllegalArgumentException if an invalid component ID is encountered. Gaps are supported: certain IDs not being associated with any vertices will yield to empty components.

  • num_components – How many different components the components property contains

Returns

The Partition object

create_edge_property(data_type, name=None)

Create a session-bound edge property.

Parameters
  • data_type – Type of the vector property to create

  • name – Name of vector property to be created

create_edge_sequence(name=None)

Create a new edge sequence.

Parameters

name – Sequence name

create_edge_set(name=None)

Create a new edge set.

Parameters

name – Edge set name

create_edge_vector_property(data_type, dim, name=None)

Create a session-bound vector property.

Parameters
  • data_type – Type of the vector property to create

  • dim – Dimension of vector property to be created

  • name – Name of vector property to be created

create_map(key_type, val_type, name=None)

Create a session-bound map.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • key_type – Property type of the keys that are going to be stored inside the map

  • val_type – Property type of the values that are going to be stored inside the map

  • name – Map name

create_merging_strategy_builder()

Create a new MergingStrategyBuilder that can be used to build a new MutationStrategy to simplify this graph.

create_path(src, dst, cost, parent, parent_edge)
Parameters
  • src – Source vertex of the path

  • dst – Destination vertex of the path

  • cost – Property holding the edge costs. If null, the resulting cost will equal the hop distance.

  • parent – Property holding the parent vertices for each vertex of the shortest path. For example, if the shortest path is A -> B -> C, then parent[C] -> B and parent[B] -> A.

  • parent_edge – Property holding the parent edges for each vertex of the shortest path

Returns

The PgxPath object

create_picking_strategy_builder()

Create a new PickingStrategyBuilder that can be used to build a new PickingStrategy to simplify this graph.

create_scalar(data_type, name=None)

Create a new Scalar.

Parameters
  • data_type – Scalar type

  • name – Name of the scalar to be created

create_synchronizer(synchronizer_class, connection=None, invalid_change_policy=None)

Create a synchronizer object which can be used to keep this graph in sync with changes happening in its original data source. Only partitioned graphs with all providers loaded from Oracle Database are supported.

Possible invalid_change_policy types are: [‘ignore’, ‘ignore_and_log’,

‘ignore_and_log_once’, ‘error’]

Parameters
  • class – the implementation class of the Synchronizer

  • connection – the connection object to the RDBMS

  • invalid_change_policy – sets the OnInvalidChange parameter to the Synchronizer ChangeSet

Returns

a synchronizer

create_vector_scalar(data_type, name=None)

Create a new vertex property.

Parameters
  • data_type – Property type

  • name – Name of the scalar to be created

create_vertex_property(data_type, name=None)

Create a new vertex property.

Parameters
  • data_type – Property type

  • name – Name of the property to be created

create_vertex_sequence(name=None)

Create a new vertex sequence.

Parameters

name – Sequence name

create_vertex_set(name=None)

Create a new vertex set.

Parameters

name – Set name

create_vertex_vector_property(data_type, dim, name=None)

Create a session-bound vertex vector property.

Parameters
  • data_type – Type of the vector property to create

  • dim – Dimension of vector property to be created

  • name – Name of vector property to be created

destroy_edge_property_if_exists(name)

Destroy a specific edge property if it exists.

Parameters

name – Property name

destroy_vertex_property_if_exists(name)

Destroy a specific vertex property if it exists.

Parameters

name – Property name

execute_pgql(pgql_query)

(BETA) Blocking version of cloneAndExecutePgqlAsync(String).

Calls cloneAndExecutePgqlAsync(String) and waits for the returned PgxFuture to complete.

throws InterruptedException if the caller thread gets interrupted while waiting for completion.

throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

Parameters

pgql_query – Query string in PGQL

Returns

The query result set as PgqlResultSet object

explain_pgql(pgql_query)

Explain the execution plan of a pattern matching query.

Note: Different PGX versions may return different execution plans.

Parameters

pgql_query – Query string in PGQL

Returns

The query plan

filter(graph_filter, vertex_properties=True, edge_properties=True, name=None)

Create a subgraph of this graph.

To create the subgraph, a given filter expression is used to determine which parts of the graph will be part of the subgraph.

Parameters
  • graph_filter – Object representing a filter expression that is applied to create the subgraph

  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • name – Filtered graph name

get_collections()

Retrieve all currently allocated collections associated with the graph.

get_edge(eid)

Get an edge with a specified id.

Parameters

eid – edge id

get_edge_label()

Get the edge labels belonging to this graph.

get_edge_properties()

Get the set of edge properties belonging to this graph.

This list might contain transient, private and published properties.

get_edge_property(name)

Get an edge property by name.

Parameters

name – Property name

get_edges(filter_expr=None, name=None)

Create a new edge set containing vertices according to the given filter expression.

Parameters
  • filter_expr – EdgeFilter object with the filter expression. If None all the vertices are returned.

  • name – the name of the collection to be created. If None, a name will be generated.

get_id()

Get the Graph id.

Returns

A string representation of the id of this graph.

get_meta_data()

Get the GraphMetaData object.

Returns

A ‘GraphMetaData’ object of this graph.

get_or_create_edge_property(name, data_type=None, dim=0)

Get or create an edge property.

Parameters
  • name – Property name

  • data_type – Property type

  • dim – Dimension of vector property to be created

get_or_create_edge_vector_property(data_type, dim, name=None)

Get or create a session-bound edge property.

Parameters
  • data_type – Type of the vector property to create

  • dim – Dimension of vector property to be created

  • name – Name of vector property to be created

get_or_create_vertex_property(name, data_type=None, dim=0)

Get or create a vertex property.

Parameters
  • name – Property name

  • data_type – Property type

  • dim – Dimension of vector property to be created

get_or_create_vertex_vector_property(data_type, dim, name=None)

Get or create a session-bound vertex vector property.

Parameters
  • data_type – Type of the vector property to create

  • dim – Dimension of vector property to be created

  • name – Name of vector property to be created

get_pgx_id()

Get the Graph id.

Returns

The id of this graph.

get_random_edge()

Get a edge vertex from the graph.

get_random_vertex()

Get a random vertex from the graph.

get_redaction_rules(authorization_type, name)

Get the redaction rules for an authorization_type name.

Possible authorization types are: [‘user’, ‘role’]

Parameters
  • authorization_type – the authorization type of the rules to be returned

  • name – the name of the user or role for which the rules should be returned

Returns

a list of redaction rules for the given name of type authorization_type

get_vertex(vid)

Get a vertex with a specified id.

Parameters

vid – Vertex id

Returns

pgxVertex object

get_vertex_labels()

Get the vertex labels belonging to this graph.

get_vertex_properties()

Get the set of vertex properties belonging to this graph.

This list might contain transient, private and published properties.

get_vertex_property(name)

Get a vertex property by name.

Parameters

name – Property name

get_vertices(filter_expr=None, name=None)

Create a new vertex set containing vertices according to the given filter expression.

Parameters
  • filter_expr – VertexFilter object with the filter expression if None all the vertices are returned

  • name – The name of the collection to be created. If None, a name will be generated.

grant_permission(permission_entity, pgx_resource_permission)

Grant a permission on this graph to the given entity.

Possible PGXResourcePermission types are: [‘none’, ‘read’, ‘write’, ‘export’, ‘manage’] Possible PermissionEntity objects are: PgxUser and PgxRole.

Cannont grant ‘manage’.

Parameters
  • permission_entity – the entity the rule is granted to

  • pgx_resource_permission – the permission type

has_edge(eid)

Check if the edge with id vid is in the graph.

Parameters

eid – Edge id

has_edge_label()

Return True if the graph has edge labels, False if not.

has_vertex(vid)

Check if the vertex with id vid is in the graph.

Parameters

vid – vertex id

has_vertex_labels()

Return True if the graph has vertex labels, False if not.

is_bipartite(is_left)

Check whether a given graph is a bipartite graph.

A graph is considered a bipartite graph if all nodes can be divided in a ‘left’ and a ‘right’ side where edges only go from nodes on the ‘left’ side to nodes on the ‘right’ side.

Parameters

is_left – Boolean vertex property that - if the method returns true - will contain for each node whether it is on the ‘left’ side of the bipartite graph. If the method returns False, the content is undefined.

property is_fresh

Check whether an in-memory representation of a graph is fresh.

is_pinned()

For a published graph, indicates if the graph is pinned. A pinned graph will stay published even if no session is using it.

property is_published

Check if this graph is published with snapshots.

is_published_with_snapshots()

Check if this graph is published with snapshots.

Returns

True if this graph is published, false otherwise

property pgx_instance

Get the server instance.

pick_random_vertex()

Select a random vertex from the graph.

Returns

The PgxVertex object

pin()

For a published graph, pin the graph so that it stays published even if no sessions uses it. This call pins the graph lineage, which ensures that at least the latest available snapshot stays published when no session uses the graph.

prepare_pgql(pgql_query)

Prepare a PGQL query.

Parameters

pgql_query – Query string in PGQL

Returns

A prepared statement object

publish(vertex_properties=True, edge_properties=True)

Publish the graph so it can be shared between sessions.

This moves the graph name from the private into the public namespace.

Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be published as well

  • edge_properties – List of edge properties belonging to graph specified by graph to be published as well

publish_with_snapshots()

Publish the graph and all its snapshots so they can be shared between sessions.

query_pgql(query)

Submit a pattern matching select only query.

Parameters

query – Query string in PGQL

Returns

PgqlResultSet with the result

remove_redaction_rule(redaction_rule_config, authorization_type, *names)

Remove a redaction rule for authorization_type names.

Possible authorization types are: [‘user’, ‘role’]

Parameters
  • authorization_type – the authorization type of the rule to be removed

  • names – the names of the users or roles for which the rule should be removed

rename(name)

Rename this graph.

Parameters

name – New name

revoke_permission(permission_entity)

Revoke all permissions on this graph from the given entity.

Possible PermissionEntity objects are: PgxUser and PgxRole.

Parameters

permission_entity – the entity for which all permissions will be revoked

simplify(vertex_properties=True, edge_properties=True, keep_multi_edges=False, keep_self_edges=False, keep_trivial_vertices=False, in_place=False, name=None)

Create a simplified version of a graph.

Note that the returned graph and properties are transient and therefore session bound. They can be explicitly destroyed and get automatically freed once the session dies.

Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • keep_multi_edges – Defines if multi-edges should be kept in the result

  • keep_self_edges – Defines if self-edges should be kept in the result

  • keep_trivial_vertices – Defines if isolated nodes should be kept in the result

  • in_place – If the operation should be done in place of if a new graph has to be created

  • name – New graph name

sort_by_degree(vertex_properties=True, edge_properties=True, ascending=True, in_degree=True, in_place=False, name=None)

Create a sorted version of a graph and all its properties.

The returned graph is sorted such that the node numbering is ordered by the degree of the nodes. Note that the returned graph and properties are transient.

Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • ascending – Sorting order

  • in_degree – If in_degree should be used for sorting. Otherwise use out degree.

  • in_place – If the sorting should be done in place or a new graph should be created

  • name – New graph name

sparsify(sparsification, vertex_properties=True, edge_properties=True, name=None)

Sparsify the given graph and returns a new graph with less edges.

Parameters
  • sparsification – The sparsification coefficient. Must be between 0.0 and 1.0..

  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • name – Filtered graph name

store(format, path, num_partitions=None, vertex_properties=True, edge_properties=True, overwrite=False)

Store graph in a file.

Parameters
  • format – One of [‘pgb’, ‘edge_list’, ‘two_tables’, ‘adj_list’, ‘flat_file’, ‘graphml’, ‘pg’, ‘rdf’, ‘csv’]

  • path – Path to which graph will be stored

  • num_partitions – The number of partitions that should be created, when exporting to multiple files

  • vertex_properties – The collection of vertex properties to store together with the graph data. If not specified all the vertex properties are stored

  • edge_properties – The collection of edge properties to store together with the graph data. If not specified all the vertex properties are stored

  • overwrite – Overwrite if existing

transpose(vertex_properties=True, edge_properties=True, edge_label_mapping=None, in_place=False, name=None)

Create a transpose of this graph.

A transpose of a directed graph is another directed graph on the same set of vertices with all of the edges reversed. If this graph contains an edge (u,v) then the return graph will contain an edge (v,u) and vice versa. If this graph is undirected (isDirected() returns false), this operation has no effect and will either return a copy or act as identity function depending on the mode parameter.

Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • edge_label_mapping – Can be used to rename edge labels. For example, an edge (John,Mary) labeled “fatherOf” can be transformed to be labeled “hasFather” on the transpose graph’s edge (Mary,John) by passing in a dict like object {“fatherOf”:”hasFather”}.

  • in_place – If the transpose should be done in place or a new graph should be created

  • name – New graph name

undirect(vertex_properties=True, edge_properties=True, keep_multi_edges=True, keep_self_edges=True, keep_trivial_vertices=True, in_place=False, name=None)
Parameters
  • vertex_properties – List of vertex properties belonging to graph specified to be kept in the new graph

  • edge_properties – List of edge properties belonging to graph specified to be kept in the new graph

  • keep_multi_edges – Defines if multi-edges should be kept in the result

  • keep_self_edges – Defines if self-edges should be kept in the result

  • keep_trivial_vertices – Defines if isolated nodes should be kept in the result

  • in_place – If the operation should be done in place of if a new graph has to be created

  • name – New graph name

unpin()

For a published graph, unpin the graph so that if no snapshot of the graph is used by any session or pinned, the graph and all its snapshots can be removed.

class pypgx.api.PgxMap(graph, java_map)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

A map is a collection of key-value pairs.

contains_key(key)
Parameters

key – Key of the entry

entries()

Return an entry set.

get(key)

Get the entry with the specified key.

Parameters

key – Key of the entry

Returns

Value

keys()

Return a key set.

put(key, value)

Set the value for a key in the map specified by the given name.

Parameters
  • key – Key of the entry

  • value – New value

remove(key)

Remove the entry specified by the given key from the map with the given name.

Returns true if the map did contain an entry with the given key, false otherwise.

Parameters

key – Key of the entry

Returns

True if the map contained the key

property size

Map size.

class pypgx.api.PgxPath(graph, java_path)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

property edges

Return a list of edges in the path.

property path

Return path as a list of (vertex,edge) tuples.

property vertices

Return a list of vertices in the path.

class pypgx.api.PgxSession(java_session)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

A PGX session represents an active user connected to a ServerInstance.

Every session gets a workspace assigned on the server, which can be used to read graph data, create transient data or custom algorithms for the sake of graph analysis. Once a session gets destroyed, all data in the session workspace is freed.

close()

Close this session object.

compile_program(path, overwrite=False)

Compile a Green-Marl program for parallel execution with all optimizations enabled.

Parameters
  • path – Path to program

  • overwrite – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise

compile_program_code(code, overwrite=False)

Compile a Green-Marl program.

Parameters
  • code – The Green-Marl code to compile

  • overwrite – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise

create_analyst()

Create and return a new analyst.

Returns

An analyst object

create_frame(schema, column_data, frame_name)

Create a frame with the specified data

Parameters
  • schema – List of tuples (columnName, columnType)

  • column_data – Map of iterables, columnName -> columnData

  • frame_name – Name of the frame

Returns

A frame builder initialized with the given schema

create_frame_builder(schema)

Create a frame builder initialized with the given schema

Parameters

schema – List of tuples (columnName, columnType)

Returns

A frame builder initialized with the given schema

create_graph_builder(id_type='integer', vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='auto_generated')

Create a graph builder with integer vertex IDs.

Parameters
  • id_type – The type of the vertex ID

  • vertex_id_generation_strategy – The vertices Id generation strategy to be used

  • edge_id_generation_strategy – The edges Id generation strategy to be used

create_map(key_type, value_type, name=None)

Create a map.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • key_type – Property type of the keys that are going to be stored inside the map

  • value_type – Property type of the values that are going to be stored inside the map

  • name – Map name

Returns

A named PgxMap of key content type key_type and value content type value_type

create_sequence(content_type, name=None)

Create a sequence of scalars.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • content_type – Property type of the elements in the sequence

  • name – Sequence name

Returns

A named ScalarSequence of content type content_type

create_set(content_type, name=None)

Create a set of scalars.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • content_type – content type of the set

  • name – the set’s name

Returns

A named ScalarSet of content type content_type

describe_graph_file(file_path)

Describe the graph contained in the file at the given path.

Parameters

file_path – Graph file path

Returns

The configuration which can be used to load the graph

describe_graph_files(files_path)

Describe the graph contained in the files at the given paths.

Parameters

files_path – Paths to the files

Returns

The configuration which can be used to load the graph

destroy()

Destroy this session object.

edge_provider_from_frame(provider_name, source_provider, destination_provider, frame, source_vertex_column='src', destination_vertex_column='dst')

Create an edge provider from a PgxFrame to later build a PgxGraph

Parameters
  • provider_name – edge provider name

  • source_provider – vertex source provider name

  • destination_provider – vertex destination provider name

  • frame – PgxFrame to use

  • source_vertex_column – column to use as source keys. Defaults to “src”

  • destination_vertex_column – column to use as destination keys. Defaults to “dst”

Returns

the EdgeFrameDeclaration object

execute_pgql(pgql_query)

Submit any query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraphAsync(String).

Parameters

pgql_query – Query string in PGQL

Returns

The query result set

throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

explain_pgql(pgql_query)

Explain the execution plan of a pattern matching query.

Note: Different PGX versions may return different execution plans.

Parameters

pgql_query – Query string in PGQL

Returns

The query plan

get_available_compiled_program_ids()

Get the set of available compiled program IDs.

get_available_snapshots(snapshot)

Return a list of all available snapshots of the given input graph.

Parameters

snapshot – A ‘PgxGraph’ object for which the available snapshots shall be retrieved

Returns

A list of ‘GraphMetaData’ objects, each corresponding to a snapshot of the input graph

get_compiled_program(id)

Get a compiled program by ID.

Parameters

id – The id of the compiled program

get_graph(name, namespace=None)

Find and return a graph with name name within the given namespace loaded inside PGX.

The search for the snapshot to return is done according to the following rules:

  • if namespace is private, than the search occurs on already referenced snapshots of the graph with name name and the most recent snapshot is returned

  • if namespace is public, then the search occurs on published graphs and the most recent snapshot of the published graph with name name is returned

  • if namespace is null, then the private namespace is searched first and, if no snapshot is found, the public namespace is then searched

Multiple calls of this method with the same parameters will return different PgxGraph objects referencing the same graph, with the server keeping track of how many references a session has to each graph.

Therefore, a graph is released within the server either if:

  • all the references are moved to another graph (e.g. via setSnapshot(PgxGraph, long))

  • the Destroyable.destroy() method is called on one reference: note that

this invalidates all references

Parameters
  • name – The name of the graph

  • namespace – The namespace where to look up the graph

Returns

The graph with the given name

get_graphs(namespace=None)

Return a collection of graph names accessible under the given namespace.

Parameters

namespace – The namespace where to look up the graphs

get_idle_timeout()

Get the idle timeout of this session

Returns

the idle timeout in seconds

get_name()

Get the identifier of the current session.

Returns

identifier of this session

get_pattern_matching_semantic()
Returns

The current pattern matching semantic. If the return value is None, the current session respects the pattern matching configuration of the engine.

get_pgql_result_set(id)

Get a PGQL result set by ID.

Parameters

id – The PGQL result set ID

Returns

The requested PGQL result set or None if no such result set exists for this session

get_session_context()

Get the context describing the current session.

Returns

context of this session

get_source()

Get the current session source

Returns

session source

get_task_timeout()

Get the task timeout of this session

Returns

the task timeout in seconds

graph_from_frames(graph_name, vertex_providers, edge_providers, partitioned=True)

Create PgxGraph from vertex providers and edge providers.

partitioned must be set to True if multiple vertex or edge providers are given

Parameters
  • graph_name – graph name

  • vertex_providers – list of vertex providers

  • edge_providers – list of edge providers

  • partitioned – whether the graph is partitioned or not. Defaults to True

Returns

the PgxGraph object

pandas_to_pgx_frame(pandas_dataframe, frame_name)

Create a frame from a pandas dataframe Duplicate columns will be renamed. Mixed column types are not supported.

This method requires pandas

Parameters
  • frame_name – Name of the frame

  • pandas_dataframe – The Pandas dataframe to use

Returns

the frame created

prepare_pgql(pgql_query)

Prepare a pattern matching query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as getGraph(String).

Parameters

pgql_query – Query string in PGQL

Returns

A prepared statement object

query_pgql(pgql_query)

Submit a pattern matching query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraph(String).

Parameters

pgql_query – Query string in PGQL

Returns

The query result set

throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

read_frame()

Create a new frame reader with which it is possible to parameterize the loading of the row frame.

Returns

A frame reader object with which it is possible to parameterize the loading

read_graph_as_of(config, meta_data=None, creation_timestamp=None, new_graph_name=None)

Read a graph and its properties of a specific version (metaData or creationTimestamp) into memory.

The creationTimestamp must be a valid version of the graph.

Parameters
  • config – The graph config

  • meta_data – The metaData object returned by get_available_snapshots(GraphConfig) identifying the version

  • creation_timestamp – The creation timestamp (milliseconds since jan 1st 1970) identifying the version to be checked out

  • new_graph_name – How the graph should be named. If None, a name will be generated.

Returns

The PgxGraph object

read_graph_by_name(graph_name, graph_source)
Parameters
  • graph_name – Name of graph

  • graph_source – Source of graph

read_graph_file(file_path, file_format=None, graph_name=None)
Parameters
  • file_path – File path

  • file_format – File format of graph

  • graph_name – Name of graph

read_graph_files(file_paths, edge_file_paths=None, file_format=None, graph_name=None)

Load the graph contained in the files at the given paths.

Parameters
  • file_paths – Paths to the vertex files

  • edge_file_paths – Path to the edge file

  • file_format – File format

  • graph_name – Loaded graph name

read_graph_with_properties(config, max_age=9223372036854775807, max_age_time_unit='days', block_if_full=False, update_if_not_fresh=True, graph_name=None)

Read a graph and its properties, specified in the graph config, into memory.

Parameters
  • config – The graph config

  • max_age – If another snapshot of the given graph already exists, the age of the latest existing snapshot will be compared to the given maxAge. If the latest snapshot is in the given range, it will be returned, otherwise a new snapshot will be created.

  • max_age_time_unit – The time unit of the maxAge parameter

  • block_if_full – If true and a new snapshot needs to be created but no more snapshots are allowed by the server configuration, the returned future will not complete until space becomes available. Iterable full and this flage is false, the returned future will complete exceptionally instead.

  • update_if_not_fresh – If a newer data version exists in the backing data source (see PgxGraph.is_fresh()), this flag tells whether to read it and create another snapshot inside PGX. If the “snapshots_source” field of config is SnapshotsSource.REFRESH, the returned graph may have multiple snapshots, depending on whether previous reads with the same config occurred; otherwise, if the “snapshots_source” field is SnapshotsSource.CHANGE_SET, only the most recent snapshot (either pre-existing or freshly read) will be visible.

  • graph_name – How the graph should be named. If null, a name will be generated. If a graph with that name already exists, the returned future will complete exceptionally.

register_keystore(keystore_path, keystore_password)

Register a keystore.

Parameters
  • keystore_path – The path to the keystore which shall be registered

  • keystore_password – The password of the provided keystore

property server_instance

Get the server instance.

Returns

The server instance

set_pattern_matching_semantic(pattern_matching_semantic)

Set the pattern matching semantic of the session.

Parameters

pattern_matching_semantic – Pattern matching semantic. If None is passed, the session respects the pattern matching semantic of the engine. Should be either ‘HOMOMORPHISM’ or ‘ISOMORPHISM’.

set_snapshot(graph, meta_data=None, creation_timestamp=None, force_delete_properties=False)

Set a graph to a specific snapshot.

You can use this method to jump back and forth in time between various snapshots of the same graph. If successful, the given graph will point to the requested snapshot after the returned future completes.

Parameters
  • graph – Input graph

  • meta_data – A GraphMetaData object used to identify the snapshot

  • creation_timestamp – The metaData object returned by (GraphConfig) identifying the version to be checked out

  • force_delete_properties – Graphs with transient properties cannot be checked out to a different version. If this flag is set to true, the checked out graph will no longer contain any transient properties. If false, the returned future will complete exceptionally with an UnsupportedOperationException as its cause.

vertex_provider_from_frame(provider_name, frame, vertex_key_column='id')

Create a vertex provider from a PgxFrame to later build a PgxGraph

Parameters
  • provider_name – vertex provider name

  • frame – PgxFrame to use

  • vertex_key_column – column to use as keys. Defaults to “id”

Returns

the VertexFrameDeclaration object

class pypgx.api.PickingStrategyBuilder(java_mutation_strategy_builder)

Bases: pypgx.api._mutation_strategy_builder.MutationStrategyBuilder

A class for defining a picking strategy on a graph.

class pypgx.api.Scalar(graph, java_scalar)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

A scalar value.

destroy()

Free resources on the server taken up by this Scalar.

get()

Get scalar value.

set(value)

Set the scalar value.

Parameters

value – Value to be assigned

class pypgx.api.ScalarCollection(java_scalar_collection)

Bases: pypgx.api._pgx_collection.PgxCollection

A collection of scalars.

class pypgx.api.ScalarSequence(java_scalar_collection)

Bases: pypgx.api._pgx_collection.ScalarCollection

An ordered sequence of scalars which may contain duplicates.

class pypgx.api.ScalarSet(java_scalar_collection)

Bases: pypgx.api._pgx_collection.ScalarCollection

An unordered set of scalars that does not contain duplicates.

class pypgx.api.ServerInstance(java_server_instance)

Bases: object

A PGX server instance.

create_session(source, idle_timeout=None, task_timeout=None, time_unit='milliseconds')
Parameters
  • source – A descriptive string identifying the client

  • idle_timeout – If not null, tries to overwrite server default idle timeout

  • task_timeout – If not null, tries to overwrite server default task timeout

  • time_unit – Time unit of idleTimeout and taskTimeout (‘days’, ‘hours’, ‘microseconds’, ‘milliseconds’, ‘minutes’, ‘nanoseconds’, ‘seconds’)

Returns

PgxSession

get_pgx_config()

Get the PGX config.

Returns

Dict containing current config

get_server_state()
Returns

Server state as a dict

get_session(session_id)

Get a session by ID.

Parameters

session_id – Id of the session

Returns

PgxSession

get_version()

Get the PGX extended version of this instance.

Returns

VersionInfo object

kill_session(session_id)

Kill a session.

Parameters

session_id – Session id

class pypgx.api.Synchronizer(java_synchronizer)

Bases: object

A class for synchronizing changes in an external data source with a PGX graph.

class pypgx.api.VertexCollection(graph, java_collection)

Bases: pypgx.api._pgx_collection.PgxCollection

A collection of vertices.

add(v)

Add one or multiple vertices to the collection.

Parameters

v – Vertex or vertex id. Can also be an iterable of vertices/Vetrices ids

add_all(vertices)

Add multiple vertices to the collection.

Parameters

vertices – Iterable of vertices/Vertices ids

contains(v)

Check if the collection contains vertex v.

Parameters

v – PgxVertex object or id

remove(v)

Remove one or multiple vertices from the collection.

Parameters

v – Vertex or vertex id. Can also be an iterable of vertices/Vetrices ids.

remove_all(vertices)

Remove multiple vertices from the collection.

Parameters

vertices – Iterable of vertices/Vetrices ids

class pypgx.api.VertexLabels(graph, java_labels)

Bases: pypgx.api._property.VertexProperty

class pypgx.api.VertexProperty(graph, java_prop)

Bases: pypgx.api._property.PgxProperty

get(key)
Parameters

key – The key (vertex/edge) whose property to get

set(key, value)

Set a property value.

Parameters
  • key – The key (vertex/edge) whose property to set

  • value – The property value

set_values(values)

Set the labels values. :param values: pgxmap with ids and values

class pypgx.api.VertexSequence(graph, java_collection)

Bases: pypgx.api._pgx_collection.VertexCollection

An ordered sequence of vertices which may contain duplicates.

class pypgx.api.VertexSet(graph, java_collection)

Bases: pypgx.api._pgx_collection.VertexCollection

An unordered set of vertices (no duplicates).

extract_top_k_from_map(pgx_map, k)

Extract the top k keys from the given map and puts them into this collection.

Parameters
  • pgx_map – the map to extract the keys from

  • k – how many keys to extract