Graph Builder and Graph Change Set

Creating Graphs from Scratch (GraphBuilder)

To create a graph from scratch, the user first needs to acquire a graph builder, which accumulates all of the new vertices and edges. The basic workflow for creating graphs from scratch is:

  1. Acquire a modifiable graph builder

  2. Add vertices and edges to the graph builder

  3. Create a fixed PgxGraph out of the accumulated changes

The following methods in PgxSession can be used to create a graph builder:

1builder = session.create_graph_builder(
2    id_type="string",
3    vertex_id_generation_strategy="user_ids",
4    edge_id_generation_strategy='user_ids'
5)

The first version without an argument will create a default graph builder with the vertex ID type integer. The second version takes one argument id_type which defines the vertex ID type of the resulting GraphBuilder instance. The third and fourth versions are similar to first and second, respectively. They take two additional arguments vertex_id_generation_strategy and edge_id_generation_strategy, which define the ID generation strategy

The GraphBuilder class itself has the following methods for adding vertices and edges, as well as for building the resulting graph.

1graph_builder = session.create_graph_builder(
2    edge_id_generation_strategy="user_ids")
3graph_builder.add_vertex(0)  # adding a vertex
4graph_builder.add_vertex(1)
5graph_builder.add_edge(src=0, dst=1, edge_id=7)  # adding an edge
6graph_builder.reset_edge(7)  # resetting an edge
7graph_builder.reset_vertex(0)  # resetting a vertex

Adding vertices and edges

PGX supports two different generation strategies for adding vertices and edges to a changeset, depending on the edge_id_generation_strategy parameter: user_ids or auto_generated. In the user_ids strategy, the user is responsible for passing the desired ID for the newly added entity as the first parameter to the add_vertex() or add_edge() methods. The chosen ID must be unique in the graph (i.e., no other entity with the same ID may exist in the graph already). In the auto_generated strategy, the user _must_ omit the first parameter of the add_vertex() or add_edge() methods and PGX takes care of generating a unique ID for the newly added entity.

Edges can be added with one of the add_edge() methods. Similar to the vertex_id, an edge_id needs to be unique.

The second and third argument of add_edge() are descriptions of the source and destination vertices, respectively. There are two ways to specify these vertices:

  • By their vertex ID

  • By an instance of a VertexBuilder which was acquired previously with a call to add_vertex()

The second method can be useful if the vertex IDs are particularly complicated or long. For example:

1builder = session.create_graph_builder(
2    id_type="string",
3    edge_id_generation_strategy="user_ids"
4)
5v1 = builder.add_vertex("long named vertex 1")
6v2 = builder.add_vertex("long named vertex 2")
7builder.add_edge(v1, v2, 0)

If add_edge() is called with vertices that have not been created with a call to add_vertex() before, they will be added on the fly.

Resetting added vertices and edges

Added vertices and edges can be removed from the graph builder by calling one of the appropriate reset methods. Calling reset_vertex() will remove the vertex with the given vertex_id or the given VertexBuilder instance and all connected edges from the graph builder.

Similarly, calling reset_edge() will remove the edge with the given edge_id from the graph builder. After a vertex or edge has been reset, the vertex_id or edge_id is again free to use for this graph builder instance.

Adding properties and labels

The method add_vertex() will return a VertexBuilder for that specific vertex. It can be used to add property values to the vertex. The VertexBuilder provides the following methods:

1set_property(key, value)
2add_label(label)

Property values can be added to the given vertex using the set_property() method. It takes the name of the property as first argument and the value of the property as the second.

Labels can be added using the add_label() method which takes the new label as argument.

Properties are not required to be declared before values can be assigned to them. The first mention of a property will also define its type, which is deduced from the value.

Example :

1builder = session.create_graph_builder()
2vertex = builder.add_vertex(1)
3vertex.set_property("intProp", 10)
4vertex.set_property("strProp", "str")

The property intProp will now have the type integer for this graph builder, as the first value assigned to it was an integer. Similarly the property strProp will have the type string.

Once the type of a property has been defined, it cannot be changed anymore and any calls to set_property() with a value of the wrong type will result in an exception.

Similarly the method add_Edge() returns an EdgeBuilder which works in the same way as the VertexBuilder. This is the interface of EdgeBuilder:

1set_property(key, value)
2set_label(label)

The methods set_property() is already known from the VertexBuilder and work similarly. One additional method set_label() is provided, which can be used to set an optional edge label for each edge.

Building the graph

After all vertices, edges and property values have been added, the graph can be constructed using the build() methods:

1build(name=None)

Optionally a name for the graph can be specified using the first argument of build(). If no name is provided or the name is None, a unique name will be generated.

Builder interface

The GraphBuilder API is designed as a builder similar to the API for building graph configuration objects. Multiple calls to add vertices and edges and setting their properties can be chained together.

If we want to add three vertices and three edges with a property each this can be written like in the following example:

1graph = session \
2    .create_graph_builder(edge_id_generation_strategy="user_ids") \
3    .add_vertex(1).set_property("prop", 1) \
4    .add_vertex(2).set_property("prop", 2) \
5    .add_vertex(3).set_property("prop", 3) \
6    .add_edge(1, 1, 2).set_property("cost", 0.1) \
7    .add_edge(2, 2, 3).set_property("cost", 0.2) \
8    .add_edge(3, 3, 1).set_property("cost", 0.3) \
9    .build()

Modifying Loaded Graphs (GraphChangeSet)

A loaded graph can be modified using a graph change set, which is a concept similar to the graph builder, but with extended capabilities for modifying and removing vertices and edges. The basic workflow for changing a graph is:

  1. Get a graph change set for the graph that needs to be modified

  2. Apply changes to the graph change set

  3. Create a fixed PgxGraph out of the graph change set

The class PgxGraph offers the following methods to create an empty graph change set:

1# Get a graph change set for the graph that needs to be modified
2change_set = graph.create_change_set(
3    vertex_id_generation_strategy='user_ids',
4    edge_id_generation_strategy='user_ids'
5)

The GraphChangeSet interface derives directly from GraphBuilder, which means that all operations that are available in the graph builder are also available in the graph change set. Additionally methods for updating and removing vertices and edges are available.

Updating vertices and edges

The following methods are available for updating vertices and edges:

1change_set.update_vertex(vertex)
2change_set.update_edge(edge)

To update vertices, update_vertex() needs to be called with the ID of the vertex that is updated as first argument. The method returns a VertexModifier which offers the same methods for changing properties as the VertexBuilder. Similarly the update_edge() method takes the ID of the edge that is updated as first argument and returns an EdgeModifier. Note that if the vertex or edge ID does not exist in the original graph, an exception is thrown as soon as the graph is built.

Removing vertices and edges

1change_set.remove_vertex(vertex)
2change_set.remove_edge(edge)

Vertices can be removed using the remove_vertex() method which takes the ID of the vertex to be removed as first argument. A vertex with the given ID needs to exist in the original graph, otherwise an exception is thrown once the graph is built. Additionally this call will remove _all adjacent edges of this vertex. The adjacent edges will be removed once the graph is built. Edges can be removed similarly with the remove_edge() method which takes an edge ID as the first argument instead. Likewise, an edge with the given ID needs to exist in the original graph.

Partitioned graphs

It is possible to update partitioned graphs using a GraphChangeSet(). The following limitations apply:

  • Only changesets with vertex_id_generation_strategy and edge_id_generation_strategy set to user_ids for vertices are supported.

  • It is not possible to add new properties.

  • It is also not possible to add new label values to vertices or edges.

  • A tight coupling between vertex/edge label and original data source provider exists:
    • When adding a vertex (resp. edge) you must indicate all the labels that are associated to the vertex provider (resp. edge provider) in which they should be added to

    • Additionally for edges the source and destination vertices must have all the labels associated to the respective vertex provider they are in.

Example: Given a graph that was loaded from the vertex providers “student” and “university”, and the edge providers “knows” (from “student” to “student”) and “studiesAt” (from “student” to “university”):

The following example shows a well-formed changeset:

 1change_set = graph.create_change_set(
 2    vertex_id_generation_strategy='user_ids',
 3    edge_id_generation_strategy='user_ids'
 4)
 5src_vertex = change_set.add_vertex(123) \
 6    .add_label("student") \
 7    .set_property("name", "Jane")
 8
 9dst_vertex = change_set.add_vertex(125) \
10    .add_label("university") \
11    .set_property("location", "SF")
12
13change_set.add_edge(
14    src_vertex,
15    dst_vertex,
16    edge_id=5
17).set_label("studiesAt")
18change_set.remove_vertex(99)
19change_set.remove_edge(3)
20new_graph = change_set.build()

The following examples are invalid:

1change_set.add_vertex(1).add_label("person")
2change_set.add_vertex(2).add_label("student").set_property("age", 23)
3
4change_set.add_vertex(3)
5change_set.add_edge(src_vertex, dst_vertex, 4)
6
7src_vertex = change_set.add_vertex(123).add_label("person")
8dst_vertex = change_set.add_vertex("UCSF").add_label("university")
9change_set.add_edge(src_vertex, dst_vertex, 5).add_label("knows")

Building the graph

After all changes have been made to the graph change set, the graph can be constructed using the same build methods as used by the graph builder. Note that building a graph change set will create a new copy of the graph with all the modifications applied. The changes will not be applied in-place.