Graph Builder and Graph Change Set
Creating Graphs from Scratch (GraphBuilder)
To create a graph from scratch, the user first needs to acquire a graph builder, which accumulates all of the new vertices and edges. The basic workflow for creating graphs from scratch is:
Acquire a modifiable graph builder
Add vertices and edges to the graph builder
Create a fixed
PgxGraph
out of the accumulated changes
The following methods in PgxSession
can be used to create a graph builder:
1builder = session.create_graph_builder(
2 id_type="string",
3 vertex_id_generation_strategy="user_ids",
4 edge_id_generation_strategy='user_ids'
5)
The first version without an argument will create a default graph builder with the vertex ID type integer
.
The second version takes one argument id_type
which defines the vertex ID type of the resulting GraphBuilder
instance.
The third and fourth versions are similar to first and second, respectively. They take two additional arguments vertex_id_generation_strategy
and edge_id_generation_strategy
, which define the ID generation strategy
The GraphBuilder
class itself has the following methods for adding vertices and edges, as well as for building the resulting graph.
1graph_builder = session.create_graph_builder(
2 edge_id_generation_strategy="user_ids")
3graph_builder.add_vertex(0) # adding a vertex
4graph_builder.add_vertex(1)
5graph_builder.add_edge(src=0, dst=1, edge_id=7) # adding an edge
6graph_builder.reset_edge(7) # resetting an edge
7graph_builder.reset_vertex(0) # resetting a vertex
Adding vertices and edges
PGX supports two different generation strategies for adding vertices and edges to a changeset, depending on the edge_id_generation_strategy
parameter: user_ids
or auto_generated
.
In the user_ids
strategy, the user is responsible for passing the desired ID for the newly added entity as the first parameter to the add_vertex()
or add_edge()
methods.
The chosen ID must be unique in the graph (i.e., no other entity with the same ID may exist in the graph already).
In the auto_generated
strategy, the user _must_ omit the first parameter of the add_vertex()
or add_edge()
methods and PGX takes care of generating a unique ID for the newly added entity.
Edges can be added with one of the add_edge()
methods.
Similar to the vertex_id
, an edge_id
needs to be unique.
The second and third argument of add_edge()
are descriptions of the source and destination vertices, respectively.
There are two ways to specify these vertices:
By their vertex ID
By an instance of a
VertexBuilder
which was acquired previously with a call toadd_vertex()
The second method can be useful if the vertex IDs are particularly complicated or long. For example:
1builder = session.create_graph_builder(
2 id_type="string",
3 edge_id_generation_strategy="user_ids"
4)
5v1 = builder.add_vertex("long named vertex 1")
6v2 = builder.add_vertex("long named vertex 2")
7builder.add_edge(v1, v2, 0)
If add_edge()
is called with vertices that have not been created with a call to add_vertex()
before, they will be added on the fly.
Resetting added vertices and edges
Added vertices and edges can be removed from the graph builder by calling one of the appropriate reset methods.
Calling reset_vertex()
will remove the vertex with the given vertex_id
or the given VertexBuilder
instance and all connected edges from the graph builder.
Similarly, calling reset_edge()
will remove the edge with the given edge_id
from the graph builder.
After a vertex or edge has been reset, the vertex_id
or edge_id
is again free to use for this graph builder instance.
Adding properties and labels
The method add_vertex()
will return a VertexBuilder
for that specific vertex. It can be used to add property values to the vertex.
The VertexBuilder
provides the following methods:
1set_property(key, value)
2add_label(label)
Property values can be added to the given vertex using the set_property()
method. It takes the name of the property as first argument and the value of the property as the second.
Labels can be added using the add_label()
method which takes the new label as argument.
Properties are not required to be declared before values can be assigned to them. The first mention of a property will also define its type, which is deduced from the value.
Example :
1builder = session.create_graph_builder()
2vertex = builder.add_vertex(1)
3vertex.set_property("intProp", 10)
4vertex.set_property("strProp", "str")
The property intProp
will now have the type integer
for this graph builder, as the first value assigned to it was an integer.
Similarly the property strProp
will have the type string
.
Once the type of a property has been defined, it cannot be changed anymore and any calls to set_property()
with a value of the wrong type will result in an exception.
Similarly the method add_Edge()
returns an EdgeBuilder
which works in the same way as the VertexBuilder
.
This is the interface of EdgeBuilder
:
1set_property(key, value)
2set_label(label)
The methods set_property()
is already known from the VertexBuilder
and work similarly.
One additional method set_label()
is provided, which can be used to set an optional edge label for each edge.
Building the graph
After all vertices, edges and property values have been added, the graph can be constructed using the build()
methods:
1build(name=None)
Optionally a name for the graph can be specified using the first argument of build()
. If no name is provided or the name is None
, a unique name will be generated.
Builder interface
The GraphBuilder API is designed as a builder similar to the API for building graph configuration objects. Multiple calls to add vertices and edges and setting their properties can be chained together.
If we want to add three vertices and three edges with a property each this can be written like in the following example:
1graph = session \
2 .create_graph_builder(edge_id_generation_strategy="user_ids") \
3 .add_vertex(1).set_property("prop", 1) \
4 .add_vertex(2).set_property("prop", 2) \
5 .add_vertex(3).set_property("prop", 3) \
6 .add_edge(1, 1, 2).set_property("cost", 0.1) \
7 .add_edge(2, 2, 3).set_property("cost", 0.2) \
8 .add_edge(3, 3, 1).set_property("cost", 0.3) \
9 .build()
Modifying Loaded Graphs (GraphChangeSet)
A loaded graph can be modified using a graph change set, which is a concept similar to the graph builder, but with extended capabilities for modifying and removing vertices and edges. The basic workflow for changing a graph is:
Get a graph change set for the graph that needs to be modified
Apply changes to the graph change set
Create a fixed
PgxGraph
out of the graph change set
The class PgxGraph
offers the following methods to create an empty graph change set:
1# Get a graph change set for the graph that needs to be modified
2change_set = graph.create_change_set(
3 vertex_id_generation_strategy='user_ids',
4 edge_id_generation_strategy='user_ids'
5)
The GraphChangeSet
interface derives directly from GraphBuilder
, which means that all operations that are available in the graph builder are also available in the graph change set.
Additionally methods for updating and removing vertices and edges are available.
Updating vertices and edges
The following methods are available for updating vertices and edges:
1change_set.update_vertex(vertex)
2change_set.update_edge(edge)
To update vertices, update_vertex()
needs to be called with the ID of the vertex that is updated as first argument.
The method returns a VertexModifier
which offers the same methods for changing properties as the VertexBuilder
.
Similarly the update_edge()
method takes the ID of the edge that is updated as first argument and returns an EdgeModifier
.
Note that if the vertex or edge ID does not exist in the original graph, an exception is thrown as soon as the graph is built.
Removing vertices and edges
1change_set.remove_vertex(vertex)
2change_set.remove_edge(edge)
Vertices can be removed using the remove_vertex()
method which takes the ID of the vertex to be removed as first argument.
A vertex with the given ID needs to exist in the original graph, otherwise an exception is thrown once the graph is built.
Additionally this call will remove _all adjacent edges of this vertex. The adjacent edges will be removed once the graph is built.
Edges can be removed similarly with the remove_edge()
method which takes an edge ID as the first argument instead.
Likewise, an edge with the given ID needs to exist in the original graph.
Partitioned graphs
It is possible to update partitioned graphs using a GraphChangeSet()
. The following limitations apply:
Only changesets with
vertex_id_generation_strategy
andedge_id_generation_strategy
set to user_ids for vertices are supported.It is not possible to add new properties.
It is also not possible to add new label values to vertices or edges.
- A tight coupling between vertex/edge label and original data source provider exists:
When adding a vertex (resp. edge) you must indicate all the labels that are associated to the vertex provider (resp. edge provider) in which they should be added to
Additionally for edges the source and destination vertices must have all the labels associated to the respective vertex provider they are in.
Example: Given a graph that was loaded from the vertex providers “student” and “university”, and the edge providers “knows” (from “student” to “student”) and “studiesAt” (from “student” to “university”):
The following example shows a well-formed changeset:
1change_set = graph.create_change_set(
2 vertex_id_generation_strategy='user_ids',
3 edge_id_generation_strategy='user_ids'
4)
5src_vertex = change_set.add_vertex(123) \
6 .add_label("student") \
7 .set_property("name", "Jane")
8
9dst_vertex = change_set.add_vertex(125) \
10 .add_label("university") \
11 .set_property("location", "SF")
12
13change_set.add_edge(
14 src_vertex,
15 dst_vertex,
16 edge_id=5
17).set_label("studiesAt")
18change_set.remove_vertex(99)
19change_set.remove_edge(3)
20new_graph = change_set.build()
The following examples are invalid:
1change_set.add_vertex(1).add_label("person")
2change_set.add_vertex(2).add_label("student").set_property("age", 23)
3
4change_set.add_vertex(3)
5change_set.add_edge(src_vertex, dst_vertex, 4)
6
7src_vertex = change_set.add_vertex(123).add_label("person")
8dst_vertex = change_set.add_vertex("UCSF").add_label("university")
9change_set.add_edge(src_vertex, dst_vertex, 5).add_label("knows")
Building the graph
After all changes have been made to the graph change set, the graph can be constructed using the same build
methods as used by the graph builder.
Note that building a graph change set will create a new copy of the graph with all the modifications applied.
The changes will not be applied in-place.