.. _builder-changeset: ********************************** Graph Builder and Graph Change Set ********************************** Creating Graphs from Scratch (GraphBuilder) ------------------------------------------- To create a graph from scratch, the user first needs to acquire a graph builder, which accumulates all of the new vertices and edges. The basic workflow for creating graphs from scratch is: 1. Acquire a modifiable graph builder 2. Add vertices and edges to the graph builder 3. Create a fixed :class:`PgxGraph` out of the accumulated changes The following methods in :class:`PgxSession` can be used to create a graph builder: .. code-block:: python :linenos: builder = session.create_graph_builder( id_type="string", vertex_id_generation_strategy="user_ids", edge_id_generation_strategy='user_ids' ) The first version without an argument will create a default graph builder with the vertex ID type ``integer``. The second version takes one argument ``id_type`` which defines the vertex ID type of the resulting :class:`GraphBuilder` instance. The third and fourth versions are similar to first and second, respectively. They take two additional arguments ``vertex_id_generation_strategy`` and ``edge_id_generation_strategy``, which define the ID generation strategy The :class:`GraphBuilder` class itself has the following methods for adding vertices and edges, as well as for building the resulting graph. .. code-block:: python :linenos: graph_builder = session.create_graph_builder( edge_id_generation_strategy="user_ids") graph_builder.add_vertex(0) # adding a vertex graph_builder.add_vertex(1) graph_builder.add_edge(src=0, dst=1, edge_id=7) # adding an edge graph_builder.reset_edge(7) # resetting an edge graph_builder.reset_vertex(0) # resetting a vertex Adding vertices and edges ~~~~~~~~~~~~~~~~~~~~~~~~~ PGX supports two different generation strategies for adding vertices and edges to a changeset, depending on the ``edge_id_generation_strategy`` parameter: ``user_ids`` or ``auto_generated``. In the ``user_ids`` strategy, the user is responsible for passing the desired ID for the newly added entity as the first parameter to the :meth:`add_vertex()` or :meth:`add_edge()` methods. The chosen ID must be unique in the graph (i.e., no other entity with the same ID may exist in the graph already). In the ``auto_generated`` strategy, the user _must_ omit the first parameter of the :meth:`add_vertex()` or :meth:`add_edge()` methods and PGX takes care of generating a unique ID for the newly added entity. Edges can be added with one of the :meth:`add_edge` methods. Similar to the ``vertex_id``, an ``edge_id`` needs to be unique. The second and third argument of :meth:`add_edge` are descriptions of the source and destination vertices, respectively. There are two ways to specify these vertices: * By their vertex ID * By an instance of a :class:`VertexBuilder` which was acquired previously with a call to :meth:`add_vertex` The second method can be useful if the vertex IDs are particularly complicated or long. For example: .. code-block:: python :linenos: builder = session.create_graph_builder( id_type="string", edge_id_generation_strategy="user_ids" ) v1 = builder.add_vertex("long named vertex 1") v2 = builder.add_vertex("long named vertex 2") builder.add_edge(v1, v2, 0) If :meth:`add_edge` is called with vertices that have not been created with a call to :meth:`add_vertex` before, they will be added on the fly. Resetting added vertices and edges ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Added vertices and edges can be removed from the graph builder by calling one of the appropriate reset methods. Calling :meth:`reset_vertex` will remove the vertex with the given ``vertex_id`` or the given :class:`VertexBuilder` instance and all connected edges from the graph builder. Similarly, calling :meth:`reset_edge` will remove the edge with the given ``edge_id`` from the graph builder. After a vertex or edge has been reset, the ``vertex_id`` or ``edge_id`` is again free to use for this graph builder instance. Adding properties and labels ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The method :meth:`add_vertex` will return a :class:`VertexBuilder` for that specific vertex. It can be used to add property values to the vertex. The :class:`VertexBuilder` provides the following methods: .. code-block:: python :linenos: set_property(key, value) add_label(label) Property values can be added to the given vertex using the :meth:`set_property` method. It takes the name of the property as first argument and the value of the property as the second. Labels can be added using the :meth:`add_label` method which takes the new label as argument. Properties are not required to be declared before values can be assigned to them. The first mention of a property will also define its type, which is deduced from the value. Example : .. code-block:: python :linenos: builder = session.create_graph_builder() vertex = builder.add_vertex(1) vertex.set_property("intProp", 10) vertex.set_property("strProp", "str") The property ``intProp`` will now have the type ``integer`` for this graph builder, as the first value assigned to it was an integer. Similarly the property ``strProp`` will have the type ``string``. Once the type of a property has been defined, it cannot be changed anymore and any calls to :meth:`set_property` with a value of the wrong type will result in an exception. Similarly the method :meth:`add_Edge` returns an :class:`EdgeBuilder` which works in the same way as the :class:`VertexBuilder`. This is the interface of :class:`EdgeBuilder`: .. code-block:: python :linenos: set_property(key, value) set_label(label) The methods :meth:`set_property` is already known from the :class:`VertexBuilder` and work similarly. One additional method :meth:`set_label` is provided, which can be used to set an optional edge label for each edge. Building the graph ~~~~~~~~~~~~~~~~~~ After all vertices, edges and property values have been added, the graph can be constructed using the :meth:`build` methods: .. code-block:: python :linenos: build(name=None) Optionally a name for the graph can be specified using the first argument of :meth:`build`. If no name is provided or the name is ``None``, a unique name will be generated. Builder interface ~~~~~~~~~~~~~~~~~ The GraphBuilder API is designed as a builder similar to the API for building graph configuration objects. Multiple calls to add vertices and edges and setting their properties can be chained together. If we want to add three vertices and three edges with a property each this can be written like in the following example: .. code-block:: python :linenos: graph = session \ .create_graph_builder(edge_id_generation_strategy="user_ids") \ .add_vertex(1).set_property("prop", 1) \ .add_vertex(2).set_property("prop", 2) \ .add_vertex(3).set_property("prop", 3) \ .add_edge(1, 1, 2).set_property("cost", 0.1) \ .add_edge(2, 2, 3).set_property("cost", 0.2) \ .add_edge(3, 3, 1).set_property("cost", 0.3) \ .build() Modifying Loaded Graphs (GraphChangeSet) ---------------------------------------- A loaded graph can be modified using a graph change set, which is a concept similar to the graph builder, but with extended capabilities for modifying and removing vertices and edges. The basic workflow for changing a graph is: 1. Get a graph change set for the graph that needs to be modified 2. Apply changes to the graph change set 3. Create a fixed :class:`PgxGraph` out of the graph change set The class :class:`PgxGraph` offers the following methods to create an empty graph change set: .. code-block:: python :linenos: # Get a graph change set for the graph that needs to be modified change_set = graph.create_change_set( vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='user_ids' ) The :class:`GraphChangeSet` interface derives directly from :class:`GraphBuilder`, which means that all operations that are available in the graph builder are also available in the graph change set. Additionally methods for updating and removing vertices and edges are available. Updating vertices and edges ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following methods are available for updating vertices and edges: .. code-block:: python :linenos: change_set.update_vertex(vertex) change_set.update_edge(edge) To update vertices, :meth:`update_vertex` needs to be called with the ID of the vertex that is updated as first argument. The method returns a :class:`VertexModifier` which offers the same methods for changing properties as the :class:`VertexBuilder`. Similarly the :meth:`update_edge` method takes the ID of the edge that is updated as first argument and returns an :class:`EdgeModifier`. Note that if the vertex or edge ID does not exist in the original graph, an exception is thrown as soon as the graph is built. Removing vertices and edges ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python :linenos: change_set.remove_vertex(vertex) change_set.remove_edge(edge) Vertices can be removed using the :meth:`remove_vertex` method which takes the ID of the vertex to be removed as first argument. A vertex with the given ID needs to exist in the original graph, otherwise an exception is thrown once the graph is built. Additionally this call will remove _all adjacent **edges** of this vertex. The adjacent edges will be removed once the graph is built. Edges can be removed similarly with the :meth:`remove_edge` method which takes an edge ID as the first argument instead. Likewise, an edge with the given ID needs to exist in the original graph. Partitioned graphs ~~~~~~~~~~~~~~~~~~ It is possible to update partitioned graphs using a :meth:`GraphChangeSet`. The following limitations apply: - Only changesets with ``vertex_id_generation_strategy`` and ``edge_id_generation_strategy`` set to `user_ids` for vertices are supported. - It is not possible to add new properties. - It is also not possible to add new label values to vertices or edges. - A tight coupling between vertex/edge label and original data source provider exists: - When adding a vertex (resp. edge) you *must* indicate all the labels that are associated to the vertex provider (resp. edge provider) in which they should be added to - Additionally for edges the source and destination vertices must have all the labels associated to the respective vertex provider they are in. Example: Given a graph that was loaded from the vertex providers "student" and "university", and the edge providers "knows" (from "student" to "student") and "studiesAt" (from "student" to "university"): The following example shows a well-formed changeset: .. code-block:: python :linenos: change_set = graph.create_change_set( vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='user_ids' ) src_vertex = change_set.add_vertex(123) \ .add_label("student") \ .set_property("name", "Jane") dst_vertex = change_set.add_vertex(125) \ .add_label("university") \ .set_property("location", "SF") change_set.add_edge( src_vertex, dst_vertex, edge_id=5 ).set_label("studiesAt") change_set.remove_vertex(99) change_set.remove_edge(3) new_graph = change_set.build() The following examples are invalid: .. code-block:: python :linenos: change_set.add_vertex(1).add_label("person") change_set.add_vertex(2).add_label("student").set_property("age", 23) change_set.add_vertex(3) change_set.add_edge(src_vertex, dst_vertex, 4) src_vertex = change_set.add_vertex(123).add_label("person") dst_vertex = change_set.add_vertex("UCSF").add_label("university") change_set.add_edge(src_vertex, dst_vertex, 5).add_label("knows") Building the graph ~~~~~~~~~~~~~~~~~~ After all changes have been made to the graph change set, the graph can be constructed using the same ``build`` methods as used by the graph builder. Note that building a graph change set will create a new copy of the graph with all the modifications applied. The changes will **not** be applied in-place.