************************
Graph Management in PGX
************************

Graph Loading
-------------

In order to perform graph analysis with PGX, the user must first read a graph into PGX. The following method in :class:`PgxSession` can be used to load graphs into memory as well as their blocking variants:

.. code-block:: python
    :linenos:

    # The following method in PgxSession can be used to load graphs into memory
    # as well as their blocking variants
    graph = session.read_graph_with_properties(
        self.graph_path,
        max_age=9223372036854775807,
        max_age_time_unit='days',
        block_if_full=False,
        update_if_not_fresh=True,
        graph_name="my_graph"
    )

The first argument (``path`` to a graph config file or a parsed ``config`` object) is the meta-data of the graph to be read. The meta-data includes the following information:

  * Location of the graph data: file location and name, DB location and connection information, etc
  * Format of the graph data: plain text formats, XML-based formats, Binary formats, etc
  * Types and Names of the properties to be loaded

The ``update_if_not_fresh`` and ``max_age`` arguments can be used to fine-control the age of the snapshot to be read. PGX will return an existing graph snapshot if the given graph specification was already loaded into memory by a different session. So, the ``max_age`` argument becomes important if reading from a database in which the data might change frequently. If no ``update_if_not_fresh`` or ``max_age`` is specified, PGX will favor cached data over reading new snapshots into memory.

Graph Names
-----------

Graph names are part of a session-private namespace unless explicitly shared via the :meth:`publish_with_snapshots()` or the :meth:`publish()` methods; at that point, the published graph name moves into the public namespace, that any session can see.
Names are unique within a given namespace and methods will throw an exception in case of name clashes.

Graph Publishing
----------------

The :meth:`publish()` methods in :class:`PgxGraph` can be used to publish the current selected snapshot of the graph. If you want to make all snapshots of the graph visible to other sessions, use the :meth:`publish_with_snapshots()` methods instead.

.. code-block:: python
    :linenos:

    graph.publish(vertex_properties=True, edge_properties=True)

You can publish specific properties using :class:`PgxProperty` methods.
Publishing properties requires the corresponding graph to be already published.

.. code-block:: python
    :linenos:

    # You can publish specific properties using PgxProperty methods.
    height = graph.create_edge_property('integer', 'height')
    height.publish()

Checking if Graphs or Properties are Published
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Both :class:`PgxGraph` and :class:`PgxProperty` offer these methods:

.. code-block:: python
    :linenos:

    # Check if graph or properties are published
    graph.is_published

You can also check whether a graph has been published with ist snapshots in a similar way:

.. code-block:: python
    :linenos:

    # Check if graph has been published with its snapshots
    graph.is_published_with_snapshots


Reading Loaded or Published Graphs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To check which graphs are currently loaded or published in a session you can use the following API method from :class:`PgxSession`:

.. code-block:: python
    :linenos:

    # Check which graphs are currently loaded or published in a session
    session.get_graphs()

The returned list contains the graph names in the given namespace.
It is also possible to reference one loaded/published graph with :class:`PgxSession` methods:

.. code-block:: python
    :linenos:

    # It is also possible to reference one loaded/published graph with PgxSession methods
    session.get_graph('pgql_lang_test_graph_with_labels')
    session.get_graph('sample_vertices.csv')

Providing ``None`` for the ``namespace`` parameter or calling the :meth:`get_graph()` methods that don't have a :class:`Namespace` parameter will look for a graph with the given name in both the private and public namespace. If a graph with the given name is found in both namespaces, the graph found in the private namespace is returned.

If you invoke these methods multiple times with the same graph name, you will get multiple different :class:`PgxGraph` objects, all pointing to the same graph; therefore, if you make any modification to the graph through any of those objects (e.g. you add a property), you will see it on all the objects pointing to the same graph:

.. code-block:: python
   :linenos:

    graph1 = session.get_graph("pgql_lang_test_graph_with_labels")
    # graph2 points to the same graph as graph1
    graph2 = session.get_graph("pgql_lang_test_graph_with_labels")

    graph1.create_vertex_property("boolean", "Bool_property")
    # returns the property just created
    graph2.get_vertex_property("Bool_property")

Graph Storing
-------------

The session can serialize a loaded graph instance to a file via the following method in the
:class:`PgxGraph` class:

.. code-block:: python
    :linenos:

    graph_config = graph.store(
        format="pgb",
        path="/tmp/myGraph.pgb",
        num_partitions=None,
        vertex_properties=True,
        edge_properties=True,
        overwrite=True
    )

The first two arguments `format` and `path` specify the format and the location on the local file system where the graph should be written to.
The ``overwrite`` argument determines whether or not an existing file should be overwritten. It defaults to ``False`` if omitted.

The session can select which vertex or edge properties should be stored with
the graph. The optional arguments `vertexProps` and `edgeProps` can be used to specify a list
of vertex and edge properties. If these arguments are omitted, all the properties are stored by default.

Finally, the above methods return a :class:`GraphConfig` object, which contains the meta-data of the stored graph instance. That object can be used to read the serialized graph into memory at a later point. Note that all :class:`GraphConfig` objects can be serialized easily as well, as shown in the following example:

.. code-block:: python
    :linenos:

    graph_config = graph.store("pgb", "/tmp/Graph.pgb", overwrite=True)
    # returns a JSON representation of the config object
    json = str(graph_config)
    with open("/tmp/myGraph.pgb.json", "w") as f:
        f.write(json)
    graph_config_2 = GraphConfigFactory.for_any_format().from_file_path(
        "/tmp/myGraph.pgb.json"
    )


.. _graph-deletion:

Graph Deletion
--------------

In order to reduce the memory usage of PGX, the session should drop the unused :class:`PgxGraph` graph objects that it created via :meth:`PgxSession.get_graph()`
by invoking their :meth:`destroy()` method.
This step not only destroys the specified graph, but all of its associated properties, including *transient* properties as well.
In addition, all of the collections related to the graph instance (e.g. a :class:`VertexSet`) are also destroyed automatically.
If a session holds multiple :class:`PgxGraph` objects referencing the same graph, invoking :class:`destroy()` on *any* of them will invalidate *all* the :class:`PgxGraph` objects referencing that graph, making any operation on those objects fail:

.. code-block:: python
    :linenos:

    graph1 = session.get_graph("sample_vertices.csv")

    # graph2 references the same graph of graph1
    graph2 = session.get_graph("sample_vertices.csv")

    graph1.destroy()
    # both calls throw an exception, as both references are not valid anymore
    # Executing graph1.get_vertex_properties throws an exception
    # Executing graph2.get_vertex_properties throws an exception
    self.assertRaises(Exception, graph1.get_vertex_properties)
    self.assertRaises(Exception, graph2.get_vertex_properties)