This page presents the API used to load, publish, store and delete graphs. To see examples of these operations and have more details, please look at the child pages of this guide.
In order to perform graph analysis with PGX, the user must first
read a graph into PGX. The following methods in PgxSession
can be used to load graphs into memory:
PgxFuture<PgxGraph> readGraphWithPropertiesAsync(String path) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(String path, String newGraphName) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(GraphConfig config) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(GraphConfig config, String newGraphName) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(GraphConfig config, boolean forceUpdateIfNotFresh) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(GraphConfig config, boolean forceUpdateIfNotFresh, String newGraphName) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(GraphConfig config, long maxAge, TimeUnit maxAgeTimeUnit) PgxFuture<PgxGraph> readGraphWithPropertiesAsync(GraphConfig config, long maxAge, TimeUnit maxAgeTimeUnit, boolean blockIfFull, String newGraphName)
as well as their blocking variants:
PgxGraph readGraphWithProperties(String path) PgxGraph readGraphWithProperties(String path, String newGraphName) PgxGraph readGraphWithProperties(GraphConfig config) PgxGraph readGraphWithProperties(GraphConfig config, String newGraphName) PgxGraph readGraphWithProperties(GraphConfig config, boolean forceUpdateIfNotFresh) PgxGraph readGraphWithProperties(GraphConfig config, boolean forceUpdateIfNotFresh, String newGraphName) PgxGraph readGraphWithProperties(GraphConfig config, long maxAge, TimeUnit maxAgeTimeUnit) PgxGraph readGraphWithProperties(GraphConfig config, long maxAge, TimeUnit maxAgeTimeUnit, boolean blockIfFull, String newGraphName)
read_graph_with_properties(self, config, max_age=9223372036854775807, max_age_time_unit='days', block_if_full=False, update_if_not_fresh=True, graph_name=None)
The first argument (path
to a graph config file or a parsed config
object) is the meta-data of the graph
to be read. The meta-data includes the following information:
Refer to the Graph Loading Guide for detailed information about the different data formats PGX supports and their configurations.
The forceUpdateIfNotFresh
and maxAge
arguments can be used to fine-control the age of the snapshot to
be read. PGX will return an existing graph snapshot if the given graph specification was already
loaded into memory by a different session. So, the maxAge
argument becomes important if reading from
a database in which the data might change frequently. If no forceUpdateIfNotFresh
or maxAge
is specified, PGX
will favor cached data over reading new snapshots into memory.
For more details, check the javadoc and the guide about loading custom graph data.
Graph names follow the rules described in the Namespaces and Sharing page.
In brief, graph names are part of a session-private namespace unless explicitly shared via the publishWithSnapshots()
or the publish()
methods; at
that point, the published graph name moves into the public namespace, that any session can see.
Names are unique within a given namespace and methods will throw an exception in case of name clashes.
PGQL supports selecting a graph to query using the FROM-statement.
The graph name mentioned in the FROM-statement is resolved with the same semantics as retrieving a graph without a namespace; see Retrieving Graphs by Name.
In the PGQL query SELECT * FROM MATCH (v) ON myGraph
the graph myGraph
will be retrieved with the same semantics as session.getGraph("myGraph")
.
Code examples for getting graphs from different namespaces:
// look up graph "myGraph" in session private namespace: PgxGraph g = session.getGraph(Namespace.PRIVATE, "myGraph"); // look up graph "myGraph" in public namespace: PgxGraph g = session.getGraph(Namespace.PUBLIC, "myGraph"); // look up "myGraph" in both namespaces, where PRIVATE takes precedence over PUBLIC: PgxGraph g = session.getGraph("myGraph"); PgxGraph g = session.getGraph(null, "myGraph");
g = session.get_graph("myGraph")
Code example for getting a list of graph names in the private namespace:
Collection<String> privateGraphs = session.getGraphs(Namespace.PRIVATE); session.getGraph(Namespace.PRIVATE, privateGraphs.get(0));
The publish()
methods in PgxGraph
can be used to publish the current selected snapshot of the graph. If
you want to make all snapshots of the graph visible to other sessions, use the publishWithSnapshots()
methods
instead.
PgxFuture<Void> publishAsync() void publish() // synchronous variant PgxFuture<Void> publishAsync(Collection<VertexProperty<?, ?>> vertexProps, Collection<EdgeProperty<?>> edgeProps) void publish(Collection<VertexProperty<?, ?>> vertexProps, Collection<EdgeProperty<?>> edgeProps) // synchronous variant
publish(self, vertex_properties=True, edge_properties=True)
You can publish specific properties using Property
methods.
Publishing properties requires the corresponding graph to be already published.
PgxFuture<Void> publishAsync() void publish() // synchronous variant
publish(self)
If a private graph already has snapshots, publishWithSnapshots()
will publish them all under the same name
PgxFuture<Void> publishWithSnapshotsAsync() void publishWithSnapshots() // synchronous variant PgxFuture<Void> publishWithSnapshotsAsync(Collection<VertexProperty<?, ?>> vertexProps, Collection<EdgeProperty<?>> edgeProps) void publishWithSnapshots(Collection<VertexProperty<?, ?>> vertexProps, Collection<EdgeProperty<?>> edgeProps) // synchronous variant
For more information, you can refer to the page dedicated to publishing a graph.
Both PgxGraph
and Property
offer these methods:
PgxFuture<Boolean> isPublishedAsync() boolean isPublished()
publish(self)
You can also check whether a graph has been published with ist snapshots in a similar way:
PgxFuture<Boolean> isPublishedWithSnapshotsAsync() boolean isPublishedWithSnapshots()
To check which graphs are currently loaded or published in a session you can use the following API method from PgxSession
:
PgxFuture<List<String>> getGraphsAsync(Namespace namespace) List<String> getGraphs(Namespace namespace)
get_graphs(self)
The returned list contains the graph names in the given namespace.
It is also possible to reference one loaded/published graph with PgxSession
methods:
PgxFuture<PgxGraph> getGraphAsync(Namespace namespace, String name) PgxFuture<PgxGraph> getGraphAsync(String name) PgxGraph getGraph(Namespace namespace, String name) PgxGraph getGraph(String name)
get_graph(self, graph_name)
Providing null
for the namespace
parameter or calling the getGraph()
methods that don't have a Namespace
parameter will look for a graph with the given name
in both the private and public namespace. If a graph with the given name is found in both namespaces, the graph found in the private namespace is returned.
If you invoke these methods multiple times with the same graph name, you will get multiple different PgxGraph
objects, all pointing to the same graph;
therefore, if you make any modification to the graph through any of those objects (e.g. you add a property), you will see it on all the objects pointing to the same graph:
PgxGraph graph1 = session.getGraph("myGraphName"); // graph2 points to the same graph as graph1 PgxGraph graph2 = session.getGraph("myGraphName"); graph1.createVertexProperty(PropertyType.BOOLEAN, "BoolProperty"); // returns the property just created VertexProperty<Object, Boolean> property = graph2.getVertexProperty("BoolProperty");
graph1 = session.get_graph("myGraphName") # graph2 points to the same graph as graph1""" graph2 = session.get_graph("myGraphName") graph1.create_vertex_property("boolean", "BoolProperty") # returns the property just created graph1.get_vertex_property("BoolProperty")
Note that the server keeps track of how many PgxGraph
objects per session point to the each snapshot: in this way, if a
PgxGraph
object is modified to point to a different graph via PgxSession.setSnapshot()
, the other objects still point to the initial snapshot:
// get a snapshot of "myGraphName" PgxGraph graph1 = session.getGraph("myGraphName"); // graph2 points to the same snapshot as graph1 PgxGraph graph2 = session.getGraph("myGraphName"); // we assume another snapshot is created ... // create a property on the snapshots pointed by graph1 graph1.createVertexProperty(PropertyType.BOOLEAN, "BoolProperty"); // make graph2 point to the latest snapshot available, which is different from graph1 session.setSnapshot(graph2, PgxSession.LATEST_SNAPSHOT); // returns the property just created: graph1 is still a valid reference to the original snapshot we got VertexProperty<Object, Boolean> property1 = graph1.getVertexProperty("BoolProperty"); // returns NULL, because graph2 points to the new snapshot, which does not have this property VertexProperty<Object, Boolean> property2 = graph2.getVertexProperty("BoolProperty");
For a detailed explanation of graph versioning, you can refer to the Graph Versioning guide.
When you are done with your work on a graph snapshot, you should release it, as explained in the Graph Deletion section.
The session can serialize a loaded graph instance to a file via one of the following methods in the
PgxGraph
class:
PgxFuture<FileGraphConfig> storeAsync(Format targetFormat, String targetPath) PgxFuture<FileGraphConfig> storeAsync(Format targetFormat, String targetPath, boolean overwrite) PgxFuture<FileGraphConfig> storeAsync(Format targetFormat, String targetPath, Collection<VertexProperty<?, ?>> vertexProps, Collection<EdgeProperty<?>> edgeProps, boolean overwrite)
or their blocking variants:
FileGraphConfig store(Format targetFormat, String targetPath) FileGraphConfig store(Format targetFormat, String targetPath, boolean overwrite) FileGraphConfig store(Format targetFormat, String targetPath, Collection<VertexProperty<?, ?>>vertexProps, Collection<EdgeProperty<?>> edgeProps, boolean overwrite)
store(self, format, path, num_partitions=None, vertex_properties=True, edge_properties=True, overwrite=False)
The first two arguments targetFormat
and targetPath
specify the format and the location on the
local file system where the graph should be written to.
Embedded PGX instance only
Note: The above methods will throw an UnsupportedOperationException
if connected to a remote PGX
instance. Writing to a remote server's file system is not permitted for security reasons.
The overwrite
argument determines whether or not an existing file should be overwritten. It defaults to false
if omitted.
The session can select which vertex or edge properties should be stored with
the graph. The optional arguments vertexProps
and edgeProps
can be used to specify a list
of vertex and edge properties. If these arguments are omitted, all the properties are stored by default.
PGX provides convenience constants VertexProperty.ALL
, EdgeProperty.ALL
and
VertexProperty.NONE
, EdgeProperty.NONE
to specify all properties or none properties to be stored,
respectively.
Finally, the above methods return a FileGraphConfig
object, which contains the
meta-data of the stored graph instance. That object can be used to read the serialized graph into
memory at a later point. Note that all GraphConfig
objects can be serialized easily as well, as shown
in the following example:
import ... import org.apache.commons.io.FileUtils; // commons-io is packaged with PGX PgxGraph myGraph = ... GraphConfig graphConfig = myGraph.store(Format.PGB, "/tmp/myGraph.pgb"); File configFile = new File("/tmp/myGraph.pgb.json"); String json = graphConfig.toString(); // returns a JSON representation of the config object FileUtils.write(configFile, json); // read config back into memory GraphConfig graphConfig2 = GraphConfigFactory.forAnyFormat().fromFile(configFile); assert(graphConfig.equals(graphConfig2));
my_graph = ... graph_config = my_graph.store("pgb", "/tmp/myGraph.pgb"); # returns a JSON representation of the config object json = str(graph_config) with open("/tmp/myGraph.pgb.json","w") as f: f.write(json) graph_config_2 = GraphConfigFactory.for_any_format().from_file_path("/tmp/myGraph.pgb.json")
Check the javadoc for details.
In order to reduce the memory usage of PGX, the session should drop the unused PgxGraph
graph objects that it created via PgxSession.getGraph()
by invoking their destroyAsync()
(or destroy()
) method.
This step not only destroys the specified graph, but all of its associated properties, including transient properties as well.
In addition, all of the collections related to the graph instance (e.g. a VertexSet
) are also destroyed automatically.
If a session holds multiple PgxGraph
objects referencing the same graph, invoking destroyAsync()
(or destroy()
) on any of
them will invalidate all the PgxGraph
objects referencing that graph, making any operation on those objects fail:
PgxGraph graph1 = session.getGraph("myGraphName") // graph2 references the same graph of graph1 PgxGraph graph2 = session.getGraph("myGraphName") // both calls throw an exception, as both references are not valid anymore Set<VertexProperty<?, ?>> properties = graph1.getVertexProperties(); properties = graph2.getVertexProperties()
graph1 = session.get_graph("myGraphName") # graph2 references the same graph of graph1 graph2 = session.get_graph("myGraphName") # both calls throw an exception, as both references are not valid anymore properties = graph1.get_vertex_properties() properties = graph2.get_vertex_properties()
The same behavior occurs when multiple PgxGraph
objects reference the same snapshot: since a snapshots is effectively a graph,
destroying a PgxGraph
object referencing a certain snapshot invalidates all PgxGraph
objects referencing the same snapshot,
but does not invalidate those referencing other snapshots:
// get a snapshot of "myGraphName" PgxGraph graph1 = session.getGraph("myGraphName"); // graph2 and graph3 reference the same snapshot as graph1 PgxGraph graph2 = session.getGraph("myGraphName"); PgxGraph graph3 = session.getGraph("myGraphName"); // we assume another snapshot is created ... // make graph3 references the latest snapshot available session.setSnapshot(graph3, PgxSession.LATEST_SNAPSHOT); graph2.destroy(); // both calls throw an exception, as both references are not valid anymore Set<VertexProperty<?, ?>> properties = graph1.getVertexProperties(); properties = graph2.getVertexProperties(); // graph3 is still valid, so the call succeeds properties = graph3.getVertexProperties();
Note that even if a graph is destroyed by a session, the graph data may still remain in the server memory if the graph is currently shared by other sessions.
In such a case, the graph may still be visible among the available graphs via PgxSession.getGraphs()
.
As a safe alternative to manual destruction of each graph, the PGX API supports some implicit resource management features which allow developers to safely
omit the destroy()
call. You can refer to the dedicated section in the PGX API Design chapter.