A graph can have multiple snapshots associated with it, reflecting different versions of the graph. All snapshots of a graph have the same graph config associated.
This guide describes:
Starting from PGX version 19.4, snapshots can be published to other sessions.
For more information, see publish a graph with snapshots.
Starting from PGX 20.0.0, snapshots can be created from two sources: Refreshing and ChangeSet. Prior to version 20.0.0, only refreshing was available.
Refreshing is available for graphs that are read from a persistent data source, e.g. a file.
When the data source has changed with respect to the version stored in PGX, it can be read again manually by calling the PgxSession.readGraphWithProperties()
method;
similarly, if auto-refresh is set for the graph, the PGX server automatically reads the data source and creates new snapshots when the data source has changed (see Auto-refreshing graphs).
Instead, a ChangeSet is a set of changes to a graph that the user creates and populates via the PGX ChangeSet API (more information in the dedicated page and in the API docs).
Once a ChangeSet is created and populated with the desired changes, the user can simply call GraphChangeSet.buildNewSnapshot()
to create a new snapshot for the graph.
In this way, PGX users can easily integrate changes coming from any source into the graph and build snapshots out of them with full control.
Only one source of snapshots is allowed for a single graph and is chosen during graph configuration via the snapshots_source
option,
which can be set to either REFRESH
or CHANGE_SET
(you can refer to Graph Configuration for the complete list of options).
In case the snapshots_source
option is not explicitly set by the user, the following default settings apply:
REFRESH
, so that snapshots can be created only by calling PgxSession.readGraphWithProperties()
(or via auto-refresh, if configured)CHANGE_SET
, since the graph is not backed by a persistent data source to read changes from;
for the same reason, CHANGE_SET
is the only admissible value for transient graphsAdditionally, the following restrictions apply:
REFRESH
is admissible for the snapshots_source
optionGraphChangeSet.buildNewSnapshot()
when the graph's snapshots_source
is REFRESH
),
the operation is invalid and an exception is thrownHere we show how to create a snapshot both via refreshing and via ChangeSet.
First, you should load a graph into memory: you can see the graph loading tutorial for a complete explanation about loading graphs;
briefly, you should call the PgxSession.readGraphWithProperties()
method and pass it the graph configuration.
pgx> var G = session.readGraphWithProperties("examples/graphs/sample.csv.json") ==> PGX Graph named 'sample' bound to PGX session 'a1744e86-65fb-4bd1-b2dc-5458b20954a9' registered at PGX Server Instance running in embedded mode
PgxSession session = Pgx.createSession("tutorial"); PgxGraph = session.readGraphWithProperties("examples/graphs/sample.csv.json");
session = pypgx.get_session(session_name-"my-session") G = session.read_graph_with_properties("examples/graphs/sample.csv.json")
Now you can check the available snapshots of the graph with PgxSession.getAvailableSnapshots()
.
Since you just loaded the graph there is only one snapshot available:
pgx> session.getAvailableSnapshots(G) ==> GraphMetaData [getNumVertices()=4, getNumEdges()=4, memoryMb=0, dataSourceVersion=1453315103000, creationRequestTimestamp=1453315122669 (2016-01-20 10:38:42.669), creationTimestamp=1453315122685 (2016-01-20 10:38:42.685), vertexIdType=integer, edgeIdType=long]
Deque<GraphMetaData> snapshots = session.getAvailableSnapshots(G); for( GraphMetaData metaData : snapshots ) { System.out.println( metaData ); }
snapshots = session.get_available_snapshots(G) for metadata in snapshots: print(metadata)
Now you can edit the source file to contain an additional vertex and an additional edge.
For example, add the vertex "42" with vertex property "7" and an edge from "42" to "333" with the edge property "10.0".
To do this add the line 42,7
at the end of examples/graphs/sample.vertices.csv
, and the line 42,333,10.0
at the end of examples/graphs/sample.edges.csv
.
When you now load the updated graph within the same session as you loaded the original graph, a new snapshot is created.
pgx> var G = session.readGraphWithProperties( G.getConfig(), true ) ==> PGX Graph named 'sample_2' bound to PGX session 'a1744e86-65fb-4bd1-b2dc-5458b20954a9' registered at PGX Server Instance running in embedded mode pgx> session.getAvailableSnapshots(G) ==> GraphMetaData [getNumVertices()=4, getNumEdges()=4, memoryMb=0, dataSourceVersion=1453315103000, creationRequestTimestamp=1453315122669 (2016-01-20 10:38:42.669), creationTimestamp=1453315122685 (2016-01-20 10:38:42.685), vertexIdType=integer, edgeIdType=long] ==> GraphMetaData [getNumVertices()=5, getNumEdges()=5, memoryMb=3, dataSourceVersion=1452083654000, creationRequestTimestamp=1453314938744 (2016-01-20 10:35:38.744), creationTimestamp=1453314938833 (2016-01-20 10:35:38.833), vertexIdType=integer, edgeIdType=long]
G = session.readGraphWithProperties( G.getConfig(), true ); Deque<GraphMetaData> snapshots = session.getAvailableSnapshots( G );
G = session.read_graph_with_properties(G.config,update_if_not_fresh=True)
Notice how there are two GraphMetaData
objects in the call for available snapshots, one with 4 vertices and 4 edges and one with 5 vertices and 5 edges.
The variable G
will point to the newest loaded graph with 5 vertices and 5 edges. You can check this with the getNumVertices()
and getNumEdges()
methods.
pgx> G.getNumVertices() ==> 5 pgx> G.geNumEdges() ==> 5
int vertices = G.getNumVertices(); long edges = G.getNumEdges();
vertices = G.num_vertices edges = G.num_edges
With ChangeSets, all operations are done via the PGX Java API.
In case you want to create the graph from a persistent data source, you can again use PgxSession.readGraphWithProperties()
as in the previous example,
with the snapshots_source
configuration option set to CHANGE_SET
.
For the sake of example, here we create the first graph snapshot of a transient graph via a graph builder as in the graph builder example.
var builder = session.createGraphBuilder() builder.addEdge(1, 2) builder.addEdge(2, 3) builder.addEdge(2, 4) builder.addEdge(3, 4) builder.addEdge(4, 2) var graph = builder.build()
import oracle.pgx.api.*; GraphBuilder<Integer> builder = session.createGraphBuilder(); builder.addEdge(1, 2); builder.addEdge(2, 3); builder.addEdge(2, 4); builder.addEdge(3, 4); builder.addEdge(4, 2); PgxGraph graph = builder.build();
builder = session.create_graph_builder(); builder.add_edge(1, 2) builder.add_edge(2, 3) builder.add_edge(2, 4) builder.add_edge(3, 4) builder.add_edge(4, 2) graph = builder.build()
Regardless of how the first snapshot has been created, the following step consists in creating a ChangeSet from graph
and populating it:
here, we add a new edge between vertices 1 and 4.
var changeSet = graph.<Integer>createChangeSet() changeSet.addEdge(6, 1, 4)
import oracle.pgx.api.*; GraphChangeSet<Integer> changeSet = graph.createChangeSet(); changeSet.addEdge(6, 1, 4);
changeSet = graph.create_change_set() changeSet.add_edge(1, 4, 6)
Finally, the second snapshot is created by invoking GraphChangeSet.buildNewSnapshot()
, which returns the reference to the second snapshot.
var secondSnapshot = changeSet.buildNewSnapshot() session.getAvailableSnapshots(secondSnapshot).size() ==> 2
PgxGraph secondSnapshot = changeSet.buildNewSnapshot(); System.out.println( session.getAvailableSnapshots(secondSnapshot).size() );
second_snapshot = change_set.build_new_snapshot() print(len(session,get_available_snapshots()))
We finally see that two snapshots exist, referenced via the variables graph
and secondSnapshot
.
With multiple snapshots of a graph being available and regardless of their source,
you can check out a specific snapshot using the PgxSession.setSnapshot()
method;
in particular, you can use the LATEST_SNAPSHOT
constant of PgxSession
to easily check out the latest available snapshot, as in the following example.
pgx> session.setSnapshot( G, PgxSession.LATEST_SNAPSHOT ) ==> null pgx> session.getCreationTimestamp() ==> 1453315122685
session.setSnapshot( G, PgxSession.LATEST_SNAPSHOT ); System.out.println( session.getCreationTimestamp() )
Note the printed timestamp is that of the most recent snapshot.
You can also check out a specific snapshot, again using the PgxSession.setSnapshot()
.
Following the refresh example from above, you have two snapshots of the sample graph loaded:
==> GraphMetaData [getNumVertices()=4, getNumEdges()=4, memoryMb=0, dataSourceVersion=1453315103000, creationRequestTimestamp=1453315122669 (2016-01-20 10:38:42.669), creationTimestamp=1453315122685 (2016-01-20 10:38:42.685), vertexIdType=integer, edgeIdType=long] ==> GraphMetaData [getNumVertices()=5, getNumEdges()=5, memoryMb=3, dataSourceVersion=1452083654000, creationRequestTimestamp=1453314938744 (2016-01-20 10:35:38.744), creationTimestamp=1453314938833 (2016-01-20 10:35:38.833), vertexIdType=integer, edgeIdType=long]
To check out a specific snapshot of the graph, you should pass the creationTimestamp
of the snapshot you want to load to setSnapshot()
.
For example, if G
is pointing to the newest graph with 5 vertices and 5 edges but you want to analyze the older graph, you need to set the snapshot to 1453315122685
.
pgx> G.getNumVertices() ==> 5 pgx> G.getNumEdges() ==> 5 pgx> session.setSnapshot( G, 1453315122685 ) ==> null pgx> G.getNumVertices() ==> 4 pgx> G.getNumEdges() ==> 4
session.setSnapshot( G, 1453315122685 );
session.set_snapshot( G, 1453315122685 )
Notice how after setting the snapshot the number of vertices and edges changed from 5 to 4.
Here, we manually passed the creation timestamp we printed to setSnapshot()
for the sake of example.
In general, you can retrieve the creation timestamp of each snapshot from its associated GraphMetaData
object via the GraphMetaData.getCreationTimestamp()
method.
The easiest way to get the GraphMetaData
information of all the snapshots is to use the the PgxSession.getAvailableSnapshots()
method,
which returns a collection of GraphMetaData
information of each snapshot ordered by creation timestamp from the most recent to the oldest.
You can also load a specific snapshot of a graph directly using the PgxSession.readGraphAsOf()
method. This is a shortcut for loading a graph with readGraphWithProperties()
followed by a setSnapshot()
.
Imagine two snapshots of a graph are already loaded into the PGX session, and you want to get a reference to a specific snapshot. First you need to get a graph configuration for this graph:
pgx> var config = GraphConfigFactory.forAnyFormat().fromPath("examples/graphs/sample.adj.json") ==> {"format":"adj_list", ... }
GraphConfig config = GraphConfigFactory.forAnyFormat().fromPath("examples/graphs/sample.csv.json");
config = GraphConfigFactory.for_any_format().from_path("examples/graphs/sample.csv.json")
Then you can check the loaded snapshots for this graph config using getAvailableSnapshots()
:
pgx> session.getAvailableSnapshots(G) ==> GraphMetaData [getNumVertices()=4, getNumEdges()=4, memoryMb=0, dataSourceVersion=1453315103000, creationRequestTimestamp=1453315122669 (2016-01-20 10:38:42.669), creationTimestamp=1453315122685 (2016-01-20 10:38:42.685), vertexIdType=integer, edgeIdType=long] ==> GraphMetaData [getNumVertices()=5, getNumEdges()=5, memoryMb=3, dataSourceVersion=1452083654000, creationRequestTimestamp=1453314938744 (2016-01-20 10:35:38.744), creationTimestamp=1453314938833 (2016-01-20 10:35:38.833), vertexIdType=integer, edgeIdType=long]
Deque<GraphMetaData> snapshots = session.getAvailableSnapshots(G);
session.get_available_snapshots(G)
Now you want to check out the snapshot of the graph which has 4 vertices and 4 edges, which has the timestamp 1453315122685
.
pgx> var G = session.readGraphAsOf( config, 1453315122685 ) ==> PGX Graph named 'sample' bound to PGX session 'a1744e86-65fb-4bd1-b2dc-5458b20954a9' registered at PGX Server Instance running in embedded mode pgx> G.getNumVertices() ==> 4 pgx> G.getNumEdges() ==> 4
PgxGraph G = session.readGraphAsOf( config, 1453315122685 )
G = read_graph_as_of(config, creation_timestamp=1453315122685)
You now know how to create snapshots, check out different snapshots of the same graph and also how to load specific snapshots. You can now learn about the Auto-Refresh Mechanism, to automatically create snapshots of your loaded graph on a timely basis.