4.8 Storing a Graph Snapshot on Disk

After reading a graph into memory using either Java or the Shell, if you make some changes to the graph such as running the PageRank algorithm and storing the values as vertex properties, you can store this snapshot of the graph on disk.

This is helpful if you want to save the state of the graph in memory, such as if you must shut down the graph server (PGX) to migrate to a newer version, or if you must shut it down for some other reason.

(Storing graphs over HTTP/REST is currently not supported.)

A snapshot of a graph can be saved as a file in a binary format (called a PGB file) if you want to save the state of the graph in memory, such as if you must shut down the graph server (PGX) to migrate to a newer version, or if you must shut it down for some other reason.

In general, we recommend that you store the graph queries and analytics APIs that had been executed, and that after the graph server (PGX) has been restarted, you reload and re-execute the APIs. But if you must save the state of the graph, you can use the logic in the following example to save the graph snapshot from the shell.

In a three-tier deployment, the file is written on the server-side file system. You must also ensure that the file location to write is specified in the graph server (PGX). (As explained in Three-Tier Deployments of Oracle Graph with Autonomous Database, in a three-tier deployment, access to the PGX server file system requires a list of allowed locations to be specified.)

opg4j> var graph = session.createGraphBuilder().addVertex(1).addVertex(2).addVertex(3).addEdge(1,2).addEdge(2,3).addEdge(3, 1).build()
graph ==> PgxGraph[name=anonymous_graph_1,N=3,E=3,created=1581623669674]

opg4j> analyst.pagerank(graph)
$3 ==> VertexProperty[name=pagerank,type=double,graph=anonymous_graph_1]

// Now save the state of this graph
opg4j> graph.store(Format.PGB, "/scratch/PG/snapshot/snapshot.pgb")
$16 ==> {"edge_uris":[],"error_handling":{},"attributes":{},"format":"pgb","vertex_props":[{"type":"double","name":"pagerank","dimension":0}],"edge_props":[],"vertex_
id_type":"integer","vertex_uris":["/scratch/PG/snapshot/snapshot.pgb"],"loading":{}}

// reload from disk 
opg4j> var graphFromDisk = session.readGraphFile("/scratch/PG/snapshot/snapshot.pgb", Format.PGB)
graphFromDisk ==> PgxGraph[name=snapshot,N=3,E=3,created=1635791377443]

// previously computed properties are still part of the graph and can be queried
opg4j> graphFromDisk.queryPgql("SELECT x.pagerank MATCH (x)").print().close()
+--------------------+
| x.pagerank         |
+--------------------+
| 0.3333333333333333 |
| 0.3333333333333333 |
| 0.3333333333333333 |
+--------------------+

The following example is essentially the same as the preceding one, but it uses partitioned graphs. You must first load a partitioned graph into the graph server (PGX). See Loading a Partitioned Graph into the Graph Server (PGX) Using GraphConfigBuilder for an example.

opg4j> analyst.pagerank(graph)
$8 ==> VertexProperty[name=pagerank,type=double,graph=bank_graph_partitioned]

// Now save the state of this graph
opg4j>  var storedPgbConfig = graph.store(ProviderFormat.PGB, "/scratch/PG/snapshot/snapshot_")
storedPgbConfig ==> {"vertex_id_strategy":"KEYS_AS_IDS","loading":{},"array_compaction_threshold":0.2,"time_with_timezone_format":["h[h]:m[m][:s[s]] a[ XXX]","[
yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"name":"bank_graph_partitioned","timestamp_with_timezone_format":
["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"edge_providers":[
{"loading":{"create_key_mapping":true},"destination_vertex_provider":"node","destination_column":3,"format":"pgb","time_format":["h[h]:m[m][:s[s][.SSS]] a[ XXX]
","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"key_type":"long","has_keys":true,"label":"transfer","source_
column":2,"storing":{"base_path":"/scratch/PG/snapshot/snapshot_transfer","edge_extension":"pgb"},"attributes":{},"timestamp_format":["yyyy-MM-dd'T'H[H]:m[m][:s
[s][.SSS[SSS]]][XXX]","yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"time_with_timezone_format":["h[h]:m[m][:s[s]
] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"timestamp_with_timezone_format":["yyyy-MM-dd'T'H[H]
:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"props":[{"drop_after_loading":false,
"type":"long","column":4,"name":"FROM_ACCT_ID","dimension":0},{"drop_after_loading":false,"type":"long","column":5,"name":"TO_ACCT_ID","dimension":0},{"drop_aft
er_loading":false,"type":"long","column":6,"name":"AMOUNT","dimension":0}],"header":false,"source_vertex_provider":"node","error_handling":{},"uris":["/scratch/
PG/snapshot/snapshot_transfer.pgb"],"key_column":1,"local_date_format":["yyyy-M[M]-d[d]","M[M]/d[d]/yyyy","d[d]-MMM-yyyy","d[d]-M[M]-yyyy","yyyy-MM-dd'T'H[H]:
m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"name":"transfer"}],"local_date_format":["yyyy-M[M]-d[d]","M[M]/d[d]/yyyy","d[d]-MMM
-yyyy","d[d]-M[M]-yyyy","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"vertex_id_type":"long","optimized_for":
"READ","max_prefetched_rows":10000,"num_connections":2,"max_batch_size":10000,"error_handling":{"on_missing_vertex":"ERROR"},"scroll_time":"1m","time_format":[
"h[h]:m[m][:s[s][.SSS]] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"edge_id_strategy":"KEYS_AS_ID
S","attributes":{},"vertex_providers":[{"storing":{"vertex_extension":"pgb","base_path":"/scratch/PG/snapshot/snapshot_node"},"attributes":{},"loading":{"create
_key_mapping":true},"timestamp_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s
[s][.SSSSSS]][XXX]"],"time_with_timezone_format":["h[h]:m[m][:s[s]] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SS
SSSS]][XXX]"],"timestamp_with_timezone_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]
:m[m][:s[s][.SSSSSS]][XXX]"],"props":[{"drop_after_loading":false,"type":"integer","column":2,"name":"ID","dimension":0},{"drop_after_loading":false,"type":
"double","column":3,"name":"pagerank","dimension":0}],"format":"pgb","time_format":["h[h]:m[m][:s[s][.SSS]] a[ XXX]","[yyyy-MM-dd'T']H[H]:m[m][:s[s][.SSS[SSS]]]
[XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"key_type":"long","header":false,"has_keys":true,"error_handling":{},"uris":["/scratch/PG/snapshot/snapsho
t_node.pgb"],"label":"node","key_column":1,"local_date_format":["yyyy-M[M]-d[d]","M[M]/d[d]/yyyy","d[d]-MMM-yyyy","d[d]-M[M]-yyyy","yyyy-MM-dd'T'H[H]:m[m][:s[s]
[.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"name":"node"}],"timestamp_format":["yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-d
d H[H]:m[m][:s[s][.SSS[SSS]]][XXX]","yyyy-MM-dd'T'H[H]:m[m][:s[s][.SSSSSS]][XXX]"],"edge_id_type":"long"}

// Reload from disk 
opg4j> var graphFromDisk = session.readGraphWithProperties(storedPgbConfig)
graphFromDisk ==> PgxGraph[name=bank_graph_partitioned_3,N=1000,E=5001,created=1635790679596]

// Previously computed properties are still part of the graph and can be queried
opg4j> graphFromDisk.queryPgql("SELECT x.pagerank MATCH(x) LIMIT 5").print().close()
+-----------------------+
| x.pagerank            |
+-----------------------+
| 9.748675243734766E-4  |
| 0.004576976478097233  |
| 5.353395438612155E-4  |
| 0.0013043345794813302 |
| 0.001502117014779663  |
+-----------------------+

Note:

In the case of partitioned graphs, multiple PGB files are being generated, one for each vertex/edge partition in the graph.