4.3 About Vertex and Edge IDs
Generating vertex and edge IDs when loading from database tables into PGX
PGX enforces by default the existence of a unique identifier for each vertex and edge in a graph, so that they can be retrieved by using PgxGraph.getVertex(ID id)
and PgxGraph.getEdge(ID id)
or by PGQL queries using the built-in id()
method.
The ID generation strategies can be selected through the configuration parameters vertex_id_strategy
and edge_id_strategy
.
Using keys to generate IDs
The default strategy to generate the vertex IDs is to use the keys provided during
loading of the graph (keys_as_ids
). In that case, each vertex should have a
vertex key that is unique across all providers.
For edges, by default no keys are required in the edge data, and edge IDs will
be automatically generated by PGX (unstable_generated_ids
). Note that the
generation of edge IDs is not guaranteed to be deterministic. If required, it is also
possible to load edge keys as IDs.
The partitioned_ids
strategy requires keys to be unique only
within a vertex or edge provider (data source). The keys do not have to be globally
unique. Globally unique IDs are derived from a combination of the provider name and the key
inside the provider, as
<provider_name>
(<unique_key_within_provider>
).
For example, Account(1)
.
The partititioned_ids
strategy can be set through the configuration fields vertex_id_strategy
and edge_id_strategy
. For example,
{
"name": "bank_graph_analytics",
"optimized_for": "updates",
"vertex_id_strategy" : "partitioned_ids",
"edge_id_strategy" : "partitioned_ids",
"vertex_providers": [
{
"name": "Accounts",
"format": "rdbms",
"database_table_name": "BANK_NODES",
"key_column": "ID",
"key_type": "integer",
"props": [
{
"name": "keyProp",
"type": "long",
"column": 1
},
{
"name": "number",
"type": "long",
"column": 2
}
],
"loading": {
"create_key_mapping" : true
}
}
],
"edge_providers": [
{
"name": "Transfers",
"format": "rdbms",
"database_table_name": "BANK_EDGES_AMT",
"key_column": "ID",
"source_column": "SRC_ID",
"destination_column": "DEST_ID",
"source_vertex_provider": "Accounts",
"destination_vertex_provider": "Accounts",
"props": [
{
"name": "keyProp",
"type": "long",
"column": 1
},
{
"name": "amount",
"type": "double",
"column": 4
}
],
"loading": {
"create_key_mapping" : true
}
}
]
}
Note:
All available key types are supported in combination with partitioned IDs.After the graph is loaded, PGX maintains information about which property of a provider corresponds to the key of the provider. In the preceding example, the vertex property keyProp
happens to correspond to the vertex key ("column": 1
) and also the edge property keyProp
happens to correspond to the edge key (again, "column": 1
). Each provider can have at most one such "key property" and the property can have any name.
vertex key property ID cannot be updated
Using an auto-incrementer to generate IDs
It is recommended to always set create_key_mapping
to
true
to benefit from performance optimizations. But if there are no
single-column keys for edges, create_key_mapping
can be set to
false
. Similarly, create_key_mapping
can be set to
false
for vertex providers also. IDs will be generated via an
auto-incrementer, for example Accounts(1)
, Accounts(2)
,
Accounts(3)
.
Parent topic: Using the In-Memory Graph Server (PGX)