12.1.2 Graph Configuration Options

The following table lists the JSON fields that are common to all graph configurations:

Table 12-1 Graph Config JSON Fields

Field Type Description Default
name string Name of the graph. Required
array_compaction_threshold number [only relevant if the graph is optimized for updates] Threshold used to determined when to compact the delta-logs into a new array. If lower than the engine min_array_compaction_threshold value, min_array_compaction_threshold will be used instead 0.2
attributes object Additional attributes needed to read and write the graph data. null
edge_id_strategy enum[no_ids, keys_as_ids, unstable_generated_ids] Indicates what ID strategy should be used for the edges of this graph. If not specified (or set to null), the strategy will be determined during loading or using a default value. null
edge_id_type enum[long] Type of the edge ID. Setting it to long requires the IDs in the edge providers to be unique across the graphs; those IDs will be used as global IDs. Setting it to null (or omitting it) will allow repeated IDs across different edge providers and PGX will automatically generate globally-unique IDs for the edges. null
edge_providers array of object List of edge providers in this graph. []
error_handling object Error handling configuration. null
external_stores array of object Specification of the external stores where external string properties reside. []
jdbc_url string JDBC URL pointing to an RDBMS instance null
keystore_alias string Alias to the keystore to use when connecting to database. null
loading object Loading-specific configuration to use. null
local_date_format array of string array of local_date formats to use when loading and storing local_date properties. See DateTimeFormatter for more details of the format string []
max_prefetched_rows integer Maximum number of rows prefetched during each round trip resultset-database. 10000
num_connections integer Number of connections to read and write data from or to the RDBMS table. <no-of-cpus>
optimized_for enum[read, updates] Indicates if the graph should use data-structures optimized for read-intensive scenarios or for fast updates. read
password string Password to use when connecting to database. null
point2d string Longitude and latitude as floating point values separated by a space. 0.0 0.0
redaction_rules array of object Array of redaction rules. []
rules_mapping array of object Mapping for redaction rules to users and roles. []
schema string Schema to use when reading or writing RDBMS objects null
time_format array of string The time format to use when loading and storing time properties. See DateTimeFormatter for a documentation of the format string. []
time_with_timezone_format array of string The time with timezone format to use when loading and storing time with timezone properties. Please see DateTimeFormatter for more information of the format string. []
timestamp_format array of string The timestamp format to use when loading and storing timestamp properties. See DateTimeFormatter for more information of the format string. []
timestamp_with_timezone_format array of string The timestamp with timezone format to use when loading and storing timestamp with timezone properties. See DateTimeFormatter for more information of the format string. []
username string Username to use when connecting to an RDBMS instance. null
vector_component_delimiter character Delimiter for the different components of vector properties. ;
vertex_id_strategy enum[no_ids, keys_as_ids, unstable_generated_ids] Indicates what ID strategy should be used for the vertices of this graph. If not specified (or set to null), the strategy will be automatically detected. null
vertex_id_type enum[int, integer, long, string] Type of the vertex ID. For homogeneous graphs, if not specified (or set to null), it will default to a specific value (depending on the origin of the data). null
vertex_providers array of object List of vertex providers in this graph. []

Note:

Database connection fields specified in the graph configuration will be used as default in case underlying data provider configuration does not specify them.

Provider Configuration JSON file Options

You can specify the meta-information about each provider's data using provider configurations. Provider configurations include the following information about the provider data:

  • Location of the data: a file, multiple files or database providers
  • Information about the properties: name and type of the property

Table 12-2 Provider Configuration JSON file Options

Field Type Description Default
format enum[pgb, csv, rdbms] Provider format. Required
name string Entity provider name. Required
attributes object Additional attributes needed to read and write the graph data. null
destination_vertex_provider string Name of the destination vertex provider to be used for this edge provider. null
error_handling object Error handling configuration. null
has_keys boolean Indicates if the provided entities data have keys. true
key_type enum[int, integer, long, string] Type of the keys. long
keystore_alias string Alias to the keystore to use when connecting to database. null
label string label for the entities loaded from this provider. null
loading object Loading-specific configuration. null
local_date_format array of string Array of local_date formats to use when loading and storing local_date properties. See DateTimeFormatter for a documentation of the format string. []
password string Password to use when connecting to database. null
point2d string Longitude and latitude as floating point values separated by a space. 0.0 0.0
props array of object Specification of the properties associated with this entity provider. []
source_vertex_provider string Name of the source vertex provider to be used for this edge provider. null
time_format array of string The time format to use when loading and storing time properties. See DateTimeFormatter for a documentation of the format string. []
time_with_timezone_format array of string The time with timezone format to use when loading and storing time with timezone properties. See DateTimeFormatter for a documentation of the format string. []
timestamp_format array of string The timestamp format to use when loading and storing timestamp properties. See DateTimeFormatter for a documentation of the format string. []
timestamp_with_timezone_format array of string The timestamp with timezone format to use when loading and storing timestamp with timezone properties. See DateTimeFormatter for a documentation of the format string. []
vector_component_delimiter character Delimiter for the different components of vector properties. ;

Provider Labels

The label field in the provider configuration can be used to set a label for the entities loaded from the provider. If no label is specified, all entities from the provider are labeled with the name of the provider. It is only possible to set the same label for two different providers if they have exactly the same properties (same names and same types).

Property Configuration

The props entry in the Provider configuration is an object with the following JSON fields:

Table 12-3 Property Configuration

Field Type Description Default
name string Name of the property. Required
type enum[boolean, integer, vertex, edge, float, long, double, string, date, local_date, time, timestamp, time_with_timezone, timestamp_with_timezone, point2d] Type of the property .

Note:

date is deprecated, use one of local_date / time / timestamp / time_with_timezone / timestamp_with_timezone instead).
vertex/edge are place-holders for the type specified in vertex_id_type/edge_id_type fields.
Required
aggregate enum[identity, group_key, min, max, avg, sum, concat, count] [currently unsupported] which aggregation function to use, aggregation always happens by vertex key. null
column value Name or index (starting from 0) of the column holding the property data. If it is not specified, the loader will try to use the property name as column name (for CSV format only). null
default value Default value to be assigned to this property if datasource does not provide it. In case of date type: string is expected to be formatted with yyyy-MM-dd HH:mm:ss. If no default is present (null), non-existent properties will contain default Java types (primitives) or empty string (string) or 01.01.1970 00:00 (date). null
dimension integer Dimension of property. 0
drop_after_loading boolean [currently unsupported] indicating helper properties only used for aggregation, which are dropped after loading false
field value Name of the JSON field holding the property data. Nesting is denoted by dot - separation. Field names containing dots are possible, in this case the dots need to be escaped using backslashes to resolve ambiguities. Only the exactly specified object are loaded, if they are non existent, the default value is used. null
format array of string Array of formats of property. []
group_key string [currently unsupported] can only be used if the property / key is part of the grouping expression. null
max_distinct_strings_per_pool integer [only relevant if string_pooling_strategy is indexed] Amount of distinct strings per property after which to stop pooling. If the limit is reached an exception is thrown. If set to null, the default value from the global PGX configuration will be used. null
stores array of object A list of storage identifiers that indicate where this property resides. []
string_pooling_strategy enum[indexed, on_heap, none] Indicates which string pooling strategy to use. If set to null, the default value from the global PGX configuration will be used. null

Loading Configuration

The loading entry is a JSON object with the following fields:

Table 12-4 Loading Configuration

Field Type Description Default
create_key_mapping boolean If true, a mapping between entity keys and internal IDs is prepared during loading. true
filter string [currently unsupported] the filter expression null
grouping_by array of string [currently unsupported] array of edge properties used for aggregator. For Vertices, only the ID can be used (default) []
load_labels boolean Whether or not to load the entity label if it is available. false
strict_mode boolean If true, exceptions are thrown and logged with ERROR level whenever loader encounters problems with input file, such as invalid format, repeated keys, missing fields, mismatches and other potential errors. If false, loader may use less memory during loading phase, but behave unexpectedly with erratic input files. true

Error Handling Configuration

The error_handling entry is a JSON object with the following fields:

Table 12-5 Error Handling Configuration

Field Type Description Default
on_missed_prop_key enum[silent, log_warn, log_warn_once, error] Error handling for a missing property key. log_warn_once
on_missing_vertex enum[ignore_edge, ignore_edge_log, ignore_edge_log_once, create_vertex, create_vertex_log, create_vertex_log_once, error] Error handling for a missing source or destination vertex of an edge in a vertex data source. error
on_parsing_issue enum[silent, log_warn, log_warn_once, error] Error handling for incorrect data parsing. If set to silent, log_warn or log_warn_once, will attempt to continue loading. Some parsing issues may not be recoverable and provoke the end of loading. error
on_prop_conversion enum[silent, log_warn, log_warn_once, error] Error handling when encountering a different property type other than the one specified, but coercion is possible. log_warn_once
on_type_mismatch enum[silent, log_warn, log_warn_once, error] Error handling when encountering a different property type other than the one specified, but coercion is not possible. error
on_vector_length_mismatch enum[silent, log_warn, log_warn_once, error] Error handling for a vector property that does not have the correct dimension. error

Note:

The only supported setting for the on_missing_vertex error handling configuration is ignore_edge.