PGX 20.1.1
Documentation

Engine and Runtime Configuration Guide

This chapter explains the configuration options that PGX engine and runtime provide and how to set them. PGX runtime is the library that the generated Green-Marl code requires to run. PGX engine uses the runtime to execute built-in and runtime-generated algorithms.

What Can Be Configured?

You can configure both engine and runtime by assigning a single JSON file to PGX engine at startup-time.

PGX Engine Fields

Field Type Description Default
admin_request_cache_timeoutintegerafter how many seconds admin request results get removed from the cache. Requests which are not done or not yet consumed are excluded from this timeout. Note: this is only relevant if PGX is deployed as a webapp.60
allow_idle_timeout_overwritebooleanif true, sessions can overwrite default idle timeouttrue
allow_local_filesystem (deprecated)boolean(This flag reduces security, enable only if you know what you are doing!) Allow loading from local filesystem, if in client/server mode. The list of directories that are allowed to be read should be listed in datasource_dir_whitelist. WARNING: This should only be enabled if you want to explicitly allow users of the PGX remote interface to access files on the local filesystem.
Deprecated: since 20.1.1, define file-locations and use permissions instead
false
allow_override_scheduling_informationbooleanif true allow all users to override scheduling information like task weight, task priority and number of threadstrue
allow_task_timeout_overwritebooleanif true, sessions can overwrite default task timeouttrue
allow_user_auto_refreshbooleanif true users may enable auto refresh for graphs they load. If false only graphs mentioned in preload_graphs can have auto refresh enabledfalse
allowed_remote_loading_locationsarray of string(This option may reduce security, use it only if you know what you are doing!) Allow loading graphs into the PGX engine from remote locations. If empty, as by default, no remote location is allowed. Any of the following locations can be listed: "https", "ftps", "s3", "hdfs" (all without colon ":"). Alternatively, "*" can be used to enable all locations at once; no other value is allowed with "*". Note that pre-loaded graphs are loaded from any location, regardless of the value of this setting[]
authorizationarray of objectmapping of users and roles to resources and permissions for authorization[]
authorization_session_create_allow_allbooleanif true allow all users to create a PGX session regardless of permissions granted to themfalse
basic_scheduler_configobjectconfiguration parameters for the fork join pool backendnull
bfs_iterate_que_task_sizeintegertask size for BFS iterate QUE phase128
bfs_threshold_parent_read_basednumberthreshold of BFS traversal level items above which to switch to parent-read-based visiting strategy0.05
bfs_threshold_read_basedintegerthreshold of BFS traversal level items above which to switch to read-based visiting strategy1024
bfs_threshold_single_threadedintegeruntil what number of BFS traversal level items vertices are visited single-threaded128
cctrace (deprecated)booleanif true log every call to a Control or Core interface
Deprecated: (ignored since 2.7.0) use the client configuration enable_cctrace instead
false
cctrace_out (deprecated)string[relevant for cctrace] when cctrace is enabled, specifies a path to a file where cctrace should log to. If null it will use the default PGX logger on level TRACE. If it is the special value :stderr: it will log to stderr
Deprecated: (ignored since 2.7.0) use the corresponding client configuration instead
null
cctrace_print_stacktraces (deprecated)boolean[relevant for cctrace] when cctrace is enabled, print the stacktrace for each request and result
Deprecated: (ignored since 2.7.0) use the corresponding client configuration instead
false
character_setstringstandard charset to use throughout PGX, UTF-8 will be used as default. Note: Some formats may not be compatible.utf-8
cni_diff_factor_defaultintegerdefault diff factor value used in the common neighbor iterator implementations.8
cni_small_defaultintegerdefault value used in the common neighbor iterator implementations, to indicate below which threshold a subarray is considered small.128
cni_stop_recursion_defaultintegerdefault value used in the common neighbor iterator implementations, to indicate the minimum size where the binary search approach is applied.96
datasource_dir_whitelist (deprecated)array of stringif allow_local_filesystem is set, the list of directories from which it is allowed to read files.
Deprecated: since 20.1.1, define file-locations and use permissions instead
[]
dfs_threshold_largeintegervalue that determines at which number of visited vertices the DFS implementation will switch to data-structures that are more optimized for larger numbers of vertices.4096
enable_csrf_token_checksbooleanif true, the PGX webapp will verify CSRF token cookie and request parameters sent by the client exist and match. This is to prevent CSRF attacks.true
enable_gm_compilerbooleanif true, enable dynamic compilation of Green-Marl code during runtimetrue
enable_graph_loading_cachebooleanif true, activate the graph loading cache that will accelerate loading of graphs that were previously loaded (can only be disabled in embedded mode)true
enable_shutdown_cleanup_hookbooleanif true PGX will add a JVM shutdown hook that will automatically shutdown PGX at JVM shutdown. Notice: Having the shutdown hook deactivated and not shutting down PGX explicitly may result in pollution of your temp directory.true
enable_solaris_studio_labeling (deprecated)boolean[relevant when profiling with solaris studio] when enabled, label experiments using the 'er_label' command
Deprecated: since 2.6.0, this feature is not available anymore
false
enterprise_scheduler_configobjectconfiguration parameters for the enterprise schedulernull
enterprise_scheduler_flagsobject[relevant for enterprise_scheduler] enterprise scheduler specific settings.null
explicit_spin_locksbooleantrue means spin explicitly in a loop until lock becomes available. false means using JDK locks which rely on the JVM to decide whether to context switch or spin. Our experiments showed that setting this value to true results in better performance.true
file_locationsarray of objectthe file-locations that can be used in the authorization-config[]
graph_algorithm_languageenum[GM_LEGACY, GM, JAVA]Frontend compiler to use.gm
graph_validation_levelenum[low, high]level of validation performed on newly loaded or created graphslow
graphs (deprecated)array of stringlist of paths to graph configs to be registered at start-up. This field is deprecated and replaced by preload_graphs field.
Deprecated: use preload_graphs instead
[]
ignore_incompatible_backend_operationsbooleanif true only log when encountering incompatible operations and configuration values in RTS or FJ pool. If false throw exceptionsfalse
in_place_update_consistency_modelenum[ALLOW_INCONSISTENCIES, CANCEL_TASKS]Consistency model used when in-place updates occur. Only relevant if in-place updates are enabled. Currently updates are only applied in place if the updates are not structural (Only modifies properties). Two models are currently implemented, one only delays new tasks when an update occurs, the other also running tasks.allow_inconsistencies
init_pgql_on_startupbooleanif true PGQL is directly initialized on start-up of PGX. Otherwise, it is initialized during the first use of PGQL.true
interval_to_poll_maxintegerExponential backoff upper bound (in ms) to which -once reached, the job status polling interval is fixed1000
java_home_dirstringThe path to Java's home directory. If set to <system-java-home-dir>, use the java.home system property.null
large_array_thresholdintegerthreshold when the size of an array is too big to use a normal Java array. This depends on the used JVM. (defaults to Integer.MAX_VALUE - 3)2147483644
max_active_sessionsintegerthe maximum number of sessions allowed to be active at a time1024
max_distinct_strings_per_poolinteger[only relevant if string_pooling_strategy is indexed] amount of distinct strings per property after which to stop pooling. If the limit is reached an exception is thrown65536
max_http_client_request_sizelongmaximum size in bytes of any http request sent to to the PGX server over the REST API. Setting it to -1 allows requests of any size.10485760
max_off_heap_sizeintegermaximum amount of off-heap memory PGX is allowed to allocate in megabytes, before an OutOfMemoryError will be thrown. Note: this limit is not guaranteed to never be exceeded because of rounding and synchronization trade-offs. It only serves as threshold when PGX starts to reject new memory allocation requests.<available-physical-memory>
max_queue_size_per_sessionintegerthe maximum number of pending tasks allowed to be in the queue, per session. If a session reaches the maximum, new incoming requests of that sesssion get rejected. A negative value means infinity / unlimited.-1
max_snapshot_countintegerNumber of snapshots that may be loaded in the engine at the same time. New snapshots can be created via auto or forced update. If the number of snapshots of a graph reaches this threshold, no more auto-updates will be performed and a forced-updated will result in an exception until one or more snapshots are removed from memory. A value of zero indicates to support an unlimited amount of snapshots.0
memory_allocatorenum[basic_allocator, enterprise_allocator]which memory allocator to usebasic_allocator
memory_cleanup_intervalintegermemory cleanup tick in seconds600
min_array_compaction_thresholdnumber[only relevant for graphs optimized for updates] minimum value that can be used for the array_compaction_threshold value in graph configuration. If a graph configuration attemps to use a value lower than the one specified by min_array_compaction_threshold, it will use min_array_compaction_threshold instead0.2
min_fetch_interval_secinteger(only relevant if the graph format supports delta updates) for delta-refresh, the lowest interval at which a graph source is queried for changes. You can tune this value to prevent PGX from hanging due to too frequent graph delta-refreshing.2
min_update_interval_secintegerfor auto-refresh, the lowest interval after which a new snapshot is created, either by reloading the entire graph or if the format supports delta-updates, out of the cached changes (only relevant if the format supports delta updates). You can tune this value to prevent PGX from hanging due to too frequent graph auto-refreshing.2
ms_bfs_frontier_type_strategyenum[auto_grow, short, int]the type strategy to use for MS-BFS frontiersauto_grow
num_spin_locksintegerhow many spin locks each generated app will create at instantiation. Trade-off: small number implies less memory consumption. Big number implies faster execution (if algorithm uses spin locks)1024
num_workers_analysis (deprecated)integerhow many worker threads to use for analysis tasks
Deprecated: since 2.3.0, use fj_pool_config.num_workers_analysis instead
<no-of-cpus>
num_workers_fast_track_analysis (deprecated)integerhow many worker threads to use for fast-track analysis tasks
Deprecated: since 2.3.0, use fj_pool_config.num_workers_fast_track_analysis instead
1
num_workers_io (deprecated)integerhow many worker threads to use for I/O tasks (load/refresh/write from/to disk). This value won't affect file-based loaders, as they're always single-threaded. Database loaders will open a new connection for each I/O worker.
Deprecated: since 2.3.0, use fj_pool_config.num_workers_io instead
<no-of-cpus>
parallelismintegernumber of worker threads to be used in thread pool. Note: if caller thread is part of another thread-pool, this value is ignored and parallelism of parent pool is used.<no-of-cpus>
path_to_gm_compiler (deprecated)stringif set, use this path to gm_comp binary for dynamic compilations. If not set and enable_gm_compiler is true, PGX will try to utilize one of the built-in compiler binaries.
Deprecated: since 3.2.0, gm_comp is not supported anymore
null
pattern_matching_semantic (deprecated)enum[isomorphism, homomorphism]the graph pattern matching semantic which is either homomorphism or isomorphism. This field is deprecated and replaced by the built-in PGQL function all_different(v1, v2, ...) that allows for specifying that the pattern and the match should be isomorphic to each other.
Deprecated: use the built-in PGQL function all_different(v1, v2, ...) instead
homomorphism
pattern_matching_supernode_cache_thresholdintegerminimum number of a node's neighbor to be a supernode. This is for pattern matching engine.1000
pgx_realmobjectconfiguration parameters for the realmnull
pooling_factornumber[only relevant if string_pooling_strategy is on_heap] this value prevents the string pool to grow as big as the property size which could render the pooling ineffective0.25
preload_graphsarray of objectlist of graph configs to be registered at start-up. Each item includes path to a graph config, the name of the graph and whether it should be published.[]
random_generator_strategyenum[non_deterministic, deterministic]method of generating random numbers in pgxnon_deterministic
random_seedlong[relevant for deterministic random number generator only] seed for the deterministic random number generator used in pgx. The default is -24466691093057031-24466691093057031
release_memory_thresholdnumberthreshold percentage of used memory after which the engine starts freeing un-used graphs. Examples: A value of 0.0 means graphs get freed as soon as their reference count becomes zero. That is, all sessions which loaded that graph were destroyed/timed out. A value of 1.0 means graphs get never freed. Engine will throw OutOfMemoryErrors as soon as a graph is needed which doesn't fit in memory anymore. A value of 0.7 means the engine keeps all graphs in memory as long as total memory consumption is below 70% of total available memory, even if there is currently no session using them. Once the 70% are surpassed and another graph needs to get loaded, un-used graphs get freed until memory consumption is below 70% again.0.85
revisit_thresholdintegermaximum number of matched results from a node to be cached4096
schedulerenum[basic_scheduler, enterprise_scheduler, low_latency_scheduler]which scheduler to use.
basic_scheduler = use scheduler with basic features.
enterprise_scheduler = use scheduler with advanced, enterprise features for running multiple tasks concurrently and increased performance.
low_latency_scheduler = use scheduler that privileges latency of tasks over throughput or fairness across multiple sessions. The low_latency_scheduler is only available in embedded mode
enterprise_scheduler
session_idle_timeout_secsintegertimeout of idling sessions in seconds. Set to zero for infinity/no timeout0
session_task_timeout_secsintegertimeout to interrupt long-running tasks submitted by sessions (algorithms, I/O tasks) in seconds. Set to zero for infinity/no timeout0
small_task_lengthintegertask length if total amount of work is small than default task length (only relevant for task-stealing strategies)128
strict_modebooleanif true, exceptions are thrown and logged with ERROR level whenever engine encounters configuration problems, such as invalid keys, mismatches and other potential errors. If false, engine logs problems with ERROR/WARN level (depending on severity) and makes best guesses / uses sensible defaults instead of throwing exceptions.true
string_pooling_strategyenum[indexed, on_heap, none][only relevant if use_string_pool is enabled] which string pooling strategy to useon_heap
task_lengthintegerdefault task length (only relevant for task-stealing strategies). F/J pool documentation says this value should be between 100 and 10000. Trade-off: small number implies more fine-grained tasks are generated, higher stealing throughput. High number implies less memory consumption and GC activity4096
tmp_dirstringUse this path as temporary directory to store compilation artifacts and other temporary data. If set to <system-tmp-dir>, use standard tmp directory of underlying system (/tmp on Linux)null
udf_config_directorystringdirectory path containing udf config filesnull
use_memory_mapper_for_reading_pgbbooleanif true, use memory mapped files for reading graphs in PGB format if possible; false always use s stream based implementationtrue
use_memory_mapper_for_storing_pgbbooleanif true, use memory mapped files for storing in PGB format if possible; if false always use a stream based implementationtrue
use_string_pool (deprecated)booleanIf true, PGX will store string properties in a pool in order to consume less memory on string properties
Deprecated: since 3.3.0, this setting is no longer valid, to disable the string pooling set string_pooling_strategy to none
true

Deprecated fields

Beginning with PGX version 2.3.0, the fields num_workers_io, num_workers_analysis, num_workers_fast_track_analysis have been moved to the basic_scheduler settings. For more information see the changelog

Enterprise Scheduler Fields

Ignored for 'basic-scheduler'

When the scheduler is set to 'basic-scheduler' these fields are ignored

Field Type Description Default
analysis_task_configobjectconfiguration for analysis tasks
priority
medium
weight
<no-of-CPUs>
max_threads
<no-of-CPUs>
fast_analysis_task_configobjectconfiguration for fast analysis tasks
priority
high
weight
1
max_threads
<no-of-CPUs>
max_num_concurrent_io_tasksintegermaximum number of concurrent io tasks at a time3
num_io_threads_per_taskintegernumber of io threads to use per task<no-of-cpus>

For more information on how to configure the enterprise scheduler see the enterprise scheduler configuration reference.

Basic Scheduler Fields

Ignored for 'enterprise-scheduler'

When the scheduler is set to 'enterprise-scheduler' these fields are ignored

Field Type Description Default
num_workers_analysisintegerhow many worker threads to use for analysis tasks<no-of-cpus>
num_workers_fast_track_analysisintegerhow many worker threads to use for fast-track analysis tasks1
num_workers_iointegerhow many worker threads to use for I/O tasks (load/refresh/write from/to disk). This value won't affect file-based loaders, as they're always single-threaded. Database loaders will open a new connection for each I/O worker.<no-of-cpus>

PGX Runtime Fields

Field Type Description Default
bfs_iterate_que_task_sizeintegertask size for BFS iterate QUE phase128
bfs_threshold_parent_read_basednumberthreshold of BFS traversal level items above which to switch to parent-read-based visiting strategy0.05
bfs_threshold_read_basedintegerthreshold of BFS traversal level items above which to switch to read-based visiting strategy1024
bfs_threshold_single_threadedintegeruntil what number of BFS traversal level items vertices are visited single-threaded128
character_setstringstandard charset to use throughout PGX, UTF-8 will be used as default. Note: Some formats may not be compatible.utf-8
cni_diff_factor_defaultintegerdefault diff factor value used in the common neighbor iterator implementations.8
cni_small_defaultintegerdefault value used in the common neighbor iterator implementations, to indicate below which threshold a subarray is considered small.128
cni_stop_recursion_defaultintegerdefault value used in the common neighbor iterator implementations, to indicate the minimum size where the binary search approach is applied.96
dfs_threshold_largeintegervalue that determines at which number of visited vertices the DFS implementation will switch to data-structures that are more optimized for larger numbers of vertices.4096
enable_solaris_studio_labeling (deprecated)boolean[relevant when profiling with solaris studio] when enabled, label experiments using the 'er_label' command
Deprecated: since 2.6.0, this feature is not available anymore
false
enterprise_scheduler_flagsobject[relevant for enterprise_scheduler] enterprise scheduler specific settings.null
explicit_spin_locksbooleantrue means spin explicitly in a loop until lock becomes available. false means using JDK locks which rely on the JVM to decide whether to context switch or spin. Our experiments showed that setting this value to true results in better performance.true
graph_validation_levelenum[low, high]level of validation performed on newly loaded or created graphslow
max_distinct_strings_per_poolinteger[only relevant if string_pooling_strategy is indexed] amount of distinct strings per property after which to stop pooling. If the limit is reached an exception is thrown65536
max_off_heap_sizeintegermaximum amount of off-heap memory PGX is allowed to allocate in megabytes, before an OutOfMemoryError will be thrown. Note: this limit is not guaranteed to never be exceeded because of rounding and synchronization trade-offs. It only serves as threshold when PGX starts to reject new memory allocation requests.<available-physical-memory>
memory_allocatorenum[basic_allocator, enterprise_allocator]which memory allocator to usebasic_allocator
ms_bfs_frontier_type_strategyenum[auto_grow, short, int]the type strategy to use for MS-BFS frontiersauto_grow
num_spin_locksintegerhow many spin locks each generated app will create at instantiation. Trade-off: small number implies less memory consumption. Big number implies faster execution (if algorithm uses spin locks)1024
pattern_matching_supernode_cache_thresholdintegerminimum number of a node's neighbor to be a supernode. This is for pattern matching engine.1000
pooling_factornumber[only relevant if string_pooling_strategy is on_heap] this value prevents the string pool to grow as big as the property size which could render the pooling ineffective0.25
random_generator_strategyenum[non_deterministic, deterministic]method of generating random numbers in pgxnon_deterministic
random_seedlong[relevant for deterministic random number generator only] seed for the deterministic random number generator used in pgx. The default is -24466691093057031-24466691093057031
revisit_thresholdintegermaximum number of matched results from a node to be cached4096
schedulerenum[basic_scheduler, enterprise_scheduler, low_latency_scheduler]which scheduler to use.
basic_scheduler = use scheduler with basic features.
enterprise_scheduler = use scheduler with advanced, enterprise features for running multiple tasks concurrently and increased performance.
low_latency_scheduler = use scheduler that privileges latency of tasks over throughput or fairness across multiple sessions. The low_latency_scheduler is only available in embedded mode
enterprise_scheduler
small_task_lengthintegertask length if total amount of work is small than default task length (only relevant for task-stealing strategies)128
string_pooling_strategyenum[indexed, on_heap, none][only relevant if use_string_pool is enabled] which string pooling strategy to useon_heap
task_lengthintegerdefault task length (only relevant for task-stealing strategies). F/J pool documentation says this value should be between 100 and 10000. Trade-off: small number implies more fine-grained tasks are generated, higher stealing throughput. High number implies less memory consumption and GC activity4096
use_memory_mapper_for_reading_pgbbooleanif true, use memory mapped files for reading graphs in PGB format if possible; false always use s stream based implementationtrue
use_memory_mapper_for_storing_pgbbooleanif true, use memory mapped files for storing in PGB format if possible; if false always use a stream based implementationtrue
use_string_pool (deprecated)booleanIf true, PGX will store string properties in a pool in order to consume less memory on string properties
Deprecated: since 3.3.0, this setting is no longer valid, to disable the string pooling set string_pooling_strategy to none
true

Examples

Minimal Example

This configuration example causes PGX to initialize its analysis thread pool with a maximum of 32 workers. For all other fields, defaults will be used (see the above table)

{
  "enterprise_scheduler_config": {
    "analysis_task_config": {
      "max_threads": 32
    }
  }
}

Example with Two Fixed Graphs

This example sets more fields and specifies two fixed graphs for loading into memory during PGX startup-time. This feature helps to avoid redundancy when you need the same graph configuration, both being pre-loaded and stand-alone, to reference the graph later. Find details on how to specify graphs here.

{ 
  "enterprise_scheduler_config": {
    "analysis_task_config": {
      "max_threads": 32
    },
    "fast_analysis_task_config": {
      "max_threads": 32
    }
  }, 
  "memory_cleanup_interval": 600,
  "max_active_sessions": 1, 
  "release_memory_threshold": 0.2, 
  "preload_graphs": [
    {
      "path": "graph-configs/my-graph.bin.json",
      "name": "my-graph"
    },
    {
      "path": "graph-configs/my-other-graph.adj.json",
      "name": "my-other-graph",
      "publish": false
    }
  ]
}

How relative paths are resolved

Relative paths in PGX configuration files are always resolved relative to the parent directory of configuration file in which they are specified. If the JSON above were in a file /pgx/conf/pgx.conf, then the path graph-configs/my-graph.bin.json inside that file would be resolved to /pgx/conf/graph-configs/my-graph.bin.json

Publish a preloaded graph

Preloaded graphs are published by default. See Publish a graph tutorial for more details.

Example with Non-default Runtime Fields

This example also configures some fields of the PGX runtime.

{ 
  "basic_scheduler_config": {
    "num_workers_analysis": 32
  },
  "scheduler": "basic_scheduler",
  "num_spin_locks": 128,
  "task_length": 1024
}

Performance tuning

The defaults of the runtime configuration fields are optimized in order to deliver the best performance across a broad set of algorithms. Depending on your workload, you may be able to further improve performance by experimenting with different strategies, sizes and thresholds.

How to Pass Configuration to PGX

The PGX engine config file is parsed by PGX at startup-time whenever ServerInstance#startEngine() (or any of its variants) is called. You can either write the path to your configuration file to PGX or perform it programmatically:

Programmatically

All configuration fields exist as Java enums:

Map<PgxConfig.Field, Object> pgxCfg = new HashMap<>();
pgxCfg.put(PgxConfig.Field.MEMORY_CLEANUP_INTERVAL, 600);

ServerInstance instance = ...
instance.startEngine(pgxCfg);

All fields not explicitly set will get default values.

Explicit via File

Instead of a map, you can write the path to a PGX configuration JSON file:

instance.startEngine("path/to/pgx.conf"); // file on local file system
instance.startEngine("hdfs:/path/to/pgx.conf"); // file on HDFS (required $HADOOP_CONF_DIR on the classpath)
instance.startEngine("https:/path/to/pgx.conf"); // file on HTTPS
instance.startEngine("classpath:/path/to/pgx.conf"); // file on current classpath

For all other protocols, you can write directly in the input stream to a JSON file:

InputStream is = ...
instance.startEngine(is); 

Implicit via File

If start() is called without argument, PGX looks for a configuration file at the following places. In case of a conflict, priority is from top to bottom:

  • File path as found in Java system property pgx_conf. Example: java -Dpgx_conf=conf/my.pgx.config.json ...
  • A file named pgx.conf in the root directory of the current classpath
  • A file named pgx.conf in the root directory relative to the current System.getProperty("user.dir") directory

Note: Providing a PGX configuration is optional. A default value for each field (see table above) will be used if the field cannot be found in the given configuration file, or if no configuration file is provided.

Local PGX Shell

To change how the shell configures the local PGX instance, edit $PGX_HOME/conf/pgx.conf. Changes will be reflected the next time you invoke $PGX_HOME/bin/pgx-jshell. You may also change the location of the configuration file with

./bin/pgx-jshell --pgx_conf path/to/my/other/pgx.conf

System Properties

Also, any PGX engine or runtime field can be set via Java system properties by writing -Dpgx.<FIELD>=<VALUE> arguments to the JVM PGX is running on. Note: Setting system properties will overwrite any other configuration. For example, to set the max off-heap size to 256GB, regardless of what another configurations says, use

java -Dpgx.max_off_heap_size=256000 ...

You can also set nested configuration fields, as used for the enterprise scheduler configuration using system properties. The <FIELD> is formed as <CONFIG_FIELD1>__<CONFIG_FIELD2>.

In the following example we will set the default number of IO threads to 4 using system properties:

java -Dpgx.enterprise_scheduler_config__num_io_threads_per_task=4 ...

You can use system properties to provide a single value for an array. Providing multiple values is not possible.

Environment Variables

In addition to system properties, any PGX engine or runtime field can also be set via environment variables by adding 'PGX_=' to the environment the JVM PGX is executed with. Note: Setting environment variables will overwrite any other configuration. In case a system property and an environment variable are set for the same field, system properties will be prioritized over environment variables. For example, to set the max off-heap size to 256GB, using environment variables use

PGX_MAX_OFF_HEAP_SIZE=256000 java ...

Nested configuration fields can be set in the same way as when using system properties (see section above).

To set the default number of IO threads to 4 using environment variables you can do:

PGX_ENTERPRISE_SCHEDULER_CONFIG__NUM_IO_THREADS_PER_TASK=4 java ...

You can use environment variables to provide a single value for an array. Providing multiple values is not possible.

How to Get the Configuration of the Current PGX Server

You can get the current PGX Server configuration by calling ServerInstance#getPgxConfigObject() (or any of its variants) like so:

//blocking variant
PgxConfig pgxConfig = instance.getPgxConfigObject();

//or the async variant
PgxFuture<PgxConfig> future = instance.getPgxConfigObjectAsync()

Do not use PgxConfig.getInstance() to get the current PGX Server configuration as it returns a new instance of PgxConfig initialized with default values which refers to a new configuration and not the actual PGX Server configuration