3. Managing Sessions

The PgxSession class can be used to manage and execute operations on the underlying session.

class pypgx.api.PgxSession(java_session)

A PGX session represents an active user connected to a ServerInstance.

Every session gets a workspace assigned on the server, which can be used to read graph data, create transient data or custom algorithms for the sake of graph analysis. Once a session gets destroyed, all data in the session workspace is freed.

Variables

LATEST_SNAPSHOT – The timestamp of the most recent snapshot, used to easily move to the newest snapshot (see set_snapshot())

Return type

None

close()

Close this session object.

Return type

None

compile_program(path, overwrite=False)

Compile a Green-Marl program for parallel execution with all optimizations enabled.

Parameters
  • path (str) – Path to program

  • overwrite (bool) – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise

Return type

pypgx.api._compiled_program.CompiledProgram

compile_program_code(code, overwrite=False)

Compile a Green-Marl program (if it is supported by the corresponding PyPGX distribution). Otherwise compile a Java program.

Parameters
  • code (str) – The Green-Marl/Java code to compile

  • overwrite (bool) – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise

Return type

pypgx.api._compiled_program.CompiledProgram

create_analyst()

Create and return a new analyst.

Returns

An analyst object

Return type

pypgx.api._analyst.Analyst

create_frame(schema, column_data, frame_name)

Create a frame with the specified data

Parameters
  • schema (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

  • column_data (Dict[str, List]) – Map of iterables, columnName -> columnData

  • frame_name (str) – Name of the frame

Returns

A frame builder initialized with the given schema

Return type

pypgx.api.frames._pgx_frame.PgxFrame

create_frame_builder(schema)

Create a frame builder initialized with the given schema

Parameters

schema (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

A frame builder initialized with the given schema

Return type

pypgx.api.frames._pgx_frame_builder.PgxFrameBuilder

create_graph_builder(id_type='integer', vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='auto_generated')

Create a graph builder with the given vertex ID type and Ids Mode.

Parameters
  • id_type (str) – The type of the vertex ID

  • vertex_id_generation_strategy (str) – The vertices Id generation strategy to be used

  • edge_id_generation_strategy (str) – The edges Id generation strategy to be used

Return type

pypgx.api._graph_builder.GraphBuilder

create_map(key_type, value_type, name=None)

Create a map.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • key_type (str) – Property type of the keys that are going to be stored inside the map

  • value_type (str) – Property type of the values that are going to be stored inside the map

  • name (Optional[str]) – Map name

Returns

A named PgxMap of key content type key_type and value content type value_type

Return type

pypgx.api._pgx_map.PgxMap

create_sequence(content_type, name=None)

Create a sequence of scalars.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • content_type (str) – Property type of the elements in the sequence

  • name (Optional[str]) – Sequence name

Returns

A named ScalarSequence of content type content_type

Return type

pypgx.api._pgx_collection.ScalarSequence

create_set(content_type, name=None)

Create a set of scalars.

Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]

Parameters
  • content_type (str) – content type of the set

  • name (Optional[str]) – the set’s name

Returns

A named ScalarSet of content type content_type

Return type

pypgx.api._pgx_collection.ScalarSet

describe_graph_file(file_path)

Describe the graph contained in the file at the given path.

Parameters

file_path (str) – Graph file path

Returns

The configuration which can be used to load the graph

Return type

pypgx.api._graph_config.GraphConfig

describe_graph_files(files_path)

Describe the graph contained in the files at the given paths.

Parameters

files_path (str) – Paths to the files

Returns

The configuration which can be used to load the graph

destroy()

Destroy this session object.

Return type

None

edge_provider_from_frame(provider_name, source_provider, destination_provider, frame, source_vertex_column='src', destination_vertex_column='dst')

Create an edge provider from a PgxFrame to later build a PgxGraph

Parameters
  • provider_name (str) – edge provider name

  • source_provider (str) – vertex source provider name

  • destination_provider (str) – vertex destination provider name

  • frame (pypgx.api.frames._pgx_frame.PgxFrame) – PgxFrame to use

  • source_vertex_column (str) – column to use as source keys. Defaults to “src”

  • destination_vertex_column (str) – column to use as destination keys. Defaults to “dst”

Returns

the EdgeFrameDeclaration object

Return type

pypgx.api.frames._edge_frame_declaration.EdgeFrameDeclaration

execute_pgql(pgql_query)

Submit any query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraphAsync(String).

Parameters

pgql_query (str) – Query string in PGQL

Returns

The query result set

Return type

Optional[pypgx.api._pgql_result_set.PgqlResultSet]

throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

explain_pgql(pgql_query)

Explain the execution plan of a pattern matching query.

Note: Different PGX versions may return different execution plans.

Parameters

pgql_query (str) – Query string in PGQL

Returns

The query plan

Return type

pypgx.api._operation.Operation

get_available_compiled_program_ids()

Get the set of available compiled program IDs.

Return type

Set[str]

get_available_snapshots(snapshot)

Return a list of all available snapshots of the given input graph.

Parameters

snapshot (pypgx.api._pgx_graph.PgxGraph) – A ‘PgxGraph’ object for which the available snapshots shall be retrieved

Returns

A list of ‘GraphMetaData’ objects, each corresponding to a snapshot of the input graph

Return type

List[pypgx.api._graph_meta_data.GraphMetaData]

get_compiled_program(id)

Get a compiled program by ID.

Parameters

id (str) – The id of the compiled program

Return type

pypgx.api._compiled_program.CompiledProgram

get_graph(name, namespace=None)

Find and return a graph with name name within the given namespace loaded inside PGX.

The search for the snapshot to return is done according to the following rules:

  • if namespace is private, than the search occurs on already referenced snapshots of the graph with name name and the most recent snapshot is returned

  • if namespace is public, then the search occurs on published graphs and the most recent snapshot of the published graph with name name is returned

  • if namespace is None, then the private namespace is searched first and, if no snapshot is found, the public namespace is then searched

Multiple calls of this method with the same parameters will return different PgxGraph objects referencing the same graph, with the server keeping track of how many references a session has to each graph.

Therefore, a graph is released within the server either if:

  • all the references are moved to another graph (e.g. via set_snapshot())

  • the PgxGraph.destroy() method is called on one reference. Note that this invalidates all references

Parameters
  • name (str) – The name of the graph

  • namespace (Namespace or None) – The namespace where to look up the graph

Returns

The graph with the given name

Return type

PgxGraph or None

get_graphs(namespace=None)

Return a collection of graph names accessible under the given namespace.

Parameters

namespace (Optional[pypgx.api._namespace.Namespace]) – The namespace where to look up the graphs

Return type

List[str]

get_idle_timeout()

Get the idle timeout of this session

Returns

the idle timeout in seconds

Return type

int

get_name()

Get the identifier of the current session.

Returns

identifier of this session

Return type

str

get_pgql_result_set(id)

Get a PGQL result set by ID.

Parameters

id (str) – The PGQL result set ID

Returns

The requested PGQL result set or None if no such result set exists for this session

Return type

Optional[pypgx.api._pgql_result_set.PgqlResultSet]

get_session_context()

Get the context describing the current session.

Returns

context of this session

Return type

pypgx.api._session_context.SessionContext

get_source()

Get the current session source

Returns

session source

Return type

str

get_task_timeout()

Get the task timeout of this session

Returns

the task timeout in seconds

Return type

int

graph_from_frames(graph_name, vertex_providers, edge_providers, partitioned=True)

Create PgxGraph from vertex providers and edge providers.

partitioned must be set to True if multiple vertex or edge providers are given

Parameters
  • graph_name (str) – graph name

  • vertex_providers (List[pypgx.api.frames._vertex_frame_declaration.VertexFrameDeclaration]) – list of vertex providers

  • edge_providers (List[pypgx.api.frames._edge_frame_declaration.EdgeFrameDeclaration]) – list of edge providers

  • partitioned (bool) – whether the graph is partitioned or not. Defaults to True

Returns

the PgxGraph object

Return type

pypgx.api._pgx_graph.PgxGraph

pandas_to_pgx_frame(pandas_dataframe, frame_name)

Create a frame from a pandas dataframe.

Duplicate columns will be renamed. Mixed column types are not supported.

This method requires pandas.

Parameters
  • pandas_dataframe – The Pandas dataframe to use

  • frame_name (str) – Name of the frame

Returns

the frame created

Return type

pypgx.api.frames._pgx_frame.PgxFrame

prepare_pgql(pgql_query)

Prepare a pattern matching query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as getGraph(String).

Parameters

pgql_query (str) – Query string in PGQL

Returns

A prepared statement object

Return type

pypgx.api._prepared_statement.PreparedStatement

query_pgql(pgql_query)

Submit a pattern matching query with a ON-clause.

The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraph(String).

Parameters

pgql_query (str) – Query string in PGQL

Returns

The query result set

Return type

Optional[pypgx.api._pgql_result_set.PgqlResultSet]

throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.

read_frame()

Create a new frame reader with which it is possible to parameterize the loading of the row frame.

Returns

A frame reader object with which it is possible to parameterize the loading

Return type

pypgx.api.frames._pgx_frame_reader.PgxGenericFrameReader

read_graph_as_of(config, meta_data=None, creation_timestamp=None, new_graph_name=None)

Read a graph and its properties of a specific version (metaData or creationTimestamp) into memory.

The creationTimestamp must be a valid version of the graph.

Parameters
  • config (pypgx.api._graph_config.GraphConfig) – The graph config

  • meta_data (Optional[pypgx.api._graph_meta_data.GraphMetaData]) – The metaData object returned by get_available_snapshots(GraphConfig) identifying the version

  • creation_timestamp (Optional[int]) – The creation timestamp (milliseconds since jan 1st 1970) identifying the version to be checked out

  • new_graph_name (Optional[str]) – How the graph should be named. If None, a name will be generated.

Returns

The PgxGraph object

Return type

pypgx.api._pgx_graph.PgxGraph

read_graph_by_name(graph_name, graph_source)
Parameters
  • graph_name (str) – Name of graph

  • graph_source (str) – Source of graph

Return type

pypgx.api._pgx_graph.PgxGraph

read_graph_file(file_path, file_format=None, graph_name=None)
Parameters
  • file_path (str) – File path

  • file_format (Optional[str]) – File format of graph

  • graph_name (Optional[str]) – Name of graph

Return type

pypgx.api._pgx_graph.PgxGraph

read_graph_files(file_paths, edge_file_paths=None, file_format=None, graph_name=None)

Load the graph contained in the files at the given paths.

Parameters
  • file_paths (Union[str, Iterable[str]]) – Paths to the vertex files

  • edge_file_paths (Optional[Union[str, Iterable[str]]]) – Path to the edge file

  • file_format (Optional[str]) – File format

  • graph_name (Optional[str]) – Loaded graph name

Return type

pypgx.api._pgx_graph.PgxGraph

read_graph_with_properties(config, max_age=9223372036854775807, max_age_time_unit='days', block_if_full=False, update_if_not_fresh=True, graph_name=None)

Read a graph and its properties, specified in the graph config, into memory.

Parameters
  • config (Union[str, pathlib.PurePath, Dict[str, Any], pypgx.api._graph_config.GraphConfig]) – The graph config

  • max_age (int) – If another snapshot of the given graph already exists, the age of the latest existing snapshot will be compared to the given maxAge. If the latest snapshot is in the given range, it will be returned, otherwise a new snapshot will be created.

  • max_age_time_unit (str) – The time unit of the maxAge parameter

  • block_if_full (bool) – If true and a new snapshot needs to be created but no more snapshots are allowed by the server configuration, the returned future will not complete until space becomes available. Iterable full and this flag is false, the returned future will complete exceptionally instead.

  • update_if_not_fresh (bool) – If a newer data version exists in the backing data source (see PgxGraph.is_fresh()), this flag tells whether to read it and create another snapshot inside PGX. If the “snapshots_source” field of config is SnapshotsSource.REFRESH, the returned graph may have multiple snapshots, depending on whether previous reads with the same config occurred; otherwise, if the “snapshots_source” field is SnapshotsSource.CHANGE_SET, only the most recent snapshot (either pre-existing or freshly read) will be visible.

  • graph_name (Optional[str]) – How the graph should be named. If null, a name will be generated. If a graph with that name already exists, the returned future will complete exceptionally.

Return type

pypgx.api._pgx_graph.PgxGraph

read_subgraph_from_pg_view(view, queries=None, config=None)

Load a graph from PG Views.

Parameters
Returns

The graph.

Return type

pypgx.api._pgx_graph.PgxGraph

register_keystore(keystore_path, keystore_password)

Register a keystore.

Parameters
  • keystore_path (str) – The path to the keystore which shall be registered

  • keystore_password (str) – The password of the provided keystore

Return type

None

property server_instance: pypgx.api._server_instance.ServerInstance

Get the server instance.

Returns

The server instance

set_snapshot(graph, meta_data=None, creation_timestamp=None, force_delete_properties=False)

Set a graph to a specific snapshot.

You can use this method to jump back and forth in time between various snapshots of the same graph. If successful, the given graph will point to the requested snapshot after the returned future completes.

Parameters
  • graph (Union[str, pypgx.api._pgx_graph.PgxGraph]) – Input graph

  • meta_data (Optional[pypgx.api._graph_meta_data.GraphMetaData]) – A GraphMetaData object used to identify the snapshot

  • creation_timestamp (Optional[int]) – The metaData object returned by (GraphConfig) identifying the version to be checked out

  • force_delete_properties (bool) – Graphs with transient properties cannot be checked out to a different version. If this flag is set to true, the checked out graph will no longer contain any transient properties. If false, the returned future will complete exceptionally with an UnsupportedOperationException as its cause.

Return type

None

vertex_provider_from_frame(provider_name, frame, vertex_key_column='id')

Create a vertex provider from a PgxFrame to later build a PgxGraph

Parameters
  • provider_name (str) – vertex provider name

  • frame (pypgx.api.frames._pgx_frame.PgxFrame) – PgxFrame to use

  • vertex_key_column (str) – column to use as keys. Defaults to “id”

Returns

the VertexFrameDeclaration object

Return type

pypgx.api.frames._vertex_frame_declaration.VertexFrameDeclaration