3. Managing Sessions¶
The PgxSession class can be used to manage and execute operations on the underlying session.
- class pypgx.api.PgxSession(java_session)
A PGX session represents an active user connected to a ServerInstance.
Every session gets a workspace assigned on the server, which can be used to read graph data, create transient data or custom algorithms for the sake of graph analysis. Once a session gets destroyed, all data in the session workspace is freed.
- Variables
LATEST_SNAPSHOT – The timestamp of the most recent snapshot, used to easily move to the newest snapshot (see
set_snapshot()
)- Return type
None
- close()
Close this session object.
- Return type
None
- compile_program(path, overwrite=False)
Compile a Green-Marl program for parallel execution with all optimizations enabled.
- Parameters
path (str) – Path to program
overwrite (bool) – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise
- Return type
- compile_program_code(code, overwrite=False)
Compile a Green-Marl program (if it is supported by the corresponding PyPGX distribution). Otherwise compile a Java program.
- Parameters
code (str) – The Green-Marl/Java code to compile
overwrite (bool) – If the procedure in the given code already exists, overwrite if true, throw an exception otherwise
- Return type
- create_analyst()
Create and return a new analyst.
- Returns
An analyst object
- Return type
- create_frame(schema, column_data, frame_name)
Create a frame with the specified data
- Parameters
schema (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
column_data (Dict[str, List]) – Map of iterables, columnName -> columnData
frame_name (str) – Name of the frame
- Returns
A frame builder initialized with the given schema
- Return type
- create_frame_builder(schema)
Create a frame builder initialized with the given schema
- Parameters
schema (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
- Returns
A frame builder initialized with the given schema
- Return type
- create_graph_builder(id_type='integer', vertex_id_generation_strategy='user_ids', edge_id_generation_strategy='auto_generated')
Create a graph builder with the given vertex ID type and Ids Mode.
- Parameters
id_type (str) – The type of the vertex ID
vertex_id_generation_strategy (str) – The vertices Id generation strategy to be used
edge_id_generation_strategy (str) – The edges Id generation strategy to be used
- Return type
- create_map(key_type, value_type, name=None)
Create a map.
Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]
- Parameters
key_type (str) – Property type of the keys that are going to be stored inside the map
value_type (str) – Property type of the values that are going to be stored inside the map
name (Optional[str]) – Map name
- Returns
A named PgxMap of key content type key_type and value content type value_type
- Return type
- create_sequence(content_type, name=None)
Create a sequence of scalars.
Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]
- Parameters
content_type (str) – Property type of the elements in the sequence
name (Optional[str]) – Sequence name
- Returns
A named ScalarSequence of content type content_type
- Return type
- create_set(content_type, name=None)
Create a set of scalars.
Possible types are: [‘integer’,’long’,’double’,’boolean’,’string’,’vertex’,’edge’, ‘local_date’,’time’,’timestamp’,’time_with_timezone’,’timestamp_with_timezone’]
- Parameters
content_type (str) – content type of the set
name (Optional[str]) – the set’s name
- Returns
A named ScalarSet of content type content_type
- Return type
- describe_graph_file(file_path)
Describe the graph contained in the file at the given path.
- Parameters
file_path (str) – Graph file path
- Returns
The configuration which can be used to load the graph
- Return type
- describe_graph_files(files_path)
Describe the graph contained in the files at the given paths.
- Parameters
files_path (str) – Paths to the files
- Returns
The configuration which can be used to load the graph
- destroy()
Destroy this session object.
- Return type
None
- edge_provider_from_frame(provider_name, source_provider, destination_provider, frame, source_vertex_column='src', destination_vertex_column='dst')
Create an edge provider from a PgxFrame to later build a PgxGraph
- Parameters
provider_name (str) – edge provider name
source_provider (str) – vertex source provider name
destination_provider (str) – vertex destination provider name
frame (pypgx.api.frames._pgx_frame.PgxFrame) – PgxFrame to use
source_vertex_column (str) – column to use as source keys. Defaults to “src”
destination_vertex_column (str) – column to use as destination keys. Defaults to “dst”
- Returns
the EdgeFrameDeclaration object
- Return type
pypgx.api.frames._edge_frame_declaration.EdgeFrameDeclaration
- execute_pgql(pgql_query)
Submit any query with a ON-clause.
The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraphAsync(String).
- Parameters
pgql_query (str) – Query string in PGQL
- Returns
The query result set
- Return type
Optional[pypgx.api._pgql_result_set.PgqlResultSet]
throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.
- explain_pgql(pgql_query)
Explain the execution plan of a pattern matching query.
Note: Different PGX versions may return different execution plans.
- Parameters
pgql_query (str) – Query string in PGQL
- Returns
The query plan
- Return type
pypgx.api._operation.Operation
- get_available_compiled_program_ids()
Get the set of available compiled program IDs.
- Return type
Set[str]
- get_available_snapshots(snapshot)
Return a list of all available snapshots of the given input graph.
- Parameters
snapshot (pypgx.api._pgx_graph.PgxGraph) – A ‘PgxGraph’ object for which the available snapshots shall be retrieved
- Returns
A list of ‘GraphMetaData’ objects, each corresponding to a snapshot of the input graph
- Return type
- get_compiled_program(id)
Get a compiled program by ID.
- Parameters
id (str) – The id of the compiled program
- Return type
- get_graph(name, namespace=None)
Find and return a graph with name name within the given namespace loaded inside PGX.
The search for the snapshot to return is done according to the following rules:
if namespace is private, than the search occurs on already referenced snapshots of the graph with name name and the most recent snapshot is returned
if namespace is public, then the search occurs on published graphs and the most recent snapshot of the published graph with name name is returned
if namespace is None, then the private namespace is searched first and, if no snapshot is found, the public namespace is then searched
Multiple calls of this method with the same parameters will return different
PgxGraph
objects referencing the same graph, with the server keeping track of how many references a session has to each graph.Therefore, a graph is released within the server either if:
all the references are moved to another graph (e.g. via
set_snapshot()
)the
PgxGraph.destroy()
method is called on one reference. Note that this invalidates all references
- get_graphs(namespace=None)
Return a collection of graph names accessible under the given namespace.
- Parameters
namespace (Optional[pypgx.api._namespace.Namespace]) – The namespace where to look up the graphs
- Return type
List[str]
- get_idle_timeout()
Get the idle timeout of this session
- Returns
the idle timeout in seconds
- Return type
int
- get_name()
Get the identifier of the current session.
- Returns
identifier of this session
- Return type
str
- get_pgql_result_set(id)
Get a PGQL result set by ID.
- Parameters
id (str) – The PGQL result set ID
- Returns
The requested PGQL result set or None if no such result set exists for this session
- Return type
Optional[pypgx.api._pgql_result_set.PgqlResultSet]
- get_session_context()
Get the context describing the current session.
- Returns
context of this session
- Return type
pypgx.api._session_context.SessionContext
- get_source()
Get the current session source
- Returns
session source
- Return type
str
- get_task_timeout()
Get the task timeout of this session
- Returns
the task timeout in seconds
- Return type
int
- graph_from_frames(graph_name, vertex_providers, edge_providers, partitioned=True)
Create PgxGraph from vertex providers and edge providers.
partitioned must be set to True if multiple vertex or edge providers are given
- Parameters
graph_name (str) – graph name
vertex_providers (List[pypgx.api.frames._vertex_frame_declaration.VertexFrameDeclaration]) – list of vertex providers
edge_providers (List[pypgx.api.frames._edge_frame_declaration.EdgeFrameDeclaration]) – list of edge providers
partitioned (bool) – whether the graph is partitioned or not. Defaults to True
- Returns
the PgxGraph object
- Return type
- pandas_to_pgx_frame(pandas_dataframe, frame_name)
Create a frame from a pandas dataframe.
Duplicate columns will be renamed. Mixed column types are not supported.
This method requires pandas.
- Parameters
pandas_dataframe – The Pandas dataframe to use
frame_name (str) – Name of the frame
- Returns
the frame created
- Return type
- prepare_pgql(pgql_query)
Prepare a pattern matching query with a ON-clause.
The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as getGraph(String).
- Parameters
pgql_query (str) – Query string in PGQL
- Returns
A prepared statement object
- Return type
- query_pgql(pgql_query)
Submit a pattern matching query with a ON-clause.
The ON-clause indicates the graph on which the query will be executed. The graph name in the ON-clause is evaluated with the same semantics as PgxSession.getGraph(String).
- Parameters
pgql_query (str) – Query string in PGQL
- Returns
The query result set
- Return type
Optional[pypgx.api._pgql_result_set.PgqlResultSet]
throws InterruptedException if the caller thread gets interrupted while waiting for completion. throws ExecutionException if any exception occurred during asynchronous execution. The actual exception will be nested.
- read_frame()
Create a new frame reader with which it is possible to parameterize the loading of the row frame.
- Returns
A frame reader object with which it is possible to parameterize the loading
- Return type
- read_graph_as_of(config, meta_data=None, creation_timestamp=None, new_graph_name=None)
Read a graph and its properties of a specific version (metaData or creationTimestamp) into memory.
The creationTimestamp must be a valid version of the graph.
- Parameters
config (pypgx.api._graph_config.GraphConfig) – The graph config
meta_data (Optional[pypgx.api._graph_meta_data.GraphMetaData]) – The metaData object returned by get_available_snapshots(GraphConfig) identifying the version
creation_timestamp (Optional[int]) – The creation timestamp (milliseconds since jan 1st 1970) identifying the version to be checked out
new_graph_name (Optional[str]) – How the graph should be named. If None, a name will be generated.
- Returns
The PgxGraph object
- Return type
- read_graph_by_name(graph_name, graph_source)
- Parameters
graph_name (str) – Name of graph
graph_source (str) – Source of graph
- Return type
- read_graph_file(file_path, file_format=None, graph_name=None)
- Parameters
file_path (str) – File path
file_format (Optional[str]) – File format of graph
graph_name (Optional[str]) – Name of graph
- Return type
- read_graph_files(file_paths, edge_file_paths=None, file_format=None, graph_name=None)
Load the graph contained in the files at the given paths.
- Parameters
file_paths (Union[str, Iterable[str]]) – Paths to the vertex files
edge_file_paths (Optional[Union[str, Iterable[str]]]) – Path to the edge file
file_format (Optional[str]) – File format
graph_name (Optional[str]) – Loaded graph name
- Return type
- read_graph_with_properties(config, max_age=9223372036854775807, max_age_time_unit='days', block_if_full=False, update_if_not_fresh=True, graph_name=None)
Read a graph and its properties, specified in the graph config, into memory.
- Parameters
config (Union[str, pathlib.PurePath, Dict[str, Any], pypgx.api._graph_config.GraphConfig]) – The graph config
max_age (int) – If another snapshot of the given graph already exists, the age of the latest existing snapshot will be compared to the given maxAge. If the latest snapshot is in the given range, it will be returned, otherwise a new snapshot will be created.
max_age_time_unit (str) – The time unit of the maxAge parameter
block_if_full (bool) – If true and a new snapshot needs to be created but no more snapshots are allowed by the server configuration, the returned future will not complete until space becomes available. Iterable full and this flag is false, the returned future will complete exceptionally instead.
update_if_not_fresh (bool) – If a newer data version exists in the backing data source (see PgxGraph.is_fresh()), this flag tells whether to read it and create another snapshot inside PGX. If the “snapshots_source” field of config is SnapshotsSource.REFRESH, the returned graph may have multiple snapshots, depending on whether previous reads with the same config occurred; otherwise, if the “snapshots_source” field is SnapshotsSource.CHANGE_SET, only the most recent snapshot (either pre-existing or freshly read) will be visible.
graph_name (Optional[str]) – How the graph should be named. If null, a name will be generated. If a graph with that name already exists, the returned future will complete exceptionally.
- Return type
- read_subgraph_from_pg_view(view, queries=None, config=None)
Load a graph from PG Views.
- Parameters
view (str) – The name of the PG View.
queries (Union[None, str, pypgx.api._graph_offloading.PreparedPgqlQuery, List[Union[str, pypgx.api._graph_offloading.PreparedPgqlQuery]]]) – A query or queries used to specify which data is to be loaded.
config (Optional[pypgx.api._graph_config.GraphConfig]) – An optional config used to describe how data should be loaded.
- Returns
The graph.
- Return type
- register_keystore(keystore_path, keystore_password)
Register a keystore.
- Parameters
keystore_path (str) – The path to the keystore which shall be registered
keystore_password (str) – The password of the provided keystore
- Return type
None
- property server_instance: pypgx.api._server_instance.ServerInstance
Get the server instance.
- Returns
The server instance
- set_snapshot(graph, meta_data=None, creation_timestamp=None, force_delete_properties=False)
Set a graph to a specific snapshot.
You can use this method to jump back and forth in time between various snapshots of the same graph. If successful, the given graph will point to the requested snapshot after the returned future completes.
- Parameters
graph (Union[str, pypgx.api._pgx_graph.PgxGraph]) – Input graph
meta_data (Optional[pypgx.api._graph_meta_data.GraphMetaData]) – A GraphMetaData object used to identify the snapshot
creation_timestamp (Optional[int]) – The metaData object returned by (GraphConfig) identifying the version to be checked out
force_delete_properties (bool) – Graphs with transient properties cannot be checked out to a different version. If this flag is set to true, the checked out graph will no longer contain any transient properties. If false, the returned future will complete exceptionally with an UnsupportedOperationException as its cause.
- Return type
None
- vertex_provider_from_frame(provider_name, frame, vertex_key_column='id')
Create a vertex provider from a PgxFrame to later build a PgxGraph
- Parameters
provider_name (str) – vertex provider name
frame (pypgx.api.frames._pgx_frame.PgxFrame) – PgxFrame to use
vertex_key_column (str) – column to use as keys. Defaults to “id”
- Returns
the VertexFrameDeclaration object
- Return type
pypgx.api.frames._vertex_frame_declaration.VertexFrameDeclaration