Frames

PgxFrame and other classes related to frames.

class pypgx.api.frames.DataTypes

Bases: object

This class can be used to construct parametrized data types (e.g., VectorType).

static vector(componenttype, dimension)

Create and return a new VectorType.

Parameters
  • componenttype (str) –

  • dimension (int) –

Return type

pypgx.api.frames._pgx_data_types.VectorType

class pypgx.api.frames.PgxCsvFrameReader(java_pgx_csv_frame_reader)

Bases: object

Class for reading PgxFrame objects from CSV files.

Return type

None

auto_detect_columns(auto_detect)

Enable or disable the autodetection of columns from the table.

Executing this function clears the currently loaded column descriptors.

Parameters

auto_detect (bool) – True if the columns should be autodetected, False otherwise

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxCsvFrameReader

clear_columns()

Clear the current configuration of which columns should be loaded and how.

Returns

None

Return type

None

columns(column_descriptors)

Set the columns to be loaded from their columnDescriptors.

Parameters

column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxCsvFrameReader

load(uris)

Load a PgxFrame from the provided URIs.

Parameters

uris (str) – the URIs from which to load the frame

Returns

PgxFrame instance

Return type

pypgx.api.frames._pgx_frame.PgxFrame

load_async(uris)

Load a PgxFrame from the provided URIs.

Parameters

uris (str) – the URIs from which to load the frame

Returns

PgxFrame instance

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – New name for the PgxFrame

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxCsvFrameReader

separator(sep)

Set the separator for CSV parsing to sep.

Parameters

sep (str) – char denoting the separator

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxCsvFrameReader

class pypgx.api.frames.PgxCsvFrameStorer(java_pgx_csv_frame_storer)

Bases: object

Class for configuring the storing operation of a PgxFrame to a CSV file and then triggering it.

Return type

None

clear_columns()

Clear columns

Returns

Return type

None

columns(column_descriptors)

Set columns

Parameters

column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – frame name.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

overwrite(overwrite_bool)

Set overwrite

Parameters

overwrite_bool (bool) – denotes if the table should be overwritten.

Returns

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

partition_extension(file_extension)

Set the fileExtension of the created CSV files.

Parameters

file_extension (str) – string denoting the file extension for the created files.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

partitions(num_partitions)

Set the number of files to be created.

Parameters

num_partitions (int) – number of partitions created.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

separator(sep)

Set the separator for CSV file to sep.

Parameters

sep (str) – char denoting the separator

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

store()

Store PgxFrame

Returns

PgxFrame instance

Return type

None

store_async()

Store PgxFrame

Returns

PgxFrame instance

table_name(table_name)

Set the table name in the database.

Parameters

table_name (str) – nodes table name.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxCsvFrameStorer

class pypgx.api.frames.PgxDbFrameReader(java_pgx_db_frame_reader)

Bases: object

Class for reading PgxFrame objects from a database.

Return type

None

auto_detect_columns(auto_detect)

Enable or disable the autodetection of columns from the table.

Executing this function clears the currently loaded column descriptors.

Parameters

auto_detect (bool) – True if the columns should be autodetected, False otherwise

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

clear_columns()

Clear the current configuration of which columns should be loaded and how.

Returns

None

Return type

None

columns(column_descriptors)

Set the columns to be loaded from their columnDescriptors.

Executing this function disables autodetection of columns.

Parameters

column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

connections(connections)

Set the number of connections to read/write data from/to the database provider

Parameters

connections (int) – number of connections

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

data_source_id(data_source_id)

Set the datasource ID.

Parameters

data_source_id (str) – the datasource ID

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

jdbc_url(jdbc_url)

Set the jdbc URL to use for connecting to the DB.

Parameters

jdbc_url – the jdbc URL

Returns

self

keystore_alias(keystore_alias)

Set the keystore alias.

Parameters

keystore_alias – the keystore alias.

Returns

self

load()

Load a PgxFrame from the database.

Parameters

uris – the URIs from which to load the frame

Returns

PgxFrame instance

Return type

pypgx.api.frames._pgx_frame.PgxFrame

load_async()

Load a PgxFrame from the database.

Parameters

uris – the URIs from which to load the frame

Returns

PgxFrame instance

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – New name for the PgxFrame

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

owner(owner)

Set the owner of the table.

Parameters

owner (str) – the owner

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

password(password)

Set the password of the database.

Parameters

password (str) – the password

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

schema(schema)

Set the schema of the table.

Parameters

schema (str) – the schema.

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

table_name(table_name)

Set the table name in the database.

Parameters

table_name (str) – nodes table name.

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

username(username)

Set the username of the database.

Parameters

username (str) – username

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxDbFrameReader

class pypgx.api.frames.PgxDbFrameStorer(java_pgx_db_frame_storer)

Bases: object

Class for configuring the storing operation of a PgxFrame to a database and then triggering it.

Return type

None

clear_columns()

Clear columns

Returns

self

Return type

None

columns(column_descriptors)

Set columns

Parameters

column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

connections(connections)

Set the number of connections to read/write data from/to the database provider

Parameters

connections (int) – number of connections

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

data_source_id(data_source_id)

Set the datasource ID.

Parameters

data_source_id (str) – the datasource ID

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

jdbc_url(jdbc_url)

Set jdbc url

Parameters

jdbc_url (str) –

Returns

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

keystore_alias(keystore_alias)

Set the keystore alias.

Parameters

keystore_alias (str) – the keystore alias.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – frame name.

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

overwrite(overwrite_bool)

Set overwrite

Parameters

overwrite_bool (bool) –

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

owner(owner)

Set the owner of the table.

Parameters

owner (str) – the owner

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

password(password)

Set the password of the database.

Parameters

password (str) – the password

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

schema(schema)

Set the schema of the table.

Parameters

schema (str) – the schema.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

store()

Store the PgxFrame.

Returns

None

Return type

None

store_async()

Store the PgxFrame.

Returns

None

Return type

None

table_name(table_name)

Set the table name in the database.

Parameters

table_name (str) – nodes table name.

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

username(username)

Set the username of the database.

Parameters

username (str) – username

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxDbFrameStorer

class pypgx.api.frames.PgxFrame(java_pgx_frame)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

Data-structure to load/store and manipulate tabular data.

It contains rows and columns. A PgxFrame can contain multiple columns where each column consist of elements of the same data type, and has a name. The list of the columns with their names and data types defines the schema of the frame. (The number of rows in the PgxFrame is not part of the schema of the frame.)

Return type

None

clone()

Create a new PgxFrame with the same content as the current frame

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

close()

Free resources on the server taken up by this frame.

Return type

None

property columns: List[str]

Get the names of the columns contained in the PgxFrame.

Return type

list

count()

Count number of elements in the frame.

Return type

int

destroy()

Free resources on the server taken up by this frame.

Return type

None

flatten(*columns, inplace=False)

Create a new PgxFrame with all the specified columns and vector columns flattened into multiple columns.

Parameters
  • columns (str) – Column names

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

flatten_all(inplace=False)

Create a new PgxFrame with all nested columns and vector columns flattened into multiple columns.

Parameters

inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

get_column(name)

Return a PgxFrameColumn.

Parameters

name (str) – Column name

Returns

PgxFrameColumn

Return type

pypgx.api.frames._pgx_frame.PgxFrameColumn

get_column_descriptors()

Return a list containing the description of the different columns of the frames.

Return type

List[Tuple[str, str]]

head(num_rows=10, inplace=False)

Return the first num_rows elements of the frame

Parameters
  • num_rows (int) – Number of rows to take

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

join(right, join_key_column=None, left_join_key_column=None, right_join_key_column=None, left_prefix=None, right_prefix=None, inplace=False)

Create a new PgxFrame by performing a join operation.

Create a new PgxFrame by adding the columns of the right frame to this frame, aligned on equality of entries in column left_join_key_column for this frame and column right_join_key_column for the right frame, or join_key_columns on both frames. The resulting frame will contain the columns of this frame prefixed by left_prefix and the columns of right frame prefixed by right_prefix (if the prefixes are not null). Prefixes must ether not be set or both be set.

Parameters
  • right (pypgx.api.frames._pgx_frame.PgxFrame) – PgxFrame whose columns will be added to the columns of this PgxFrame

  • join_key_column (Optional[str]) – Column of both frames on which the equality test will be performed

  • left_join_key_column (Optional[str]) – Column of this frame on which the equality test will be performed with right_join_key_column

  • right_join_key_column (Optional[str]) – Column of right frame on which the equality test will be performed with leftJoinKeyColumn

  • left_prefix (Optional[str]) – Prefix of the columns name of this frame in the resulting frame

  • right_prefix (Optional[str]) – Prefix of the columns name of right frame in the resulting frame

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

property length: int

Return the number of rows in the frame.

Returns

number of rows

print(file=None, num_results=1000, start=0)

Print the frame.

Parameters
  • file (Optional[TextIO]) – File to which results are printed (default is sys.stdout)

  • num_results (int) – Number of results to be printed

  • start (int) – Index of the first result to be printed

Return type

None

rename_column(old_column_name, new_column_name, inplace=False)

Return a PgxFrame with the column name modified.

Parameters
  • old_column_name (str) – name of the column to rename

  • new_column_name (str) – name of the column after the operation

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

rename_columns(column_renaming, inplace=False)

Return a PgxFrame with the column name modified.

Parameters
  • column_renaming (Mapping[str, str]) – dict-like holding old_column names as keys and new column names as values

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

select(*columns, inplace=False)

Select multiple columns by column name.

Parameters
  • columns (str) – Column names

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

store(path, file_format='csv', overwrite=True)

Store the frame in a file.

Parameters
  • path (str) – Path where to store the frame

  • file_format (str) – Storage format

  • overwrite (bool) – Overwrite current file

Return type

None

tail(num_rows=10, inplace=False)

Return the last num_rows elements of the frame

Parameters
  • num_rows (int) – Number of rows to take

  • inplace (bool) – Apply the changes inplace and return self

Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

to_pandas()

Convert to pandas DataFrame.

This method may change result_set cursor.

This method requires pandas.

Returns

PgxFrame as a Pandas Dataframe

to_pgql_result_set()

Create a new PgqlResultSet having the same content as this frame.

Returns

PgqlResultSet

Return type

pypgx.api._pgql_result_set.PgqlResultSet

union(*frames, inplace=False)

Create a PgxFrame by concatenating the rows of this frame with the rows of the frames in frames. The different frames should have the same columns (same names, types and dimensions), in the same order. The resulting frame is not guaranteed to have any specific ordering of its rows.

Parameters
Returns

PgxFrame

Return type

pypgx.api.frames._pgx_frame.PgxFrame

write()

Get Pgx Frame storer

Returns

PgxGenericFrameStorer

Return type

pypgx.api.frames._pgx_frame_storer.PgxGenericFrameStorer

class pypgx.api.frames.PgxFrameBuilder(java_pgx_frame_builder)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

A frame builder for constructing a PgxFrame.

Return type

None

add_rows(column_data)

Add the data to the frame builder.

Parameters

column_data (Dict[str, Any]) – the column data in a dictionary

Returns

self

Return type

pypgx.api.frames._pgx_frame_builder.PgxFrameBuilder

build(frame_name)

Build the frame with the given frame name.

Parameters

frame_name (str) – the name of the frame to create

Returns

the newly frame created

Return type

pypgx.api.frames._pgx_frame.PgxFrame

close()

Free resources on the server taken up by this frame builder. After this method returns, the behaviour of any methods of this class becomes undefined.

Return type

None

destroy()

Free resources on the server taken up by this frame builder. After this method returns, the behaviour of any methods of this class becomes undefined.

Return type

None

class pypgx.api.frames.PgxFrameColumn(java_pgx_frame_column)

Bases: pypgx.api._pgx_context_manager.PgxContextManager

Class representing one column of a PgxFrame.

Return type

None

close()

Destroy/close the object.

Base implementation, overridable by subclasses.

Return type

None

destroy()

Free resources on the server taken up by this column.

Return type

None

get_descriptor()

Return a description of the column.

Return type

Tuple[str, str]

class pypgx.api.frames.PgxGenericFrameReader(java_pgx_generic_frame_reader)

Bases: object

A generic class for reading PgxFrame objects from various sources.

The class allows configuration of how the data should be read and facilitates the creation of specialized frame readers (PgxCsvFrameReader and PgxPgbFrameReader).

Return type

None

auto_detect_columns(auto_detect)

Enable or disable the autodetection of columns from the table.

Not all formats support autodetection of the columns (only DB in fact).

Executing this function clears the currently loaded column descriptors.

Parameters

auto_detect (bool) – True if the columns should be autodetected, False otherwise

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxGenericFrameReader

clear_columns()

Clear the current configuration of which columns should be loaded and how.

Returns

None

Return type

None

columns(column_descriptors)

Set the columns to be loaded from their columnDescriptors.

Executing this function disables autodetection of columns.

Parameters

column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxGenericFrameReader

csv(uris=None)

Create a PgxCsvFrameReader object for loading CSV files from URIs.

Parameters

uris (Optional[str]) – List of paths to the csv files

Return type

PgxCsvFrameReader

csv_async(uris)

Read a PgxFrame from a list of URIs to CSV files.

Parameters

uris (str) – list denoting the URIs

Returns

the read frame

db()

Create a PgxDbFrameReader object to load PgxFrame from a database.

Return type

PgxDbFrameReader

format(format)

Return a frame reader for the type format specified.

Parameters

format (str) – format to be loaded Can be one of ‘csv’ or ‘pgb’

Returns

a loader for the format

Return type

PgxCsvFrameReader or PgxPgbFrameReader

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – New name for the PgxFrame

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxGenericFrameReader

pgb(uris=None)

Create a PgxPgbFrameReader object for loading PGB files.

Parameters

uris (Optional[str]) – List of paths to the PGB files

Return type

PgxPgbFrameReader

pgb_async(uris)

Read a PgxFrame from a list of URIs to PGB files.

Parameters

uris (str) – list denoting the URIs

Returns

the read frame

class pypgx.api.frames.PgxGenericFrameStorer(java_pgx_generic_frame_storer)

Bases: object

Class for configuring the storing operation of a PgxFrame and then triggering it.

Return type

None

csv()

Get a PgxCsvFrameStorer instance for the PgxFrame

Return type

PgxCsvFrameStorer

db()

Get a PgxDbFrameStorer instance for the PgxFrame

Return type

PgxDbFrameStorer

format(format)

Create a specialized frame storer for saving the PgxFrame in a given format.

Parameters

format (str) – identifier of the wanted format. Can be one of ‘csv’, ‘pgb’ or ‘db’

Return type

PgxCsvFrameStorer, PgxPgbFrameStorer or PgxDbFrameStorer

name(frame_name)

Set the name of the stored frame.

Parameters

frame_name (str) – the new frame name

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxGenericFrameStorer

overwrite(overwrite_bool)

Set overwrite

Parameters

overwrite_bool (bool) – denotes if the table should be overwritten.

Returns

self

Return type

pypgx.api.frames._pgx_frame_storer.PgxGenericFrameStorer

pgb()

Get a PgxPgbFrameStorer instance for the PgxFrame.

Return type

PgxPgbFrameStorer

class pypgx.api.frames.PgxPgbFrameReader(java_pgx_pgb_frame_reader)

Bases: object

Class for reading PgxFrame objects from PGB files.

Return type

None

auto_detect_columns(auto_detect)

Enable or disable the autodetection of columns from the table.

Executing this function clears the currently loaded column descriptors.

Parameters

auto_detect (bool) – True if the columns should be autodetected, False otherwise

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxPgbFrameReader

clear_columns()

Clear the current configuration of which columns should be loaded and how.

Returns

None

Return type

None

columns(column_descriptors)

Set the columns to be loaded from their columnDescriptors.

Executing this function disables autodetection of columns.

Parameters

column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxPgbFrameReader

load(uris)

Load a PgxFrame from the provided URIs.

Parameters

uris (str) – the URIs from which to load the frame

Returns

PgxFrame instance

Return type

pypgx.api.frames._pgx_frame.PgxFrame

load_async(uris)

Load a PgxFrame from the provided URIs.

Parameters

uris (str) – the URIs from which to load the frame

Returns

PgxFrame instance

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – New name for the PgxFrame

Returns

self

Return type

pypgx.api.frames._pgx_frame_reader.PgxPgbFrameReader

class pypgx.api.frames.PgxPgbFrameStorer(java_pgx_pgb_frame_storer)

Bases: object

Class for configuring the storing operation of a PgxFrame to a PGB file and then triggering it.

Return type

None

name(frame_name)

Set the frame name.

Parameters

frame_name (str) – frame name.

Returns

this storer

Return type

pypgx.api.frames._pgx_frame_storer.PgxPgbFrameStorer

overwrite(overwrite_bool)

Set overwrite

Parameters

overwrite_bool (bool) – denotes if the table should be overwritten.

Returns

Return type

pypgx.api.frames._pgx_frame_storer.PgxPgbFrameStorer

store()

Store PgxFrame

Returns

PgxFrame instance

Return type

None

store_async()

Store PgxFrame

Returns

PgxFrame instance

class pypgx.api.frames.VectorType(componenttype, dimension)

Bases: object

Represents a vector type which can be used in a PgxFrame.

Parameters
  • componenttype (str) –

  • dimension (int) –

Return type

None

get_value_type()

Get the type of component.

Returns

the component type

Return type

str

simple_string()

Get the string representation of a vector.

Returns

simple string of a vector

Return type

str