Frames
PgxFrame
and other classes related to frames.
- class pypgx.api.frames.DataTypes
Bases:
object
This class can be used to construct parametrized data types (e.g., VectorType).
- static vector(componenttype, dimension)
Create and return a new VectorType.
- Parameters
componenttype (str) –
dimension (int) –
- Return type
- class pypgx.api.frames.PgxCsvFrameReader(java_pgx_csv_frame_reader)
Bases:
object
Class for reading
PgxFrame
objects from CSV files.- auto_detect_columns(auto_detect)
Enable or disable the autodetection of columns from the table.
Executing this function clears the currently loaded column descriptors.
- Parameters
auto_detect (bool) – True if the columns should be autodetected, False otherwise
- Returns
self
- Return type
- clear_columns()
Clear the current configuration of which columns should be loaded and how.
- Returns
None
- Return type
None
- columns(column_descriptors)
Set the columns to be loaded from their columnDescriptors.
- Parameters
column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
- Returns
self
- Return type
- load(uris)
Load a
PgxFrame
from the provided URIs.- Parameters
uris (str) – the URIs from which to load the frame
- Returns
PgxFrame instance
- Return type
- load_async(uris)
Load a
PgxFrame
from the provided URIs.- Parameters
uris (str) – the URIs from which to load the frame
- Returns
PgxFrame instance
- name(frame_name)
Set the frame name.
- Parameters
frame_name (str) – New name for the
PgxFrame
- Returns
self
- Return type
- separator(sep)
Set the separator for CSV parsing to sep.
- Parameters
sep (str) – char denoting the separator
- Returns
self
- Return type
- class pypgx.api.frames.PgxCsvFrameStorer(java_pgx_csv_frame_storer)
Bases:
object
Class for configuring the storing operation of a
PgxFrame
to a CSV file and then triggering it.- name(frame_name)
Set the frame name.
- Parameters
frame_name (str) – frame name.
- Returns
this storer
- Return type
- overwrite(overwrite_bool)
Set overwrite
- Parameters
overwrite_bool (bool) – denotes if the table should be overwritten.
- Returns
- Return type
- partition_extension(file_extension)
Set the fileExtension of the created CSV files.
- Parameters
file_extension (str) – string denoting the file extension for the created files.
- Returns
this storer
- Return type
- partitions(num_partitions)
Set the number of files to be created.
- Parameters
num_partitions (int) – number of partitions created.
- Returns
this storer
- Return type
- separator(sep)
Set the separator for CSV file to sep.
- Parameters
sep (str) – char denoting the separator
- Returns
self
- Return type
- store()
Store PgxFrame
- Returns
PgxFrame instance
- Return type
None
- class pypgx.api.frames.PgxDbFrameReader(java_pgx_db_frame_reader)
Bases:
object
Class for reading
PgxFrame
objects from a database.- auto_detect_columns(auto_detect)
Enable or disable the autodetection of columns from the table.
Executing this function clears the currently loaded column descriptors.
- Parameters
auto_detect (bool) – True if the columns should be autodetected, False otherwise
- Returns
self
- Return type
- clear_columns()
Clear the current configuration of which columns should be loaded and how.
- Returns
None
- Return type
None
- columns(column_descriptors)
Set the columns to be loaded from their columnDescriptors.
Executing this function disables autodetection of columns.
- Parameters
column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
- Returns
self
- Return type
- connections(connections)
Set the number of connections to read/write data from/to the database provider
- Parameters
connections (int) – number of connections
- Returns
self
- Return type
- data_source_id(data_source_id)
Set the datasource ID.
- Parameters
data_source_id (str) – the datasource ID
- Returns
self
- Return type
- jdbc_url(jdbc_url)
Set the jdbc URL to use for connecting to the DB.
- Parameters
jdbc_url – the jdbc URL
- Returns
self
- keystore_alias(keystore_alias)
Set the keystore alias.
- Parameters
keystore_alias – the keystore alias.
- Returns
self
- load()
Load a
PgxFrame
from the database.- Parameters
uris – the URIs from which to load the frame
- Returns
PgxFrame instance
- Return type
- load_async()
Load a
PgxFrame
from the database.- Parameters
uris – the URIs from which to load the frame
- Returns
PgxFrame instance
- name(frame_name)
Set the frame name.
- Parameters
frame_name (str) – New name for the
PgxFrame
- Returns
self
- Return type
- owner(owner)
Set the owner of the table.
- Parameters
owner (str) – the owner
- Returns
self
- Return type
- password(password)
Set the password of the database.
- Parameters
password (str) – the password
- Returns
self
- Return type
- schema(schema)
Set the schema of the table.
- Parameters
schema (str) – the schema.
- Returns
self
- Return type
- table_name(table_name)
Set the table name in the database.
- Parameters
table_name (str) – nodes table name.
- Returns
self
- Return type
- username(username)
Set the username of the database.
- Parameters
username (str) – username
- Returns
self
- Return type
- class pypgx.api.frames.PgxDbFrameStorer(java_pgx_db_frame_storer)
Bases:
object
Class for configuring the storing operation of a
PgxFrame
to a database and then triggering it.- connections(connections)
Set the number of connections to read/write data from/to the database provider
- Parameters
connections (int) – number of connections
- Returns
this storer
- Return type
- data_source_id(data_source_id)
Set the datasource ID.
- Parameters
data_source_id (str) – the datasource ID
- Returns
this storer
- Return type
- jdbc_url(jdbc_url)
Set jdbc url
- Parameters
jdbc_url (str) –
- Returns
- Return type
- keystore_alias(keystore_alias)
Set the keystore alias.
- Parameters
keystore_alias (str) – the keystore alias.
- Returns
this storer
- Return type
- name(frame_name)
Set the frame name.
- Parameters
frame_name (str) – frame name.
- Returns
self
- Return type
- overwrite(overwrite_bool)
Set overwrite
- Parameters
overwrite_bool (bool) –
- Returns
self
- Return type
- owner(owner)
Set the owner of the table.
- Parameters
owner (str) – the owner
- Returns
this storer
- Return type
- password(password)
Set the password of the database.
- Parameters
password (str) – the password
- Returns
this storer
- Return type
- schema(schema)
Set the schema of the table.
- Parameters
schema (str) – the schema.
- Returns
this storer
- Return type
- table_name(table_name)
Set the table name in the database.
- Parameters
table_name (str) – nodes table name.
- Returns
self
- Return type
- username(username)
Set the username of the database.
- Parameters
username (str) – username
- Returns
this storer
- Return type
- class pypgx.api.frames.PgxFrame(java_pgx_frame)
Bases:
PgxContextManager
Data-structure to load/store and manipulate tabular data.
It contains rows and columns. A PgxFrame can contain multiple columns where each column consist of elements of the same data type, and has a name. The list of the columns with their names and data types defines the schema of the frame. (The number of rows in the PgxFrame is not part of the schema of the frame.)
- clone()
Create a new PgxFrame with the same content as the current frame
- Returns
PgxFrame
- Return type
- close()
Free resources on the server taken up by this frame.
- Return type
None
- property columns: List[str]
Get the names of the columns contained in the PgxFrame.
- Return type
list
- count()
Count number of elements in the frame.
- Return type
int
- destroy()
Free resources on the server taken up by this frame.
- Return type
None
- flatten(*columns, inplace=False)
Create a new PgxFrame with all the specified columns and vector columns flattened into multiple columns.
- Parameters
columns (str) – Column names
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- flatten_all(inplace=False)
Create a new PgxFrame with all nested columns and vector columns flattened into multiple columns.
- Parameters
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- get_column(name)
Return a PgxFrameColumn.
- Parameters
name (str) – Column name
- Returns
PgxFrameColumn
- Return type
- get_column_descriptors()
Return a list containing the description of the different columns of the frames.
- Return type
List[Tuple[str, str]]
- head(num_rows=10, inplace=False)
Return the first num_rows elements of the frame
- Parameters
num_rows (int) – Number of rows to take
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- join(right, join_key_column=None, left_join_key_column=None, right_join_key_column=None, left_prefix=None, right_prefix=None, inplace=False)
Create a new PgxFrame by performing a join operation.
Create a new PgxFrame by adding the columns of the right frame to this frame, aligned on equality of entries in column left_join_key_column for this frame and column right_join_key_column for the right frame, or join_key_columns on both frames. The resulting frame will contain the columns of this frame prefixed by left_prefix and the columns of right frame prefixed by right_prefix (if the prefixes are not null). Prefixes must ether not be set or both be set.
- Parameters
right (PgxFrame) – PgxFrame whose columns will be added to the columns of this PgxFrame
join_key_column (Optional[str]) – Column of both frames on which the equality test will be performed
left_join_key_column (Optional[str]) – Column of this frame on which the equality test will be performed with right_join_key_column
right_join_key_column (Optional[str]) – Column of right frame on which the equality test will be performed with leftJoinKeyColumn
left_prefix (Optional[str]) – Prefix of the columns name of this frame in the resulting frame
right_prefix (Optional[str]) – Prefix of the columns name of right frame in the resulting frame
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- property length: int
Return the number of rows in the frame.
- Returns
number of rows
- print(file=None, num_results=1000, start=0)
Print the frame.
- Parameters
file (Optional[TextIO]) – File to which results are printed (default is
sys.stdout
)num_results (int) – Number of results to be printed
start (int) – Index of the first result to be printed
- Return type
None
- rename_column(old_column_name, new_column_name, inplace=False)
Return a PgxFrame with the column name modified.
- Parameters
old_column_name (str) – name of the column to rename
new_column_name (str) – name of the column after the operation
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- rename_columns(column_renaming, inplace=False)
Return a PgxFrame with the column name modified.
- Parameters
column_renaming (Mapping[str, str]) – dict-like holding old_column names as keys and new column names as values
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- select(*columns, inplace=False)
Select multiple columns by column name.
- Parameters
columns (str) – Column names
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- store(path, file_format='csv', overwrite=True)
Store the frame in a file.
- Parameters
path (str) – Path where to store the frame
file_format (str) – Storage format
overwrite (bool) – Overwrite current file
- Return type
None
- tail(num_rows=10, inplace=False)
Return the last num_rows elements of the frame
- Parameters
num_rows (int) – Number of rows to take
inplace (bool) – Apply the changes inplace and return self
- Returns
PgxFrame
- Return type
- to_pandas()
Convert to pandas DataFrame.
This method may change result_set cursor.
This method requires pandas.
- Returns
PgxFrame as a Pandas Dataframe
- to_pgql_result_set()
Create a new PgqlResultSet having the same content as this frame.
- Returns
PgqlResultSet
- Return type
- union(*frames, inplace=False)
Create a PgxFrame by concatenating the rows of this frame with the rows of the frames in frames. The different frames should have the same columns (same names, types and dimensions), in the same order. The resulting frame is not guaranteed to have any specific ordering of its rows.
- write()
Get Pgx Frame storer
- Returns
PgxGenericFrameStorer
- Return type
- class pypgx.api.frames.PgxFrameBuilder(java_pgx_frame_builder)
Bases:
PgxContextManager
A frame builder for constructing a
PgxFrame
.- add_rows(column_data)
Add the data to the frame builder.
- Parameters
column_data (Dict[str, Any]) – The column data in a dictionary.
- Raises
TypeError – column_data must be a dictionary.
- Returns
The current pgx frame reader.
- Return type
- build(frame_name)
Build the frame with the given frame name.
- Parameters
frame_name (str) – The name of the frame to create.
- Raises
TypeError – frame_name must be a string.
- Returns
The newly frame created.
- Return type
- close()
Free resources on the server taken up by this frame builder. After this method returns, the behaviour of any methods of this class becomes undefined.
- Return type
None
- destroy()
Free resources on the server taken up by this frame builder. After this method returns, the behaviour of any methods of this class becomes undefined.
- Return type
None
- class pypgx.api.frames.PgxFrameColumn(java_pgx_frame_column)
Bases:
PgxContextManager
Class representing one column of a
PgxFrame
.- close()
Destroy/close the object. Base implementation, overridable by subclasses.
- Raises
NotImplementedError – Not implemented.
- Return type
None
- destroy()
Free resources on the server taken up by this column.
- Return type
None
- get_descriptor()
Return a description of the column.
- Return type
Tuple[str, str]
- class pypgx.api.frames.PgxGenericFrameReader(java_pgx_generic_frame_reader)
Bases:
object
A generic class for reading
PgxFrame
objects from various sources.The class allows configuration of how the data should be read and facilitates the creation of specialized frame readers (
PgxCsvFrameReader
andPgxPgbFrameReader
).- auto_detect_columns(auto_detect)
Enable or disable the autodetection of columns from the table.
Not all formats support autodetection of the columns (only DB in fact).
Executing this function clears the currently loaded column descriptors.
- Parameters
auto_detect (bool) – True if the columns should be autodetected, False otherwise
- Returns
self
- Return type
- clear_columns()
Clear the current configuration of which columns should be loaded and how.
- Returns
None
- Return type
None
- columns(column_descriptors)
Set the columns to be loaded from their columnDescriptors.
Executing this function disables autodetection of columns.
- Parameters
column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
- Returns
self
- Return type
- csv(uris=None)
Create a
PgxCsvFrameReader
object for loading CSV files from URIs.- Parameters
uris (Optional[str]) – List of paths to the csv files
- Return type
- csv_async(uris)
Read a
PgxFrame
from a list of URIs to CSV files.- Parameters
uris (str) – list denoting the URIs
- Returns
the read frame
- db()
Create a
PgxDbFrameReader
object to loadPgxFrame
from a database.- Return type
- format(format)
Return a frame reader for the type format specified.
- Parameters
format (str) – format to be loaded Can be one of ‘csv’ or ‘pgb’
- Returns
a loader for the format
- Return type
- name(frame_name)
Set the frame name.
- Parameters
frame_name (str) – New name for the
PgxFrame
- Returns
self
- Return type
- pgb(uris=None)
Create a
PgxPgbFrameReader
object for loading PGB files.- Parameters
uris (Optional[str]) – List of paths to the PGB files
- Return type
- class pypgx.api.frames.PgxGenericFrameStorer(java_pgx_generic_frame_storer)
Bases:
object
Class for configuring the storing operation of a
PgxFrame
and then triggering it.- csv()
Get a
PgxCsvFrameStorer
instance for thePgxFrame
- Return type
- db()
Get a
PgxDbFrameStorer
instance for thePgxFrame
- Return type
- format(format)
Create a specialized frame storer for saving the
PgxFrame
in a given format.- Parameters
format (str) – identifier of the wanted format. Can be one of ‘csv’, ‘pgb’ or ‘db’
- Return type
- name(frame_name)
Set the name of the stored frame.
- Parameters
frame_name (str) – the new frame name
- Returns
self
- Return type
- overwrite(overwrite_bool)
Set overwrite
- Parameters
overwrite_bool (bool) – denotes if the table should be overwritten.
- Returns
self
- Return type
- pgb()
Get a
PgxPgbFrameStorer
instance for thePgxFrame
.- Return type
- class pypgx.api.frames.PgxPgbFrameReader(java_pgx_pgb_frame_reader)
Bases:
object
Class for reading PgxFrame objects from PGB files.
- auto_detect_columns(auto_detect)
Enable or disable the autodetection of columns from the table.
Executing this function clears the currently loaded column descriptors.
- Parameters
auto_detect (bool) – True if the columns should be autodetected, False otherwise
- Returns
self
- Return type
- clear_columns()
Clear the current configuration of which columns should be loaded and how.
- Returns
None
- Return type
None
- columns(column_descriptors)
Set the columns to be loaded from their columnDescriptors.
Executing this function disables autodetection of columns.
- Parameters
column_descriptors (List[Tuple[str, str]]) – List of tuples (columnName, columnType)
- Returns
self
- Return type
- load(uris)
Load a
PgxFrame
from the provided URIs.- Parameters
uris (str) – the URIs from which to load the frame
- Returns
PgxFrame instance
- Return type
- load_async(uris)
Load a
PgxFrame
from the provided URIs.- Parameters
uris (str) – the URIs from which to load the frame
- Returns
PgxFrame instance
- class pypgx.api.frames.PgxPgbFrameStorer(java_pgx_pgb_frame_storer)
Bases:
object
Class for configuring the storing operation of a
PgxFrame
to a PGB file and then triggering it.- name(frame_name)
Set the frame name.
- Parameters
frame_name (str) – frame name.
- Returns
this storer
- Return type
- overwrite(overwrite_bool)
Set overwrite
- Parameters
overwrite_bool (bool) – denotes if the table should be overwritten.
- Returns
- Return type
- store()
Store PgxFrame
- Returns
PgxFrame instance
- Return type
None
- class pypgx.api.frames.VectorType(componenttype, dimension)
Bases:
object
Represents a vector type which can be used in a PgxFrame.
- Parameters
componenttype (str) –
dimension (int) –
- get_value_type()
Get the type of component.
- Returns
the component type
- Return type
str
- simple_string()
Get the string representation of a vector.
- Returns
simple string of a vector
- Return type
str