PGX can automatically detect the graph configuration for
GraphML (non-partitioned graphs only),
CSV files. Both loading the graph immediately
and creating a graph config object for a graph file is supported. If the expected format is not explicitly specified,
PGX will try to detect the format via the file extension and by looking for known magic words inside the file.
Property names may be assigned automatically
PGB files do not contain any names for the vertex and edge properties. Therefore, PGX will generate names for all properties found in the file.
CSV format detection
Format detection from CSV files require the files to contain a header and this header to follow the syntax described here
PgxSession methods can be used to load a graph without an explicit graph configuration:
PgxFuture<PgxGraph> readGraphFileAsync(String path) PgxFuture<PgxGraph> readGraphFilesAsync(List<String> paths)
If the vertices and edges are located in segregated files, the following methods should be used:
PgxFuture<PgxGraph> readGraphFilesAsync(String vertexPath, String edgePath) PgxFuture<PgxGraph> readGraphFilesAsync(List<String> vertexPaths. List<String> edgePaths)
PgxSession methods can be used to generate a graph config for a specific graph:
PgxFuture<GraphConfig> describeGraphFileAsync(String path) PgxFuture<GraphConfig> describeGraphFilesAsync(List<String> paths)
As above, the following methods should be used when vertices and edges do not share the same file:
PgxFuture<GraphConfig> describeGraphFilesAsync(String vertexPath, String edgePath) PgxFuture<GraphConfig> describeGraphFilesAsync(List<String> vertexPaths, List<String> edgePaths)
Variants of these methods are available. See the Javadocs for a complete overview.
When loading a graph from CSV files with automatic configuration detection, the files have to conform to the following syntax.
Headers for special column (ID, labels, edge source and destination) are of the form
The name can be omitted, but if it is present the column will be loaded as a property with this name in addition to its special purpose.
The columns holding property data have a header of the form
Columns where no property type is specified default to
A column can be skipped by the loader by using the
Here is an example of a vertex table header:
This header defines four columns:
1. the vertex ID column, of type
2. a property named
prop1 of type
3. a property named
prop2 of type
4. the vertex labels column
The following keywords are recognized for vertex tables:
:VID(type;table_name): vertex ID
type: type of the vertex ID (
table_name (only for partitioned graphs, optional):
if it is omitted, the table name will be generated from the file name by removing the extension and an optional
_n suffix, where
n is an integer partition number.
(e.g. a table in
people_2.csv will be called
Files with the same table name are loaded into the same vertex table.
In that case, their structure has to be identical.
:LABELS(separator) (only in non-partitioned graphs, optional): vertex labels
separator (optional). Specifies a different character to separate labels in the column than the default
Here is an example of an edge table header:
This header defines four columns:
1. a column whose contents will be ignored when loading the table
2. the edge ID column, which defines the edge table name to be
relationships and will also be loaded as a property named
id (of type
3. the source vertex ID column, which points to the vertex table
4. the destination vertex ID column, which also points to the vertex table
The following keywords are recognized for edge tables:
:SRC(source_table): source vertex table
source_table (only for partitioned graphs): specifies the name of the table containing the source vertices of the edges in this table.
:DST(destination_table): destination vertex table
destination_table: (only for partitioned graphs). Specifies the name of the table containing the destination vertices of the edges in this table.
:EID(table_name): edge ID
table_name (only for partitioned graphs, optional): if it is omitted or the
:EID field is not present in the header, the table name is generated the same way as for vertex tables.
:LABEL (only for non-partitioned graphs, optional): edge label
It is possible to load graphs from CSV files exported from or created for (i.e., with header formats supported by) Neo4j and Amazon Neptune. Currently, properties of type point and duration (for Neo4j) as well as array properties (for both formats) are not supported.
More examples are provided in this tutorial