16.1.4.5 Flat File (FLAT_FILE)
The Flat File format is a text file format containing two description files, one for vertices and one for edges. Each file consists of a list of properties with the following format:
vertices.opv vertex_ID, key_name, value_type, value, value, value <V-1> <V-1, VPK-1> <V-1, VPT-1> [<V-1, VP-1> <V-1, VP-1> <V-1, VP-1>] ... <V-1> <V-1, VPK-N> <V-1, VPT-1> [<V-1, VP-N> <V-1, VP-N> <V-1, VP-N>] <V-2> <V-2, VPK-1> <V-2, VPT-1> [<V-2, VP-1> <V-2, VP-1> <V-2, VP-1>] ... <V-2> <V-2, VPK-N> <V-2, VPT-N> [<V-2, VP-N> <V-2, VP-N> <V-2, VP-N>] ... <V-V> <V-V, VPK-N> <V-V, VPT-N> [<V-V, VP-N> <V-V, VP-N> <V-V, VP-N>]
edges.ope edge_ID, source_vertex_ID, destination_vertex_ID, edge_label, key_name, value_type, value, value, value <E-1> <V-1, VG-1> <E-1, EL-1> <E-1, EPK-1> <E-1, EPT-1> [<E-1, EP-1> <E-1, EP-1> <E-1, EP-1>] ... <E-1> <V-N, VG-N> <E-1, EL-N> <E-1, EPK-N> <E-1, EPT-N> [<E-1, EP-N> <E-1, EP-N> <E-1, EP-N>] <E-2> <V-1, VG-1> <E-2, EL-1> <E-2, EPK-1> <E-2, EPT-1> [<E-2, EP-1> <E-2, EP-1> <E-2, EP-1>] ... <E-2> <V-N, VG-N> <E-2, EL-N> <E-2, EPK-N> <E-2, EPT-N> [<E-2, EP-N> <E-2, EP-N> <E-2, EP-N>] ... <E-E> <V-N, VG-N> <E-E, EL-N> <E-E, EPK-N> <E-E, EPT-N> [<E-E, EP-N> <E-E, EP-N> <E-E, EP-N>]
Special Considerations when Using Flat File Format
- When no properties are defined for a certain vertex or edge,
%20
is used instead of the key name:Vertices: 1,%20,,,, Edges: 1,2,1,"label",%20,,,,
- Values that are not numeric nor date go in the first field; numeric values go in the second, and dates in the third.
- The following shows the mapping between PGX property type and flat file
value_type
:Table 16-3 Mapping between PGX Property Type and Flat File
value_type
PGX property type Flat file value_type STRING
1
INTEGER
2
FLOAT
3
DOUBLE
4
DATE
5
LOCAL_DATE
5
TIME
5
TIMESTAMP
5
TIME_WITH_TIMEZONE
5
TIMESTAMP_WITH_TIMEZONE
5
BOOLEAN
6
LONG
7
POINT2D
20
0Note:
When loading a graph in flat file format into PGX, the graph configuration is used to find the right temporal or spatial type. - The standard for the flat file format defines commma as the only valid delimiter, therefore any delimiter set in the graph configuration is ignored and comma is used instead.
- Strings must not be quoted, however the following encoding is needed for some characters:
- '%' -> '%25'
- '\t' -> '%09'
- ' ' -> '%20'
- '\n' -> '%0A'
- ',' -> '%2C'
- When storing a graph into flat file format, vertex labels will be ignored. Also, when a graph has no edge label, an empty string ("") will be stored instead.
- When loading a graph in parallel using flat file format, all information regarding a specific vertex or edge must be contained in the same partition otherwise unexpected behavior might occur.
Example 16-6 Graph in Flat File Text format
The following example shows a graph of 4 vertices (1, 2, 3 and 4), each having a double
and a string
property, and 3 edges, each having a boolean
and a date
property, encoded in Flat File Text format:
vertices.opv: 1,doubleProp,4,,8.0, 1,stringProp,1,foo,, 2,doubleProp,4,,4.3, 2,stringProp,1,bar,, 3,doubleProp,4,,6.1, 3,stringProp,1,bax,, 4,doubleProp,4,,17.78, 4,stringProp,1,f00,,
edges.ope: 1,2,1,label,boolProp,6,false,, 1,2,1,label,dateProp,5,,,1985-10-18%2010:00:00 2,3,2,label,boolProp,6,true,, 2,3,2,label,dateProp,5,,,1961-12-30%2014:45:14 3,3,4,label,boolProp,6,false,, 3,3,4,label,dateProp,5,,,2001-01-15%2007:00:43
Parent topic: Plain Text Formats