PGX 20.2.2
Documentation

Binary Formats

PGX Binary Format (PGB)

PGX binary format (.pgb) is the proprietary binary format for PGX, which allows fast and efficient file processing for PGX. Fundamentally, the file is a binary dump of the graph and property data. Bytes are written in network byte order (big endian).

Type Encoding

Value Type Size in bytes
0 Boolean 1
1 Integer 4
2 Long 8
3 Float 4
4 Double 8
7 String varies
11 Vertex labels varies
13 Local date 4
14 Time 4
15 Timestamp 8
16 Time with time zone 8
17 Timestamp with time zone 12
18 Vector property variable: <sizeof component-type> * <dimension>

File Layout

Size in bytes Description Required Comment
4 magic word Yes 0x99191191
4 vertex size Yes allowed values are 4 and 8
4 edge size Yes allowed values are 4 and 8
<vertex size> number of vertices Yes
<edge size> number of edges Yes
<edge size> * (<numVertices> + 1) edge begin array Yes
<vertex size> * <numEdges> destination vertex array Yes
1 component bitmap Yes 0x0001: node keys, 0x0002: vertex labels, 0x0004: edge label, 0x0008: edge keys, other bits: reserved
4 vertexKey type No Only present if component bitmap & 0x0001 == 0x0001. See table above for type encoding
<vertex key layout> vertex keys No Only present if component bitmap & 0x0001 == 0x0001.
4 edgeKey type No Only present if component bitmap & 0x0008 == 0x0008. See table above for type encoding
<numEdges> * 8 edge keys No Only present if component bitmap & 0x0008 == 0x0008.
4 number of vertex properties Yes
<num vertex properties> * <property layout> property data Yes (see table below)
4 number of edge properties Yes
<num edge properties> * <property layout> property data Yes (see table below)
<vertex labels layout> vertex labels No Only present if component bit & 0x0002 == 0x0002
<edge labels layout> edge label No Only present if component bit & 0x0004 == 0x0004
4 number of shared pools Yes
<shared pools size> shared pools No (see table below)
<property names size> property names No Only present if component bit & 0x0010 == 0x0010. See table below for details

Vertex Key Layout

The layout of vertex keys depends on the vertexKey type. PGB supports integer, long & string vertex keys.

Integer Vertex Keys

Size in bytes Description Required Comment
<numVertices> * 4 key data Yes for each vertex, the corresponding integer key value

Long Vertex Keys

Size in bytes Description Required Comment
<numVertices> * 8 key data Yes for each vertex, the corresponding long key value

String Vertex Keys

Size in bytes Description Required Comment
4 compression scheme Yes reserved (must be 0)
8 property size Yes size of each element in bytes in the following data
<number of keys> * <string key element layout> string key data Yes content of the vertex keys (see table below)

String Key Element Layout

Size in bytes Description Required Comment
4 string length Yes length of the string in bytes
<string length> string key data Yes content of the string as bytes, No zero-character

Property Layout

There is a special layout for string properties, and for vector properties (see below).

Primitive Type Layout

Applies for all property types except string that are not vector properties:

Size in bytes Description Required Comment
4 property type Yes see table above for type encoding
8 property size Yes size of the property data in bytes
<property size> property data Yes stored as <numVertices/numEdges> * <type size>

Vector Property Layout

Size in bytes Description Comment
4 vector type mark always equal to 18
8 size of vector property data and extra fields dataSize = <sizeof component-type> * <dimension> + 8 (The 8 extra bytes are for the added following 2 extra fields in the vector property header)
4 vector component data type valid types are integer, long, float, double . Encoded with the value specified in the Type Encoding table.
4 vector dimension number of components per vector value. Must be greater than 0 to be a valid vector property
dataSize - 8 data stored as array of length * ` in which the value of the j-th component of the vector for the i-th entity is at positioni * + j`

String Type Layout

Size in bytes Description Required Comment
4 property type Yes must be 7
8 property size Yes size of the following data in bytes
1 reserved Yes reserved (must be 0)
<dictionary layout> dictionary Yes string dictionary used in the property
<numVertices/numEdges> * 8 property content Yes content of the string property, stored as IDs that refer to the strings in the dictionary

String Dictionary Layout

Size in bytes Description Required Comment
1 reserved Yes reserved (must be 0)
8 number of strings Yes number of strings in the following dictionary
<number of strings> * <dictionary element layout> dictionary data Yes (see table below)

String Dictionary Element Layout

Size in bytes Description Required Comment
8 string id Yes unique id of the string
4 string length Yes length of the string in bytes
<string length> string data Yes content of the string as bytes, No zero-character

Vertex Labels Layout

Size in bytes Description Required Comment
4 type Yes must be 11
8 size Yes size of the following data in bytes
<dictionary layout> dictionary Yes string dictionary used in the vertex labels
<numVertices + 1> * 8 string id begin array Yes <string ids> offset array for each vertex
8 number of string ids Yes the number of string ids
<number of string ids> * 8 string ids Yes array of string ids in the string dictionary

Edge Label Layout

The edge label layout follows the string type layout.

Shared Pools Layout

Size in bytes Description Required Comment
1 type Yes 1: enum, 2: prefixed

Type == Enum

Size in bytes Description Required Comment
8 num strings Yes
<number of strings> * <string table layout> dictionary data Yes (See table below)

Type == Prefix

Size in bytes Description Required Comment
8 num prefixes Yes
<number of prefixes> * <string table layout> dictionary data Yes (See table below)
8 num suffixes Yes
<number of suffixes> * <string table layout> dictionary data Yes (See table below)

String Table for Shared Pools

Size in bytes Description Required Comment
8 string id Yes string can be literal (in case of enum) or prefix/suffix (in case of prefix)
4 string length Yes
<string length> string data Yes

Property Names Layout

Size in bytes Description Required Comment
8 size Yes Total size in bytes of property name section
<sum of size of vertex property names> vertex property names No (Follows the String Key Element Layout, see table above)
<sum of size of edge property names> edge property names No (Follows the String Key Element Layout, see table above)