15.1.6 Binary File Formats

PGX Binary Format (PGB)

PGX binary format (.pgb) is the proprietary binary format for graph server (PGX), which allows fast and efficient file processing. Fundamentally, the file is a binary dump of the graph and property data. Bytes are written in network byte order (big endian).

Type Encoding

Table 15-3 Type Encoding

Value Type Size in bytes
0 Boolean 1
1 Integer 4
2 Long 8
3 Float 4
4 Double 8
7 String varies
11 Vertex labels varies
13 Local date 4
14 Time 4
15 Timestamp 8
16 Time with time zone 8
17 Timestamp with time zone 12
18 Vector property variable: <sizeof component-type> * <dimension>

File Layout

Table 15-4 File Layout

Size in bytes Description Required Comment
4 magic word Yes 0x99191191
4 vertex size Yes Allowed values are 4 and 8.
4 edge size Yes Allowed values are 4 and 8.
<vertex size> number of vertices Yes  
<edge size> number of edges Yes  
<edge size> * (<numVertices> + 1) edge begin array Yes  
<vertex size> * <numEdges> destination vertex array Yes  
1 component bitmap Yes
  • 0x0001: node keys
  • 0x0002: vertex labels
  • 0x0004: edge label
  • 0x0008: edge keys
  • other bits: reserved
4 vertexKey type No Only present if component bitmap & 0x0001 == 0x0001. See Table 15-3 for type encoding.
<vertex key layout> vertex keys No Only present if component bitmap & 0x0001 == 0x0001.
4 edgeKey type No Only present if component bitmap & 0x0008 == 0x0008. See table Table 15-3 for type encoding
<numEdges> * 8 edge keys No Only present if component bitmap & 0x0008 == 0x0008.
4 number of vertex properties Yes  
<num vertex properties> * <property layout> property data Yes See Table 15-10.
4 number of edge properties Yes  
<num edge properties> * <property layout> property data Y See Edge Property Layout.
<vertex labels layout> vertex labels No Only present if component bit & 0x0002 == 0x0002.
<edge labels layout> edge label No Only present if component bit & 0x0004 == 0x0004.
4 number of shared pools Yes  
<shared pools size> shared pools No  
<property names size> property names No Only present if component bit & 0x0010 == 0x0010. See Table 15-19.

Vertex Key Layout

The layout of vertex keys depends on the vertexKey type. PGB supports integer, long and string vertex keys.

Table 15-5 Integer Vertex Keys

Size in bytes Description Required Comment
<numVertices> * 4 key data Yes For each vertex, the corresponding integer key value.

Table 15-6 Long Vertex Keys

Size in bytes Description Required Comment
<numVertices> * 8 key data Yes For each vertex, the corresponding long key value.

Table 15-7 String Vertex Keys

Size in bytes Description Required Comment
4 compression scheme Yes reserved (must be 0)
8 property size Yes size of each element in bytes in the following data
<number of keys> * <string key element layout> string key data Yes content of the vertex keys (see Table 15-5)

Table 15-8 String Key Element Layout

Size in bytes Description Required Comment
4 string length Yes length of the string in bytes
<string length> string key data Yes content of the string as bytes, No zero-character

Property Layout

The following shows the special layout for string properties, and for vector properties:

Table 15-9 Primitive Type Layout

Size in bytes Description Required Comment
4 property type Yes See Table 15-3 for type encoding.
8 property size Yes Size of the property data in bytes
<property size> property data Yes Stored as <numVertices/numEdges> * <type size>

Table 15-10 Vector Property Layout

Size in bytes Description Comment
4 vector type mark Always equal to 18.
8 size of vector property data and extra fields dataSize = <sizeof component-type> * <dimension> + 8 (The 8 extra bytes are for the added following 2 extra fields in the vector property header.)
4 vector component data type Valid types are integer, long, float, double. Encoded with the value specified in Table 15-3.
4 vector dimension Number of components per vector value. Must be greater than 0 to be a valid vector property.
dataSize - 8 data Stored as array of length * ` in which the value of the j-th component of the vector for the i-th entity is at position i * + j`.

Table 15-11 String Type Layout

Size in bytes Description Required Comment
4 property type Yes Must be 7.
8 property size Yes Size of the following data in bytes.
1 reserved Yes Reserved (must be 0).
<dictionary layout> dictionary Yes String dictionary used in the property
<numVertices/numEdges> * 8 property content Yes Content of the string property, stored as IDs that refer to the strings in the dictionary.

Table 15-12 String Dictionary Layout

Size in bytes Description Required Comment
1 reserved Yes Reserved (must be 0).
8 number of strings Yes Number of strings in the following dictionary.
<number of strings> * <dictionary element layout> dictionary data Yes See Table 15-13.

Table 15-13 String Dictionary Element Layout

Size in bytes Description Required Comment
8 string id Yes Unique ID of the string.
4 string length Yes Length of the string in bytes.
<string length> string data Yes Content of the string as bytes, No zero-character

Vertex Labels Layout

Table 15-14 Vertex Labels Layout

Size in bytes Description Required Comment
4 type Yes Must be 11.
8 size Yes Size of the following data in bytes.
<dictionary layout> dictionary Yes String dictionary used in the vertex labels.
<numVertices + 1> * 8 string id begin array Yes <string ids> offset array for each vertex.
8 number of string ids Yes The number of string ids.
<number of string ids> * 8 string ids Yes Array of string ids in the string dictionary.

Edge Label Layout

The edge label layout follows the string type layout.

Shared Pools Layout

Table 15-15 Shared Pools Layout

Size in bytes Description Required Comment
1 type Yes 1: enum, 2: prefixed

Table 15-16 Type == Enum

Size in bytes Description Required Comment
8 num strings Yes  
<number of strings> * <string table layout> dictionary data Yes See Table 15-18.

Table 15-17 Type == Prefix

Size in bytes Description Required Comment
8 num prefixes Yes  
<number of prefixes> * <string table layout> dictionary data Yes See Table 15-18.
8 num suffixes Yes  
<number of suffixes> * <string table layout> dictionary data Yes See Table 15-18.

Table 15-18 String Table for Shared Pools

Size in bytes Description Required Comment
8 string id Yes String can be literal (in case of enum) or prefix/suffix (in case of prefix).
4 string length Yes  
<string length> string data Yes  

Property Names Layout

Table 15-19 Property Names Layout

Size in bytes Description Required Comment
8 size Yes String can be literal (in case of enum) or prefix/suffix (in case of prefix).
<sum of size of vertex property names> vertex property names No Follows the String Key Element Layout. See Table 15-8.
<sum of size of edge property names> edge property names No Follows the String Key Element Layout. See Table 15-8.