27.1.6 Binary File Formats
PGX Binary Format (PGB)
PGX binary format (.pgb) is the proprietary binary format for graph server (PGX), which allows fast and efficient file processing. Fundamentally, the file is a binary dump of the graph and property data. Bytes are written in network byte order (big endian).
Type Encoding
Table 27-3 Type Encoding
| Value | Type | Size in bytes |
|---|---|---|
0 |
Boolean |
1 |
1 |
Integer |
4 |
2 |
Long |
8 |
3 |
Float |
4 |
4 |
Double |
8 |
7 |
String |
varies |
11 |
Vertex labels |
varies |
13 |
Local date |
4 |
14 |
Time |
4 |
15 |
Timestamp |
8 |
16 |
Time with time zone |
8 |
17 |
Timestamp with time zone |
12 |
18 |
Vector property |
variable: <sizeof component-type> * <dimension> |
File Layout
Table 27-4 File Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
4 |
magic word | Yes | 0x99191191 |
4 |
vertex size | Yes | Allowed values are 4 and 8.
|
4 |
edge size | Yes | Allowed values are 4 and 8.
|
<vertex size> |
number of vertices | Yes | |
<edge size> |
number of edges | Yes | |
<edge size> * (<numVertices> + 1) |
edge begin array | Yes | |
<vertex size> * <numEdges> |
destination vertex array | Yes | |
1 |
component bitmap | Yes |
|
4 |
vertexKey type | No | Only present if component bitmap & 0x0001 == 0x0001. See Table 27-3 for type encoding.
|
<vertex key layout> |
vertex keys | No | Only present if component bitmap & 0x0001 == 0x0001.
|
4 |
edgeKey type | No | Only present if component bitmap & 0x0008 == 0x0008. See table Table 27-3 for type encoding
|
<numEdges> * 8 |
edge keys | No | Only present if component bitmap & 0x0008 == 0x0008.
|
4 |
number of vertex properties | Yes | |
<num vertex properties> * <property layout> |
property data | Yes | See Table 27-10. |
4 |
number of edge properties | Yes | |
<num edge properties> * <property layout> |
property data | Y | See Edge Property Layout. |
<vertex labels layout> |
vertex labels | No | Only present if component bit & 0x0002 == 0x0002.
|
<edge labels layout> |
edge label | No | Only present if component bit & 0x0004 == 0x0004.
|
4 |
number of shared pools | Yes | |
<shared pools size> |
shared pools | No | |
<property names size> |
property names | No | Only present if component bit & 0x0010 == 0x0010. See Table 27-19.
|
Vertex Key Layout
The layout of vertex keys depends on the vertexKey type. PGB supports integer, long and string vertex keys.
Table 27-5 Integer Vertex Keys
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
<numVertices> * 4 |
key data | Yes | For each vertex, the corresponding integer key value. |
Table 27-6 Long Vertex Keys
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
<numVertices> * 8 |
key data | Yes | For each vertex, the corresponding long key value. |
Table 27-7 String Vertex Keys
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
4 |
compression scheme | Yes | reserved (must be 0)
|
8 |
property size | Yes | size of each element in bytes in the following data |
<number of keys> * <string key element layout> |
string key data | Yes | content of the vertex keys (see Table 27-5) |
Table 27-8 String Key Element Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
4 |
string length | Yes | length of the string in bytes |
<string length> |
string key data | Yes | content of the string as bytes, No zero-character |
Property Layout
The following shows the special layout for string properties, and for vector properties:
Table 27-9 Primitive Type Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
4 |
property type | Yes | See Table 27-3 for type encoding. |
8 |
property size | Yes | Size of the property data in bytes |
<property size> |
property data | Yes | Stored as <numVertices/numEdges> * <type size> |
Table 27-10 Vector Property Layout
| Size in bytes | Description | Comment |
|---|---|---|
4 |
vector type mark | Always equal to 18. |
8 |
size of vector property data and extra fields | dataSize = <sizeof component-type> * <dimension> + 8 (The 8 extra bytes are for the added following 2 extra fields in the vector property header.)
|
4 |
vector component data type | Valid types are integer, long, float, double. Encoded with the value specified in Table 27-3.
|
4 |
vector dimension | Number of components per vector value. Must be greater than 0 to be a valid vector property. |
dataSize - 8 |
data | Stored as array of length * ` in which the value of the j-th component of the vector for the i-th entity is at position i * + j`.
|
Table 27-11 String Type Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
4 |
property type | Yes | Must be 7. |
8 |
property size | Yes | Size of the following data in bytes. |
1 |
reserved | Yes | Reserved (must be 0). |
<dictionary layout> |
dictionary | Yes | String dictionary used in the property |
<numVertices/numEdges> * 8 |
property content | Yes | Content of the string property, stored as IDs that refer to the strings in the dictionary. |
Table 27-12 String Dictionary Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
1 |
reserved | Yes | Reserved (must be 0). |
8 |
number of strings | Yes | Number of strings in the following dictionary. |
<number of strings> * <dictionary element layout> |
dictionary data | Yes | See Table 27-13. |
Table 27-13 String Dictionary Element Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
8 |
string id | Yes | Unique ID of the string. |
4 |
string length | Yes | Length of the string in bytes. |
<string length> |
string data | Yes | Content of the string as bytes, No zero-character |
Vertex Labels Layout
Table 27-14 Vertex Labels Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
4 |
type | Yes | Must be 11. |
8 |
size | Yes | Size of the following data in bytes. |
<dictionary layout> |
dictionary | Yes | String dictionary used in the vertex labels. |
<numVertices + 1> * 8 |
string id begin array | Yes | <string ids> offset array for each vertex.
|
8 |
number of string ids | Yes | The number of string ids. |
<number of string ids> * 8 |
string ids | Yes | Array of string ids in the string dictionary. |
Edge Label Layout
The edge label layout follows the string type layout.
Shared Pools Layout
Table 27-15 Shared Pools Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
1 |
type | Yes | 1: enum, 2: prefixed |
Table 27-16 Type == Enum
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
8 |
num strings | Yes | |
<number of strings> * <string table layout> |
dictionary data | Yes | See Table 27-18. |
Table 27-17 Type == Prefix
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
8 |
num prefixes | Yes | |
<number of prefixes> * <string table layout> |
dictionary data | Yes | See Table 27-18. |
8 |
num suffixes | Yes | |
<number of suffixes> * <string table layout> |
dictionary data | Yes | See Table 27-18. |
Table 27-18 String Table for Shared Pools
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
8 |
string id | Yes | String can be literal (in case of enum) or prefix/suffix (in case of prefix). |
4 |
string length | Yes | |
<string length> |
string data | Yes |
Property Names Layout
Table 27-19 Property Names Layout
| Size in bytes | Description | Required | Comment |
|---|---|---|---|
8 |
size | Yes | String can be literal (in case of enum) or prefix/suffix (in case of prefix). |
<sum of size of vertex property names> |
vertex property names | No | Follows the String Key Element Layout. See Table 27-8. |
<sum of size of edge property names> |
edge property names | No | Follows the String Key Element Layout. See Table 27-8. |
Parent topic: Loading Graph Data from Files