5.9.3 Vertex File

Each line in a vertex file is a record that describes a vertex of the property graph. A record can describe one key-value property of a vertex, thus multiple records/lines are used to describe a vertex with multiple properties.

A record contains fields separated by commas. Each record must contain five commas to delimit first six fields, whether or not they have values. An optional seventh field can be added (delimited from the sixth field by a comma) to define a vertex label:

vertex_ID, key_name, value_type, value, value, value, vertex_label

The following table describes the fields composing a vertex file record.

Table 5-2 Vertex File Record Format

Field Number Name Description

1

vertex_ID

An integer that uniquely identifies the vertex

2

key_name

The name of the key in the key-value pair

If the vertex has no properties, then enter a space (%20). This example describes vertex 1 with no properties:

1,%20,,,,

3

value_type

An integer that represents the data type of the value in the key-value pair:

  • 1 String
  • 2 Integer
  • 3 Float
  • 4 Double
  • 5 Timestamp (date)
  • 6 Boolean
  • 7 Long integer
  • 8 Short integer
  • 9 Byte
  • 10 Char
  • 20 Spatial data, which can be geospatial coordinates, lines, polygons, or Well-Known Text (WKT) literals
  • 101 Serializable Java object

4

value

The encoded, nonnull value of key_name when it is neither numeric nor date

5

value

The encoded, nonnull value of key_name when it is numeric

6

value

The encoded, nonnull value of key_name when it is a timestamp (date)

Use the Java SimpleDateFormat class to identify the format of the date. This example describes the date format of 2015-03-26T00:00:00.000-05:00:

SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX"); encode(sdf.format((java.util.Date) value));

7

vertex_label

The optional encoded label of the vertex, which can be used to describe the type or category of the vertex.

Required Grouping of Vertices: A vertex can have multiple properties, and the vertex file includes a record (represented by a single line of text in the flat file) for each combination of a vertex ID and a property for that vertex. In the vertex file, all records for each vertex must be grouped together (that is, not have any intervening records for other vertices. You can accomplish this any way you want, but a convenient way is to sort the vertex file records in ascending (or descending) order by vertex ID. (Note, however, a vertex file is not required to have all records sorted by vertex ID; this is merely one way to achieve the grouping requirement.)

When building an edge file in Oracle flat file format, it is important to verify that the vertex property name and value fields are correctly encoded (see especially Encoding Special Characters). To simplify the encoding, you can use the OraclePropertyGraphUtils.escape Java API.

You can use the OraclePropertyGraphUtils.outputVertexRecord(os, vid, key, value) utility method to serialize a vertex record directly in Oracle flat file format. With this method, you no longer need to worry about encoding of special characters. The method writes a new line of text in the given output stream describing the key/value property of the given vertex identified by vid.

Example 5-25 Using OraclePropertyGraphUtils.outputVertexRecord

This example uses OraclePropertyGraphUtils.outputVertexRecord to write two new lines for vertex 1.

OutputStream os = new FileOutputStream("./example.opv");
long vid = 1;
String label = "person";
OraclePropertyGraphUtils.outputVertexRecord(os, vid, label, "name", "Robert Smith");
OraclePropertyGraphUtils.outputVertexRecord(os, vid, label, "birth year", 1961);
os.flush();
os.close();

The first line in the generated output file describes the property name with value "Robert Smith", and the second line describes his birth year of 1961.

% cat example.opv
1,name,1,Robert%20OSmith,,,person
1,birth%20year,2,,1961,,person