PGX 20.1.1
Documentation

Loading a Graph from Two Database Tables

This guide explains the steps to load (property) graph data from RDBMS when it is represented by two relational tables. That is, each row in the first table corresponds to each node in the graph and its properties while the rows in the second table correspond to edges and their properties. Naturally, each node/edge property corresponds to one column in the graph. Check this document for more information.

In this guide you will perform the following two steps.

  1. Create an example graph data using the two table formats with a RDBMS.

  2. Use the PGX Shell or a Java application, load the example graph data from the two tables.

Creating an Example Graph Data with Two Tables

Schema of "Nodes" and "Edges" Tables

For this example, Nodes is the name of our nodes table and contains two columns:

  • VID: is the primary key that represents the node identifier.

  • NP1, NP2, NP3: refer to the node properties.

And Edges is the name of our edges table and has the following columns:

  • EID: is the primary key that represents the edge identifier.
  • SVID: is a foreign key to the "VID" column in the "Nodes" table. It represents the source node.
  • DVID: is also a foreign key to the "VID" column in the "Nodes" table. It represents the destination node.
  • EP1, EP2, ...: are the properties of the edges.
  • EL, ...: is the edge label.

SQL Scripts for Creating Two Tables and Populating Example Data

The following SQL script will create an example of the nodes and edges tables. Note: The example shown below has three node properties (string, date and numeric) and three edge properties (string, date and numeric)

CREATE TABLE nodes
( VID NUMBER,
  NP1 VARCHAR2(200),
  NP2 DATE,
  NP3 NUMBER,
  CONSTRAINT node_pk PRIMARY KEY (VID)
);
CREATE TABLE edges
( EID NUMBER,
  SVID NUMBER,
  DVID NUMBER,
  EP1 VARCHAR2(200),
  EP2 DATE,
  EP3 NUMBER,
  EL VARCHAR2(25),
  CONSTRAINT edge_pk PRIMARY KEY (EID),
  CONSTRAINT fk_SVID FOREIGN KEY (SVID) REFERENCES nodes(VID),
  CONSTRAINT fk_DVID FOREIGN KEY (DVID) REFERENCES nodes(VID)
);

We now insert some nodes and edges into the tables we just created:

INSERT INTO nodes (VID,NP1,NP2,NP3) VALUES (1, 'Node1', '01-JAN-01', 111.11);
INSERT INTO nodes (VID,NP1,NP2,NP3) VALUES (2, 'Node2', '02-FEB-02', 222.22);
INSERT INTO nodes (VID,NP1,NP2,NP3) VALUES (3, 'Node3', '03-MAR-03', 333.33);
INSERT INTO nodes (VID,NP1,NP2,NP3) VALUES (4, 'Node4', '04-APR-04', 444.44);

INSERT INTO edges (EID,SVID,DVID,EP1,EP2,EP3) VALUES (1, 2, 1, 'edge1', '01-JAN-01', 111.11);
INSERT INTO edges (EID,SVID,DVID,EP1,EP2,EP3) VALUES (2, 4, 2, 'edge2', '02-JAN-02', 222.22);
INSERT INTO edges (EID,SVID,DVID,EP1,EP2,EP3) VALUES (3, 2, 3, 'edge3', '03-JAN-03', 333.33);
INSERT INTO edges (EID,SVID,DVID,EP1,EP2,EP3) VALUES (4, 3, 4, 'edge4', '04-JAN-04', 444.44);

For now, our database contains five nodes and five edges. The following figure shows the corresponding graph:

sample graph

Loading the Graph from the Two Tables

Creating Graph Configuration

To load this graph into PGX, write a graph config. You can write the config in JSON format into a file or, when using the Java API, by using a TwoTablesRdbmsGraphConfigBuilder object. Following, we demonstrate both ways.

Change parameters

Notice that you will have to modify jdbc_url, username and password according to your own database configuration.

{
  "jdbc_url": "jdbc:oracle:thin:@mydatabaseserver:1521/dbName",
  "format": "two_tables",
  "datastore": "rdbms",
  "username": "dbUser",
  "password": "dbPassword",
  "nodes_table_name": "nodes",
  "edges_table_name": "edges",
  "vertex_props": [{
    "name": "NP1",
    "type": "string"
  },{
    "name": "NP2",
    "type": "timestamp"
  },{
    "name": "NP3",
    "type": "double"
  }],
  "edge_props": [{
    "name": "EP1",
    "type": "string"
  },{
    "name": "EP2",
    "type": "timestamp"
  },{
    "name": "EP3",
    "type": "float"
  }]
}
TwoTablesRdbmsGraphConfig config = GraphConfigBuilder.forTwoTablesRdbms()
  .setJdbcUrl("jdbc:oracle:thin:@mydatabaseserver:1521/dbName")
  .setUsername("dbUser")
  .setPassword("dbPassword")
  .setNodesTableName("nodes")
  .setEdgesTableName("edges")
  .addVertexProperty("NP1", PropertyType.STRING)
  .addVertexProperty("NP2", PropertyType.DATE)
  .addVertexProperty("NP3", PropertyType.DOUBLE)
  .addEdgeProperty("EP1", PropertyType.STRING)
  .addEdgeProperty("EP2", PropertyType.DATE)
  .addEdgeProperty("EP3", PropertyType.FLOAT)
  .build();

Actual Graph Loading from the PGX Shell or a Java Application

To load the graph data from the two tables into memory, simply write the graph config to readGraphWithProperties(). You can either write the path to the JSON file:

var G = session.readGraphWithProperties("path/to/config.json")
PgxSession session = Pgx.createSession("my-session");
PgxGraph G = session.readGraphWithProperties("path/to/config.json");

or the config object directly if you built the config programmatically:

var G = session.readGraphWithProperties(config)
PgxSession session = Pgx.createSession("my-session");
PgxGraph G = session.readGraphWithProperties(config);