Analytical Functions for RDF Data

7.10 Analytical Functions for RDF Data

You can perform analytical functions on RDF data by using the SemNetworkAnalyst class in the oracle.spatial.rdf.client.jena package.

This support integrates the Network Data Model Graph logic with the underlying RDF data structures. Therefore, to use analytical functions on RDF data, you must be familiar with the Network Data Model Graph feature, which is documented in Oracle Spatial Topology and Network Data Model Developer's Guide.

The required NDM Java libraries, including sdonm.jar and sdoutl.jar, are under the directory $ORACLE_HOME/md/jlib. Note that xmlparserv2.jar (under $ORACLE_HOME/xdk/lib) must be included in the classpath definition.

Example 7-6 Performing Analytical functions on RDF Data

Example 7-6 uses the SemNetworkAnalyst class, which internally uses the NDM NetworkAnalyst API

Oracle oracle = new Oracle(jdbcUrl, user, password);
GraphOracleSem graph = new GraphOracleSem(oracle, modelName);
 
Node nodeA = Node.createURI("http://A");
Node nodeB = Node.createURI("http://B");
Node nodeC = Node.createURI("http://C");
Node nodeD = Node.createURI("http://D");
Node nodeE = Node.createURI("http://E");
Node nodeF = Node.createURI("http://F");
Node nodeG = Node.createURI("http://G");
Node nodeX = Node.createURI("http://X");
 
// An anonymous node
Node ano = Node.createAnon(new AnonId("m1"));
 
Node relL = Node.createURI("http://likes");
Node relD = Node.createURI("http://dislikes");
Node relK = Node.createURI("http://knows");
Node relC = Node.createURI("http://differs");
 
graph.add(new Triple(nodeA, relL, nodeB));
graph.add(new Triple(nodeA, relC, nodeD));
graph.add(new Triple(nodeB, relL, nodeC));
graph.add(new Triple(nodeA, relD, nodeC));
 
graph.add(new Triple(nodeB, relD, ano));
graph.add(new Triple(nodeC, relL, nodeD));
graph.add(new Triple(nodeC, relK, nodeE));
graph.add(new Triple(ano,   relL, nodeD));
graph.add(new Triple(ano,   relL, nodeF));
graph.add(new Triple(ano,   relD, nodeB));
 
// X only likes itself
graph.add(new Triple(nodeX, relL, nodeX));
 
graph.commitTransaction();
HashMap<Node, Double> costMap = new HashMap<Node, Double>();
costMap.put(relL, Double.valueOf((double)0.5));
costMap.put(relD, Double.valueOf((double)1.5));
costMap.put(relC, Double.valueOf((double)5.5));
 
graph.setDOP(4); // this allows the underlying LINK/NODE tables
                 // and indexes to be created in parallel.
 
SemNetworkAnalyst sna = SemNetworkAnalyst.getInstance(
    graph,   // network data source
    true,    // directed graph
    true,    // cleanup existing NODE and LINK table
    costMap
    );
 
psOut.println("From nodeA to nodeC");
Node[] nodeArray = sna.shortestPathDijkstra(nodeA, nodeC);
printNodeArray(nodeArray, psOut);
 
psOut.println("From nodeA to nodeD"); 
nodeArray = sna.shortestPathDijkstra( nodeA, nodeD);
printNodeArray(nodeArray, psOut);
 
psOut.println("From nodeA to nodeF");
nodeArray = sna.shortestPathAStar(nodeA, nodeF);
printNodeArray(nodeArray, psOut);
 
psOut.println("From ano to nodeC");
nodeArray = sna.shortestPathAStar(ano, nodeC);
printNodeArray(nodeArray, psOut);
 
psOut.println("From ano to nodeX");
nodeArray = sna.shortestPathAStar(ano, nodeX);
printNodeArray(nodeArray, psOut);
 
graph.close();
oracle.dispose();
...
...
   
// A helper function to print out a path
public static void printNodeArray(Node[] nodeArray, PrintStream psOut)
{
  if (nodeArray == null) {
    psOut.println("Node Array is null");
    return;
  }
  if (nodeArray.length == 0) {psOut.println("Node Array is empty"); }
  int iFlag = 0;
  psOut.println("printNodeArray: full path starts");
  for (int iHops = 0; iHops < nodeArray.length; iHops++) {
    psOut.println("printNodeArray: full path item " + iHops + " = "
        + ((iFlag == 0) ? "[n] ":"[e] ") + nodeArray[iHops]);
    iFlag = 1 - iFlag;
  }
}

In Example 7-6:

A GraphOracleSem object is constructed and a few triples are added to the GraphOracleSem object. These triples describe several individuals and their relationships including likes, dislikes, knows, and differs.
A cost mapping is constructed to assign a numeric cost value to different links/predicates (of the RDF graph). In this case, 0.5, 1.5, and 5.5 are assigned to predicates likes, dislikes, and differs, respectively. This cost mapping is optional. If the mapping is absent, then all predicates will be assigned the same cost 1. When cost mapping is specified, this mapping does not need to be complete; for predicates not included in the cost mapping, a default value of 1 is assigned.

The output of Example 7-6 is as follows. In this output, the shortest paths are listed for the given start and end nodes. Note that the return value of sna.shortestPathAStar(ano, nodeX) is null because there is no path between these two nodes.

From nodeA to nodeC
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://A           ## "n" denotes Node             
printNodeArray: full path item 1 = [e] http://likes       ## "e" denotes Edge (Link)
printNodeArray: full path item 2 = [n] http://B
printNodeArray: full path item 3 = [e] http://likes
printNodeArray: full path item 4 = [n] http://C
 
From nodeA to nodeD
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://A
printNodeArray: full path item 1 = [e] http://likes
printNodeArray: full path item 2 = [n] http://B
printNodeArray: full path item 3 = [e] http://likes
printNodeArray: full path item 4 = [n] http://C
printNodeArray: full path item 5 = [e] http://likes
printNodeArray: full path item 6 = [n] http://D
 
From nodeA to nodeF
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://A
printNodeArray: full path item 1 = [e] http://likes
printNodeArray: full path item 2 = [n] http://B
printNodeArray: full path item 3 = [e] http://dislikes
printNodeArray: full path item 4 = [n] m1
printNodeArray: full path item 5 = [e] http://likes
printNodeArray: full path item 6 = [n] http://F
 
From ano to nodeC
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] m1
printNodeArray: full path item 1 = [e] http://dislikes
printNodeArray: full path item 2 = [n] http://B
printNodeArray: full path item 3 = [e] http://likes
printNodeArray: full path item 4 = [n] http://C
 
From ano to nodeX
Node Array is null

The underlying RDF graph view (SEMM_<rdf_graph_name> or RDFM_<rdf_graph_name>) cannot be used directly by NDM functions, and so SemNetworkAnalyst creates necessary tables that contain the nodes and links that are derived from a given RDF graph. These tables are not updated automatically when the RDF graph changes; rather, you can set the cleanup parameter in SemNetworkAnalyst.getInstance to true, to remove old node and link tables and to rebuild updated tables.

Example 7-7 Implementing NDM nearestNeighbors Function on Top of RDF Data

Example 7-7 implements the NDM nearestNeighbors function on top of RDF data. This gets a NetworkAnalyst object from the SemNetworkAnalyst instance, gets the node ID, creates PointOnNet objects, and processes LogicalSubPath objects.

%cat TestNearestNeighbor.java 
 
import java.io.*;
import java.util.*;
import org.apache.jena.graph.*;
import org.apache.jena.update.*;
import oracle.spatial.rdf.client.jena.*;
import oracle.spatial.rdf.client.jena.SemNetworkAnalyst;
import oracle.spatial.network.lod.LODGoalNode;
import oracle.spatial.network.lod.LODNetworkConstraint;
import oracle.spatial.network.lod.NetworkAnalyst;
import oracle.spatial.network.lod.PointOnNet;
import oracle.spatial.network.lod.LogicalSubPath;
 
/**
 * This class implements a nearestNeighbors function on top of RDF data
 * using public APIs provided in SemNetworkAnalyst and Oracle Spatial NDM
 */
public class TestNearestNeighbor
{
  public static void main(String[] args) throws Exception
  {
    String szJdbcURL = args[0];
    String szUser    = args[1];
    String szPasswd  = args[2];
 
    PrintStream psOut = System.out;
 
    Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd);
    
    String szModelName = "test_nn";
    // First construct a TBox and load a few axioms
    ModelOracleSem model = ModelOracleSem.createOracleSemModel(oracle, szModelName);
    String insertString =  
      " PREFIX my:  <http://my.com/> " +
      " INSERT DATA "                             +
      " { my:A   my:likes my:B .                " +
      "   my:A   my:likes my:C .                " +
      "   my:A   my:knows my:D .                " +
      "   my:A   my:dislikes my:X .             " +
      "   my:A   my:dislikes my:Y .             " +
      "   my:C   my:likes my:E .                " +
      "   my:C   my:likes my:F .                " +
      "   my:C   my:dislikes my:M .             " +
      "   my:D   my:likes my:G .                " +
      "   my:D   my:likes my:H .                " +
      "   my:F   my:likes my:M .                " +
      " }   ";
    UpdateAction.parseExecute(insertString,  model);
 
    GraphOracleSem g = model.getGraph();
    g.commitTransaction();
    g.setDOP(4);
 
    HashMap<Node, Double> costMap = new HashMap<Node, Double>();
    costMap.put(Node.createURI("http://my.com/likes"),    Double.valueOf(1.0));
    costMap.put(Node.createURI("http://my.com/dislikes"), Double.valueOf(4.0));
    costMap.put(Node.createURI("http://my.com/knows"),    Double.valueOf(2.0));
 
    SemNetworkAnalyst sna = SemNetworkAnalyst.getInstance(
        g,     // source RDF graph
        true,  // directed graph
        true,  // cleanup old Node/Link tables
        costMap
        );
 
    Node nodeStart = Node.createURI("http://my.com/A");
    long origNodeID = sna.getNodeID(nodeStart);
 
    long[] lIDs = {origNodeID};
 
    // translate from the original ID
    long nodeID = (sna.mapNodeIDs(lIDs))[0]; 
 
    NetworkAnalyst networkAnalyst = sna.getEmbeddedNetworkAnalyst();
 
    LogicalSubPath[] lsps = networkAnalyst.nearestNeighbors(
      new PointOnNet(nodeID),      // startPoint
      6,                           // numberOfNeighbors
      1,                           // searchLinkLevel
      1,                           // targetLinkLevel
      (LODNetworkConstraint) null, // constraint
      (LODGoalNode) null           // goalNodeFilter
      );
 
    if (lsps != null) {
      for (int idx = 0; idx < lsps.length; idx++) {
        LogicalSubPath lsp = lsps[idx];
        Node[] nodePath = sna.processLogicalSubPath(lsp, nodeStart);
        psOut.println("Path " + idx);
        printNodeArray(nodePath, psOut);
      }
    }
 
    g.close();
    sna.close();
    oracle.dispose();
  }
 
 
  public static void printNodeArray(Node[] nodeArray, PrintStream psOut)
  {
    if (nodeArray == null) {
      psOut.println("Node Array is null");
      return;
    }
    if (nodeArray.length == 0) {
      psOut.println("Node Array is empty");
    }
    int iFlag = 0;
    psOut.println("printNodeArray: full path starts");
    for (int iHops = 0; iHops < nodeArray.length; iHops++) {
      psOut.println("printNodeArray: full path item " + iHops + " = "
          + ((iFlag == 0) ? "[n] ":"[e] ") + nodeArray[iHops]);
      iFlag = 1 - iFlag;
    }
  }
}

The output of Example 7-7 is as follows.

Path 0
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://my.com/A
printNodeArray: full path item 1 = [e] http://my.com/likes
printNodeArray: full path item 2 = [n] http://my.com/C
 
Path 1
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://my.com/A
printNodeArray: full path item 1 = [e] http://my.com/likes
printNodeArray: full path item 2 = [n] http://my.com/B
 
Path 2
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://my.com/A
printNodeArray: full path item 1 = [e] http://my.com/knows
printNodeArray: full path item 2 = [n] http://my.com/D
 
Path 3
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://my.com/A
printNodeArray: full path item 1 = [e] http://my.com/likes
printNodeArray: full path item 2 = [n] http://my.com/C
printNodeArray: full path item 3 = [e] http://my.com/likes
printNodeArray: full path item 4 = [n] http://my.com/E
 
Path 4
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://my.com/A
printNodeArray: full path item 1 = [e] http://my.com/likes
printNodeArray: full path item 2 = [n] http://my.com/C
printNodeArray: full path item 3 = [e] http://my.com/likes
printNodeArray: full path item 4 = [n] http://my.com/F
 
Path 5
printNodeArray: full path starts
printNodeArray: full path item 0 = [n] http://my.com/A
printNodeArray: full path item 1 = [e] http://my.com/knows
printNodeArray: full path item 2 = [n] http://my.com/D
printNodeArray: full path item 3 = [e] http://my.com/likes
printNodeArray: full path item 4 = [n] http://my.com/H

Generating Contextual Information about a Path in a Graph

Parent topic: RDF Graph Support for Apache Jena

7.10.1 Generating Contextual Information about a Path in a Graph

It is sometimes useful to see contextual information about a path in a graph, in addition to the path itself. The buildSurroundingSubGraph method in the SemNetworkAnalyst class can output a DOT file (graph description language file, extension .gv) into the specified Writer object. For each node in the path, up to ten direct neighbors are used to produce a surrounding subgraph for the path. The following example shows the usage of generating a DOT file with contextual information, specifically the output from the analytical functions used in Example 7-6.

nodeArray = sna.shortestPathDijkstra(nodeA, nodeD);
printNodeArray(nodeArray, psOut);
 
FileWriter dotWriter = new FileWriter("Shortest_Path_A_to_D.gv");
sna.buildSurroundingSubGraph(nodeArray, dotWriter);

The generated output DOT file from the preceding example is straightforward, as shown in the following example:

% cat Shortest_Path_A_to_D.gv
digraph { rankdir = LR; charset="utf-8"; 
 
"Rhttp://A" [ label="http://A" shape=rectangle,color=red,style = filled, ];
"Rhttp://B" [ label="http://B" shape=rectangle,color=red,style = filled, ];
"Rhttp://A" -> "Rhttp://B" [ label="http://likes"  color=red, style=bold, ];
"Rhttp://C" [ label="http://C" shape=rectangle,color=red,style = filled, ];
"Rhttp://A" -> "Rhttp://C" [ label="http://dislikes" ];
"Rhttp://D" [ label="http://D" shape=rectangle,color=red,style = filled, ];
"Rhttp://A" -> "Rhttp://D" [ label="http://differs" ];
"Rhttp://B" -> "Rhttp://C" [ label="http://likes"  color=red, style=bold, ];
"Rm1" [ label="m1" shape=ellipse,color=blue, ];
"Rhttp://B" -> "Rm1" [ label="http://dislikes" ];
"Rm1" -> "Rhttp://B" [ label="http://dislikes" ];
"Rhttp://C" -> "Rhttp://D" [ label="http://likes"  color=red, style=bold, ];
"Rhttp://E" [ label="http://E" shape=ellipse,color=blue, ];
"Rhttp://C" -> "Rhttp://E" [ label="http://knows" ];
"Rm1" -> "Rhttp://D" [ label="http://likes" ];
}

You can also use methods in the SemNetworkAnalyst and GraphOracleSem classes to produce more sophisticated visualization of the analytical function output.

You can convert the preceding DOT file into a variety of image formats. Figure 7-1 is an image representing the information in the preceding DOT file.

Figure 7-1 Visual Representation of Analytical Function Output

Description of "Figure 7-1 Visual Representation of Analytical Function Output"

Parent topic: Analytical Functions for RDF Data