7.4 SEM_MATCH and RDF Graph Support for Apache Jena Queries Compared

There are two ways to query RDF data stored in Oracle Database: SEM_MATCH-based SQL statements and SPARQL queries through the support for Apache Jena.

Queries using each approach are similar in appearance, but there are important behavioral differences. To ensure consistent application behavior, you must understand the differences and use care when dealing with query results coming from SEM_MATCH queries and SPARQL queries.

The following simple examples show the two approaches.

Query 1 (SEM_MATCH-based)

select s, p, o
    from table(sem_match('{?s ?p ?o}', sem_models('Test_Model'), ....))

Query 2 (SPARQL query through Support for Apache Jena)

select ?s ?p ?o
where {?s ?p ?o}

These two queries perform the same kind of functions; however, there are some important differences. Query 1 (SEM_MATCH-based):

  • Reads all triples out of Test_Model.

  • Does not differentiate among URI, bNode, plain literals, and typed literals, and it does not handle long literals.

  • Does not unescape certain characters (such as '\n').

Query 2 (SPARQL query executed through the support for Apache Jena) also reads all triples out of Test_Model (assume it executed a call to ModelOracleSem referring to the same underlying Test_Model). However, Query 2:

  • Reads out additional columns (as opposed to just the s, p, and o columns with the SEM_MATCH table function), to differentiate URI, bNodes, plain literals, typed literals, and long literals. This is to ensure proper creation of Jena Node objects.

  • Unescapes those characters that are escaped when stored in Oracle Database

Blank node handling is another difference between the two approaches:

  • In a SEM_MATCH-based query, blank nodes are always treated as constants.

  • In a SPARQL query, a blank node that is not wrapped inside < and > is treated as a variable when the query is executed through the support for Apache Jena. This matches the SPARQL standard semantics. However, a blank node that is wrapped inside < and > is treated as a constant when the query is executed, and the support for Apache Jena adds a proper prefix to the blank node label as required by the underlying data modeling.

The maximum length for the name of an RDF graph created using the support for Apache Jena API is 22 characters.