10.4 Combining Native RDF Data with Virtual RDB2RDF Data

You can combine native triple data with virtual RDB2RDF triple data (from an RDF view graph) in a single SEM_MATCH query by means of the SERVICE keyword.

The SERVICE keyword (explained in Graph Patterns: Support for SPARQL 1.1 Federated Query) is overloaded through the use of special SERVICE URLs that signify local (virtual) RDF data. The following prefixes are used to denote special SERVICE URLs:

  • Native RDF graphs - oram: <http://xmlns.oracle.com/models/>

  • Native RDF graph collections - oravm: <http://xmlns.oracle.com/virtual_models/>

  • RDB2RDF models - orardbm: <http://xmlns.oracle.com/rdb_models/>

Example 10-5 Querying Multiple Data Sets

Example 10-5 queries multiple data sets. In this query, the first triple pattern { ?x rdf:type :Person } will go against an RDF graph m1 as usual, but { ?x :name ?name } will go against the local RDF graph m2, and { ?x emp:JOB ?job } will go against the local RDB2RDF model empdb_model.

SELECT * FROM TABLE (SEM_MATCH(
'PREFIX    : <http://people.org/> 
 PREFIX emp: <http://empdb/TESTUSER.EMP#> 
 SELECT ?x ?name ?job 
 WHERE {
   ?x rdf:type :Person .   
   OPTIONAL { SERVICE oram:m2 { ?x :name ?name } }   
   OPTIONAL { SERVICE orardbm:empdb_model { ?x emp:JOB ?job } } 
 }',
 SEM_MODELS('m1'), NULL, NULL, NULL, NULL, ' ', NULL, NULL, 'RDFUSER', 'NET1'));

Overloaded SERVICE use is only allowed with a single RDF graph specified in the models argument of SEM_MATCH. Overloaded SERVICE queries do not allow multiple RDF graphs or a rulebase as input. An RDF graph collection that contains multiple RDF graphs and/or inferred graphs should be used instead for such combinations. In addition, the index_status argument for SEM_MATCH will only check the inferred graph contained in the RDF graph collection passed as input in the models parameter. This means the status of inferred graphs that are referenced in overloaded SERVICE calls will not be checked.

Example 10-6 queries two data sets: the empdb_model from Example: Using an RDF View Graph with Direct Mapping and a native RDF graph named people.

Example 10-6 Querying Virtual RDB2RDF Data and Native RDF Data in a Schema-Private Network

-- Create native model people --
 EXECUTE SEM_APIS.CREATE_RDF_GRAPH('people', NULL, NULL, network_owner=>'rdfuser', network_name=>'net1');
 
BEGIN
  SEM_APIS.UPDATE_RDF_GRAPH('people',
   'PREFIX peop: <http://people.org/> 
    INSERT DATA {
       <http://empdb/TESTUSER.EMP/EMPNO=1> peop:age 35 .
       <http://empdb/TESTUSER.EMP/EMPNO=2> peop:age 39 .
       <http://empdb/TESTUSER.EMP/EMPNO=3> peop:age 30 .
       <http://empdb/TESTUSER.EMP/EMPNO=4> peop:age 42 .
    } ');
END;
/
COMMIT;
 
-- Querying multiple datasets --
SELECT emp, age
  FROM TABLE(SEM_MATCH(
    'PREFIX dept: <http://empdb/TESTUSER.DEPT#>
     PREFIX emp: <http://empdb/TESTUSER.EMP#>
     PREFIX peop: <http://people.org/>
     SELECT ?emp ?age WHERE {
       ?emp peop:age ?age
       SERVICE orardbm:empdb_model { ?emp emp:ref-DEPTNO ?dept . ?dept dept:LOC "Boston" }
    }',
    SEM_Models('people'),
    NULL,
    NULL,
    NULL, NULL, NULL, NULL, NULL, 'RDFUSER', 'NET1'));

The query produces the following output:

EMP                                                AGE
-------------------------------------------------- --------------------------------------------------
http://empdb/TESTUSER.EMP/EMPNO=1                   35
http://empdb/TESTUSER.EMP/EMPNO=3                   30

10.4.1 Nested Loop Pushdown with Overloaded Service

Using a nested loop service can improve performance is some scenarios. Consider the following example queries against multiple data sets for a schema-private network. The query finds the properties of all the departments with people who are 35 years old.

–- Query example for a schema-private network.

SELECT emp, dept, p, o
  FROM TABLE(SEM_MATCH(
    'PREFIX dept: <http://empdb/TESTUSER.DEPT#>     
     PREFIX emp: <http://empdb/TESTUSER.EMP#>
     PREFIX peop: <http://people.org/>
     SELECT * WHERE{
       ?emp peop:age 35
       SERVICE orardbm:empdb_model{ ?emp emp:ref-DEPTNO ?dept . ?dept ?p ?o }
     }',
     SEM_Models('people'),
     NULL,
     NULL,
     NULL, NULL, NULL, NULL, NULL, 'RDFUSER', 'NET1'));

The preceding query produces the following output:

EMP                                DEPT                                P                                                 O
---------------------------------- ----------------------------------- ------------------------------------------------  --------------------------
http://empdb/TESTUSER.EMP/EMPNO=1   http://empdb/TESTUSER.DEPT/DEPTNO=1  http://empdb/TESTUSER.DEPT#DEPTNO                1
http://empdb/TESTUSER.EMP/EMPNO=1   http://empdb/TESTUSER.DEPT/DEPTNO=1  http://empdb/TESTUSER.DEPT#DNAME                 Sales
http://empdb/TESTUSER.EMP/EMPNO=1   http://empdb/TESTUSER.DEPT/DEPTNO=1  http://empdb/TESTUSER.DEPT#LOC                   Boston
http://empdb/TESTUSER.EMP/EMPNO=1   http://empdb/TESTUSER.DEPT/DEPTNO=1  http://www.w3.org/1999/02/22-rdf-syntax-ns#type  http://empdb/TESTUSER.DEPT

To get all the results that match for given graph pattern, first the triple pattern { ?emp peop:age 35 } is matched against the RDF graph people, then the triple patterns { ?emp emp:ref-DEPTNO ?d . ?d dept:DNAME ?dept } are matched against the RDF graph empdb_model, and finally the results are joined. Assume that there is only one 35-year-old person in the RDF graph people, but there are 100,000 triples with information about departments. Obviously, a strategy that retrieves all the results is not the most efficient, and query may have poor performance because a large number of results that need to be processed before being joined with the rest of the query.

An nested-loop service can improve performance in this case. If the hint OVERLOADED_NL=T is used, the results of the first part of the query are computed and the SERVICE pattern is executed procedurally in a nested loop once for each ?emp value from the root triple pattern. The ?emp subject variable in the SERVICE pattern is replaced with a constant from the root triple pattern in each execution. This effectively pushes the join condition down into the SERVICE clause.

The following example shows the use of the OVERLOADED_NL=T hint for the preceding query.

SELECT emp, dept, p, o
  FROM TABLE(SEM_MATCH(
    'PREFIX dept: <http://empdb/TESTUSER.DEPT#>     
     PREFIX emp: <http://empdb/TESTUSER.EMP#>
     PREFIX peop: <http://people.org/>
     SELECT * WHERE{
       ?emp peop:age 35
       SERVICE orardbm:empdb_model { ?emp emp:ref-DEPTNO ?dept . ?dept ?p ?o }
     }',
     SEM_Models('people'),
     NULL,
     NULL,
     NULL, NULL,' OVERLOADED_NL=T ', NULL, NULL, 'RDFUSER', 'NET1'));

The hint OVERLOADED_NL=T can be specified among SEM_MATCH options or among inline comments for a given SERVICE graph.