3.2 Using OWL Inferencing
You can use inference rules to perform native OWL inferencing.
This section creates a simple ontology, performs native inferencing, and illustrates some more advanced features.
- Creating a Simple OWL Ontology
- Performing Native OWL Inferencing
- Performing OWL and User-Defined Rules Inferencing
- Generating OWL Inferencing Proofs
- Validating OWL RDF Graphs and Inferred Graphs
- Using SEM_APIS.CREATE_INFERRED_GRAPH for RDFS Inference
- Enhancing Inference Performance
- Optimizing owl:sameAs Inference
- Performing Incremental Inference
- Using Parallel Inference
- Using Named Graph Based Inferencing (Global and Local)
- Performing Selective Inferencing (Advanced Information)
Parent topic: OWL Concepts
3.2.1 Creating a Simple OWL Ontology
Example 3-1 creates a simple OWL ontology, inserts one statement that two URIs refer to the same entity, and performs a query using the SEM_MATCH table function.
Example 3-1 Creating a Simple OWL Ontology
SQL> CREATE TABLE owltst(id number, triple sdo_rdf_triple_s); Table created. SQL> EXECUTE sem_apis.create_rdf_graph('owltst','owltst','triple',network_owner=>'RDFUSER',network_name=>'NET1'); PL/SQL procedure successfully completed. SQL> INSERT INTO owltst VALUES (1, sdo_rdf_triple_s('owltst', 'http://example.com/name/John', 'http://www.w3.org/2002/07/owl#sameAs', 'http://example.com/name/JohnQ','RDFUSER','NET1')); 1 row created. SQL> commit; SQL> -- Use SEM_MATCH to perform a simple query. SQL> select s$rdfterm,p$rdfterm,o$rdfterm from table(SEM_MATCH('SELECT * WHERE {?s ?p ?o}', SEM_Models('OWLTST'), null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1'));
Parent topic: Using OWL Inferencing
3.2.2 Performing Native OWL Inferencing
Example 3-2 calls the SEM_APIS.CREATE_INFERRED_GRAPH procedure. You do not need to create the rulebase and add rules to it, because the OWL rules are already built into the RDF Graph inferencing engine.
Example 3-2 Performing Native OWL Inferencing
SQL> -- Invoke the following command to run native OWL inferencing that SQL> -- understands the vocabulary defined in the preceding section. SQL> SQL> EXECUTE sem_apis.create_inferred_graph('owltst_idx', sem_models('owltst'), sem_rulebases('OWLPRIME'), network_owner=>'RDFUSER', network_name=>'NET1'); PL/SQL procedure successfully completed. SQL> -- The following view is generated to represent the inferred graph (rules index). SQL> desc RDFUSER.NET1#semi_owltst_idx; SQL> -- Run the preceding query with an additional rulebase parameter to list SQL> -- the original graph plus the inferred triples. SQL> SELECT s$rdfterm,p$rdfterm,o$rdfterm FROM table(SEM_MATCH('SELECT * WHERE {?s ?p ?o}', SEM_MODELS('OWLTST'), SEM_RULEBASES('OWLPRIME'), null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1'));
Parent topic: Using OWL Inferencing
3.2.3 Performing OWL and User-Defined Rules Inferencing
Example 3-3 creates a user-defined rulebase, inserts a simplified
uncleOf
rule (stating that the brother of one's father is one's uncle),
and calls the SEM_APIS.CREATE_INFERRED_GRAPH procedure.
Example 3-3 Performing OWL and User-Defined Rules Inferencing
SQL> -- First, insert the following assertions. SQL> INSERT INTO owltst VALUES (1, sdo_rdf_triple_s('owltst', 'http://example.com/name/John', 'http://example.com/rel/fatherOf', 'http://example.com/name/Mary', 'RDFUSER', 'NET1')); SQL> INSERT INTO owltst VALUES (1, sdo_rdf_triple_s('owltst', 'http://example.com/name/Jack', 'http://example.com/rel/brotherOf', 'http://example.com/name/John', 'RDFUSER', 'NET1')); SQL> -- Create a user-defined rulebase. SQL> EXECUTE sem_apis.create_rulebase('user_rulebase', network_owner=>'RDFUSER', network_name=>'NET1'); SQL> -- Insert a simple "uncle" rule. SQL> INSERT INTO RDFUSER.NET1#SEMR_USER_RULEBASE VALUES ('uncle_rule', '(?x <http://example.com/rel/brotherOf> ?y)(?y <http://example.com/rel/fatherOf> ?z)', NULL, '(?x <http://example.com/rel/uncleOf> ?z)', null); SQL> -- In the following statement, 'USER_RULES=T' is required, to SQL> -- include the original graph plus the inferred triples. SQL> EXECUTE sem_apis.create_inferred_graph('owltst2_idx', sem_models('owltst'), sem_rulebases('OWLPRIME','USER_RULEBASE'), SEM_APIS.REACH_CLOSURE, null, 'USER_RULES=T', network_owner=>'RDFUSER', network_name=>'NET1'); SQL> -- In the result of the following query, :Jack :uncleOf :Mary is inferred. SQL> SELECT s$rdfterm,p$rdfterm,o$rdfterm FROM table(SEM_MATCH('SELECT * WHERE {?s ?p ?o}', SEM_MODELS('OWLTST'), SEM_RULEBASES('OWLPRIME','USER_RULEBASE'), null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1'));
For performance, the inference engine by default executes each user rule without checking the syntax legality of inferred triples (for example, literal value as a subject, blank node as a predicate) until after the last round of inference. After completing the last inference round, the inference engine removes all syntactically illegal triples without throwing any errors for these triples. However, because triples with illegal syntax may exist during multiple rounds of inference, rules can use these triples as part of their antecedents. For example, consider the following user-defined rules:
-
Rule 1:
(?s :account ?y) (?s :country :Spain) --> (?y rdf:type :SpanishAccount)
-
Rule 2:
(?s :account ?y) (?y rdf:type :SpanishAccount) --> (?s :language "es_ES")
Rule 1 finds all Spanish users and designates their accounts as Spanish accounts. Rule 2 sets the language for all users with Spanish accounts to es_ES
(Spanish). Consider the following data, displayed in Turtle format:
:Juan :account "123ABC4Z" :country :Spain :Alejandro :account "5678DEF9Y" :country :Spain
Applying Rule 1 and Rule 2 produces the following inferred triples:
(:Juan :language "es_ES") (:Alejandro :language "es_ES")
Note there are no triples specifying which accounts are of type
:SpanishAccount
. The user-defined rules infer those triples during
creation of the inferred graph, but the inference engine removes them after the last round
of inference because they contain illegal syntax. The accounts are the literal values, which
cannot be used as subjects in an RDF triple.
To force the checking of syntax legality of inferred triples, add the /*+ ENABLE_SYNTAX_CHECKING */
optimizer hint to the beginning of the rule's FILTER expression. Forcing syntax checking for a rule can result in a performance penalty and will throw an exception for any syntactically illegal triples. The following example, similar to Rule 1, forces syntax checking. (In addition, merely to illustrate the use of a filter expression, the example restricts accounts to those that do not end with the letter 'Z
'.)
INSERT INTO RDFUSER.NET1#SEMR_USER_RULEBASE VALUES ( 'spanish_account_rule', '(?s <http://example.com/account> ?y)(?y <http://example.com/account> <http://example.com/Spain>)', '/*+ ENABLE_SYNTAX_CHECKING */ y not like ''%Z'' ', '(?y <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.com/SpanishAccount>)', NULL );
Parent topic: Using OWL Inferencing
3.2.4 Generating OWL Inferencing Proofs
OWL inference can be complex, depending on the size of the ontology, the actual vocabulary (set of language constructs) used, and the interactions among those language constructs. To enable you to find out how a triple is derived, you can use proof generation during inference. (Proof generation does require additional CPU time and disk resources.)
To generate the information required for proof, specify PROOF=T
in
the call to the SEM_APIS.CREATE_INFERRED_GRAPH
procedure, as shown in the following example:
EXECUTE sem_apis.create_inferred_graph('owltst_idx', sem_models('owltst'), - sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, 'SAM', 'PROOF=T', network_owner=>'RDFUSER', network_name=>'NET1');
Specifying PROOF=T
causes a view to be created containing proof for
each inferred triple. The view name is the inferred graph name prefixed by
RDFUSER.NET1#SEMI_
. Two relevant columns in this view are LINK_ID and
EXPLAIN (the proof). The following example displays the LINK_ID value and proof of each
generated triple (with LINK_ID values shortened for simplicity):
SELECT link_id || ' generated by ' || explain as triple_and_its_proof FROM RDFUSER.NET1#SEMI_OWLST_IDX; TRIPLE_AND_ITS_PROOF -------------------------------------------------------------------- 8_5_5_4 generated by 4_D_5_5 : SYMM_SAMH_SYMM 8_4_5_4 generated by 8_5_5_4 4_D_5_5 : SAM_SAMH . . .
A proof consists of one or more triple (link) ID values and the name of the rule that is applied on those triples:
link-id1
[link-id2
... link-idn
] : rule-name
Example 3-4 Displaying Proof Information
To get the full subject, predicate, and object URIs for proofs, you can query the RDF graph view and the inferred graph view. Example 3-4 displays the LINK_ID value and associated triple contents using the RDF graph view SEMM_OWLTST and the inferred graph view SEMI_OWLTST_IDX.
SELECT to_char(x.triple.rdf_m_id, 'FMXXXXXXXXXXXXXXXX') ||'_'|| to_char(x.triple.rdf_s_id, 'FMXXXXXXXXXXXXXXXX') ||'_'|| to_char(x.triple.rdf_p_id, 'FMXXXXXXXXXXXXXXXX') ||'_'|| to_char(x.triple.rdf_c_id, 'FMXXXXXXXXXXXXXXXX'), x.triple.get_triple() FROM ( SELECT sdo_rdf_triple_s( t.canon_end_node_id, t.model_id, t.start_node_id, t.p_value_id, t.end_node_id) triple FROM (select * from rdfuser.net1#semm_owltst union all select * from rdfuser.net1#semi_owltst_idx ) t WHERE t.link_id IN ('4_D_5_5','8_5_5_4') ) x; LINK_ID X.TRIPLE.GET_TRIPLE()(SUBJECT, PROPERTY, OBJECT) ---------- -------------------------------------------------------------- 4_D_5_5 SDO_RDF_TRIPLE('<http://example.com/name/John>', '<http://www.w3.org/2002/07/owl#sameAs>', '<http://example.com/name/JohnQ>') 8_5_5_4 SDO_RDF_TRIPLE('<http://example.com/name/JohnQ>', '<http://www.w3.org/2002/07/owl#sameAs>', '<http://example.com/name/John>')
In Example 3-4, for the proof entry 8_5_5_4 generated by 4_D_5_5 : SYMM_SAMH_SYMM for the triple with LINK_ID = 8_5_5_4, it is inferred from the triple with 4_D_5_5 using the symmetricity of owl:sameAs
.
If the inference status is INCOMPLETE and if the last inference was generated without
proof information, you cannot invoke SEM_APIS.CREATE_INFERRED_GRAPH with PROOF=T
. In this case, you must first drop the
inferred graph and create it again specifying PROOF=T
.
Parent topic: Using OWL Inferencing
3.2.5 Validating OWL RDF Graphs and Inferred Graphs
An OWL ontology may contain errors, such as unsatisfiable classes, instances belonging to unsatisfiable classes, and two individuals asserted to be same and different at the same time. You can use the SEM_APIS.VALIDATE_RDF_GRAPH and SEM_APIS.VALIDATE_INFERRED_GRAPH functions to detect inconsistencies in the original RDF graph and in the inferred graph, respectively.
Example 3-5 Validating an Inferred Graph
Example 3-5 shows uses the SEM_APIS.VALIDATE_INFERRED_GRAPH function, which returns a null value if no errors are detected or a VARRAY of strings if any errors are detected.
SQL> -- Insert an offending triple.
SQL> insert into owltst values (1, sdo_rdf_triple_s('owltst',
'urn:C1', 'http://www.w3.org/2000/01/rdf-schema#subClassOf', 'http://www.w3.org/2002/07/owl#Nothing', 'RDFUSER', 'NET1'));
SQL> -- Drop inferred graph first.
SQL> exec sem_apis.drop_inferred_graph('owltst_idx', network_owner=>'RDFUSER', network_name=>'NET1');
PL/SQL procedure successfully completed.
SQL> -- Perform OWL inferencing.
SQL> exec sem_apis.create_inferred_graph('owltst_idx', sem_models('OWLTST'), sem_rulebases('OWLPRIME') , network_owner=>'RDFUSER', network_name=>'NET1');
PL/SQL procedure successfully completed.
SQL > set serveroutput on;
SQL > -- Now invoke validation API: sem_apis.validate_inferred_graph
SQL >
declare
lva sem_longvarchararray;
idx int;
begin
lva := sem_apis.validate_inferred_graph(sem_models('OWLTST'), sem_rulebases('OWLPRIME'), network_owner=>'RDFUSER', network_name=>'NET1') ;
if (lva is null) then
dbms_output.put_line('No errors found.');
else
for idx in 1..lva.count loop
dbms_output.put_line('Offending entry := ' || lva(idx)) ;
end loop ;
end if;
end ;
/
SQL> -- NOTE: The LINK_ID value and the numbers in the following
SQL> -- line are shortened for simplicity in this example. --
Offending entry := 1 10001 (4_2_4_8 2 4 8) Unsatisfiable class.
Each item in the validation report array includes the following information:
-
Number of triples that cause this error (
1
in Example 3-5) -
Error code (
10001
Example 3-5) -
One or more triples (shown in parentheses in the output;
(4_2_4_8 2 4 8)
in Example 3-5).These numbers are the LINK_ID value and the ID values of the subject, predicate, and object.
-
Descriptive error message (
Unsatisfiable class.
in Example 3-5)
The output in Example 3-5 indicates that the error is caused by one triple that asserts that a class is a subclass of an empty class owl:Nothing
.
Parent topic: Using OWL Inferencing
3.2.6 Using SEM_APIS.CREATE_INFERRED_GRAPH for RDFS Inference
In addition to accepting OWL vocabularies, the SEM_APIS.CREATE_INFERRED_GRAPH
procedure accepts RDFS rulebases. The following example shows RDFS inference (all standard RDFS
rules are defined in http://www.w3.org/TR/rdf-mt/
):
EXECUTE sem_apis.create_inferred_graph('rdfstst_idx', sem_models('my_model'), sem_rulebases('RDFS'), network_owner=>'RDFUSER', network_name=>'NET1');
Because rules RDFS4A, RDFS4B, RDFS6, RDFS8, RDFS10, RDFS13 may not generate meaningful inference for your applications, you can deselect those components for faster inference. The following example deselects these rules.
EXECUTE sem_apis.create_inferred_graph('rdfstst_idx', sem_models('my_model'), sem_rulebases('RDFS'), SEM_APIS.REACH_CLOSURE, -
'RDFS4A-, RDFS4B-, RDFS6-, RDFS8-, RDFS10-, RDFS13-'), network_owner=>'RDFUSER', network_name=>'NET1');
Parent topic: Using OWL Inferencing
3.2.7 Enhancing Inference Performance
This section describes suggestions for improving the performance of inference operations.
-
Collect statistics before inferencing. After you load a large RDF/OWL data model, you should execute the SEM_PERF.GATHER_STATS procedure. See the Usage Notes for that procedure (in SEM_PERF Package Subprograms) for important usage information.
-
Allocate sufficient temporary tablespace for inference operations. OWL inference support in Oracle relies heavily on table joins, and therefore uses significant temporary tablespace.
-
Use the appropriate implementations of the SVFH and AVFH inference components.
The default implementations of the SVFH and AVFH inference components work best when the number of restriction classes defined by
owl:someValuesFrom
and/orowl:allValuesFrom
is low (as in the LUBM data sets). However, when the number of such classes is high (as in the Gene Ontologyhttp://www.geneontology.org/
), using non-procedural implementations of SVFH and AVFH may significantly improve performance.To disable the procedural implementations and to select the non-procedural implementations of SVFH and AVFH, include
'PROCSVFH=F'
and/or'PROCAVFH=F'
in the options to SEM_APIS.CREATE_INFERRED_GRAPH. Using the appropriate implementation for an ontology can provide significant performance benefits. For example, selecting the non-procedural implementation of SVFH for the NCI Thesaurus ontology (seehttp://www.cancer.gov/research/resources/terminology
) produced a 960% performance improvement for the SVFH inference component (tested on a dual-core, 8GB RAM desktop system with 3 SATA disks tied together with Oracle ASM).
See also Optimizing owl:sameAs Inference.
Related Topics
Parent topic: Using OWL Inferencing
3.2.8 Optimizing owl:sameAs Inference
You can optimize inference performance for large owl:sameAs
cliques
by specifying 'OPT_SAMEAS=T'
in the options
parameter
when performing OWLPrime inference. (A clique is a graph in which every node of it
is connected to, bidirectionally, every other node in the same graph.)
According to OWL semantics, the owl:sameAs
construct is treated as
an equivalence relation, so it is reflexive, symmetric, and transitive. As a result, during
inference a full materialization of owl:sameAs
-related inferences could
significantly increase the size of the inferred graph. Consider the following example
triple set:
:John owl:sameAs :John1 . :John owl:sameAs :John2 . :John2 :hasAge "32" .
Applying OWLPrime inference (with the SAM
component specified) to this set would generate the following new triples:
:John1 owl:sameAs :John . :John2 owl:sameAs :John . :John1 owl:sameAs :John2 . :John2 owl:sameAs :John1 . :John owl:sameAs :John . :John1 owl:sameAs :John1 . :John2 owl:sameAs :John2 . :John :hasAge "32" . :John1 :hasAge "32" .
In the preceding example, :John
, :John1
and
:John2
are connected to each other with the owl:sameAs
relationship; that is, they are members of an owl:sameAs
clique. To provide optimized inference for large owl:sameAs
cliques, you can consolidate owl:sameAs
triples without sacrificing
correctness by specifying 'OPT_SAMEAS=T'
in the options
parameter when performing OWLPrime inference. For example:
EXECUTE sem_apis.create_inferred_graph('M_IDX',sem_models('M'),
sem_rulebases('OWLPRIME'),null,null,'OPT_SAMEAS=T', network_owner=>'RDFUSER', network_name=>'NET1');
When you specify this option, for each owl:sameAs
clique, one resource from the clique is chosen as a canonical representative and all of the inferences for that clique are consolidated around that resource. Using the preceding example, if :John1
is the clique representative, after consolidation the inferred graph would contain only the following triples:
:John1 owl:sameAs :John1 . :John1 :hasAge "32" .
Some overhead is incurred with owl:sameAs
consolidation. During
inference, all asserted RDF graphs are copied into the inference partition, where they are
consolidated together with the inferred triples. Additionally, for very large asserted
graphs, consolidating and removing duplicate triples incurs a large runtime overhead, so
the OPT_SAMEAS=T
option is recommended only for ontologies that have a
large number of owl:sameAs
relationships and large clique sizes.
After the OPT_SAMEAS=T option has been used for an inferred graph, all subsequent
uses of SEM_APIS.CREATE_INFERRED_GRAPH for that inferred graph must also use OPT_SAMEAS=T
, or an error will be
reported. To disable optimized sameAs
handling, you must first drop the
inferred graph.
Clique membership information is stored in a view named SEMC_inferred-graph-name, where inferred-graph-name is the name of the inferred graph. Each SEMC_inferred-graph-name view has the columns shown in Table 3-3.
Table 3-3 SEMC_inferred_graph_name View Columns
Column Name | Data Type | Description |
---|---|---|
MODEL_ID |
NUMBER |
ID number of the inferred model |
VALUE_ID |
NUMBER) |
ID number of a resource that is a member of the |
CLIQUE_ID |
NUMBER |
ID number of the clique representative for the VALUE_ID resource |
To save space, the SEMC_inferred-graph-name view does not contain reflexive rows like (CLIQUE_ID, CLIQUE_ID).
3.2.8.1 Querying owl:sameAs Consolidated Inference Graphs
At query time, if the inferred graph queried was created using the
OPT_SAMEAS=T
option, the results are returned from an
owl:sameAs
-consolidated inference partition. The query results are not
expanded to include the full owl:sameAs
closure.
In the following example query, the only result returned would be :John1
, which is the canonical clique representative.
SELECT A FROM TABLE ( SEM_MATCH ('SELECT ?A WHERE {?A :hasAge "32"}',SEM_MODELS('M'), SEM_RULEBASES('OWLPRIME'),null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1'));
With the preceding example, even though :John2 :hasAge "32"
occurs
in the RDF graph, it has been replaced during the inference consolidation phase where
redundant triples are removed. However, you can expand the query results by performing a join
with the RDFUSER.NET1#SEMC_rules-index-name view that contains the consolidated
owl:sameAs
information. For example, to get expanded result set for the
preceding SEM_MATCH query, you can use the following expanded query:
SELECT V.VALUE_NAME A_VAL FROM TABLE ( SEM_MATCH ('SELECT ?A WHERE {?A :hasAge "32"}',SEM_MODELS('M'), SEM_RULEBASES('OWLPRIME'), null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1')) Q, RDFUSER.NET1#RDF_VALUE$ V, RDFUSER.NET1#SEMC_M_IDX C WHERE V.VALUE_ID = C.VALUE_ID AND C.CLIQUE_ID = Q.A$RDFVID UNION ALL SELECT A A_VAL FROM TABLE ( SEM_MATCH ('SELECT ?A WHERE {?A :hasAge "32"}',SEM_MODELS('M'), SEM_RULEBASES('OWLPRIME'), null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1'));
Or, you could rewrite the preceding expanded query using a left outer join, as follows:
SELECT V.VALUE_NAME A_VAL FROM TABLE ( SEM_MATCH ('(?A <http://hasAge> "33")',SEM_MODELS('M'), SEM_RULEBASES('OWLPRIME'), null, null, null, null, 'PLUS_RDFT=VC', null, null, 'RDFUSER', 'NET1')) Q, RDFUSER.NET1#RDF_VALUE$ V, (SELECT value_id, clique_id FROM RDFUSER.NET1#SEMC_M_IDX UNION ALL SELECT DISTINCT clique_id, clique_id FROM RDFUSER.NET1#SEMC_M_IDX) C WHERE Q.A$RDFVID = c.clique_id (+) AND V.VALUE_ID = nvl(C.VALUE_ID, Q.A$RDFVID);
Parent topic: Optimizing owl:sameAs Inference
3.2.9 Performing Incremental Inference
Incremental inference can be used to update inferred graphs efficiently after triple additions. There are two ways to enable incremental inference for an inferred graph:
-
Specify the
options
parameter valueINC=T
when creating the inferred graph. For example:EXECUTE sem_apis.create_inferred_graph ('M_IDX',sem_models('M'), sem_rulebases('OWLPRIME'),null,null, 'INC=T', network_owner=>'RDFUSER', network_name=>'NET1');
-
Use the SEM_APIS.ENABLE_INC_INFERENCE procedure.
If you use this procedure, the inferred graph must have a VALID status. Before calling the procedure, if you do not own the RDF graphs involved in the inferred graph, you must ensure that the respective RDF graph owners have used the SEM_APIS.ENABLE_CHANGE_TRACKING procedure to enable change tracking for those RDF graphs.
When incremental inference is enabled for an inferred graph, the parameter
INC=T
must be specified when invoking the SEM_APIS.CREATE_INFERRED_GRAPH
procedure for that inferred graph.
Incremental inference for an inferred graph depends on triggers for the application
tables of the RDF graphs involved in creating the inferred graph. This means that incremental
inference works only when triples are inserted in the application tables underlying the
inferred graph using conventional path loads, unless you specify the triples by using the
delta_in
parameter in the call to the SEM_APIS.CREATE_INFERRED_GRAPH
procedure, as in the following example, in which the triples from RDF graph
M_NEW
will be added to the RDF graph M
, and inferred graph
M_IDX
will be updated with the new inferences:
EXECUTE sem_apis.create_inferred_graph('M_IDX', sem_models('M'),
sem_rulebases('OWLPRIME''), SEM_APIS.REACH_CLOSURE, null, null,
sem_models('M_NEW'), network_owner=>'RDFUSER', network_name=>'NET1');
If multiple RDF graphs are involved in the incremental inference call, then to
specify the destination RDF graph to which the delta_in
RDF graph or RDF
graphs are to be added, specify
DEST_MODEL=
<rdf_graph_name>
in the
options
parameter. For example, the following causes the RDF data in RDF
graph M_NEW
to be added to the RDF graph M2
:
EXECUTE sem_apis.create_inferred_graph('M_IDX', sem_models('M1','M2','M3'),
sem_rulebases('OWLPRIME''), SEM_APIS.REACH_CLOSURE, null, 'DEST_MODEL=M2', sem_models('M_NEW')), network_owner=>'RDFUSER', network_name=>'NET1');
Another way to bypass the conventional path loading requirement when using incremental inference is to set the UNDO_RETENTION parameter to cover the intervals between inferred graphs when you perform bulk loading. For example, if the last inferred graph was created 6 hours ago, the UNDO_RETENTION value should be set to greater than 6 hours; if it is less than that, then (given a heavy workload and limited undo space) it is not guaranteed that all relevant undo information will be preserved for incremental inference to apply. In such cases, the SEM_APIS.CREATE_INFERRED_GRAPH procedure falls back to regular (non-incremental) inference.
To check if change tracking is enabled on an RDF graph, use the SEM_APIS.GET_CHANGE_TRACKING_INFO procedure. To get additional information about incremental inference for an inferred graph, use the SEM_APIS.GET_INC_INF_INFO procedure.
The following restrictions apply to incremental inference:
-
It does not work with optimized
owl:sameAs
handling (OPT_SAMEAS
), user-defined rules, VPD-enabled RDF graphs, or version-enabled RDF graphs. -
It supports only the addition of triples. With updates or deletions, the inferred graph will be completely rebuilt.
-
It depends on triggers on application tables.
-
Column types (RAW8 or NUMBER) used in incremental inference must be consistent. For instance, if
RAW8=T
is used to build the inferred graph initially, then for every subsequent SEM_APIS.CREATE_INFERRED_GRAPH call the same option must be used. To change the column type to NUMBER, you must drop and rebuild the inferred graph.
Parent topic: Using OWL Inferencing
3.2.10 Using Parallel Inference
Parallel inference can improve inference performance by taking advantage of the
capabilities of a multi-core or multi-CPU architectures. To use parallel
inference, specify the DOP
(degree of parallelism) keyword and an
appropriate value when using the SEM_APIS.CREATE_INFERRED_GRAPH procedure. For example:
EXECUTE sem_apis.create_inferred_graph('M_IDX',sem_models('M'),
sem_rulebases('OWLPRIME'), sem_apis.REACH_CLOSURE, null, 'DOP=4',
network_owner=>'RDFUSER', network_name=>'NET1');
Specifying the DOP keyword causes parallel execution to be enabled for an Oracle-chosen set of inference components
The success of parallel inference depends heavily on a good hardware configuration of the system on which the database is running. The key is to have a "balanced" system that implements the best practices for database performance tuning and Oracle SQL parallel execution. For example, do not use a single 1 TB disk for an 800 GB database, because executing SQL statements in parallel on a single physical disk can even be slower than executing SQL statements in serial mode. Parallel inference requires ample memory; for each CPU core, you should have at least 4 GB of memory.
Parallel inference is best suited for large ontologies; however, inference performance can also improve for small ontologies.
There is some transient storage overhead associated with using parallel inference. Parallel inference builds a source table that includes all triples based on all the source RDF/OWL graphs and existing inferred graph. This table might use an additional 10 to 30 percent of storage compared to the space required for storing data and index of the source RDF graphs.
Parent topic: Using OWL Inferencing
3.2.11 Using Named Graph Based Inferencing (Global and Local)
The default inferencing in Oracle Database takes all asserted triples from all the source RDF graph or RDF graphs provided and applies semantic rules on top of all the asserted triples until an inference closure is reached. Even if the given source RDF graphs contain one or more multiple named graphs, it makes no difference because all assertions, whether part of a named graph or not, are treated the same as if they come from a single graph. (For an introduction to named graph support in RDF Graph, see Named Graphs.)
This default inferencing can be thought of as completely "global" in that it does not consider named graphs at all.
However, if you use named graphs, you can override the default inferencing and have named graphs be considered by using either of the following features:
-
Named graph based global inference (NGGI), which treats all specified named graphs as a unified graph. NGGI lets you narrow the scope of triples to be considered, while enabling great flexibility; it is explained in Named Graph Based Global Inference (NGGI).
-
Named graph based local inference (NGLI), which treats each specified named graph as a separate entity. NGLI is explained in Named Graph Based Local Inference (NGLI).
For using NGGI and NGLI together, see a recommended usage flow in Using NGGI and NGLI Together.
You specify NGGI or NGLI through certain parameters and options to the SEM_APIS.CREATE_INFERRED_GRAPH procedure when you create an inferred graph.
- Named Graph Based Global Inference (NGGI)
- Named Graph Based Local Inference (NGLI)
- Using NGGI and NGLI Together
Parent topic: Using OWL Inferencing
3.2.11.1 Named Graph Based Global Inference (NGGI)
Named graph based global inference (NGGI) enables you to narrow the scope of triples used for inferencing at the named graph level (as opposed to the RDF graph level). It also enables great flexibility in selecting the scope; for example, you can include triples from zero or more named graphs and/or from the default graph, and you can include all triples with a null graph name from specified RDF graphs.
For example, in a hospital application you may only want to apply the inference rules on all the information contained in a set of named graphs describing patients of a particular hospital. If the patient-related named graphs contains only instance-related assertions (ABox), you can specify one or multiple additional schema related-RDF graphs (TBox), as in Example 3-6.
Example 3-6 Named Graph Based Global Inference
EXECUTE sem_apis.create_inferred_graph( 'patients_inf', rdf_graphs_in => sem_models('patients','hospital_ontology'), rulebases_in => sem_rulebases('owl2rl'), passes => SEM_APIS.REACH_CLOSURE, inf_components_in => null, options => 'DOP=4,RAW8=T', include_default_g => sem_models('hospital_ontology'), include_named_g => sem_graphs('<urn:hospital1_patient1>','<urn:hospital1_patient2>'), inf_ng_name => '<urn:inf_graph_for_hospital1>', network_owner =>'RDFUSER', network_name =>'NET1' );
In Example 3-6:
-
Two RDF graphs are involved:
patients
contains a set of named graphs where each named graph holds triples relevant to a particular patient, andhospital_ontology
contains schema information describing concepts and relationships that are defined for hospitals. These two RDF graphs together are the source graphs, and they set up an overall scope for the inference. -
The
include_default_g
parameter causes all triples with a NULL graph name in the specified RDF graphs to participate in NGGI. In this example, all triples with a NULL graph name in RDF graphhospital_ontology
will be included in NGGI. -
The
include_named_g
parameter causes all triples from the specified named graphs (across all source RDF graphs) to participate in NGGI. In this example, triples from named graphs<urn:hospital1_patient1>
and<urn:hospital1_patient2>
will be included in NGGI. -
The
inf_ng_name
parameter assigns graph name<urn:inf_graph_for_hospital1>
to all the new triples inferred by NGGI.
Parent topic: Using Named Graph Based Inferencing (Global and Local)
3.2.11.2 Named Graph Based Local Inference (NGLI)
Named graph based local inference (NGLI) treats each named graph as a separate entity instead of viewing the graphs as a single unified graph. Inference logic is performed within the boundary of each entity. You can specify schema-related assertions (TBox) in a default graph, and that default graph will participate the inference of each named graph. For example, inferred triples based on a graph with name G1
will be assigned the same graph name G1
in the inferred data partition.
Assertions from any two separate named graphs will never jointly produce any new assertions.
For example, assume the following:
-
Graph
G1
includes the following assertion::John :hasBirthMother :Mary .
-
Graph
G2
includes the following assertion::John :hasBirthMother :Bella .
-
The default graph includes the assertion that
:hasBirthMother
is anowl:FunctionalProperty
. (This assertion has a null graph name.)
In this example, named graph based local inference (NGLI) will not infer that :Mary
is owl:sameAs :Bella
because the two assertions are from two distinct graphs, G1
and G2
. By contrast, a named graph based global inference (NGGI) that includes G1
, G2
, and the functional property definition would be able to infer that :Mary
is owl:sameAs :Bella
.
NGLI currently does not work together with proof generation, user-defined rules, optimized owl:sameAs
handling, or incremental inference.
Example 3-7 Named Graph Based Local Inference
Example 3-7 shows NGLI.
EXECUTE sem_apis.create_inferred_graph( 'patients_inf', rdf_graphs_in => sem_models('patients','hospital_ontology'), rulebases_in => sem_rulebases('owl2rl'), passes => SEM_APIS.REACH_CLOSURE, inf_components_in => null, options => 'LOCAL_NG_INF=T', network_owner=>'RDFUSER', network_name=>'NET1' );
In Example 3-7:
-
The two RDF graphs patients and hospital_ontology together are the source graphs, and they set up an overall scope for the inference, similar to the case of global inference in Example 3-6. All triples with a null graph name are treated as part of the common schema (TBox). Inference is performed within the boundary of every single named graph combined with the common schema.
-
Then
options
parameter keyword-value pairLOCAL_NG_INF=T
specifies that named graph based local inference (NGLI) is to be performed.
Note that, by design, NGLI does not apply to the default graph itself. However, you can easily apply named graph based global inference (NGGI) on the default graph and set the inf_ng_name
parameter to null. In this way, the TBox inference is precomputed, improving the overall performance and storage consumption.
NGLI does not allow the following:
-
Inferring new relationships based on a mix of triples from multiple named graphs
-
Inferring new relationships using only triples from the default graph.
To get the inference that you would normally expect, you should keep schema assertions and instance assertions separate. Schema assertions (for example, :A rdfs:subClassOf :B
and :p1 rdfs:subPropertyOf :p2
) should be stored in the default graph as unnamed triples (with null graph names). By contrast, instance assertions (for example, :X :friendOf :Y
) should be stored in one of the named graphs.
For a discussion and example of using NGLI to perform document-centric inference with semantically indexed documents, see Performing Document-Centric Inference.
Parent topic: Using Named Graph Based Inferencing (Global and Local)
3.2.11.3 Using NGGI and NGLI Together
The following is a recommended usage flow for using NGGI and NGLI together. It assumes that TBox and ABox are stored in two separate RDF graphs, that TBox contains schema definitions and all triples in the TBox have a null graph name, but that ABox consists of a set of named graphs describing instance-related data.
Parent topic: Using Named Graph Based Inferencing (Global and Local)
3.2.12 Performing Selective Inferencing (Advanced Information)
Selective inferencing is component-based inferencing, in which you limit the
inferencing to specific OWL components that you are interested in. To perform selective
inferencing, use the inf_components_in
parameter to the SEM_APIS.CREATE_INFERRED_GRAPH procedure to specify a comma-delimited list of components. The final inferencing is
determined by the union of rulebases specified and the components specified.
Example 3-8 Performing Selective Inferencing
Example 3-8 limits the inferencing to the class
hierarchy from subclass (SCOH) relationship and the property hierarchy from
subproperty (SPOH) relationship. This example creates an empty rulebase and then
specifies the two components ('SCOH,SPOH'
) in the call to the SEM_APIS.CREATE_INFERRED_GRAPH procedure.
EXECUTE sem_apis.create_rulebase('my_rulebase', network_owner=>'RDFUSER', network_name=>'NET1');
EXECUTE sem_apis.create_inferred_graph('owltst_idx', sem_models('owltst'), sem_rulebases('my_rulebase'), SEM_APIS.REACH_CLOSURE, 'SCOH,SPOH', network_owner=>'RDFUSER', network_name=>'NET1');
The following component codes are available: SCOH
, COMPH
, DISJH
, SYMMH
, INVH
, SPIH
, MBRH
, SPOH
, DOMH
, RANH
, EQCH
, EQPH
, FPH
, IFPH
, DOM
, RAN
, SCO
, DISJ
, COMP
, INV
, SPO
, FP
, IFP
, SYMM
, TRANS
, DIF
, SAM
, CHAIN
, HASKEY
, ONEOF
, INTERSECT
, INTERSECTSCOH
, MBRLST
, PROPDISJH
, SKOSAXIOMS
, SNOMED
, SVFH
, THINGH
, THINGSAM
, UNION
, RDFP1
, RDFP2
, RDFP3
, RDFP4
, RDFP6
, RDFP7
, RDFP8AX
, RDFP8BX
, RDFP9
, RDFP10
, RDFP11
, RDFP12A
, RDFP12B
, RDFP12C
, RDFP13A
, RDFP13B
, RDFP13C
, RDFP14A
, RDFP14BX
, RDFP15
, RDFP16
, RDFS2
, RDFS3
, RDFS4a
, RDFS4b
, RDFS5
, RDFS6
, RDFS7
, RDFS8
, RDFS9
, RDFS10
, RDFS11
, RDFS12
, RDFS13
The rules corresponding to components with a prefix of RDFP can be found in Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary, by H.J. Horst.
The syntax for deselecting a component is component_name followed by a minus (-) sign. For example, the following statement performs OWLPrime inference without calculating the subClassOf
hierarchy:
EXECUTE sem_apis.create_inferred_graph('owltst_idx', sem_models('owltst'), sem_rulebases('OWLPRIME'), SEM_APIS.REACH_CLOSURE, 'SCOH-', network_owner=>'RDFUSER', network_name=>'NET1');
By default, the OWLPrime rulebase implements the transitive semantics of owl:sameAs. OWLPrime does not include the following rules (semantics):
U owl:sameAs V . U p X . ==> V p X . U owl:sameAs V . X p U . ==> X p V .
The reason for not including these rules is that they tend to generate many
assertions. If you need to include these assertions, you can include the
SAM
component code in the call to the SEM_APIS.CREATE_INFERRED_GRAPH procedure.
Parent topic: Using OWL Inferencing