15.35 SEM_APIS.CREATE_INFERRED_GRAPH
Format
SEM_APIS.CREATE_INFERRED_GRAPH( inferred_graph_name IN VARCHAR2, rdf_graphs_in IN SEM_MODELS, rulebases_in IN SEM_RULEBASES, passes IN NUMBER DEFAULT SEM_APIS.REACH_CLOSURE, inf_components_in IN VARCHAR2 DEFAULT NULL, options IN VARCHAR2 DEFAULT NULL, delta_in IN SEM_MODELS DEFAULT NULL, label_gen IN RDFSA_LABELGEN DEFAULT NULL, include_named_g IN SEM_GRAPHS DEFAULT NULL, include_default_g IN SEM_MODELS DEFAULT NULL, include_all_g IN SEM_MODELS DEFAULT NULL, inf_ng_name IN VARCHAR2 DEFAULT NULL, inf_ext_user_func_name IN VARCHAR2 DEFAULT NULL, ols_ladder_inf_lbl_sec IN VARCHAR2 DEFAULT NULL, network_owner IN VARCHAR2 DEFAULT NULL, network_name IN VARCHAR2 DEFAULT NULL);
Description
Creates an inferred graph (rules index) that can be used to perform OWL or RDFS inferencing, and optionally use user-defined rules.
Parameters
- inferred_graph_name
-
Name of the inferred graph to be created.
- rdf_graphs_in
-
One or more RDF graph names. Its data type is SEM_MODELS, which has the following definition:
TABLE OF VARCHAR2(25)
- rulebases_in
-
One or more rulebase names. Its data type is SEM_RULEBASES, which has the following definition:
TABLE OF VARCHAR2(25)
. Rules and rulebases are explained in Inferencing: Rules and Rulebases. - passes
-
The number of rounds that the inference engine should run. The default value is
SEM_APIS.REACH_CLOSURE
, which means the inference engine will run till a closure is reached. If the number of rounds specified is less than the number of actual rounds needed to reach a closure, the status of the inferred graph will then be set toINCOMPLETE
. - inf_components_in
-
A comma-delimited string of keywords representing inference components, for performing selective or component-based inferencing. If this parameter is null, the default set of inference components is used. See the Usage Notes for more information about inference components.
- options
-
A comma-delimited string of options to control the inference process by overriding the default inference behavior. To enable an option, specify
option-name
=T
; to disable an option, you can specifyoption-name
=F
(the default). The available option-name values areCOL_COMPRESS
,DEST_MODEL
,DISTANCE,DOP
,ENTAIL_ANYWAY
,HASH_PART
,INC
,LOCAL_NG_INF
,OPT_SAMEAS
,RAW8
,PROOF
, andUSER_RULES
. See the Usage Notes for explanations of each value. - delta_in
-
If incremental inference is in effect, specifies one or more RDF graphs on which to perform incremental inference. Its data type is SEM_MODELS, which has the following definition:
TABLE OF VARCHAR2(25)
The triples in the first RDF graph in
delta_in
are copied to the first RDF graph inrdf_graphs_in
, and the inferred graph (rules index) inrules_index_in
is updated; then the triples in the second RDF graph (if any) indelta_in
are copied to the second RDF graph (if any) inrdf_graphs_in
, and the inferred graph inrules_index_in
is updated; and so on until all triples are copied and the inferred graph is updated. (Thedelta_in
parameter has no effect if incremental inference is not enabled for the inferred graph.) - label_gen
-
An instance of RDFSA_LABELGEN or a subtype of it, defining the logic for generating Oracle Label Security (OLS) labels for inferred triples. What you specify for this parameter depends on whether you use the default label generator or a custom label generator:
-
If you use the default label generator, specify one of the following constants:
SEM_RDFSA.LABELGEN_RULE
for Use Rule Label,SEM_RDFSA.LABELGEN_SUBJECT
for Use Subject Label,SEM_RDFSA.LABELGEN_PREDICATE
for Use Predicate Label,SEM_RDFSA.LABELGEN_OBJECT
for Use Object Label,SEM_RDFSA.LABELGEN_DOMINATING
for Use Dominating Label,SEM_RDFSA.LABELGEN_ANTECED
for Use Antecedent Labels. -
If you use a custom label generator, specify the custom label generator type.
-
- include_named_g
-
Causes all triples from the specified named graphs (across all source RDF graphs) to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example,
include_named_g => sem_graphs('<urn:G1>','<urn:G2>')
implies that triples from named graphsG1
andG2
will be included in NGGI.Its data type is RDF_GRAPHS, which has the following definition:
TABLE OF VARCHAR2(4000)
. - include_default_g
-
Causes all triples with a null graph name in the specified SEM_MODELS to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example,
include_default_g => sem_models('m1')
causes all triples with a null graph name fromM1
to be included in NGGI. - include_all_g
-
Causes all triples, regardless of their graph name values, in the specified models to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example,
include_all_g => sem_models('m2')
causes all triples inM2
to be included in NGGI. - inf_ng_name
-
Assigns the specified graph name to all the new triples inferred by the named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)).
- inf_ext_user_func_name
-
The name of a user-defined inference function, or a comma-delimited list of names of user-defined functions. For information about creating user-defined inference functions, including format requirements and options for certain parameters, see API Support for User-Defined Inferencing. (For information about user-defined inferencing, including examples, see User-Defined Inferencing and Querying.)
- ols_ladder_inf_lbl_sec
- network_owner
-
Owner of the RDF network. (See Table 1-2.)
- network_name
-
Name of the RDF network. (See Table 1-2.)
Usage Notes
For the inf_components_in
parameter, you can specify any combination of the following keywords: SCOH
, COMPH
, DISJH
, SYMMH
, INVH
, SPIH
, MBRH
, SPOH
, DOMH
, RANH
, EQCH
, EQPH
, FPH
, IFPH
, DOM
, RAN
, SCO
, DISJ
, COMP
, INV
, SPO
, FP
, IFP
, SYMM
, TRANS
, DIF
, SAM
, CHAIN
, HASKEY
, ONEOF
, INTERSECT
, INTERSECTSCOH
, MBRLST
, PROPDISJH
, SKOSAXIOMS
, SNOMED
, SVFH
, THINGH
, THINGSAM
, UNION
, RDFP1
, RDFP2
, RDFP3
, RDFP4
, RDFP6
, RDFP7
, RDFP8AX
, RDFP8BX
, RDFP9
, RDFP10
, RDFP11
, RDFP12A
, RDFP12B
, RDFP12C
, RDFP13A
, RDFP13B
, RDFP13C
, RDFP14A
, RDFP14BX
, RDFP15
, RDFP16
, RDFS2
, RDFS3
, RDFS4a
, RDFS4b
, RDFS5
, RDFS6
, RDFS7
, RDFS8
, RDFS9
, RDFS10
, RDFS11
, RDFS12
, RDFS13
. For an explanation of the meaning of these keywords, see Table 15-1, where the keywords are listed in alphabetical order.
The default set of inference components for the OWLPrime vocabulary includes the following: SCOH
, COMPH
, DISJH
, SYMMH
, INVH
, SPIH
, SPOH
, DOMH
, RANH
, EQCH
, EQPH
, FPH
, IFPH
, SAMH
, DOM
, RAN
, SCO
, DISJ
, COMP
, INV
, SPO
, FP
, IFP
, SYMM
, TRANS
, DIF
, RDFP14A
, RDFP14BX
, RDFP15
, RDFP16
. However, note the following:
-
Component
SAM
is not in this default OWLPrime list, because it tends to generate many new triples for some ontologies. -
Effective with Release 11.2, the native OWL inference engine supports the following new inference components:
CHAIN
,HASKEY
,INTERSECT
,INTERSECTSCOH
,MBRLST
,ONEOF
,PROPDISJH
,SKOSAXIOMS
,SNOMED
,SVFH
,THINGH
,THINGSAM
,UNION
. However, for backward compatibility, the OWLPrime rulebase and any existing rulebases do not include these new components by default; instead, to use these new inference components, you must specify them explicitly, and they are included in Table 15-1 The following example creates an OWLPrime inferred graph for two OWL ontologies namedLUBM
andUNIV
. Because of the additional inference components specified, this inferred graph will include the new semantics introduced in those inference components.EXECUTE sem_apis.create_inferred_rdf_graph('lubm1000_idx',sem_models('lubm','univ'), sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, 'INTERSECT,INTERSECTSCOH,SVFH,THINGH,THINGSAM,UNION');
Table 15-3 Inferencing Keywords for inf_components_in Parameter
Keyword | Explanation |
---|---|
CHAIN |
Captures the property chain semantics defined in OWL 2. Only chains of length 2 are supported. By default, this is included in the |
COMPH |
Performs inference based on owl:complementOf assertions and the interaction of owl:complementOf with other language constructs. |
DIF |
Generates owl:differentFrom assertions based on the symmetricity of owl:differentFrom. |
DISJ |
Infers owl:differentFrom relationships at instance level using owl:disjointWith assertions. |
DISJH |
Performs inference based on owl:disjointWith assertions and their interactions with other language constructs. |
DOM |
Performs inference based on RDFS2. |
DOMH |
Performs inference based on rdfs:domain assertions and their interactions with other language constructs. |
EQCH |
Performs inference that are relevant to owl:equivalentClass. |
EQPH |
Performs inference that are relevant to owl:equivalentProperty. |
FP |
Performs instance-level inference using instances of owl:FunctionalProperty. |
FPH |
Performs inference using instances of owl:FunctionalProperty. |
HASKEY |
Covers the semantics behind "keys" defined in OWL 2. In OWL 2, a collection of properties can be treated as a key to a class expression. For efficiency, the size of the collection must not exceed 3. (New as of Release 11.2.) |
IFP |
Performs instance-level inference using instances of owl:InverseFunctionalProperty. |
IFPH |
Performs inference using instances of owl:InverseFunctionalProperty. |
INTERSECT |
Handles the core semantics of owl:intersectionOf. For example, if class C is the intersection of classes C1, C2 and C3, then C is a subclass of C1, C2, and C3. In addition, common instances of all C1, C2, and C3 are also instances of C. (New as of Release 11.2.) |
INTERSECTSCOH |
Handles the fact that an intersection is the maximal common subset. For example, if class C is the intersection of classes C1, C2, and C3, then any common subclass of all C1, C2, and C3 is a subclass of C. (New as of Release 11.2.) |
INV |
Performs instance-level inference using owl:inverseOf assertions. |
INVH |
Performs inference based on owl:inverseOf assertions and their interactions with other language constructs. |
MBRLST |
Captures the semantics that for any resource, every item in the list given as the value of the |
ONEOF |
Generates classification assertions based on the definition of the enumeration classes. In OWL, class extensions can be enumerated explicitly with the |
PROPDISJH |
Captures the interaction between |
RANH |
Performs inference based on |
RDFP* |
(The rules corresponding to components with a prefix of RDFP can be found in Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary, by H.J. Horst.) |
RDFS2, ... RDFS13 |
RDFS2, RDFS3, RDFS4a, RDFS4b, RDFS5, RDFS6, RDFS7, RDFS8, RDFS9, RDFS10, RDFS11, RDFS12, and RDFS13 are described in Section 7.3 of RDF Semantics ( |
SAM |
Performs inference about individuals based on existing assertions for those individuals and owl:sameAs. |
SAMH |
Infers owl:sameAs assertions using transitivity and symmetricity of owl:sameAs. |
SCO |
Performs inference based on RDFS9. |
SCOH |
Generates the subClassOf hierarchy based on existing rdfs:subClassOf assertions. Basically, C1 rdfs:subClassOf C2 and C2 rdfs:subClassOf C3 will infer C1 rdfs:subClassOf C3 based on transitivity. SCOH is also an alias of RDFS11. |
SKOSAXIOMS |
Captures most of the axioms defined in the SKOS detailed specification. By default, this is included in the |
Performs inference based on the semantics of the OWL 2 EL profile, which captures the expressiveness of SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms), which is one of the most expressive and complex medical terminologies. (New as of Release 11.2.) |
|
SPIH |
Performs inference based on interactions between rdfs:subPropertyOf and owl:inverseOf assertions. |
SPO |
Performs inference based on RDFS7. |
SPOH |
Generates rdfs:subPropertyOf hierarchy based on transitivity of rdfs:subPropertyOf. It is an alias of RDFS5. |
SVFH |
Handles the following semantics that involves the interaction between owl:someValuesFrom and rdfs:subClassOf. Consider two existential restriction classes C1 and C2 that both use the same restriction property. Assume further that the owl:someValuesFrom constraint class for C1 is a subclass of that for C2. Then C1 can be inferred as a subclass of C2. (New as of Release 11.2.) |
SYMM |
Performs instance-level inference using instances of owl:SymmetricProperty. |
SYMH |
Performs inference for properties of type owl:SymmetricProperty. |
THINGH |
Handles the semantics that any defined OWL class is a subclass of owl:Thing. The consequence of this rule is that instances of all defined OWL classes will become instances of owl:Thing. The size of the inferred graph will very likely be bigger with this component selected. (New as of Release 11.2.) |
THINGSAM |
Handles the semantics that instances of owl:Thing are equal to (owl:sameAs) themselves. This component is provided for the convenience of some applications. Note that an application does not have to select this inference component to figure out an individual is equal to itself; this kind of information can easily be built in the application logic. (New as of Release 11.2.) |
TRANS |
Calculates transitive closure for instances of owl:TransitiveProperty. |
UNION |
Captures the core semantics of the |
To deselect a component, use the component name followed by a minus (-) sign. For example, SCOH-
deselects inference of the subClassOf
hierarchy.
For the options
parameter, you can enable the following options to override the default inferencing behavior:
-
COL_COMPRESS=T
creates temporary, intermediate working tables. This option can reduce the space required for such tables, and can improve the performance of the CREATE_INFERRED_GRAPH operation with large data sets.By default
COL_COMPRESS=T
uses the "compress for query level low" setting; however, you can addCPQH=T
to change to the "compress for query level high" setting.Note:
You can specify
COL_COMPRESS=T
only on systems that support Hybrid Columnar Compression (HCC). For information about HCC, see Oracle Database Concepts. -
DEST_MODEL=
<rdf_graph_name>
specifies, for incremental inference, the destination graph to which thedelta_in
RDF graphs are to be added. The specified destination graph must be one of the graphs specified in therdf_graphs_in
parameter. -
DISTANCE=T
generates ancillary distance information that is useful for semantic operators. -
DOP=
n
specifies the degree of parallelism for parallel inference, which can improve inference performance. For information about parallel inference, see Using Parallel Inference. -
ENTAIL_ANYWAY=T
forces OWL inferencing to proceed and reuse existing inferred data (inferred graph) when the inferred graph has a valid status. By default, SEM_APIS.CREATE_INFERRED_GRAPH quits immediately if there is already a valid inferred graph for the combination of RDF graphs and rulebases. -
HASH_PART=
n
creates the specified number of hash partitions for internal working tables. (The number must be a power of 2: 2, 4, 8, 16, 32, and so on.) You may want to specify a value if there are many distinct predicates in the RDF graph. In Oracle internal testing on benchmark ontologies, HASH_PART=32 worked well. -
INC=T
enables incremental inference for the inferred graph. For information about incremental inference, see Performing Incremental Inference. -
LOCAL_NG_INF=T
causes named graph based local inference (NGLI) to be used instead of named graph based global inference (NGGI). For information about NGLI, see Named Graph Based Local Inference (NGLI). -
MODEL_PARTITIONS=n
overrides the default number of subpartitions in a composite partitioned RDF network and creates the specified number (n) of subpartitions in the final inferred graph partition in RDF_LINK$. -
OPT_SAMEAS=T
uses consolidatedowl:sameAs
inferred graph for the inferred graph. If you specify this option, you cannot specifyPROOF=T
. For information about optimizingowl:sameA
s inference, see Optimizing owl:sameAs Inference. -
RAW8=T
uses RAW8 data types for the auxiliary inference tables. This option can improve inferred graph performance by up to 30% in some cases. -
PROOF=T
generates proof for inferred triples. Do not specify this option unless you need to; it slows inference performance because it causes more data to be generated. If you specify this option, you cannot specifyOPT_SAMEAS=T
. -
USER_RULES=T
causes any user-defined rules to be applied. If you specify this option, you cannot specifyPROOF=T
orDISTANCE=T
, and you must accept the default value for thepasses
parameter.
For the delta_in
parameter, inference performance is best if the
value is small compared to the overall size of those RDF graphs. In a typical scenario, the
best results might be achieved when the delta contains fewer than 10,000 triples; however,
some tests have shown significant inference performance improvements with deltas as large as
100,000 triples.
For the label_gen
parameter, if you want to use the default OLS label generator, specify the appropriate SEM_RDFSA package constant value fromTable 15-2.
Table 15-4 SEM_RDFSA Package Constants for label_gen Parameter
Constant | Description |
---|---|
SEM_RDFSA.LABELGEN_SUBJECT |
Label generator that applies the label associated with the inferred triple's subject as the triple's label. |
SEM_RDFSA.LABELGEN_PREDICATE |
Label generator that applies the label associated with the inferred triple's subject as the triple's label. |
SEM_RDFSA.LABELGEN_OBJECT |
Label generator that applies the label associated with the inferred triple's subject as the triple's label. |
SEM_RDFSA.LABELGEN_RULE |
Label generator that applies the label associated with the rule that directly produced the inferred triple as the triple's label. If you specify this option, you must also specify |
SEM_RDFSA.LABELGEN_DOMINATING |
Label generator that computes a dominating label of all the available labels for the triple's components (subject, predicate, object, and rule), and applies it as the label for the inferred triple. |
Fine-Grained Access Control (OLS) Considerations
When fine-grained access control is enabled for the entire network using OLS, only a
user with FULL access privileges to the associated policy may create an inferred graph. When
OLS is enabled, full access privileges to the OLS policy are granted using the
SA_USER_ADMIN.SET_USER_PRIVS
procedure.
Inferred triples accessed through generated labels might not be same as conceptual triples inferred directly from the user accessible triples and rules. The labels generated using a subset of triple components may be weaker than intended. For example, one of the antecedents for the inferred triple may have a higher label than any of the components of the triple. When the label is generated based on just the triple components, end users with no access to one of the antecedents may still have access to the inferred triple. Even when the antecedents are used for custom label generation, the generated label may be stronger than intended. The inference process is not exhaustive, and information pertaining to any alternate ways of inferring the same triple is not available. So, the label generated using a given set of antecedents may be too strong, because the user with access to all the triples in the alternate path could infer the triple with lower access.
Even when generating a label that dominates all its components and antecedents, the label may not be precise. This is the case when labels considered for dominating relationship have non-overlapping group information. For example, consider two labels L:C:NY
and L:C:NH
where L is a level, C is a component and NY and NH are two groups. A simple label that dominates these two labels is L:C:NY,NH
, and a true supremum for the two labels is L:C:US
, where US is parent group for both NY and NH. Unfortunately, neither of these two dominating labels is precise for the triple inferred from the triples with first two labels. If L:C:NY,NH
is used for the inferred triple, a user with membership in either of these groups has access to the inferred triple, whereas the same user does not have access to one of its antecedents. On the other hand, if L:C:US
is used for the inferred triple, a user with membership in both the groups and not in the US group will not be able to access the inferred triple, whereas that user could infer the triple by directly accessing its components and antecedents.
Because of these unique challenges with inferred triples, extra caution must be taken when choosing or implementing the label generator.
See also the OLS example in the Examples section.
For information about RDF network types and options, see RDF Networks.
Note:
If the SEM_APIS.CREATE_INFERRED_GRAPH
procedure with OWL2RL
reasoning takes a long time to execute , then the create inferred graph procedure needs
to be executed with options as shown for the OWL2RL rulebase example in the Examples
section.
Examples
The following example creates an inferred graph named OWLTST_IDX
using the OWLPrime rulebase, and it causes proof to be generated for inferred triples.
EXECUTE sem_apis.create_inferred_graph('owltst_idx', sem_models('owltst'), sem_rulebases('OWLPRIME'), SEM_APIS.REACH_CLOSURE, null, 'PROOF=T');
The following example assumes an OLS environment. It creates a rulebase with a rule, and it creates an inferred graph.
-- Create an inferred graph with a rule. -- exec sdo_rdf_inference.create_inferred_graph('contracts_rb'); insert into rdfr_contracts_rb values ( 'projectLedBy', '(?x :drivenBy ?y) (?y :hasVP ?z)', NULL, '(?x :isLedBy ?z)', SDO_RDF_Aliases(SDO_RDF_Alias('','http://www.myorg.com/pred/'))); -- Assign sensitivity label for the predicate to be inferred. -- -- Yhe predicate label may be set globally or it can be assign to -- -- the one or the RDF graphs used to infer the data – e.g: CONTRACTS. begin sem_rdfsa.set_predicate_label( model_name => 'rdf$global', predicate => 'http://www.myorg.com/pred/isLedBy', label_string => 'TS:US_SPCL'); end; / -- Create index with a specific label generator. -- begin sem_apis.create_inferred_graph( inferred_graph_name => 'contracts_inf', rdf_graphs_in => sem_models('contracts'), rulebases_in => sem_Rulebases('contracts_rb'), options => 'USER_RULES=T', label_gen => sem_rdfsa.LABELGEN_PREDICATE); end; / -- Check for any label exceptions and update them accordingly. -- update rdfi_contracts_inf set ctxt1 = 1100 where ctxt1 = -1; -- The new inferred graph is now ready for use in SEM_MATCH queries. --
The following example shows the steps to overcome long execution time when creating inferred graphs with OWL2RL rulebase.
ALTER SESSION SET "_OPTIMIZER_GENERATE_TRANSITIVE_PRED"=FALSE; EXECUTE SEM_APIS.CREATE_INFERRED_GRAPH ('m1_inf',SEM_MODELS('m1'),SEM_RULEBASES('OWL2RL'),NULL,NULL, 'RAW8=T,DOP=8,HINTS=[rule:SCM-CLS,use_hash(m1),rule:SCM-OP-DP,use_hash(m1)],PROCSVF=F,PROCAVF=F,PROCSCMHV=F,PROCSVFH=F,PROCAVFH=F,PROCDOM=F,PROCRAN=F' );
Parent topic: SEM_APIS Package Subprograms