15.32 SEM_APIS.CREATE_ENTAILMENT
Format
SEM_APIS.CREATE_ENTAILMENT( index_name_in IN VARCHAR2, models_in IN SEM_MODELS, rulebases_in IN SEM_RULEBASES, passes IN NUMBER DEFAULT SEM_APIS.REACH_CLOSURE, inf_components_in IN VARCHAR2 DEFAULT NULL, options IN VARCHAR2 DEFAULT NULL, delta_in IN SEM_MODELS DEFAULT NULL, label_gen IN RDFSA_LABELGEN DEFAULT NULL, include_named_g IN SEM_GRAPHS DEFAULT NULL, include_default_g IN SEM_MODELS DEFAULT NULL, include_all_g IN SEM_MODELS DEFAULT NULL, inf_ng_name IN VARCHAR2 DEFAULT NULL, inf_ext_user_func_name IN VARCHAR2 DEFAULT NULL, ols_ladder_inf_lbl_sec IN VARCHAR2 DEFAULT NULL, network_owner IN VARCHAR2 DEFAULT NULL, network_name IN VARCHAR2 DEFAULT NULL);
Note:
This subprogram will be deprecated in a future release. It is recommended that you use the SEM_APIS.CREATE_INFERRED_GRAPH subprogram instead.Description
Creates an entailment (rules index) that can be used to perform OWL or RDFS inferencing, and optionally use user-defined rules.
Parameters
- index_name_in
-
Name of the entailment to be created.
- models_in
-
One or more model names. Its data type is SEM_MODELS, which has the following definition:
TABLE OF VARCHAR2(25)
- rulebases_in
-
One or more rulebase names. Its data type is SEM_RULEBASES, which has the following definition:
TABLE OF VARCHAR2(25)
. Rules and rulebases are explained in Inferencing: Rules and Rulebases. - passes
-
The number of rounds that the inference engine should run. The default value is
SEM_APIS.REACH_CLOSURE
, which means the inference engine will run till a closure is reached. If the number of rounds specified is less than the number of actual rounds needed to reach a closure, the status of the entailment will then be set toINCOMPLETE
. - inf_components_in
-
A comma-delimited string of keywords representing inference components, for performing selective or component-based inferencing. If this parameter is null, the default set of inference components is used. See the Usage Notes for more information about inference components.
- options
-
A comma-delimited string of options to control the inference process by overriding the default inference behavior. To enable an option, specify
option-name
=T
; to disable an option, you can specifyoption-name
=F
(the default). The available option-name values areCOL_COMPRESS
,DEST_MODEL
,DISTANCE,DOP
,ENTAIL_ANYWAY
,HASH_PART
,INC
,LOCAL_NG_INF
,OPT_SAMEAS
,RAW8
,PROOF
, andUSER_RULES
. See the Usage Notes for explanations of each value. - delta_in
-
If incremental inference is in effect, specifies one or more models on which to perform incremental inference. Its data type is SEM_MODELS, which has the following definition:
TABLE OF VARCHAR2(25)
The triples in the first model in
delta_in
are copied to the first model inmodels_in
, and the entailment (rules index) inrules_index_in
is updated; then the triples in the second model (if any) indelta_in
are copied to the second model (if any) inmodels_in
, and the entailment inrules_index_in
is updated; and so on until all triples are copied and the entailment is updated. (Thedelta_in
parameter has no effect if incremental inference is not enabled for the entailment.) - label_gen
-
An instance of RDFSA_LABELGEN or a subtype of it, defining the logic for generating Oracle Label Security (OLS) labels for inferred triples. What you specify for this parameter depends on whether you use the default label generator or a custom label generator:
-
If you use the default label generator, specify one of the following constants:
SEM_RDFSA.LABELGEN_RULE
for Use Rule Label,SEM_RDFSA.LABELGEN_SUBJECT
for Use Subject Label,SEM_RDFSA.LABELGEN_PREDICATE
for Use Predicate Label,SEM_RDFSA.LABELGEN_OBJECT
for Use Object Label,SEM_RDFSA.LABELGEN_DOMINATING
for Use Dominating Label,SEM_RDFSA.LABELGEN_ANTECED
for Use Antecedent Labels. -
If you use a custom label generator, specify the custom label generator type.
-
- include_named_g
-
Causes all triples from the specified named graphs (across all source models) to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example,
include_named_g => sem_graphs('<urn:G1>','<urn:G2>')
implies that triples from named graphsG1
andG2
will be included in NGGI.Its data type is SEM_GRAPHS, which has the following definition:
TABLE OF VARCHAR2(4000)
. - include_default_g
-
Causes all triples with a null graph name in the specified models to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example,
include_default_g => sem_models('m1')
causes all triples with a null graph name from modelM1
to be included in NGGI. - include_all_g
-
Causes all triples, regardless of their graph name values, in the specified models to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example,
include_all_g => sem_models('m2')
causes all triples in modelM2
to be included in NGGI. - inf_ng_name
-
Assigns the specified graph name to all the new triples inferred by the named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)).
- inf_ext_user_func_name
-
The name of a user-defined inference function, or a comma-delimited list of names of user-defined functions. For information about creating user-defined inference functions, including format requirements and options for certain parameters, see API Support for User-Defined Inferencing. (For information about user-defined inferencing, including examples, see User-Defined Inferencing and Querying.)
- ols_ladder_inf_lbl_sec
- network_owner
-
Owner of the semantic network. (See Table 1-2.)
- network_name
-
Name of the semantic network. (See Table 1-2.)
Usage Notes
For the inf_components_in
parameter, you can specify any combination of the following keywords: SCOH
, COMPH
, DISJH
, SYMMH
, INVH
, SPIH
, MBRH
, SPOH
, DOMH
, RANH
, EQCH
, EQPH
, FPH
, IFPH
, DOM
, RAN
, SCO
, DISJ
, COMP
, INV
, SPO
, FP
, IFP
, SYMM
, TRANS
, DIF
, SAM
, CHAIN
, HASKEY
, ONEOF
, INTERSECT
, INTERSECTSCOH
, MBRLST
, PROPDISJH
, SKOSAXIOMS
, SNOMED
, SVFH
, THINGH
, THINGSAM
, UNION
, RDFP1
, RDFP2
, RDFP3
, RDFP4
, RDFP6
, RDFP7
, RDFP8AX
, RDFP8BX
, RDFP9
, RDFP10
, RDFP11
, RDFP12A
, RDFP12B
, RDFP12C
, RDFP13A
, RDFP13B
, RDFP13C
, RDFP14A
, RDFP14BX
, RDFP15
, RDFP16
, RDFS2
, RDFS3
, RDFS4a
, RDFS4b
, RDFS5
, RDFS6
, RDFS7
, RDFS8
, RDFS9
, RDFS10
, RDFS11
, RDFS12
, RDFS13
. For an explanation of the meaning of these keywords, see Table 15-1, where the keywords are listed in alphabetical order.
The default set of inference components for the OWLPrime vocabulary includes the following: SCOH
, COMPH
, DISJH
, SYMMH
, INVH
, SPIH
, SPOH
, DOMH
, RANH
, EQCH
, EQPH
, FPH
, IFPH
, SAMH
, DOM
, RAN
, SCO
, DISJ
, COMP
, INV
, SPO
, FP
, IFP
, SYMM
, TRANS
, DIF
, RDFP14A
, RDFP14BX
, RDFP15
, RDFP16
. However, note the following:
-
Component
SAM
is not in this default OWLPrime list, because it tends to generate many new triples for some ontologies. -
Effective with Release 11.2, the native OWL inference engine supports the following new inference components:
CHAIN
,HASKEY
,INTERSECT
,INTERSECTSCOH
,MBRLST
,ONEOF
,PROPDISJH
,SKOSAXIOMS
,SNOMED
,SVFH
,THINGH
,THINGSAM
,UNION
. However, for backward compatibility, the OWLPrime rulebase and any existing rulebases do not include these new components by default; instead, to use these new inference components, you must specify them explicitly, and they are included in Table 15-1 The following example creates an OWLPrime entailment for two OWL ontologies namedLUBM
andUNIV
. Because of the additional inference components specified, this entailment will include the new semantics introduced in those inference components.EXECUTE sem_apis.create_entailment('lubm1000_idx',sem_models('lubm','univ'), sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, 'INTERSECT,INTERSECTSCOH,SVFH,THINGH,THINGSAM,UNION');
Table 15-1 Inferencing Keywords for inf_components_in Parameter
Keyword | Explanation |
---|---|
CHAIN |
Captures the property chain semantics defined in OWL 2. Only chains of length 2 are supported. By default, this is included in the |
COMPH |
Performs inference based on owl:complementOf assertions and the interaction of owl:complementOf with other language constructs. |
DIF |
Generates owl:differentFrom assertions based on the symmetricity of owl:differentFrom. |
DISJ |
Infers owl:differentFrom relationships at instance level using owl:disjointWith assertions. |
DISJH |
Performs inference based on owl:disjointWith assertions and their interactions with other language constructs. |
DOM |
Performs inference based on RDFS2. |
DOMH |
Performs inference based on rdfs:domain assertions and their interactions with other language constructs. |
EQCH |
Performs inference that are relevant to owl:equivalentClass. |
EQPH |
Performs inference that are relevant to owl:equivalentProperty. |
FP |
Performs instance-level inference using instances of owl:FunctionalProperty. |
FPH |
Performs inference using instances of owl:FunctionalProperty. |
HASKEY |
Covers the semantics behind "keys" defined in OWL 2. In OWL 2, a collection of properties can be treated as a key to a class expression. For efficiency, the size of the collection must not exceed 3. (New as of Release 11.2.) |
IFP |
Performs instance-level inference using instances of owl:InverseFunctionalProperty. |
IFPH |
Performs inference using instances of owl:InverseFunctionalProperty. |
INTERSECT |
Handles the core semantics of owl:intersectionOf. For example, if class C is the intersection of classes C1, C2 and C3, then C is a subclass of C1, C2, and C3. In addition, common instances of all C1, C2, and C3 are also instances of C. (New as of Release 11.2.) |
INTERSECTSCOH |
Handles the fact that an intersection is the maximal common subset. For example, if class C is the intersection of classes C1, C2, and C3, then any common subclass of all C1, C2, and C3 is a subclass of C. (New as of Release 11.2.) |
INV |
Performs instance-level inference using owl:inverseOf assertions. |
INVH |
Performs inference based on owl:inverseOf assertions and their interactions with other language constructs. |
MBRLST |
Captures the semantics that for any resource, every item in the list given as the value of the |
ONEOF |
Generates classification assertions based on the definition of the enumeration classes. In OWL, class extensions can be enumerated explicitly with the |
PROPDISJH |
Captures the interaction between |
RANH |
Performs inference based on |
RDFP* |
(The rules corresponding to components with a prefix of RDFP can be found in Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary, by H.J. Horst.) |
RDFS2, ... RDFS13 |
RDFS2, RDFS3, RDFS4a, RDFS4b, RDFS5, RDFS6, RDFS7, RDFS8, RDFS9, RDFS10, RDFS11, RDFS12, and RDFS13 are described in Section 7.3 of RDF Semantics ( |
SAM |
Performs inference about individuals based on existing assertions for those individuals and owl:sameAs. |
SAMH |
Infers owl:sameAs assertions using transitivity and symmetricity of owl:sameAs. |
SCO |
Performs inference based on RDFS9. |
SCOH |
Generates the subClassOf hierarchy based on existing rdfs:subClassOf assertions. Basically, C1 rdfs:subClassOf C2 and C2 rdfs:subClassOf C3 will infer C1 rdfs:subClassOf C3 based on transitivity. SCOH is also an alias of RDFS11. |
SKOSAXIOMS |
Captures most of the axioms defined in the SKOS detailed specification. By default, this is included in the |
Performs inference based on the semantics of the OWL 2 EL profile, which captures the expressiveness of SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms), which is one of the most expressive and complex medical terminologies. (New as of Release 11.2.) |
|
SPIH |
Performs inference based on interactions between rdfs:subPropertyOf and owl:inverseOf assertions. |
SPO |
Performs inference based on RDFS7. |
SPOH |
Generates rdfs:subPropertyOf hierarchy based on transitivity of rdfs:subPropertyOf. It is an alias of RDFS5. |
SVFH |
Handles the following semantics that involves the interaction between owl:someValuesFrom and rdfs:subClassOf. Consider two existential restriction classes C1 and C2 that both use the same restriction property. Assume further that the owl:someValuesFrom constraint class for C1 is a subclass of that for C2. Then C1 can be inferred as a subclass of C2. (New as of Release 11.2.) |
SYMM |
Performs instance-level inference using instances of owl:SymmetricProperty. |
SYMH |
Performs inference for properties of type owl:SymmetricProperty. |
THINGH |
Handles the semantics that any defined OWL class is a subclass of owl:Thing. The consequence of this rule is that instances of all defined OWL classes will become instances of owl:Thing. The size of the inferred graph will very likely be bigger with this component selected. (New as of Release 11.2.) |
THINGSAM |
Handles the semantics that instances of owl:Thing are equal to (owl:sameAs) themselves. This component is provided for the convenience of some applications. Note that an application does not have to select this inference component to figure out an individual is equal to itself; this kind of information can easily be built in the application logic. (New as of Release 11.2.) |
TRANS |
Calculates transitive closure for instances of owl:TransitiveProperty. |
UNION |
Captures the core semantics of the |
To deselect a component, use the component name followed by a minus (-) sign. For example, SCOH-
deselects inference of the subClassOf
hierarchy.
For the options
parameter, you can enable the following options to override the default inferencing behavior:
-
COL_COMPRESS=T
creates temporary, intermediate working tables. This option can reduce the space required for such tables, and can improve the performance of the CREATE_ENTAILMENT operation with large data sets.By default
COL_COMPRESS=T
uses the "compress for query level low" setting; however, you can addCPQH=T
to change to the "compress for query level high" setting.Note:
You can specify
COL_COMPRESS=T
only on systems that support Hybrid Columnar Compression (HCC). For information about HCC, see Oracle Database Concepts. -
DEST_MODEL=
<model_name>
specifies, for incremental inference, the destination model to which thedelta_in
model or models are to be added. The specified destination model must be one of the models specified in themodels_in
parameter. -
DISTANCE=T
generates ancillary distance information that is useful for semantic operators. -
DOP=
n
specifies the degree of parallelism for parallel inference, which can improve inference performance. For information about parallel inference, see Using Parallel Inference. -
ENTAIL_ANYWAY=T
forces OWL inferencing to proceed and reuse existing inferred data (entailment) when the entailment has a valid status. By default, SEM_APIS.CREATE_ENTAILMENT quits immediately if there is already a valid entailment for the combination of models and rulebases. -
HASH_PART=
n
creates the specified number of hash partitions for internal working tables. (The number must be a power of 2: 2, 4, 8, 16, 32, and so on.) You may want to specify a value if there are many distinct predicates in the semantic data model. In Oracle internal testing on benchmark ontologies, HASH_PART=32 worked well. -
INC=T
enables incremental inference for the entailment. For information about incremental inference, see Performing Incremental Inference. -
LOCAL_NG_INF=T
causes named graph based local inference (NGLI) to be used instead of named graph based global inference (NGGI). For information about NGLI, see Named Graph Based Local Inference (NGLI). -
MODEL_PARTITIONS=n
overrides the default number of subpartitions in a composite partitioned semantic network and creates the specified number (n) of subpartitions in the final entailment partition in RDF_LINK$. -
OPT_SAMEAS=T
uses consolidatedowl:sameAs
entailment for the entailment. If you specify this option, you cannot specifyPROOF=T
. For information about optimizingowl:sameA
s inference, see Optimizing owl:sameAs Inference. -
RAW8=T
uses RAW8 data types for the auxiliary inference tables. This option can improve entailment performance by up to 30% in some cases. -
PROOF=T
generates proof for inferred triples. Do not specify this option unless you need to; it slows inference performance because it causes more data to be generated. If you specify this option, you cannot specifyOPT_SAMEAS=T
. -
USER_RULES=T
causes any user-defined rules to be applied. If you specify this option, you cannot specifyPROOF=T
orDISTANCE=T
, and you must accept the default value for thepasses
parameter.
For the delta_in
parameter, inference performance is best if the value is small compared to the overall size of those models. In a typical scenario, the best results might be achieved when the delta contains fewer than 10,000 triples; however, some tests have shown significant inference performance improvements with deltas as large as 100,000 triples.
For the label_gen
parameter, if you want to use the default OLS label generator, specify the appropriate SEM_RDFSA package constant value fromTable 15-2.
Table 15-2 SEM_RDFSA Package Constants for label_gen Parameter
Constant | Description |
---|---|
SEM_RDFSA.LABELGEN_SUBJECT |
Label generator that applies the label associated with the inferred triple's subject as the triple's label. |
SEM_RDFSA.LABELGEN_PREDICATE |
Label generator that applies the label associated with the inferred triple's subject as the triple's label. |
SEM_RDFSA.LABELGEN_OBJECT |
Label generator that applies the label associated with the inferred triple's subject as the triple's label. |
SEM_RDFSA.LABELGEN_RULE |
Label generator that applies the label associated with the rule that directly produced the inferred triple as the triple's label. If you specify this option, you must also specify |
SEM_RDFSA.LABELGEN_DOMINATING |
Label generator that computes a dominating label of all the available labels for the triple's components (subject, predicate, object, and rule), and applies it as the label for the inferred triple. |
Fine-Grained Access Control (OLS) Considerations
When fine-grained access control is enabled for the entire network using OLS, only a user with FULL access privileges to the associated policy may create an entailment. When OLS is enabled, full access privileges to the OLS policy are granted using the SA_USER_ADMIN.SET_USER_PRIVS
procedure.
Inferred triples accessed through generated labels might not be same as conceptual triples inferred directly from the user accessible triples and rules. The labels generated using a subset of triple components may be weaker than intended. For example, one of the antecedents for the inferred triple may have a higher label than any of the components of the triple. When the label is generated based on just the triple components, end users with no access to one of the antecedents may still have access to the inferred triple. Even when the antecedents are used for custom label generation, the generated label may be stronger than intended. The inference process is not exhaustive, and information pertaining to any alternate ways of inferring the same triple is not available. So, the label generated using a given set of antecedents may be too strong, because the user with access to all the triples in the alternate path could infer the triple with lower access.
Even when generating a label that dominates all its components and antecedents, the label may not be precise. This is the case when labels considered for dominating relationship have non-overlapping group information. For example, consider two labels L:C:NY
and L:C:NH
where L is a level, C is a component and NY and NH are two groups. A simple label that dominates these two labels is L:C:NY,NH
, and a true supremum for the two labels is L:C:US
, where US is parent group for both NY and NH. Unfortunately, neither of these two dominating labels is precise for the triple inferred from the triples with first two labels. If L:C:NY,NH
is used for the inferred triple, a user with membership in either of these groups has access to the inferred triple, whereas the same user does not have access to one of its antecedents. On the other hand, if L:C:US
is used for the inferred triple, a user with membership in both the groups and not in the US group will not be able to access the inferred triple, whereas that user could infer the triple by directly accessing its components and antecedents.
Because of these unique challenges with inferred triples, extra caution must be taken when choosing or implementing the label generator.
See also the OLS example in the Examples section.
For information about semantic network types and options, see RDF Networks.
Note:
If the SEM_APIS.CREATE_ENTAILMENT
procedure with OWL2RL reasoning takes a long time to execute , then the create entailment procedure needs to be executed with options as shown for the OWL2RL rulebase example in the Examples section.
Examples
The following example creates an entailment named OWLTST_IDX
using the OWLPrime rulebase, and it causes proof to be generated for inferred triples.
EXECUTE sem_apis.create_entailment('owltst_idx', sem_models('owltst'), sem_rulebases('OWLPRIME'), SEM_APIS.REACH_CLOSURE, null, 'PROOF=T');
The following example assumes an OLS environment. It creates a rulebase with a rule, and it creates an entailment.
-- Create an entailment with a rule. -- exec sdo_rdf_inference.create_entailment('contracts_rb'); insert into rdfr_contracts_rb values ( 'projectLedBy', '(?x :drivenBy ?y) (?y :hasVP ?z)', NULL, '(?x :isLedBy ?z)', SDO_RDF_Aliases(SDO_RDF_Alias('','http://www.myorg.com/pred/'))); -- Assign sensitivity label for the predicate to be inferred. -- -- Yhe predicate label may be set globally or it can be assign to -- -- the one or the models used to infer the data – e.g: CONTRACTS. begin sem_rdfsa.set_predicate_label( model_name => 'rdf$global', predicate => 'http://www.myorg.com/pred/isLedBy', label_string => 'TS:US_SPCL'); end; / -- Create index with a specific label generator. -- begin sem_apis.create_entailment( index_name_in => 'contracts_inf', models_in => SDO_RDF_Models('contracts'), rulebases_in => SDO_RDF_Rulebases('contracts_rb'), options => 'USER_RULES=T', label_gen => sem_rdfsa.LABELGEN_PREDICATE); end; / -- Check for any label exceptions and update them accordingly. -- update rdfi_contracts_inf set ctxt1 = 1100 where ctxt1 = -1; -- The new entailment is now ready for use in SEM_MATCH queries. --
The following example shows the steps to overcome long execution time when creating entailments with OWL2RL rulebase.
ALTER SESSION SET "_OPTIMIZER_GENERATE_TRANSITIVE_PRED"=FALSE; EXECUTE SEM_APIS.CREATE_ENTAILMENT ('m1_inf',SEM_MODELS('m1'),SEM_RULEBASES('OWL2RL'),NULL,NULL, 'RAW8=T,DOP=8,HINTS=[rule:SCM-CLS,use_hash(m1),rule:SCM-OP-DP,use_hash(m1)],PROCSVF=F,PROCAVF=F,PROCSCMHV=F,PROCSVFH=F,PROCAVFH=F,PROCDOM=F,PROCRAN=F' );
Parent topic: SEM_APIS Package Subprograms