15.32 SEM_APIS.CREATE_ENTAILMENT

Format

SEM_APIS.CREATE_ENTAILMENT(
     index_name_in      IN VARCHAR2, 
     models_in          IN SEM_MODELS, 
     rulebases_in       IN SEM_RULEBASES, 
     passes             IN NUMBER DEFAULT SEM_APIS.REACH_CLOSURE, 
     inf_components_in  IN VARCHAR2 DEFAULT NULL, 
     options            IN VARCHAR2 DEFAULT NULL, 
     delta_in           IN SEM_MODELS DEFAULT NULL, 
     label_gen          IN RDFSA_LABELGEN DEFAULT NULL,      
     include_named_g    IN SEM_GRAPHS DEFAULT NULL,      
     include_default_g  IN SEM_MODELS DEFAULT NULL,      
     include_all_g      IN SEM_MODELS DEFAULT NULL,      
     inf_ng_name        IN VARCHAR2 DEFAULT NULL,      
     inf_ext_user_func_name IN VARCHAR2 DEFAULT NULL,
     ols_ladder_inf_lbl_sec IN VARCHAR2 DEFAULT NULL,
     network_owner    IN VARCHAR2 DEFAULT NULL,
     network_name     IN VARCHAR2 DEFAULT NULL);

Note:

This subprogram will be deprecated in a future release. It is recommended that you use the SEM_APIS.CREATE_INFERRED_GRAPH subprogram instead.

Description

Creates an entailment (rules index) that can be used to perform OWL or RDFS inferencing, and optionally use user-defined rules.

Parameters

index_name_in

Name of the entailment to be created.

models_in

One or more model names. Its data type is SEM_MODELS, which has the following definition: TABLE OF VARCHAR2(25)

rulebases_in

One or more rulebase names. Its data type is SEM_RULEBASES, which has the following definition: TABLE OF VARCHAR2(25). Rules and rulebases are explained in Inferencing: Rules and Rulebases.

passes

The number of rounds that the inference engine should run. The default value is SEM_APIS.REACH_CLOSURE, which means the inference engine will run till a closure is reached. If the number of rounds specified is less than the number of actual rounds needed to reach a closure, the status of the entailment will then be set to INCOMPLETE.

inf_components_in

A comma-delimited string of keywords representing inference components, for performing selective or component-based inferencing. If this parameter is null, the default set of inference components is used. See the Usage Notes for more information about inference components.

options

A comma-delimited string of options to control the inference process by overriding the default inference behavior. To enable an option, specify option-name=T; to disable an option, you can specify option-name=F (the default). The available option-name values are COL_COMPRESS, DEST_MODEL, DISTANCE,DOP, ENTAIL_ANYWAY, HASH_PART, INC, LOCAL_NG_INF, OPT_SAMEAS, RAW8, PROOF, and USER_RULES. See the Usage Notes for explanations of each value.

delta_in

If incremental inference is in effect, specifies one or more models on which to perform incremental inference. Its data type is SEM_MODELS, which has the following definition: TABLE OF VARCHAR2(25)

The triples in the first model in delta_in are copied to the first model in models_in, and the entailment (rules index) in rules_index_in is updated; then the triples in the second model (if any) in delta_in are copied to the second model (if any) in models_in, and the entailment in rules_index_in is updated; and so on until all triples are copied and the entailment is updated. (The delta_in parameter has no effect if incremental inference is not enabled for the entailment.)

label_gen

An instance of RDFSA_LABELGEN or a subtype of it, defining the logic for generating Oracle Label Security (OLS) labels for inferred triples. What you specify for this parameter depends on whether you use the default label generator or a custom label generator:

  • If you use the default label generator, specify one of the following constants: SEM_RDFSA.LABELGEN_RULE for Use Rule Label, SEM_RDFSA.LABELGEN_SUBJECT for Use Subject Label, SEM_RDFSA.LABELGEN_PREDICATE for Use Predicate Label, SEM_RDFSA.LABELGEN_OBJECT for Use Object Label, SEM_RDFSA.LABELGEN_DOMINATING for Use Dominating Label, SEM_RDFSA.LABELGEN_ANTECED for Use Antecedent Labels.

  • If you use a custom label generator, specify the custom label generator type.

include_named_g

Causes all triples from the specified named graphs (across all source models) to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example, include_named_g => sem_graphs('<urn:G1>','<urn:G2>') implies that triples from named graphs G1 and G2 will be included in NGGI.

Its data type is SEM_GRAPHS, which has the following definition: TABLE OF VARCHAR2(4000).

include_default_g

Causes all triples with a null graph name in the specified models to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example, include_default_g => sem_models('m1') causes all triples with a null graph name from model M1 to be included in NGGI.

include_all_g

Causes all triples, regardless of their graph name values, in the specified models to participate in named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)). For example, include_all_g => sem_models('m2')causes all triples in model M2 to be included in NGGI.

inf_ng_name

Assigns the specified graph name to all the new triples inferred by the named graph based global inference (NGGI, explained in Named Graph Based Global Inference (NGGI)).

inf_ext_user_func_name

The name of a user-defined inference function, or a comma-delimited list of names of user-defined functions. For information about creating user-defined inference functions, including format requirements and options for certain parameters, see API Support for User-Defined Inferencing. (For information about user-defined inferencing, including examples, see User-Defined Inferencing and Querying.)

ols_ladder_inf_lbl_sec

network_owner

Owner of the semantic network. (See Table 1-2.)

network_name

Name of the semantic network. (See Table 1-2.)

Usage Notes

For the inf_components_in parameter, you can specify any combination of the following keywords: SCOH, COMPH, DISJH, SYMMH, INVH, SPIH, MBRH, SPOH, DOMH, RANH, EQCH, EQPH, FPH, IFPH, DOM, RAN, SCO, DISJ, COMP, INV, SPO, FP, IFP, SYMM, TRANS, DIF, SAM, CHAIN, HASKEY, ONEOF, INTERSECT, INTERSECTSCOH, MBRLST, PROPDISJH, SKOSAXIOMS, SNOMED, SVFH, THINGH, THINGSAM, UNION, RDFP1, RDFP2, RDFP3, RDFP4, RDFP6, RDFP7, RDFP8AX, RDFP8BX, RDFP9, RDFP10, RDFP11, RDFP12A, RDFP12B, RDFP12C, RDFP13A, RDFP13B, RDFP13C, RDFP14A, RDFP14BX, RDFP15, RDFP16, RDFS2, RDFS3, RDFS4a, RDFS4b, RDFS5, RDFS6, RDFS7, RDFS8, RDFS9, RDFS10, RDFS11, RDFS12, RDFS13. For an explanation of the meaning of these keywords, see Table 15-1, where the keywords are listed in alphabetical order.

The default set of inference components for the OWLPrime vocabulary includes the following: SCOH, COMPH, DISJH, SYMMH, INVH, SPIH, SPOH, DOMH, RANH, EQCH, EQPH, FPH, IFPH, SAMH, DOM, RAN, SCO, DISJ, COMP, INV, SPO, FP, IFP, SYMM, TRANS, DIF, RDFP14A, RDFP14BX, RDFP15, RDFP16. However, note the following:

  • Component SAM is not in this default OWLPrime list, because it tends to generate many new triples for some ontologies.

  • Effective with Release 11.2, the native OWL inference engine supports the following new inference components: CHAIN, HASKEY, INTERSECT, INTERSECTSCOH, MBRLST, ONEOF, PROPDISJH, SKOSAXIOMS, SNOMED, SVFH, THINGH, THINGSAM, UNION. However, for backward compatibility, the OWLPrime rulebase and any existing rulebases do not include these new components by default; instead, to use these new inference components, you must specify them explicitly, and they are included in Table 15-1 The following example creates an OWLPrime entailment for two OWL ontologies named LUBM and UNIV. Because of the additional inference components specified, this entailment will include the new semantics introduced in those inference components.

    EXECUTE sem_apis.create_entailment('lubm1000_idx',sem_models('lubm','univ'),
        sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, 
        'INTERSECT,INTERSECTSCOH,SVFH,THINGH,THINGSAM,UNION');

Table 15-1 Inferencing Keywords for inf_components_in Parameter

Keyword Explanation

CHAIN

Captures the property chain semantics defined in OWL 2. Only chains of length 2 are supported. By default, this is included in the SKOSCORE rulebase. Subproperty chaining is an OWL 2 feature, and for backward compatibility this component is not by default included in the OWLPrime rulebase. (For information about property chain handling, see Property Chain Handling.) (New as of Release 11.2.)

COMPH

Performs inference based on owl:complementOf assertions and the interaction of owl:complementOf with other language constructs.

DIF

Generates owl:differentFrom assertions based on the symmetricity of owl:differentFrom.

DISJ

Infers owl:differentFrom relationships at instance level using owl:disjointWith assertions.

DISJH

Performs inference based on owl:disjointWith assertions and their interactions with other language constructs.

DOM

Performs inference based on RDFS2.

DOMH

Performs inference based on rdfs:domain assertions and their interactions with other language constructs.

EQCH

Performs inference that are relevant to owl:equivalentClass.

EQPH

Performs inference that are relevant to owl:equivalentProperty.

FP

Performs instance-level inference using instances of owl:FunctionalProperty.

FPH

Performs inference using instances of owl:FunctionalProperty.

HASKEY

Covers the semantics behind "keys" defined in OWL 2. In OWL 2, a collection of properties can be treated as a key to a class expression. For efficiency, the size of the collection must not exceed 3. (New as of Release 11.2.)

IFP

Performs instance-level inference using instances of owl:InverseFunctionalProperty.

IFPH

Performs inference using instances of owl:InverseFunctionalProperty.

INTERSECT

Handles the core semantics of owl:intersectionOf. For example, if class C is the intersection of classes C1, C2 and C3, then C is a subclass of C1, C2, and C3. In addition, common instances of all C1, C2, and C3 are also instances of C. (New as of Release 11.2.)

INTERSECTSCOH

Handles the fact that an intersection is the maximal common subset. For example, if class C is the intersection of classes C1, C2, and C3, then any common subclass of all C1, C2, and C3 is a subclass of C. (New as of Release 11.2.)

INV

Performs instance-level inference using owl:inverseOf assertions.

INVH

Performs inference based on owl:inverseOf assertions and their interactions with other language constructs.

MBRLST

Captures the semantics that for any resource, every item in the list given as the value of the skos:memberList property is also a value of the skos:member property. (See S36 in the SKOS detailed specification.) By default, this is included in the SKOSCORE rulebase. (New as of Release 11.2.)

ONEOF

Generates classification assertions based on the definition of the enumeration classes. In OWL, class extensions can be enumerated explicitly with the owl:oneOf constructor. (New as of Release 11.2.)

PROPDISJH

Captures the interaction between owl:propertyDisjointWith and rdfs:subPropertyOf. By default, this is included in SKOSCORE rulebase. propertyDisjointWith is an OWL 2 feature, and for backward compatibility this component is not by default included in the OWLPrime rulebase. (New as of Release 11.2.)

RANH

Performs inference based on rdfs:range assertions and their interactions with other language constructs.

RDFP*

(The rules corresponding to components with a prefix of RDFP can be found in Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary, by H.J. Horst.)

RDFS2, ... RDFS13

RDFS2, RDFS3, RDFS4a, RDFS4b, RDFS5, RDFS6, RDFS7, RDFS8, RDFS9, RDFS10, RDFS11, RDFS12, and RDFS13 are described in Section 7.3 of RDF Semantics (http://www.w3.org/TR/rdf-mt/). Note that many of the RDFS components are not relevant for OWL inference.

SAM

Performs inference about individuals based on existing assertions for those individuals and owl:sameAs.

SAMH

Infers owl:sameAs assertions using transitivity and symmetricity of owl:sameAs.

SCO

Performs inference based on RDFS9.

SCOH

Generates the subClassOf hierarchy based on existing rdfs:subClassOf assertions. Basically, C1 rdfs:subClassOf C2 and C2 rdfs:subClassOf C3 will infer C1 rdfs:subClassOf C3 based on transitivity. SCOH is also an alias of RDFS11.

SKOSAXIOMS

Captures most of the axioms defined in the SKOS detailed specification. By default, this is included in the SKOSCORE rulebase. (New as of Release 11.2.)

SNOMED

Performs inference based on the semantics of the OWL 2 EL profile, which captures the expressiveness of SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms), which is one of the most expressive and complex medical terminologies. (New as of Release 11.2.)

SPIH

Performs inference based on interactions between rdfs:subPropertyOf and owl:inverseOf assertions.

SPO

Performs inference based on RDFS7.

SPOH

Generates rdfs:subPropertyOf hierarchy based on transitivity of rdfs:subPropertyOf. It is an alias of RDFS5.

SVFH

Handles the following semantics that involves the interaction between owl:someValuesFrom and rdfs:subClassOf. Consider two existential restriction classes C1 and C2 that both use the same restriction property. Assume further that the owl:someValuesFrom constraint class for C1 is a subclass of that for C2. Then C1 can be inferred as a subclass of C2. (New as of Release 11.2.)

SYMM

Performs instance-level inference using instances of owl:SymmetricProperty.

SYMH

Performs inference for properties of type owl:SymmetricProperty.

THINGH

Handles the semantics that any defined OWL class is a subclass of owl:Thing. The consequence of this rule is that instances of all defined OWL classes will become instances of owl:Thing. The size of the inferred graph will very likely be bigger with this component selected. (New as of Release 11.2.)

THINGSAM

Handles the semantics that instances of owl:Thing are equal to (owl:sameAs) themselves. This component is provided for the convenience of some applications. Note that an application does not have to select this inference component to figure out an individual is equal to itself; this kind of information can easily be built in the application logic. (New as of Release 11.2.)

TRANS

Calculates transitive closure for instances of owl:TransitiveProperty.

UNION

Captures the core semantics of the owl:unionOf construct. Basically, the union class is a superclass of all member classes. For backward compatibility, this component is not by default included in the OWLPrime rulebase. (New as of Release 11.2.)

To deselect a component, use the component name followed by a minus (-) sign. For example, SCOH- deselects inference of the subClassOf hierarchy.

For the options parameter, you can enable the following options to override the default inferencing behavior:

  • COL_COMPRESS=T creates temporary, intermediate working tables. This option can reduce the space required for such tables, and can improve the performance of the CREATE_ENTAILMENT operation with large data sets.

    By default COL_COMPRESS=T uses the "compress for query level low" setting; however, you can add CPQH=T to change to the "compress for query level high" setting.

    Note:

    You can specify COL_COMPRESS=T only on systems that support Hybrid Columnar Compression (HCC). For information about HCC, see Oracle Database Concepts.

  • DEST_MODEL=<model_name> specifies, for incremental inference, the destination model to which the delta_in model or models are to be added. The specified destination model must be one of the models specified in the models_in parameter.

  • DISTANCE=T generates ancillary distance information that is useful for semantic operators.

  • DOP=n specifies the degree of parallelism for parallel inference, which can improve inference performance. For information about parallel inference, see Using Parallel Inference.

  • ENTAIL_ANYWAY=T forces OWL inferencing to proceed and reuse existing inferred data (entailment) when the entailment has a valid status. By default, SEM_APIS.CREATE_ENTAILMENT quits immediately if there is already a valid entailment for the combination of models and rulebases.

  • HASH_PART=n creates the specified number of hash partitions for internal working tables. (The number must be a power of 2: 2, 4, 8, 16, 32, and so on.) You may want to specify a value if there are many distinct predicates in the semantic data model. In Oracle internal testing on benchmark ontologies, HASH_PART=32 worked well.

  • INC=T enables incremental inference for the entailment. For information about incremental inference, see Performing Incremental Inference.

  • LOCAL_NG_INF=T causes named graph based local inference (NGLI) to be used instead of named graph based global inference (NGGI). For information about NGLI, see Named Graph Based Local Inference (NGLI).

  • MODEL_PARTITIONS=n overrides the default number of subpartitions in a composite partitioned semantic network and creates the specified number (n) of subpartitions in the final entailment partition in RDF_LINK$.

  • OPT_SAMEAS=T uses consolidated owl:sameAs entailment for the entailment. If you specify this option, you cannot specify PROOF=T. For information about optimizing owl:sameAs inference, see Optimizing owl:sameAs Inference.

  • RAW8=T uses RAW8 data types for the auxiliary inference tables. This option can improve entailment performance by up to 30% in some cases.

  • PROOF=T generates proof for inferred triples. Do not specify this option unless you need to; it slows inference performance because it causes more data to be generated. If you specify this option, you cannot specify OPT_SAMEAS=T.

  • USER_RULES=T causes any user-defined rules to be applied. If you specify this option, you cannot specify PROOF=T or DISTANCE=T, and you must accept the default value for the passes parameter.

For the delta_in parameter, inference performance is best if the value is small compared to the overall size of those models. In a typical scenario, the best results might be achieved when the delta contains fewer than 10,000 triples; however, some tests have shown significant inference performance improvements with deltas as large as 100,000 triples.

For the label_gen parameter, if you want to use the default OLS label generator, specify the appropriate SEM_RDFSA package constant value fromTable 15-2.

Table 15-2 SEM_RDFSA Package Constants for label_gen Parameter

Constant Description

SEM_RDFSA.LABELGEN_SUBJECT

Label generator that applies the label associated with the inferred triple's subject as the triple's label.

SEM_RDFSA.LABELGEN_PREDICATE

Label generator that applies the label associated with the inferred triple's subject as the triple's label.

SEM_RDFSA.LABELGEN_OBJECT

Label generator that applies the label associated with the inferred triple's subject as the triple's label.

SEM_RDFSA.LABELGEN_RULE

Label generator that applies the label associated with the rule that directly produced the inferred triple as the triple's label. If you specify this option, you must also specify PROOF=T in the options parameter.

SEM_RDFSA.LABELGEN_DOMINATING

Label generator that computes a dominating label of all the available labels for the triple's components (subject, predicate, object, and rule), and applies it as the label for the inferred triple.

Fine-Grained Access Control (OLS) Considerations

When fine-grained access control is enabled for the entire network using OLS, only a user with FULL access privileges to the associated policy may create an entailment. When OLS is enabled, full access privileges to the OLS policy are granted using the SA_USER_ADMIN.SET_USER_PRIVS procedure.

Inferred triples accessed through generated labels might not be same as conceptual triples inferred directly from the user accessible triples and rules. The labels generated using a subset of triple components may be weaker than intended. For example, one of the antecedents for the inferred triple may have a higher label than any of the components of the triple. When the label is generated based on just the triple components, end users with no access to one of the antecedents may still have access to the inferred triple. Even when the antecedents are used for custom label generation, the generated label may be stronger than intended. The inference process is not exhaustive, and information pertaining to any alternate ways of inferring the same triple is not available. So, the label generated using a given set of antecedents may be too strong, because the user with access to all the triples in the alternate path could infer the triple with lower access.

Even when generating a label that dominates all its components and antecedents, the label may not be precise. This is the case when labels considered for dominating relationship have non-overlapping group information. For example, consider two labels L:C:NY and L:C:NH where L is a level, C is a component and NY and NH are two groups. A simple label that dominates these two labels is L:C:NY,NH, and a true supremum for the two labels is L:C:US, where US is parent group for both NY and NH. Unfortunately, neither of these two dominating labels is precise for the triple inferred from the triples with first two labels. If L:C:NY,NH is used for the inferred triple, a user with membership in either of these groups has access to the inferred triple, whereas the same user does not have access to one of its antecedents. On the other hand, if L:C:US is used for the inferred triple, a user with membership in both the groups and not in the US group will not be able to access the inferred triple, whereas that user could infer the triple by directly accessing its components and antecedents.

Because of these unique challenges with inferred triples, extra caution must be taken when choosing or implementing the label generator.

See also the OLS example in the Examples section.

For information about semantic network types and options, see RDF Networks.

Note:

If the SEM_APIS.CREATE_ENTAILMENT procedure with OWL2RL reasoning takes a long time to execute , then the create entailment procedure needs to be executed with options as shown for the OWL2RL rulebase example in the Examples section.

Examples

The following example creates an entailment named OWLTST_IDX using the OWLPrime rulebase, and it causes proof to be generated for inferred triples.

EXECUTE sem_apis.create_entailment('owltst_idx', sem_models('owltst'), sem_rulebases('OWLPRIME'), SEM_APIS.REACH_CLOSURE, null, 'PROOF=T');

The following example assumes an OLS environment. It creates a rulebase with a rule, and it creates an entailment.

-- Create an entailment with a rule. -- 
exec sdo_rdf_inference.create_entailment('contracts_rb');
 
insert into rdfr_contracts_rb values (
  'projectLedBy', '(?x :drivenBy ?y) (?y :hasVP ?z)',  NULL,
  '(?x :isLedBy ?z)',
  SDO_RDF_Aliases(SDO_RDF_Alias('','http://www.myorg.com/pred/')));
 
-- Assign sensitivity label for the predicate to be inferred. -- 
-- Yhe predicate label may be set globally or it can be assign to --
-- the one or the models used to infer the data – e.g: CONTRACTS. 
begin
  sem_rdfsa.set_predicate_label(
         model_name   => 'rdf$global',
         predicate    => 'http://www.myorg.com/pred/isLedBy',
         label_string => 'TS:US_SPCL');
end;
/
 
-- Create index with a specific label generator. -- 
begin
  sem_apis.create_entailment(
         index_name_in  => 'contracts_inf',
         models_in      => SDO_RDF_Models('contracts'),
         rulebases_in   => SDO_RDF_Rulebases('contracts_rb'),
         options        => 'USER_RULES=T',
         label_gen      => sem_rdfsa.LABELGEN_PREDICATE);
end;
/
 
-- Check for any label exceptions and update them accordingly. -- 
update rdfi_contracts_inf set ctxt1 = 1100 where ctxt1 = -1;
 
-- The new entailment is now ready for use in SEM_MATCH queries. --

The following example shows the steps to overcome long execution time when creating entailments with OWL2RL rulebase.

ALTER SESSION SET "_OPTIMIZER_GENERATE_TRANSITIVE_PRED"=FALSE;
EXECUTE SEM_APIS.CREATE_ENTAILMENT
  ('m1_inf',SEM_MODELS('m1'),SEM_RULEBASES('OWL2RL'),NULL,NULL,
   'RAW8=T,DOP=8,HINTS=[rule:SCM-CLS,use_hash(m1),rule:SCM-OP-DP,use_hash(m1)],PROCSVF=F,PROCAVF=F,PROCSCMHV=F,PROCSVFH=F,PROCAVFH=F,PROCDOM=F,PROCRAN=F'
  );