1 Oracle Semantic Technologies Overview

This chapter describes the support for semantic technologies, specifically Resource Description Framework (RDF) and a subset of the Web Ontology Language (OWL). It assumes that you are familiar with the major concepts associated with RDF and OWL, such as {subject, predicate, object} triples, URIs, blank nodes, plain and typed literals, and ontologies. This chapter does not explain these concepts in detail, but focuses instead on how the concepts are implemented in Oracle.

The PL/SQL subprograms for working with semantic data are in the SEM_APIS package, which is documented in Chapter 3.

The RDF and OWL support are features of Oracle Spatial, which must be installed for these features to be used. However, the use of RDF and OWL is not restricted to spatial data.

This chapter contains the following major sections:

For information about OWL concepts and the Oracle Database support for OWL capabilities, see Chapter 2.

Required Script:

Before performing any operations described in this guide, you must run the $ORACLE_HOME/md/admin/catsem10i.sql or $ORACLE_HOME/md/admin/catsem11i.sql script, as explained in Section 1.11.

Name Changes:

Effective with Oracle Database Release 11.1, the names of many software objects (PL/SQL packages, functions and procedures, system tables and views, and so on) have been changed. In most cases, the change involves replacing the string RDF with SEM. However, existing applications using valid names from a previous release will continue to work. For more information, see Section 1.14.

Spatial and Partitioning:

Oracle Spatial must be installed before you can use any of the RDF and OWL capabilities. Partitioning must also be enabled, as explained in Section 1.12.

1.1 Introduction to Oracle Semantic Technologies

Oracle Database enables you to store semantic data and ontologies, to query semantic data and to perform ontology-assisted query of enterprise relational data, and to use supplied or user-defined inferencing to expand the power of querying on semantic data. Figure 1-1 shows how these capabilities interact.

Figure 1-1 Oracle Semantic Capabilities

Description of Figure 1-1 follows
Description of "Figure 1-1 Oracle Semantic Capabilities"

As shown in Figure 1-1, the database contains semantic data and ontologies (RDF/OWL models), as well as traditional relational data. To load semantic data, bulk loading is the most efficient approach, although you can load data incrementally using transactional INSERT statements.

Note:

If you want to use existing semantic data from a release before Oracle Database 11.1, the data must be upgraded as described in Section 1.11.

You can query semantic data and ontologies, and you can also perform ontology-assisted queries of semantic and traditional relational data to find semantic relationships. To perform ontology-assisted queries, use the SEM_RELATED operator, which is described in Section 2.3.

You can expand the power of queries on semantic data by using inferencing, which uses rules in rulebases. Inferencing enables you to make logical deductions based on the data and the rules. For information about using rules and rulebases for inferencing, see Section 1.3.6.

1.2 Semantic Data Modeling

In addition to its formal semantics, semantic data has a simple data structure that is effectively modeled using a directed graph. The metadata statements are represented as triples: nodes are used to represent two parts of the triple, and the third part is represented by a directed link that describes the relationship between the nodes. The triples are stored in a semantic data network. In addition, information is maintained about specific semantic data models created by database users. A user-created model has a model name, and refers to triples stored in a specified table column.

Statements are expressed in triples: {subject or resource, predicate or property, object or value}. In this manual, {subject, property, object} is used to describe a triple, and the terms statement and triple may sometimes be used interchangeably. Each triple is a complete and unique fact about a specific domain, and can be represented by a link in a directed graph.

1.3 Semantic Data in the Database

There is one universe for all semantic data stored in the database. All triples are parsed and stored in the system as entries in tables under the MDSYS schema. A triple {subject, property, object} is treated as one database object. As a result, a single document containing multiple triples results in multiple database objects.

All the subjects and objects of triples are mapped to nodes in a semantic data network, and properties are mapped to network links that have their start node and end node as subject and object, respectively. The possible node types are blank nodes, URIs, plain literals, and typed literals.

The following requirements apply to the specifications of URIs and the storage of semantic data in the database:

  • A subject must be a URI or a blank node.

  • A property must be a URI.

  • An object can be any type, such as a URI, a blank node, or a literal. (However, null values and null strings are not supported.)

1.3.1 Metadata for Models

The MDSYS.SEM_MODEL$ view contains information about all models defined in the database. When you create a model using the SEM_APIS.CREATE_SEM_MODEL procedure, you specify a name for the model, as well as a table and column to hold references to the semantic data, and the system automatically generates a model ID.

Oracle maintains the MDSYS.SEM_MODEL$ view automatically when you create and drop models. Users should never modify this view directly. For example, do not use SQL INSERT, UPDATE, or DELETE statements with this view.

The MDSYS.SEM_MODEL$ view contains the columns shown in Table 1-1.

Table 1-1 MDSYS.SEM_MODEL$ View Columns

Column Name Data Type Description

OWNER

VARCHAR2(30)

Schema of the owner of the model.

MODEL_ID

NUMBER

Unique model ID number, automatically generated.

MODEL_NAME

VARCHAR2(25)

Name of the model.

TABLE_NAME

VARCHAR2(30)

Name of the table to hold references to semantic data for the model.

COLUMN_NAME

VARCHAR2(30)

Name of the column of type SDO_RDF_TRIPLE_S in the table to hold references to semantic data for the model.

MODEL_TABLESPACE_NAME

VARCHAR2(30)

Name of the tablespace to be used for storing the triples for this model.


When you create a model, a view for the triples associated with the model is also created under the MDSYS schema. This view has a name in the format RDFM_model-name, and it is visible only to the owner of the model and to users with suitable privileges. Each MDSYS.SEMM_model-name view contains a row for each triple (stored as a link in a network), and it has the columns shown in Table 1-2.

Table 1-2 MDSYS.SEMM_model-name View Columns

Column Name Data Type Description

P_VALUE_ID

NUMBER

The VALUE_ID for the text value of the predicate of the triple. Part of the primary key.

START_NODE_ID

NUMBER

The VALUE_ID for the text value of the subject of the triple. Also part of the primary key.

CANON_END_NODE_ID

NUMBER

The VALUE_ID for the text value of the canonical form of the object of the triple. Also part of the primary key.

END_NODE_ID

NUMBER

The VALUE_ID for the text value of the object of the triple

MODEL_ID

NUMBER

The ID for the RDF graph to which the triple belongs. It logically partitions the table by RDF graphs.

COST

NUMBER

(Reserved for future use)

CTXT1

NUMBER

(Reserved for future use)

CTXT2

VARCHAR2(4000)

(Reserved for future use)

DISTANCE

NUMBER

(Reserved for future use)

EXPLAIN

VARCHAR2(4000)

(Reserved for future use)

PATH

VARCHAR2(4000)

(Reserved for future use)

LINK_ID

VARCHAR2(71)

Unique triple identifier value. (It is currently a computed column, and its definition may change in a future release.)


Note:

In Table 1-2, for columns P_VALUE_ID, START_NODE_ID, END_NODE_ID, and CANON_END_NODE_ID, the actual ID values are computed from the corresponding lexical values. However, a lexical value may not always map to the same ID value.

1.3.2 Statements

The MDSYS.RDF_VALUE$ table contains information about the subjects, properties, and objects used to represent RDF statements. It uniquely stores the text values (URIs or literals) for these three pieces of information, using a separate row for each part of each triple.

Oracle maintains the MDSYS.RDF_VALUE$ table automatically. Users should never modify this view directly. For example, do not use SQL INSERT, UPDATE, or DELETE statements with this view.

The RDF_VALUE$ table contains the columns shown in Table 1-3.

Table 1-3 MDSYS.RDF_VALUE$ Table Columns

Column Name Data Type Description

VALUE_ID

NUMBER

Unique value ID number, automatically generated.

VALUE_TYPE

VARCHAR2(10)

The type of text information stored in the VALUE_NAME column. Possible values: UR for URI, BN for blank node, PL for plain literal, PL@ for plain literal with a language tag, PLL for plain long literal, PLL@ for plain long literal with a language tag, TL for typed literal, or TLL for typed long literal. A long literal is a literal with more than 4000 bytes.

VNAME_PREFIX

VARCHAR2(4000)

If the length of the lexical value is 4000 bytes or less, this column stores a prefix of a portion of the lexical value. The SEM_APIS.VALUE_NAME_PREFIX function can be used for prefix computation. For example, the prefix for the portion of the lexical value <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> without the angle brackets is http://www.w3.org/1999/02/22-rdf-syntax-ns#.

VNAME_SUFFIX

VARCHAR2(512)

If the length of the lexical value is 4000 bytes or less, this column stores a suffix of a portion of the lexical value. The SEM_APIS.VALUE_NAME_SUFFIX function can be used for suffix computation. For the lexical value mentioned in the description of the VNAME_PREFIX column, the suffix is type.

LITERAL_TYPE

VARCHAR2(4000)

For typed literals, the type information; otherwise, null. For example, for a row representing a creation date of 1999-08-16, the VALUE_TYPE column can contain TL, and the LITERAL_TYPE column can contain http://www.w3.org/2001/XMLSchema#date.

LANGUAGE_TYPE

VARCHAR2(80)

Language tag (for example, fr for French) for a literal with a language tag (that is, if VALUE_TYPE is PL@ or PLL@). Otherwise, this column has a null value.

CANON_ID

NUMBER

The ID for the canonical lexical value for the current lexical value. (The use of this column may change in a future release.)

COLLISION_EXT

VARCHAR2(64)

Used for collision handling for the lexical value. (The use of this column may change in a future release.)

CANON_COLLISION_EXT

VARCHAR2(64)

Used for collision handling for the canonical lexical value. (The use of this column may change in a future release.)

LONG_VALUE

CLOB

The character string if the length of the lexical value is greater than 4000 bytes. Otherwise, this column has a null value.

VALUE_NAME

VARCHAR2(4000)

This is a computed column. If length of the lexical value is 4000 bytes or less, the value of this column is the concatenation of the values of VNAME_PREFIX column and the VNAME_SUFFIX column.


1.3.2.1 Triple Uniqueness and Data Types for Literals

Duplicate triples are not stored in the database. To check if a triple is a duplicate of an existing triple, the subject, property, and object of the incoming triple are checked against triple values in the specified model. If the incoming subject, property, and object are all URIs, an exact match of their values determines a duplicate. However, if the object of incoming triple is a literal, an exact match of the subject and property, and a value (canonical) match of the object, determine a duplicate. For example, the following two triples are duplicates:

<eg:a> <eg:b> "123"^^http://www.w3.org/2001/XMLSchema#int
<eg:a> <eg:b> "123"^^http://www.w3.org/2001/XMLSchema#unsignedByte

The second triple is treated as a duplicate of the first, because "123"^^http://www.w3.org/2001/XMLSchema#int has an equivalent value (is canonically equivalent) to "123"^^http://www.w3.org/2001/XMLSchema#unsignedByte. Two entities are canonically equivalent if they can be reduced to the same value.

To use a non-RDF example, A*(B-C), A*B-C*A, (B-C)*A, and -A*C+A*B all convert into the same canonical form.

Value-based matching of lexical forms is supported for the following data types:

  • STRING: plain literal, xsd:string and some of its XML Schema subtypes

  • NUMERIC: xsd:decimal and its XML Schema subtypes, xsd:float, and xsd:double. (Support is not provided for float/double INF, -INF, and NaN values.)

  • DATETIME: xsd:datetime, with support for time zone. (Without time zone there are still multiple representations for a single value, for example, "2004-02-18T15:12:54" and "2004-02-18T15:12:54.0000".)

  • DATE: xsd:date, with or without time zone

  • OTHER: Everything else. (No attempt is made to match different representations).

Canonicalization is performed when the time zone is present for literals of type xsd:time and xsd:dateTime.

The following namespace definition is used: xmlns:xsd=”http://www.w3.org/2001/XMLSchema”

The first occurrence of a literal in the RDF_VALUE$ table is taken as the canonical form and given the VALUE_TYPE value of CPL, CPL@, CTL, CPLL, CPLL@, or CTLL as appropriate; that is, a C for canonical is prefixed to the actual value type. If a literal with the same canonical form (but a different lexical representation) as a previously inserted literal is inserted into the RDF_VALUE$ table, the VALUE_TYPE value assigned to the new insert is PL, PL@, TL, PLL, PLL@, or TLL as appropriate.

Canonically equivalent text values having different lexical representations are thus stored in the RDF_VALUE$ table; however, canonically equivalent triples are not stored in the database.

1.3.3 Subjects and Objects

RDF subjects and objects are mapped to nodes in a semantic data network. Subject nodes are the start nodes of links, and object nodes are the end nodes of links. Non-literal nodes (that is, URIs and blank nodes) can be used as both subject and object nodes. Literals can be used only as object nodes.

1.3.4 Blank Nodes

Blank nodes can be used as subject and object nodes in the semantic network. Blank node identifiers are different from URIs in that they are scoped within a semantic model. Thus, although multiple occurrences of the same blank node identifier within a single semantic model necessarily refer to the same resource, occurrences of the same blank node identifier in two different semantic models do not refer to the same resource.

In an Oracle semantic network, this behavior is modeled by requiring that blank nodes are always reused (that is, are used to represent the same resource if the same blank node identifier is used) within a semantic model, and never reused between two different models. Thus, when inserting triples involving blank nodes into a model, you must use the SDO_RDF_TRIPLE_S constructor that supports reuse of blank nodes.

1.3.5 Properties

Properties are mapped to links that have their start node and end node as subjects and objects, respectively. Therefore, a link represents a complete triple.

When a triple is inserted into a model, the subject, property, and object text values are checked to see if they already exist in the database. If they already exist (due to previous statements in other models), no new entries are made; if they do not exist, three new rows are inserted into the RDF_VALUE$ table (described in Section 1.3.2).

1.3.6 Inferencing: Rules and Rulebases

Inferencing is the ability to make logical deductions based on rules. Inferencing enables you to construct queries that perform semantic matching based on meaningful relationships among pieces of data, as opposed to just syntactic matching based on string or other values. Inferencing involves the use of rules, either supplied by Oracle or user-defined, placed in rulebases.

Figure 1-2 shows triple sets being inferred from model data and the application of rules in one or more rulebases. In this illustration, the database can have any number of semantic models, rulebases, and inferred triple sets, and an inferred triple set can be derived using rules in one or more rulebases.

A rule is an object that can be applied to draw inferences from semantic data. A rule is identified by a name and consists of:

  • An IF side pattern for the antecedents

  • An optional filter condition that further restricts the subgraphs matched by the IF side pattern

  • A THEN side pattern for the consequents

For example, the rule that a chairperson of a conference is also a reviewer of the conference could be represented as follows:

('chairpersonRule', -- rule name
 '(?r :ChairPersonOf ?c)', -- IF side pattern
 NULL,  -- filter condition
 '(?r :ReviewerOf ?c)', -- THEN side pattern
 SEM_ALIASES (SEM_ALIAS('', 'http://some.org/test/'))
)

In this case, the rule does not have a filter condition, so that component of the representation is NULL. Note that a THEN side pattern with more than one triple can be used to infer multiple triples for each IF side match.

A rulebase is an object that contains rules. The following Oracle-supplied rulebases are provided:

  • RDFS

  • RDF (a subset of RDFS)

  • OWLSIF (empty)

  • RDFS++ (empty)

  • OWLPRIME (empty)

The RDFS and RDF rulebases are created when you call the SEM_APIS.CREATE_SEM_NETWORK procedure to add RDF support to the database. The RDFS rulebase implements the RDFS entailment rules, as described in the World Wide Web Consortium (W3C) RDF Semantics document at http://www.w3.org/TR/rdf-mt/. The RDF rulebase represents the RDF entailment rules, which are a subset of the RDFS entailment rules. You can see the contents of these rulebases by examining the MDSYS.SEMR_RDFS and MDSYS.SEMR_RDF views.

You can also create user-defined rulebases using the SEM_APIS.CREATE_RULEBASE procedure. User-defined rulebases enable you to provide additional specialized inferencing capabilities.

For each rulebase, a system table is created to hold rules in the rulebase, along with a system view with a name in the format MDSYS.SEMR_rulebase-name (for example, MDSYS.SEMR_FAMILY_RB for a rulebase named FAMILY_RB). You must use this view to insert, delete, and modify rules in the rulebase. Each MDSYS.SEMR_rulebase-name view has the columns shown in Table 1-4.

Table 1-4 MDSYS.SEMR_rulebase-name View Columns

Column Name Data Type Description

RULE_NAME

VARCHAR2(30)

Name of the rule

ANTECEDENTS

VARCHAR2(4000)

IF side pattern for the antecedents

FILTER

VARCHAR2(4000)

Filter condition that further restricts the subgraphs matched by the IF side pattern. Null indicates no filter condition is to be applied.

CONSEQUENTS

VARCHAR2(4000)

THEN side pattern for the consequents

ALIASES

SEM_ALIASES

One or more namespaces to be used. (The SEM_ALIASES data type is described in Section 1.6.)


Information about all rulebases is maintained in the MDSYS.SEM_RULEBASE_INFO view, which has the columns shown in Table 1-5 and one row for each rulebase.

Table 1-5 MDSYS.SEM_RULEBASE_INFO View Columns

Column Name Data Type Description

OWNER

VARCHAR2(30)

Owner of the rulebase

RULEBASE_NAME

VARCHAR2(25)

Name of the rulebase

RULEBASE_VIEW_NAME

VARCHAR2(30)

Name of the view that you must use for any SQL statements that insert, delete, or modify rules in the rulebase

STATUS

VARCHAR2(30)

Contains VALID if the rulebase is valid, INPROGRESS if the rulebase is being created, or FAILED if a system failure occurred during the creation of the rulebase.


Example 1-1 creates a rulebase named family_rb, and then inserts a rule named grandparent_rule into the family_rb rulebase. This rule says that if a person is the parent of a child who is the parent of a child, that person is a grandparent of (that is, has the grandParentOf relationship with respect to) his or her child's child. It also specifies a namespace to be used. (This example is an excerpt from Example 1-14 in Section 1.10.2.)

Example 1-1 Inserting a Rule into a Rulebase

EXECUTE SEM_APIS.CREATE_RULEBASE('family_rb');

INSERT INTO mdsys.semr_family_rb VALUES(
  'grandparent_rule',
  '(?x :parentOf ?y) (?y :parentOf ?z)',
  NULL,
  '(?x :grandParentOf ?z)', 
  SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')));

You can specify one or more rulebases when calling the SEM_MATCH table function (described in Section 1.6), to control the behavior of queries against semantic data. Example 1-2 refers to the family_rb rulebase and to the grandParentOf relationship created in Example 1-1, to find all grandfathers (grandparents who are male) and their grandchildren. (This example is an excerpt from Example 1-14 in Section 1.10.2.)

Example 1-2 Using Rulebases for Inferencing

-- Select all grandfathers and their grandchildren from the family model.
-- Use inferencing from both the RDFS and family_rb rulebases.
SELECT x, y
  FROM TABLE(SEM_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));

For information about support for native OWL inferencing, see Section 2.2.

1.3.7 Rules Indexes

A rules index is an object containing precomputed triples that can be inferred from applying a specified set of rulebases to a specified set of models. If a SEM_MATCH query refers to any rulebases, a rules index must exist for each rulebase-model combination in the query.

To create a rules index, use the SEM_APIS.CREATE_RULES_INDEX procedure. To drop (delete) a rules index, use the SEM_APIS.DROP_RULES_INDEX procedure.

When you create a rules index, a view for the triples associated with the rules index is also created under the MDSYS schema. This view has a name in the format SEMI_rules-index-name, and it is visible only to the owner of the rules index and to users with suitable privileges. Each MDSYS.SEMI_rules-index-name view contains a row for each triple (stored as a link in a network), and it has the same columns as the SEMM_model-name view, which is described in Table 1-2 in Section 1.3.1.

Information about all rules indexes is maintained in the MDSYS.SEM_RULES_INDEX_INFO view, which has the columns shown in Table 1-6 and one row for each rules index.

Table 1-6 MDSYS.SEM_RULES_INDEX_INFO View Columns

Column Name Data Type Description

OWNER

VARCHAR2(30)

Owner of the rules index

INDEX_NAME

VARCHAR2(25)

Name of the rules index

INDEX_VIEW_NAME

VARCHAR2(30)

Name of the view that you must use for any SQL statements that insert, delete, or modify rules in the rules index

STATUS

VARCHAR2(30)

Contains VALID if the rules index is valid, INVALID if the rules index is not valid, INCOMPLETE if the rules index is incomplete (similar to INVALID but requiring less time to re-create), INPROGRESS if the rules index is being created, or FAILED if a system failure occurred during the creation of the rules index.

MODEL_COUNT

NUMBER

Number of models included in the rules index

RULEBASE_COUNT

NUMBER

Number of rulebases included in the rules index


Information about all database objects, such as models and rulebases, related to rules indexes is maintained in the MDSYS.SEM_RULES_INDEX_DATASETS view. This view has the columns shown in Table 1-7 and one row for each unique combination of values of all the columns.

Table 1-7 MDSYS.SEM_RULES_INDEX_DATASETS View Columns

Column Name Data Type Description

INDEX_NAME

VARCHAR2(25)

Name of the rules index

DATA_TYPE

VARCHAR2(8)

Type of data included in the rules index. Examples: MODEL and RULEBASE

DATA_NAME

VARCHAR2(25)

Name of the object of the type in the DATA_TYPE column


Example 1-3 creates a rules index named family_rb_rix_family, using the family model and the RDFS and family_rb rulebases. (This example is an excerpt from Example 1-14 in Section 1.10.2.)

Example 1-3 Creating a Rules Index

BEGIN
  SEM_APIS.CREATE_RULES_INDEX(
    'rdfs_rix_family',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'));
END;
/

1.3.8 Virtual Models

A virtual model is a logical graph that can be used in a SEM_MATCH query. A virtual model is the result of a UNION or UNION ALL operation on one or more models and optionally the corresponding rules index.

Using a virtual model can simplify management of access privileges for semantic data. For example, assume that you have created three semantic models and one rules index based on the three models and the OWLPRIME rulebase. Without a virtual model, you must individually grant and revoke access privileges for each model and the rules index. However, if you create a virtual model that contains the three models and the rules index, you will only need to grant and revoke access privileges for the single virtual model.

Using a virtual model can also facilitate rapid updates to semantic models. For example, assume that virtual model VM1 contains model M1 and rules index R1 (that is, VM1 = M1 UNION ALL R1), and assume that semantic model M1_UPD is a copy of M1 that has been updated with additional triples and that R1_UPD is a rules index created for M1_UPD. Now, to have user queries over VM1 go to the updated model and rules index, you can redefine virtual model VM1 (that is, VM1 = M1_UPD UNION ALL R1_UPD).

To create a virtual model, use the SEM_APIS.CREATE_VIRTUAL_MODEL procedure. To drop (delete) a virtual model, use the SEM_APIS.DROP_VIRTUAL_MODEL procedure. A virtual model is dropped automatically if any of its component models, rulebases, or rules index are dropped.

To query a virtual model, specify the virtual model name in the models parameter of the SEM_MATCH table function, as shown in Example 1-4.

Example 1-4 Querying a Virtual Model

SELECT COUNT(protein)
  FROM TABLE (SEM_MATCH (
    '(?protein rdf:type :Protein) 
     (?protein :citation ?citation) 
     (?citation :author "Bairoch A.")',
    SEM_MODELS('UNIPROT_VM'), 
    NULL, 
    SEM_ALIASES(SEM_ALIAS('', 'http://purl.uniprot.org/core/')),
    NULL, 
    NULL, 
    'ALLOW_DUP=T'));

For information about the SEM_MATCH table function, see Section 1.6, which includes information using certain attributes when querying a virtual model.

When you create a virtual model, an entry is created for it in the MDSYS.SEM_MODEL$ view, which is described in Table 1-1 in Section 1.3.1. However, the values in several of the columns are different for virtual models as opposed to semantic models, as explained in Table 1-8.

Table 1-8 MDSYS.SEM_MODEL$ View Column Explanations for Virtual Models

Column Name Data Type Description

OWNER

VARCHAR2(30)

Schema of the owner of the virtual model.

MODEL_ID

NUMBER

Unique model ID number, automatically generated. Will be a negative number, to indicate that this is a virtual model.

MODEL_NAME

VARCHAR2(25)

Name of the virtual model.

TABLE_NAME

VARCHAR2(30)

Null for a virtual model.

COLUMN_NAME

VARCHAR2(30)

Null for a virtual model.

MODEL_TABLESPACE_NAME

VARCHAR2(30)

Null for a virtual model.


Information about all virtual models is maintained in the MDSYS.SEM_VMODEL_INFO view, which has the columns shown in Table 1-6 and one row for each virtual model.

Table 1-9 MDSYS.SEM_VMODEL_INFO View Columns

Column Name Data Type Description

OWNER

VARCHAR2(30)

Owner of the virtual model

VIRTUAL_MODEL_NAME

VARCHAR2(25)

Name of the virtual model

UNIQUE_VIEW_NAME

VARCHAR2(30)

Name of the view that contains unique triples in the virtual model, or null if the view was not created

DUPLICATE_VIEW_NAME

VARCHAR2(30)

Name of the view that contains duplicate triples (if any) in the virtual model

STATUS

VARCHAR2(30)

Contains VALID if the associated rules index is valid, INVALID if the rules index is not valid, INCOMPLETE if the rules index is incomplete (similar to INVALID but requiring less time to re-create), INPROGRESS if the rules index is being created, FAILED if a system failure occurred during the creation of the rules index, or NORIDX if no rules index is associated with the virtual model.

MODEL_COUNT

NUMBER

Number of models in the virtual model

RULEBASE_COUNT

NUMBER

Number of rulebases used for the virtual model

RULES_INDEX_COUNT

NUMBER

Number of rules indexes in the virtual model


Information about all objects (models, rulebases, and rules index) related to virtual models is maintained in the MDSYS.SEM_VMODEL_DATASETS view. This view has the columns shown in Table 1-7 and one row for each unique combination of values of all the columns.

Table 1-10 MDSYS.SEM_VMODEL_DATASETS View Columns

Column Name Data Type Description

VIRTUAL_MODEL_NAME

VARCHAR2(25)

Name of the virtual model

DATA_TYPE

VARCHAR2(8)

Type of object included in the virtual model. Examples: MODEL for a semantic model, RULEBASE for a rulebase, or RULEIDX for a rules index

DATA_NAME

VARCHAR2(25)

Name of the object of the type in the DATA_TYPE column


1.3.9 Semantic Data Security Considerations

The following database security considerations apply to the use of semantic data:

  • When a model or rules index is created, the owner gets the SELECT privilege with the GRANT option on the associated view. Users that have the SELECT privilege on these views can perform SEM_MATCH queries against the associated model or rules index.

  • When a rulebase is created, the owner gets the SELECT, INSERT, UPDATE, and DELETE privileges on the rulebase, with the GRANT option. Users that have the SELECT privilege on a rulebase can create a rules index that includes the rulebase. The INSERT, UPDATE, and DELETE privileges control which users can modify the rulebase and how they can modify it.

  • To perform data manipulation language (DML) operations on a model, a user must have DML privileges for the corresponding base table.

  • The creator of the base table corresponding to a model can grant privileges to other users.

  • To perform data manipulation language (DML) operations on a rulebase, a user must have the appropriate privileges on the corresponding database view.

  • The creator of a model can grant SELECT privileges on the corresponding database view to other users.

  • A user can query only those models for which that user has SELECT privileges to the corresponding database views.

  • Only the creator of a model or a rulebase can drop it.

1.4 Semantic Metadata Tables and Views

Oracle Database maintains several tables and views in the MDSYS schema to hold metadata related to semantic data. (Some of these tables and views are created by the SEM_APIS.CREATE_SEM_NETWORK procedure, as explained in Section 1.9, and some are created only as needed.) Table 1-11 lists the tables and views in alphabetical order. (In addition, several tables and views are created for Oracle internal use, and these are accessible only by users with DBA privileges.)

Table 1-11 Semantic Metadata Tables and Views

Name Contains Information About Described In

RDF_VALUE$

Subjects, properties, and objects used to represent statements

Section 1.3.2

SEM_MODEL$

All models defined in the database

Section 1.3.1

SEMM_model-name

Triples in the specified model

Section 1.3.1

SEM_RULEBASE_INFO

Rulebases

Section 1.3.6

SEM_RULES_INDEX_DATASETS

Database objects used in rules indexes

Section 1.3.7

SEM_RULES_INDEX_INFO

Rules indexes

Section 1.3.7

SEM_VMODEL_INFO

Virtual models

Section 1.3.8

SEM_VMODEL_DATASETS

Database objects used in virtual models

Section 1.3.8

SEMI_rules-index-name

Triples in the specified rules index

Section 1.3.7

SEMR_rulebase-name

Rules in the specified rulebase

Section 1.3.6

SEMU_virtual-model-name

Unique triples in the virtual model

Section 1.3.8

SEMV_virtual-model-name

Triples in the virtual model

Section 1.3.8


1.5 Semantic Data Types, Constructors, and Methods

The SDO_RDF_TRIPLE object type represents semantic data in triple format, and the SDO_RDF_TRIPLE_S object type (the _S for storage) stores persistent semantic data in the database. The SDO_RDF_TRIPLE_S type has references to the data, because the actual semantic data is stored only in the central RDF schema. This type has methods to retrieve the entire triple or part of the triple.

Note:

Blank nodes are always reused within an RDF model and cannot be reused across models

The SDO_RDF_TRIPLE type is used to display triples, whereas the SDO_RDF_TRIPLE_S type is used to store the triples in database tables.

The SDO_RDF_TRIPLE object type has the following attributes:

SDO_RDF_TRIPLE (
  subject VARCHAR2(4000), 
  property VARCHAR2(4000), 
  object VARCHAR2(10000))

The SDO_RDF_TRIPLE_S object type has the following attributes:

SDO_RDF_TRIPLE_S (
  RDF_C_ID  NUMBER,  -- Canonical object value ID
  SEM_M_ID NUMBER,  -- Model ID 
  RDF_S_ID  NUMBER,  -- Subject value ID
  RDF_P_ID NUMBER, -- Property value ID
  RDF_O_ID  NUMBER)  -- Object value ID

The SDO_RDF_TRIPLE_S type has the following methods that retrieve a triple or a part (subject, property, or object) of a triple:

GET_TRIPLE() RETURNS SDO_RDF_TRIPLE
GET_SUBJECT() RETURNS VARCHAR2
GET_PROPERTY() RETURNS VARCHAR2
GET_OBJECT() RETURNS CLOB

Example 1-5 shows the SDO_RDF_TRIPLE_S methods.

Example 1-5 SDO_RDF_TRIPLE_S Methods

SELECT a.triple.GET_TRIPLE() AS triple
  FROM articles_rdf_data a WHERE a.id = 1;
 
TRIPLE(SUBJECT, PROPERTY, OBJECT)
--------------------------------------------------------------------------------
SDO_RDF_TRIPLE('<http://nature.example.com/Article1>', '<http://purl.org/dc/elem
ents/1.1/title>', '<All about XYZ>')
 
SELECT a.triple.GET_SUBJECT() AS subject
  FROM articles_rdf_data a WHERE a.id = 1;
 
SUBJECT                                                                         
--------------------------------------------------------------------------------
<http://nature.example.com/Article1>                                           
 
SELECT a.triple.GET_PROPERTY() AS property
  FROM articles_rdf_data a WHERE a.id = 1;
 
PROPERTY                                                                        
--------------------------------------------------------------------------------
<http://purl.org/dc/elements/1.1/title>                                         
 
SELECT a.triple.GET_OBJECT() AS object
  FROM articles_rdf_data a WHERE a.id = 1;
 
OBJECT                                                                          
--------------------------------------------------------------------------------
<All about XYZ>

1.5.1 Constructors for Inserting Triples Without Any Blank Nodes

The following constructor formats are available for inserting triples into a model table. The only difference is that in the second format the data type for the object is CLOB, to accommodate very long literals.

SDO_RDF_TRIPLE_S (
  model_name VARCHAR2, -- Model name
  subject    VARCHAR2, -- Subject
  property   VARCHAR2, -- Property
  object     VARCHAR2) -- Object
  RETURN     SELF;

SDO_RDF_TRIPLE_S (
  model_name VARCHAR2, -- Model name
  subject    VARCHAR2, -- Subject
  property   VARCHAR2, -- Property
  object     CLOB) -- Object
  RETURN SELF;

GET_OBJ_VALUE() RETURN VARCHAR2;

Example 1-6 uses the first constructor format to insert a triple.

Example 1-6 SDO_RDF_TRIPLE_S Constructor to Insert a Triple

INSERT INTO articles_rdf_data VALUES (2,
  SDO_RDF_TRIPLE_S ('articles','<http://nature.example.com/Article1>',
    '<http://purl.org/dc/elements/1.1/creator>',
    '"Jane Smith"'));

1.5.2 Constructors for Inserting Triples With or Without Any Blank Nodes

The following constructor formats are available for inserting triples referring to blank nodes into a model table. The only difference is that in the second format the data type for the fourth attribute is CLOB, to accommodate very long literals.

SDO_RDF_TRIPLE_S (
  model_name VARCHAR2, -- Model name
  sub_or_bn  VARCHAR2, -- Subject or blank node
  property   VARCHAR2, -- Property
  obj_or_bn  VARCHAR2, -- Object or blank node
  bn_m_id    NUMBER) -- ID of the model from which to reuse the blank node
  RETURN SELF;

SDO_RDF_TRIPLE_S (
  model_name VARCHAR2, -- Model name
  sub_or_bn  VARCHAR2, -- Subject or blank node
  property   VARCHAR2, -- Property
  object     CLOB, -- Object
  bn_m_id    NUMBER) -- ID of the model from which to reuse the blank node
  RETURN SELF;

If the value of bn_m_id is positive, it must be the same as the model ID of the target model.

Example 1-7 uses the first constructor format to insert a triple that reuses a blank node for the subject.

Example 1-7 SDO_RDF_TRIPLE_S Constructor to Reusing a Blank Node

INSERT INTO nsu_data VALUES (SDO_RDF_TRIPLE_S(
  'nsu', 
  '_:BNSEQN1001A', 
  '<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>', 
  '<http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq>', 
  4));

1.6 Using the SEM_MATCH Table Function to Query Semantic Data

To query semantic data, use the SEM_MATCH table function. This function has the following attributes:

SEM_MATCH(
  query        VARCHAR2,
  models       SEM_MODELS,
  rulebases    SEM_RULEBASES,
  aliases      SEM_ALIASES,
  filter       VARCHAR2,
  index_status VARCHAR2,
  options      VARCHAR2
 ) RETURN ANYDATASET;

The query attribute is required. The other attributes are optional (that is, each can be a null value).

The query attribute is a string literal (or concatenation of string literals) with one or more triple patterns, usually containing variables. (The query attribute cannot be a bind variable or an expression involving a bind variable.) A triple pattern is a triple of atoms enclosed in parentheses. Each atom can be a variable (for example, ?x), a qualified name (for example, rdf:type) that is expanded based on the default namespaces and the value of the aliases attribute, or a full URI (for example, <http://www.example.org/family/Male>). In addition, the third atom can be a numeric literal (for example, 3.14), a plain literal (for example, "Herman"), a language-tagged plain literal (for example, "Herman"@en), or a typed literal (for example, "123"^^xsd:int).

For example, the following query attribute specifies three triple patterns to find grandfathers (that is, grandparents who are also male) and the height of each of their grandchildren:

'(?x :grandParentOf ?y) (?x rdf:type :Male) (?y :height ?h)'

The models attribute identifies the model or models to use. Its data type is SEM_MODELS, which has the following definition: TABLE OF VARCHAR2(25). If you are querying a virtual model, specify only the name of the virtual model and no other models. (Virtual models are explained in Section 1.3.8.)

The rulebases attribute identifies one or more rulebases whose rules are to be applied to the query. Its data type is SDO_RDF_RULEBASES, which has the following definition: TABLE OF VARCHAR2(25). If you are querying a virtual model, this attribute must be null.

The aliases attribute identifies one or more namespaces, in addition to the default namespaces, to be used for expansion of qualified names in the query pattern. Its data type is SEM_ALIASES, which has the following definition: TABLE OF SEM_ALIAS, where each SEM_ALIAS element identifies a namespace ID and namespace value. The SEM_ALIAS data type has the following definition: (namespace_id VARCHAR2(30), namespace_val VARCHAR2(4000))

The following default namespaces (namespace_id and namespace_val attributes) are used by the SEM_MATCH table function:

('rdf', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#')
('rdfs', 'http://www.w3.org/2000/01/rdf-schema#')
('xsd', 'http://www.w3.org/2001/XMLSchema#')

You can override any of these defaults by specifying the namespace_id value and a different namespace_val value in the aliases attribute.

The filter attribute identifies any additional selection criteria. If this attribute is not null, it should be a string in the form of a WHERE clause without the WHERE keyword. For example: '(h >= 6)' to limit the result to cases where the height of the grandfather's grandchild is 6 or greater (using the example of triple patterns earlier in this section).

The index_status attribute lets you query semantic data even when the relevant rules index does not have a valid status. (If you are querying a virtual model, this attribute refers to the rules index associated with the virtual model.) If this attribute is null, the query returns an error if the rules index does not have a valid status. If this attribute is not null, it must be the string INCOMPLETE or INVALID. For an explanation of query behavior with different index_status values, see Section 1.6.1.

The options attribute identifies options that can affect the results of queries. Options are expressed as keyword-value pairs. The following options are supported:

  • ALLOW_DUP=T generates an underlying SQL statement that performs a "union all" instead of a union of the semantic models and inferred data (if applicable). This option may introduce more rows (duplicate triples) in the result set, and you may need to adjust the application logic accordingly. If you do not specify this option, duplicate triples are automatically removed across all the models and inferred data to maintain the set semantics of merged RDF graphs; however, removing duplicate triples increases query processing time. In general, specifying 'ALLOW_DUP=T' improves performance significantly when multiple semantic models are involved in a SEM_MATCH query.

    If you are querying a virtual model, specifying ALLOW_DUP=T causes the SEMV_vm_name view to be queried; otherwise, the SEMU_vm_name view is queried.

  • HINT0={<hint-string>} (pronounced and written "hint" and the number zero) specifies one or more keywords with hints to influence the execution plan and results of queries. Conceptually, a graph pattern with n triple patterns and referring to m distinct variables results in an (n+m)-way join: n-way self-join of the target RDF model or models and optionally the corresponding rules index, and then m joins with RDF_VALUE$ for looking up the values for the m variables. A hint specification affects the join order and join type used for the query execution.

    The hint specification, <hint-string>, uses keywords, some of which have parameters consisting of a sequence or set of aliases, or references, for individual triple patterns and variables used in the query. Aliases for triple patterns are of the form ti where i refers to the 0-based ordinal numbers of triple patterns in the query. For example, the alias for the first triple pattern in a query is t0, the alias for the second one is t1, and so on. Aliases for the variables used in a query are simply the names of those variables. Thus, ?x will be used in the hint specification as the alias for a variable ?x used in the graph pattern.

    Hints used for influencing query execution plans include LEADING(<sequence of aliases>), USE_NL(<set of aliases>), USE_HASH(<set of aliases>), and INDEX(<alias> <index_name>). Hints used for influencing the results of queries include GET_CANON_VALUE(<set of aliases for variables>), which ensures that the values returned for the referenced variables will be their canonical lexical forms. These hints have the same format and basic meaning as hints in SQL statements, which are explained in Oracle Database SQL Language Reference.

    Example 1-9 shows the HINT0 option used in a SEM_MATCH query.

The SEM_MATCH table function returns an object of type ANYDATASET, with elements that depend on the input variables. In the following explanations, var represents the name of a variable used in the query:

  • For each variable var that may be a literal (that is, for each variable that appears only in the object position in the query pattern), the result elements have the following attributes: var, var$RDFVID, var$_PREFIX, var$_SUFFIX, var$RDFVTYP, var$RDFCLOB, var$RDFLTYP, var$RDFVID, var$_PREFIX, var$_SUFFIX, and var$RDFLANG.

  • For each variable var that cannot take a literal value, the result elements have the following attributes: var and var$RDFVTYP.

In such cases, var has the lexical value bound to the variable, var$RDFVID has the VALUE_ID of the value bound to the variable, var$_PREFIX and var$_SUFFIX are the prefix and suffix of the value bound to the variable, var$RDFVTYP indicates the type of value bound to the variable (URI, LIT [literal], or BLN [blank node]), var$RDFCLOB has the lexical value bound to the variable if the value is a long literal, var$RDFLTYP indicates the type of literal bound if a literal is bound, and var$RDFLANG has the language tag of the bound literal if a literal with language tag is bound. var$RDFCLOB is of type CLOB, while all other attributes are of type VARCHAR2.

For a literal value or a blank node, its prefix is the value itself and its suffix is null. For a URI value, its prefix is the left portion of the value up to and including the rightmost occurrence of any of the three characters / (slash), # (pound), or : (colon), and its and suffix is the remaining portion of the value to the right. For example, the prefix and suffix for the URI value http://www.example.org/family/grandParentOf are http://www.example.org/family/ and grandParentOf, respectively.

Example 1-8 selects all grandfathers (grandparents who are male) and their grandchildren from the family model, using inferencing from both the RDFS and family_rb rulebases. (This example is an excerpt from Example 1-14 in Section 1.10.2.)

Example 1-8 SEM_MATCH Table Function

SELECT x, y
  FROM TABLE(SEM_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));

Example 1-9 is functionally the same as Example 1-8, but it adds the HINT0 option.

Example 1-9 HINT0 Option with SEM_MATCH Table Function

SELECT x, y
  FROM TABLE(SEM_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null,
    null,
    'HINT0={LEADING(t0 t1) USE_NL(?x ?y) GET_CANON_VALUE(?x ?y)}'));

Example 1-10 uses the Pathway/Genome BioPax ontology to get all chemical compound types that belong to both Proteins and Complexes:

Example 1-10 SEM_MATCH Table Function

SELECT t.r 
  FROM TABLE (SEM_MATCH ( 
      '(?r rdfs:subClassOf Proteins)
       (?r rdfs:subClassOf Complexes)',   
      SEM_Models ('BioPax'), 
      SEM_Rulebases ('rdfs'), NULL)) t;

As shown in Example 1-10, the search pattern for the SEM_MATCH table function is specified using SPARQL-like syntax where the variable starts with the question-mark character (?). In this example, the variable ?r must match to the same term, and thus is must be a subclass of both Proteins and Complexes.

To use the SEM_RELATED operator to query an OWL ontology, see Section 2.3.

When you are querying multiple models or querying one or more models and the corresponding rules index, consider using virtual models (explained in Section 1.3.8) because of the potential performance benefits.

1.6.1 Performing Queries with Incomplete or Invalid Rules Indexes

You can query semantic data even when the relevant rules index does not have a valid status if you specify the string value INCOMPLETE or INVALID for the index_status attribute of the SEM_MATCH table function. (The rules index status is stored in the STATUS column of the MDSYS.SEM_RULES_INDEX_INFO view, which is described in Section 1.3.7. The SEM_MATCH table function is described in Section 1.6.)

The index_status attribute value affects the query behavior as follows:

  • If the rules index has a valid status, the query behavior is not affected by the value of the index_status attribute.

  • If you provide no value or specify a null value for index_status, the query returns an error if the rules index does not have a valid status.

  • If you specify the string INCOMPLETE for the index_status attribute, the query is performed if the status of the rules index is incomplete or valid.

  • If you specify the string INVALID for the index_status attribute, the query is performed regardless of the actual status of the rules index (invalid, incomplete, or valid).

However, the following considerations apply if the status of the rules index is incomplete or invalid:

  • If the status is incomplete, the content of a rules index may be approximate, because some triples that are inferable (due to the recent insertions into the underlying models) may not actually be present in the rules index, and therefore results returned by the query may be inaccurate.

  • If the status is invalid, the content of the rules index may be approximate, because some triples that are no longer inferable (due to recent modifications to the underlying models or rulebases, or both) may still be present in the rules index, and this may affect the accuracy of the result returned by the query. In addition to possible presence of triples that are no longer inferable, some inferable rows may not actually be present in the rules index.

1.6.2 Graph Patterns: Support for Curly Brace Syntax and OPTIONAL Keyword

The SEM_MATCH table function accepts the syntax for the graph pattern in which a sequence of triple patterns is enclosed within curly braces. The period is usually required as a separator unless followed by the OPTIONAL keyword. With this syntax, you can also use the OPTIONAL construct to retrieve results even in the case of a partial match.

Example 1-11 is functionally the same as Example 1-8, but it uses the syntax with curly braces and a period to express a graph pattern in the SEM_MATCH table function.

Example 1-11 Curly Brace Syntax

SELECT x, y
  FROM TABLE(SEM_MATCH(
    '{?x :grandParentOf ?y . ?x rdf:type :Male}',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));

Example 1-12 uses the OPTIONAL construct to modify Example 1-11, so that also returns, for each grandfather, the names of the games that he plays or null if he does not play any games.

Example 1-12 Curly Brace Syntax and OPTIONAL Construct

SELECT x, y, game
  FROM TABLE(SEM_MATCH(
    '{?x :grandParentOf ?y . ?x rdf:type :Male . 
      OPTIONAL{?x :plays ?game} 
     }',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null,
    null,
    'HINT0={LEADING(t0 t1) USE_NL(?x ?y)}'));

If you use the syntax with curly braces to express a graph pattern:

  • The query always returns canonical lexical forms for the matching values for the variables.

  • Any hints specified using HINT0={<hint-string>} (explained in Section 1.6), should be constructed only on the basis of the portion of the graph pattern outside the OPTIONAL construct. For example, the only valid aliases for use in a hint specification for the query in Example 1-12 are t0, t1, ?x, and ?y.

1.7 Loading and Exporting Semantic Data

To load semantic data into a model, use one or more of the following options:

  • Bulk load using a SQL*Loader direct-path load to get data from an N-Triple format into a staging table and then use a PL/SQL procedure to load or append data into the semantic data store, as explained in Section 1.7.1.

    This is the fastest option for loading large amounts of data; however, it cannot handle triples containing object values with more than 4000 bytes.

  • Batch load using a Java client interface to load or append data from an N-Triple format file into the semantic data store (see Section 1.7.2).

    This option is slower than bulk loading, but it handles triples containing object values with more than 4000 bytes.

  • Load into tables using SQL INSERT statements that call the SDO_RDF_TRIPLE_S constructor, as explained in Section 1.7.3.

To export semantic data, use the Java API, as described in Section 1.7.4.

1.7.1 Bulk Loading Semantic Data Using a Staging Table

You can load semantic data (and optionally associated non-semantic data) in bulk using a staging table. The data must first be parsed to check for syntax correctness and then loaded into the staging table. Then, you can call the SEM_APIS.BULK_LOAD_FROM_STAGING_TABLE procedure (described in Chapter 3).

The following example shows the format for the staging table, including all required columns and the required names for these columns:

CREATE TABLE stage_table (
                     RDF$STC_sub varchar2(4000) not null,
                     RDF$STC_pred varchar2(4000) not null,
                     RDF$STC_obj varchar2(4000) not null
);

If you also want to load non-semantic data, specify additional columns for the non-semantic data in the CREATE TABLE statement. The non-semantic column names must be different from the names of the required columns. The following example creates the staging table with two additional columns (SOURCE and ID) for non-semantic attributes.

CREATE TABLE stage_table_with_extra_cols (
                     source VARCHAR2(4000),
                     id NUMBER,
                     RDF$STC_sub varchar2(4000) not null,
                     RDF$STC_pred varchar2(4000) not null,
                     RDF$STC_obj varchar2(4000) not null
);

Note:

For either form of the CREATE TABLE statement, you may want to add the COMPRESS clause to use table compression, which will reduce the disk space requirements and may improve bulk-load performance.

You must grant the following privileges to the MDSYS user: SELECT privilege on the staging table, and INSERT privilege on the application table.

You can use the SQL*Loader utility to parse and load semantic data into a staging table. If you installed the demo files from the Oracle Database Examples media (see Oracle Database Examples Installation Guide), a sample control file is available at $ORACLE_HOME/md/demo/network/rdf_demos/bulkload.ctl. You can modify and use this file.

Objects longer than 4000 bytes cannot be loaded. If you use the sample SQL*Loader control file, triples (rows) containing such long values will be automatically rejected and stored in a SQL*Loader "bad" file.

However, triples containing object values longer than 4000 bytes can be loaded using the following approach:

  1. Use the SEM_APIS.BULK_LOAD_FROM_STAGING_TABLE procedure to load all rows that can be stored in the staging table.

  2. Load the remaining rows (that is, the rejected rows when using SQL*Loader with the sample control file) from an N-Triple format file, as described in Section 1.7.2.

1.7.1.1 Recording Event Traces during Bulk Loading

If a table named RDF$ET_TAB exists in the invoker's schema and if the MDSYS user has been granted the INSERT and UPDATE privileges on this table, event traces for some of the tasks performed during executions of the SEM_APIS.BULK_LOAD_FROM_STAGING_TABLE procedure will be added to the table. You may find the content of this table useful if you ever need to report any problems in bulk load. The RDF$ET_TAB table must be created as follows:

CREATE TABLE RDF$ET_TAB (
  proc_sid VARCHAR2(30), 
  proc_sig VARCHAR2(200),
  event_name varchar2(200),
  start_time timestamp,
  end_time timestamp,
  start_comment varchar2(1000) DEFAULT NULL,
  end_comment varchar2(1000) DEFAULT NULL
);
GRANT INSERT, UPDATE on RDF$ET_TAB to MDSYS;

1.7.2 Batch Loading Semantic Data Using the Java API

You can perform a batch load operation using the Java class oracle.spatial.rdf.client.BatchLoader, which is packaged in <ORACLE_HOME>/md/jlib/sdordf.jar. Before performing a batch load operation, ensure that the following are true:

  • The semantic data is in N-Triple format. (Several tools are available for converting RDF/XML to N-Triple format; see the Oracle Technology Network or perform a Web search for information about RDF/XML to N-Triple conversion.)

  • Oracle Database Release 11, with Oracle Spatial, is installed, and partitioning is enabled.

  • A semantic technologies network, an application table, and its corresponding semantic model have been created in the database.

  • The CLASSPATH definition includes ojdbc5.jar.

  • You are using JDK version 1.5 or later. (You can use the Java version packaged under <ORACLE_HOME>/jdk/bin.)

To run the oracle.spatial.rdf.client.BatchLoader class, use a command (on a single command line) in the following general form (replacing the sample example database connection information with your own connection information).

  • Linux systems:

    java -Ddb.user=scott -Ddb.password=password -Ddb.host=127.0.0.1 -Ddb.port=1522 -Ddb.sid=orcl -classpath ${ORACLE_HOME}/md/jlib/sdordf.jar:${ORACLE_HOME}/jdbc/lib/ojdbc5.jar oracle.spatial.rdf.client.BatchLoader <N-TripleFile> <tablename> <tablespaceName> <modelName>
    
  • Windows systems:

    java -Ddb.user=scott -Ddb.password=password -Ddb.host=127.0.0.1 -Ddb.port=1522 -Ddb.sid=orcl -classpath %ORACLE_HOME%\md\jlib\sdordf.jar;%ORACLE_HOME%\jdbc\lib\ojdbc5.jar oracle.spatial.rdf.client.BatchLoader <N-TripleFile> <tablename> <tablespaceName> <modelName>
    

By default, BatchLoader assumes there are at least two columns, a column named ID of type NUMBER and a column named TRIPLE of type SDO_RDF_TRIPLE_S, in the user's application table. However, you can override the default names by using the JVM properties -DidColumn=<idColumnName> and -DtripleColumn=<tripleColumnName>. Note that the ID column is not required; and to prevent BatchLoader from generating a sequence-like identifier in the ID column for each triple inserted, specify the JVM property -DjustTriple=true.

If the application table is not empty and if you want the batch loading to be done in append mode, specify an additional JVM property: -Dappend=true. Moreover, in append mode you might want to choose a different starting value for ID column in user's application table, and to accomplish this you can add the JVM property -DstartID=<startingIntegerValue> to the command line. By default, the ID column starts at 1 and is increased sequentially as new triples are inserted into the application table.

To skip the first n triples in <N-TripleFile>, add the JVM property -Dskip=<numberOfTriplesSkipped> to the command line.

To load an N-Triple file with a character set different from the default, specify the JVM property -Dcharset=<charsetName>. For example, -Dcharset="UTF-8" will recognize UTF-8 encoding. However, for UTF-8 characters to be stored properly in the N-Triple file, the Oracle database must be configured to use a corresponding universal character set, such as AL32UTF8.

The BatchLoader class supports loading an N-Triple file in compressed format. If the <N-TripleFile> has a file extension of .zip or .jar, the file will be uncompressed and loaded at the same time.

1.7.2.1 When to Choose Batch Loading

Batch loading is faster than loading semantic data using INSERT statements (described in Section 1.7.3). However, bulk loading (described in Section 1.7.1) is much faster than batch loading for large amounts of data. Batch loading is typically a good option when the following conditions are true:

  • The data to be loaded is less than a few million triples.

  • The data contains a significant amount long literals (longer than 4000 bytes).

1.7.3 Loading Semantic Data Using INSERT Statements

To load semantic data using INSERT statements, the data should be encoded using < > (angle brackets) for URIs, _: (underscore colon) for blank nodes, and " " (quotation marks) for literals. Spaces are not allowed in URIs or blank nodes. Use the SDO_RDF_TRIPLE_S constructor to insert the data, as described in Section 1.5.1.

Note:

If URIs are not encoded with < > and literals with " ", statements will still be processed. However, the statements will take longer to load, since they will have to be further processed to determine their VALUE_TYPE values.

The following example includes statements with URIs, a blank node (the model_id for nsu is 4), a literal, a literal with a language tag, and a typed literal:

INSERT INTO nsu_data VALUES (SDO_RDF_TRIPLE_S('nsu', '<http://nature.example.com/nsu/rss.rdf>',
  '<http://purl.org/rss/1.0/title>', '"Nature''s Science Update"'));
INSERT INTO nsu_data VALUES (SDO_RDF_TRIPLE_S('nsu', '_:BNSEQN1001A',
  '<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>', 
  '<http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq>', 4));
INSERT INTO nsu_data VALUES (SDO_RDF_TRIPLE_S('nsu',
  '<http://nature.example.com/cgi-taf/dynapage.taf?file=/nature/journal/v428/n6978/index.html>',
  '<http://purl.org/dc/elements/1.1/language>', '"English”@en-GB'));
INSERT INTO nature VALUES (SDO_RDF_TRIPLE_S('nsu', '<http://dx.doi.org/10.1038/428004b>',
  '<http://purl.org/dc/elements/1.1/date>', '"2004-03-04"^^xsd:date'));

To convert semantic XML data to INSERT statements, you can edit the sample rss2insert.xsl XSLT file to convert all the features in the semantic data XML file. The blank node constructor is used to insert statements with blank nodes. After editing the XSLT, download the Xalan XSLT processor (http://xml.apache.org/xalan) and follow the installation instructions. To convert a semantic data XML file to INSERT statements using your edited version of the rss2insert.xsl file, use a command in the following format:

java org.apache.xalan.xslt.Process –in input.rdf -xsl rss2insert.xsl –out output.nt

1.7.4 Exporting Semantic Data

To output semantic data to a file in N-Triple format, use the NTripleConverter Java class. The NDM2NTriple(String, int) method exports all the triples stored for the specified model.

For information about using the NTriple converter class, see the README.txt file in the sdordf_converter.zip file, which you can download from the Oracle Technology Network.

1.8 Creating and Managing Semantic Network Indexes

Semantic network indexes are nonunique B-tree indexes that you can add, alter, and drop for use with models and rules indexes in a semantic network. You can use such indexes to tune the performance of SEM_MATCH queries on the models and rules indexes in the network. You can create and manage semantic network indexes using the following subprograms:

All of these subprograms have an index_code parameter, which can contain any sequence of the following letters (without repetition): P, C, S, M, O. These letters used in the index_code correspond to the following columns in the SEMM_* and SEMI_* views: P_VALUE_ID, CANON_END_NODE_ID, START_NODE_ID, MODEL_ID, and END_NODE_ID.

The SEM_APIS.ADD_SEM_INDEX procedure creates a semantic network index that results in creation of a nonunique B-tree index in UNUSABLE status for each of the existing models and rules indexes. The name of the index is RDF_LNK_<index_code>_IDX and the index is owned by MDSYS. This operation is allowed only if the invoker has DBA role. The following example shows creation of the PSCM index with the following key: <P_VALUE_ID, START_NODE_ID, CANON_END_NODE_ID, MODEL_ID>.

EXECUTE SEM_APIS.ADD_SEM_INDEX('PSCM');

After you create a semantic network index, each of the corresponding nonunique B-tree indexes is in the UNUSABLE status, because making it usable can cause significant time and resources to be used, and because subsequent index maintenance operations might involve performance costs that you do not want to incur. You can make a semantic network index usable or unusable for specific models or rules indexes that you own by calling the SEM_APIS.ALTER_SEM_INDEX_ON_MODEL and SEM_APIS.ALTER_SEM_INDEX_ON_RULES_INDEX procedures and specifying 'REBUILD' or 'UNUSABLE' as the command parameter. Thus, you can experiment by making different semantic network indexes usable and unusable, and checking for any differences in performance. For example, the following statement makes the PSCM index usable for the FAMILY model:

EXECUTE SEM_APIS.ALTER_SEM_INDEX_ON_MODEL('FAMILY','PSCM','REBUILD');

Also note the following:

  • Independent of any semantic network indexes that you create, when a semantic network is created, one of the indexes that is automatically created is an index that you can manage by referring to the index_code as 'PSCF' when you call the subprograms mentioned in this section.

  • When you create a new model or a new rules index, a new nonunique B-tree index is created for each of the semantic network indexes, and each such B-tree index is in the USABLE status.

  • Including the MODEL_ID column in a semantic network index key (by including 'M' in the index_code value) may improve query performance. This is particularly relevant when virtual models are used.

1.9 Quick Start for Using Semantic Data

To work with semantic data in an Oracle database, follow these general steps:

  1. Create a tablespace for the system tables. You must be connected as a user with appropriate privileges to create the tablespace. The following example creates a tablespace named RDF_TBLSPACE:

    CREATE TABLESPACE rdf_tblspace
     DATAFILE '/oradata/orcl/rdf_tblspace.dat' SIZE 1024M REUSE
     AUTOEXTEND ON NEXT 256M MAXSIZE UNLIMITED
     SEGMENT SPACE MANAGEMENT AUTO;
    
  2. Create a semantic data network.

    Creating a semantic data network adds semantic data support to an Oracle database. You must create a semantic data network as a user with DBA privileges, specifying a valid tablespace with adequate space. Create the network only once for an Oracle database.

    The following example creates a semantic data network using a tablespace named RDF_TBLSPACE (which must already exist):

    EXECUTE SEM_APIS.CREATE_SEM_NETWORK('rdf_tblspace');
    
  3. Connect as the database user under whose schema you will store your semantic data; do not perform the following steps while connected as SYS, SYSTEM, or MDSYS.

  4. Create a table to store references to the semantic data. (You do not need to be connected as a user with DBA privileges for this step and the remaining steps.)

    This table must contain a column of type SDO_RDF_TRIPLE_S, which will contain references to all data associated with a single model.

    The following example creates a table named ARTICLES_RDF_DATA:

    CREATE TABLE articles_rdf_data (id NUMBER, triple SDO_RDF_TRIPLE_S);
    
  5. Create a model.

    When you create a model, you specify the model name, the table to hold references to semantic data for the model, and the column of type SDO_RDF_TRIPLE_S in that table.

    The following command creates a model named ARTICLES, which will use the table created in the preceding step.

    EXECUTE SEM_APIS.CREATE_SEM_MODEL('articles', 'articles_rdf_data', 'triple');
    
  6. Where possible, create Oracle database indexes on conditions that will be specified in the WHERE clause of SELECT statements, to provide better performance for direct queries against the application table's SDO_RDF_TRIPLE_S column. (These indexes are not relevant if the SEM_MATCH table function is being used.) The following example creates such indexes:

    -- Create indexes on the subjects, properties, and objects
    -- in the ARTICLES_RDF_DATA table.
    CREATE INDEX articles_sub_idx ON articles_rdf_data (triple.GET_SUBJECT());
    CREATE INDEX articles_prop_idx ON articles_rdf_data (triple.GET_PROPERTY());
    CREATE INDEX articles_obj_idx ON articles_rdf_data (TO_CHAR(triple.GET_OBJECT()));
    

After you create the model, you can insert triples into the table, as shown in the examples in Section 1.10.

1.10 Semantic Data Examples

This section contains the following PL/SQL examples:

1.10.1 Example: Journal Article Information

This section presents a simplified PL/SQL example of model for statements about journal articles. Example 1-13 contains descriptive comments, refer to concepts that are explained in this chapter, and uses functions and procedures documented in Chapter 3.

Example 1-13 Using a Model for Journal Article Information

-- Basic steps:
-- After you have connected as a privileged user and called 
-- SEM_APIS.CREATE_RDF_NETWORK to add RDF support,
-- connect as a regular database user and do the following.
-- 1. For each desired model, create a table to hold its data.
-- 2. For each model, create a model (SEM_APIS.CREATE_RDF_MODEL).
-- 3. For each table to hold semantic data, insert data into the table.
-- 4. Use various subprograms and consructors.
 
-- Create the table to hold data for the model.
CREATE TABLE articles_rdf_data (id NUMBER, triple SDO_RDF_TRIPLE_S);
 
-- Create the model.
EXECUTE SEM_APIS.CREATE_RDF_MODEL('articles', 'articles_rdf_data', 'triple');
 
-- Information to be stored about some fictitious articles:
-- Article1, titled "All about XYZ" and written by Jane Smith, refers 
--   to Article2 and Article3.
-- Article2, titled "A review of ABC" and written by Joe Bloggs, 
--   refers to Article3.
-- Seven SQL statements to store the information. In each statement:
-- Each article is referred to by its complete URI The URIs in
--   this example are fictitious.
-- Each property is referred to by the URL for its definition, as 
--   created by the Dublin Core Metadata Initiative.
 
-- Insert rows into the table.
 
-- Article1 has the title "All about XYZ".
INSERT INTO articles_rdf_data VALUES (1,
  SDO_RDF_TRIPLE_S ('articles','http://nature.example.com/Article1',
    'http://purl.org/dc/elements/1.1/title','All about XYZ'));
 
-- Article1 was created (written) by Jane Smith.
INSERT INTO articles_rdf_data VALUES (2,
  SDO_RDF_TRIPLE_S ('articles','http://nature.example.com/Article1',
    'http://purl.org/dc/elements/1.1/creator',
    'Jane Smith'));
 
-- Article1 references (refers to) Article2.
INSERT INTO articles_rdf_data VALUES (3,
  SDO_RDF_TRIPLE_S ('articles',
    'http://nature.example.com/Article1',
    'http://purl.org/dc/terms/references',
    'http://nature.example.com/Article2'));
 
-- Article1 references (refers to) Article3.
INSERT INTO articles_rdf_data VALUES (4,
  SDO_RDF_TRIPLE_S ('articles',
    'http://nature.example.com/Article1',
    'http://purl.org/dc/terms/references',
    'http://nature.example.com/Article3'));
 
-- Article2 has the title "A review of ABC".
INSERT INTO articles_rdf_data VALUES (5,
  SDO_RDF_TRIPLE_S ('articles',
    'http://nature.example.com/Article2',
    'http://purl.org/dc/elements/1.1/title',
    'A review of ABC'));
 
-- Article2 was created (written) by Joe Bloggs.
INSERT INTO articles_rdf_data VALUES (6,
  SDO_RDF_TRIPLE_S ('articles',
    'http://nature.example.com/Article2',
    'http://purl.org/dc/elements/1.1/creator',
    'Joe Bloggs'));
 
-- Article2 references (refers to) Article3.
INSERT INTO articles_rdf_data VALUES (7,
  SDO_RDF_TRIPLE_S ('articles',
    'http://nature.example.com/Article2',
    'http://purl.org/dc/terms/references',
    'http://nature.example.com/Article3'));
 
COMMIT;
 
-- Query semantic data.
 
SELECT SEM_APIS.GET_MODEL_ID('articles') AS model_id FROM DUAL;

SELECT SEM_APIS.GET_TRIPLE_ID(
  'articles',
  'http://nature.example.com/Article2',
  'http://purl.org/dc/terms/references',
  'http://nature.example.com/Article3') AS RDF_triple_id FROM DUAL;
 
SELECT SEM_APIS.IS_TRIPLE(
  'articles',
  'http://nature.example.com/Article2',
  'http://purl.org/dc/terms/references',
  'http://nature.example.com/Article3') AS is_triple FROM DUAL;
 
-- Use SDO_RDF_TRIPLE_S member functions in queries.
 
SELECT a.triple.GET_TRIPLE() AS triple 
  FROM articles_rdf_data a WHERE a.id = 1;
SELECT a.triple.GET_SUBJECT() AS subject 
  FROM articles_rdf_data a WHERE a.id = 1;
SELECT a.triple.GET_PROPERTY() AS property 
  FROM articles_rdf_data a WHERE a.id = 1;
SELECT a.triple.GET_OBJECT() AS object 
  FROM articles_rdf_data a WHERE a.id = 1;

1.10.2 Example: Family Information

This section presents a simplified PL/SQL example of a model for statements about family tree (genealogy) information. Example 1-13 contains descriptive comments, refer to concepts that are explained in this chapter, and uses functions and procedures documented in Chapter 3.

The family relationships in this example reflect the family tree shown in Figure 1-3. This figure also shows some of the information directly stated in the example: Cathy is the sister of Jack, Jack and Tom are male, and Cindy is female.

Figure 1-3 Family Tree for RDF Example

Description of Figure 1-3 follows
Description of "Figure 1-3 Family Tree for RDF Example"

Example 1-14 Using a Model for Family Information

-- Basic steps:
-- After you have connected as a privileged user and called 
-- SEM_APIS.CREATE_RDF_NETWORK to enable RDF support,
-- connect as a regular database user and do the following.
-- 1. For each desired model, create a table to hold its data.
-- 2. For each model, create a model (SEM_APIS.CREATE_RDF_MODEL).
-- 3. For each table to hold semantic data, insert data into the table.
-- 4. Use various subprograms and constructors.
 
-- Create the table to hold data for the model.
CREATE TABLE family_rdf_data (id NUMBER, triple SDO_RDF_TRIPLE_S);
 
-- Create the model.
execute SEM_APIS.create_rdf_model('family', 'family_rdf_data', 'triple');
 
-- Insert rows into the table. These express the following information:
-----------------
-- John and Janice have two children, Suzie and Matt.
-- Matt married Martha, and they have two children:
--   Tom (male, height 5.75) and Cindy (female, height 06.00).
-- Suzie married Sammy, and they have two children:
--   Cathy (height 5.8) and Jack (male, height 6).
 
-- Person is a class that has two subslasses: Male and Female.
-- parentOf is a property that has two subproperties: fatherOf and motherOf.
-- siblingOf is a property that has two subproperties: brotherOf and sisterOf.
-- The domain of the fatherOf and brotherOf properties is Male.
-- The domain of the motherOf and sisterOf properties is Female.
------------------------
 
-- John is the father of Suzie.
INSERT INTO family_rdf_data VALUES (1, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/John', 
'http://www.example.org/family/fatherOf', 
'http://www.example.org/family/Suzie'));
 
-- John is the father of Matt.
INSERT INTO family_rdf_data VALUES (2, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/John', 
'http://www.example.org/family/fatherOf', 
'http://www.example.org/family/Matt'));
 
-- Janice is the mother of Suzie.
INSERT INTO family_rdf_data VALUES (3, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Janice', 
'http://www.example.org/family/motherOf', 
'http://www.example.org/family/Suzie'));
 
-- Janice is the mother of Matt.
INSERT INTO family_rdf_data VALUES (4, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Janice', 
'http://www.example.org/family/motherOf', 
'http://www.example.org/family/Matt'));
 
-- Sammy is the father of Cathy.
INSERT INTO family_rdf_data VALUES (5, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Sammy', 
'http://www.example.org/family/fatherOf', 
'http://www.example.org/family/Cathy'));
 
-- Sammy is the father of Jack.
INSERT INTO family_rdf_data VALUES (6, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Sammy', 
'http://www.example.org/family/fatherOf', 
'http://www.example.org/family/Jack'));
 
-- Suzie is the mother of Cathy.
INSERT INTO family_rdf_data VALUES (7, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Suzie', 
'http://www.example.org/family/motherOf', 
'http://www.example.org/family/Cathy'));
 
-- Suzie is the mother of Jack.
INSERT INTO family_rdf_data VALUES (8, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Suzie', 
'http://www.example.org/family/motherOf', 
'http://www.example.org/family/Jack'));
 
-- Matt is the father of Tom.
INSERT INTO family_rdf_data VALUES (9, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Matt', 
'http://www.example.org/family/fatherOf', 
'http://www.example.org/family/Tom'));
 
-- Matt is the father of Cindy
INSERT INTO family_rdf_data VALUES (10, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Matt', 
'http://www.example.org/family/fatherOf', 
'http://www.example.org/family/Cindy'));
 
-- Martha is the mother of Tom.
INSERT INTO family_rdf_data VALUES (11, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Martha', 
'http://www.example.org/family/motherOf', 
'http://www.example.org/family/Tom'));
 
-- Martha is the mother of Cindy. 
INSERT INTO family_rdf_data VALUES (12, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Martha', 
'http://www.example.org/family/motherOf', 
'http://www.example.org/family/Cindy'));
 
-- Cathy is the sister of Jack.
INSERT INTO family_rdf_data VALUES (13, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Cathy', 
'http://www.example.org/family/sisterOf', 
'http://www.example.org/family/Jack'));
 
-- Jack is male.
INSERT INTO family_rdf_data VALUES (14, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Jack', 
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
'http://www.example.org/family/Male'));
 
-- Tom is male.
INSERT INTO family_rdf_data VALUES (15, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Tom', 
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
'http://www.example.org/family/Male'));
 
-- Cindy is female.
INSERT INTO family_rdf_data VALUES (16, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Cindy', 
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
'http://www.example.org/family/Female'));
 
-- Person is a class.
INSERT INTO family_rdf_data VALUES (17, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Person', 
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
'http://www.w3.org/2000/01/rdf-schema#Class'));
 
-- Male is a subclass of Person.
INSERT INTO family_rdf_data VALUES (18, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Male', 
'http://www.w3.org/2000/01/rdf-schema#subClassOf',
'http://www.example.org/family/Person'));
 
-- Female is a subclass of Person. 
INSERT INTO family_rdf_data VALUES (19, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Female', 
'http://www.w3.org/2000/01/rdf-schema#subClassOf',
'http://www.example.org/family/Person'));
 
-- siblingOf is a property.
INSERT INTO family_rdf_data VALUES (20, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/siblingOf', 
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
'http://www.w3.org/1999/02/22-rdf-syntax-ns#Property'));
 
-- parentOf is a property.
INSERT INTO family_rdf_data VALUES (21, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/parentOf', 
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
'http://www.w3.org/1999/02/22-rdf-syntax-ns#Property'));
 
-- brotherOf is a subproperty of siblingOf.
INSERT INTO family_rdf_data VALUES (22, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/brotherOf', 
'http://www.w3.org/2000/01/rdf-schema#subPropertyOf',
'http://www.example.org/family/siblingOf'));
 
-- sisterOf is a subproperty of siblingOf.
INSERT INTO family_rdf_data VALUES (23, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/sisterOf', 
'http://www.w3.org/2000/01/rdf-schema#subPropertyOf',
'http://www.example.org/family/siblingOf'));
 
-- A brother is male.
INSERT INTO family_rdf_data VALUES (24, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/brotherOf', 
'http://www.w3.org/2000/01/rdf-schema#domain',
'http://www.example.org/family/Male'));
 
-- A sister is female.
INSERT INTO family_rdf_data VALUES (25, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/sisterOf', 
'http://www.w3.org/2000/01/rdf-schema#domain',
'http://www.example.org/family/Female'));
 
-- fatherOf is a subproperty of parentOf.
INSERT INTO family_rdf_data VALUES (26, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/fatherOf', 
'http://www.w3.org/2000/01/rdf-schema#subPropertyOf',
'http://www.example.org/family/parentOf'));
 
-- motherOf is a subproperty of parentOf.
INSERT INTO family_rdf_data VALUES (27, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/motherOf', 
'http://www.w3.org/2000/01/rdf-schema#subPropertyOf',
'http://www.example.org/family/parentOf'));
 
-- A father is male.
INSERT INTO family_rdf_data VALUES (28, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/fatherOf', 
'http://www.w3.org/2000/01/rdf-schema#domain',
'http://www.example.org/family/Male'));
 
-- A mother is female.
INSERT INTO family_rdf_data VALUES (29, 
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/motherOf', 
'http://www.w3.org/2000/01/rdf-schema#domain',
'http://www.example.org/family/Female'));
 
-- Use SET ESCAPE OFF to prevent the caret (^) from being
-- interpreted as an escape character. Two carets (^^) are
-- used to represent typed literals.
SET ESCAPE OFF;
 
-- Cathy's height is 5.8 (decimal).
INSERT INTO family_rdf_data VALUES (30,
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Cathy', 
'http://www.example.org/family/height',
'"5.8"^^xsd:decimal'));
 
-- Jack's height is 6 (integer).
INSERT INTO family_rdf_data VALUES (31,
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Jack', 
'http://www.example.org/family/height',
'"6"^^xsd:integer'));
 
-- Tom's height is 05.75 (decimal).
INSERT INTO family_rdf_data VALUES (32,
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Tom', 
'http://www.example.org/family/height',
'"05.75"^^xsd:decimal'));
 
-- Cindy's height is 06.00 (decimal).
INSERT INTO family_rdf_data VALUES (33,
SDO_RDF_TRIPLE_S('family', 
'http://www.example.org/family/Cindy', 
'http://www.example.org/family/height',
'"06.00"^^xsd:decimal'));
 
COMMIT;
 
-- RDFS inferencing in the family model
BEGIN
  SEM_APIS.CREATE_RULES_INDEX(
    'rdfs_rix_family',
    SEM_Models('family'),
    SEM_Rulebases('RDFS'));
END;
/
 
-- Select all males from the family model, without inferencing.
SELECT m
  FROM TABLE(SEM_MATCH(
    '(?m rdf:type :Male)',
    SEM_Models('family'),
    null,
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));
 
-- Select all males from the family model, with RDFS inferencing.
SELECT m
  FROM TABLE(SEM_MATCH(
    '(?m rdf:type :Male)',
    SEM_Models('family'),
    SDO_RDF_Rulebases('RDFS'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));
 
-- General inferencing in the family model
 
EXECUTE SEM_APIS.CREATE_RULEBASE('family_rb');
 
INSERT INTO mdsys.semr_family_rb VALUES(
  'grandparent_rule',
  '(?x :parentOf ?y) (?y :parentOf ?z)',
  NULL,
  '(?x :grandParentOf ?z)', 
  SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')));
 
COMMIT;
 
-- Because a new rulebase has been created, and it will be used in the
-- rules index, drop the preceding rules index and then re-create it.
EXECUTE SEM_APIS.DROP_RULES_INDEX ('rdfs_rix_family');
 
-- Re-create the rules index.
BEGIN
  SEM_APIS.CREATE_RULES_INDEX(
    'rdfs_rix_family',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'));
END;
/
 
-- Select all grandfathers and their grandchildren from the family model, 
-- without inferencing. (With no inferencing, no results are returned.)
SELECT x grandfather, y grandchild
  FROM TABLE(SEM_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SEM_Models('family'),
    null, 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));
 
-- Select all grandfathers and their grandchildren from the family model.
-- Use inferencing from both the RDFS and family_rb rulebases.
SELECT x grandfather, y grandchild
  FROM TABLE(SEM_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')),
    null));
 
-- Set up to find grandfathers of tall (>= 6) grandchildren
-- from the family model, with RDFS inferencing and
-- inferencing using the "family_rb" rulebase.
 
UPDATE mdsys.semr_family_rb SET
  antecedents = '(?x :parentOf ?y) (?y :parentOf ?z) (?z :height ?h)',
  filter = '(h >= 6)',
  aliases = SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/'))
WHERE rule_name = 'GRANDPARENT_RULE';
 
-- Because the rulebase has been updated, drop the preceding rules index, 
-- and then re-create it.
EXECUTE SEM_APIS.DROP_RULES_INDEX ('rdfs_rix_family');
 
-- Re-create the rules index.
BEGIN
  SEM_APIS.CREATE_RULES_INDEX(
    'rdfs_rix_family',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'));
END;
/
 
-- Find the rules index that was just created (that is, the
-- one based on the specified model and rulebases).
SELECT SEM_APIS.LOOKUP_RULES_INDEX(SEM_MODELS('family'),
  SEM_RULEBASES('RDFS','family_rb')) AS lookup_rules_index FROM DUAL;
 
-- Select grandfathers of tall (>= 6) grandchildren, and their
-- tall grandchildren.
SELECT x grandfather, y grandchild
  FROM TABLE(SEM_MATCH(
    '(?x :grandParentOf ?y) (?x rdf:type :Male)',
    SEM_Models('family'),
    SEM_RuleBases('RDFS','family_rb'), 
    SEM_ALIASES(SEM_ALIAS('','http://www.example.org/family/')), 
    null));

1.11 Required Procedure for Semantic Technologies Support

Before you can use any types, synonyms, or PL/SQL packages related to Oracle semantic technologies support, you must connect to the database as a user with DBA privileges, start SQL*Plus, and execute one of the following scripts, depending on your needs:

  • Upgrade with data migration: $ORACLE_HOME/md/admin/catsem10i.sql

  • New installation: $ORACLE_HOME/md/admin/catsem11i.sql

Note:

If you are upgrading from Oracle Database Release 10.2 and have any Release 10.2 RDF data, and if you do not want the RDF data to be migrated, you must do the following before running the catsem10i.sql script:
  • Drop the RDF network.

  • Drop all columns of types SDO_RDF_TRIPLE_S and SDO_RDF_TRIPLE in user tables.

The catsem10i.sql script is comprehensive and handles both installation and (if necessary) any RDF data migration from Release 10.2 to Release 11 format. This script is required if you have an existing Release 10.2 RDF network; however, it may take a long time to run if the existing Release 10.2 RDF network contains large amount of RDF data.

The catsem11i.sql script is sufficient if you have a new Release 11.1 installation or if you are upgrading from Release 10.2 but do not have any existing RDF network and any columns of types SDO_RDF_TRIPLE_S or SDO_RDF_TRIPLE in user tables.

If a Release 10.2 RDF network exists or if dependencies exist on the Release 10.2 RDF types SDO_RDF_TRIPLE_S or SDO_RDF_TRIPLE, the catsem11i.sql script will exit with an error. In that case, you can continue using Release 10.2 RDF support and run the catsem10i.sql script later, when you are ready to upgrade.

1.12 Partitioning Requirement

Before you can use any of the RDF and OWL capabilities, the Partitioning option must be enabled.

For licensing information about the Partitioning option, see Oracle Database Licensing Information. For usage information about partitioning, see Oracle Database VLDB and Partitioning Guide.

1.13 Downgrading to the Previous Oracle Database Release

If you need to downgrade to Oracle Database Release 10.2, and if you used Oracle Database Release 11 RDF or OWL features and want to preserve existing semantic data and rulebases, you must execute a statement to prepare for the downgrade, perform the downgrade, and execute another statement to restore the semantic data.

However, the following considerations apply:

  • Entailed graph data will not be preserved, because the same information can be regenerated using the Oracle Database Release 10.2 RDF inference API.

  • No rulebases or rules indexes related to OWL are preserved, because Oracle Database Release 10.2 did not support the OWL vocabulary.

Perform the following steps:

  1. Before the database downgrade, connect to the Release 11 database as the SYS user with SYSDBA privileges (SYS AS SYSDBA, and enter the SYS account password when prompted).

  2. Start SQL*Plus and enter the following statement:

    EXECUTE SDO_SEM_DOWNGRADE.PREPARE_DOWNGRADE_FROM_11;
    

    When this statement executes successfully, all existing semantic data and rulebases are saved. You will restore the semantic data after the database downgrade.

  3. Perform the database downgrade.

  4. Download the following file from the Semantic Technologies page of the Oracle Technology Network site: sdosemdgu.plb

  5. If (and only if) your Oracle Database Release 10.2 release number is 10.2.0.1, click the Software link, and download and install the RDF-specific patch. (This patch is needed because Release 10.2.0.1 did not have the batch loading feature, which is used to restore the semantic data.)

  6. Connect to the Release 10.2 database as the SYS user with SYSDBA privileges.

  7. Start SQL*Plus and enter a statement in the following statement:

    EXECUTE SDO_SEM_DOWNGRADE_UTL.PREPARE_DOWNGRADE_TO_102('<tablespace-name>');
    

    Where <tablespace-name> is the name of the tablespace in which the RDF network will be created.

    When this statement executes successfully, all semantic data that had been saved before the downgrade is restored and ready to use.

1.14 Software Naming Changes for Semantic Technologies

Because the support for semantic data has been expanded beyond the original focus on RDF, the names of many software objects (PL/SQL packages, functions and procedures, system tables and views, and so on) have been changed as of Oracle Database Release 11.1. In most cases, the change is to replace the string RDF with SEM. although in some cases it may be to replace SDO_RDF with SEM.

All valid code that used the pre-Release 11.1 names will continue to work; your existing applications will not be broken. However, it is suggested that you change old applications to use new object names, and you should use the new names for any new applications. This manual will document only the new names.

Table 1-12 lists the old and new names for some objects related to support for semantic technologies, in alphabetical order by old name.

Table 1-12 Semantic Technology Software Objects: Old and New Names

Old Name New Name

RDF_ALIAS data type

SEM_ALIAS

RDF_MODEL$ view

SEM_MODEL$

RDF_RULEBASE_INFO view

SEM_RULEBASE_INFO

RDF_RULES_INDEX_DATASETS view

SEM_RULES_INDEX_DATASETS

RDF_RULES_INDEX_INFO view

SEM_RULES_INDEX_INFO

RDFI_rules-index-name view

SEMI_rules-index-name

RDFM_model-name view

SEMM_model-name

RDFR_rulebase-name view

SEMR_rulebase-name

SDO_RDF package

SEM_APIS

SDO_RDF_INFERENCE package

SEM_APIS

SDO_RDF_MATCH table function

SEM_MATCH

SDO_RDF_MODELS data type

SEM_MODELS

SDO_RDF_RULEBASES data type

SEM_RULEBASES