1.13 Using Data Type Indexes

Data type indexes are indexes on the values of typed literals stored in an RDF network.

These indexes may significantly improve the performance of SEM_MATCH queries involving certain types of FILTER expressions. For example, a data type index on xsd:dateTime literals may speed up evaluation of the filter (?x < "1929-11-16T13:45:00Z"^^xsd:dateTime). Indexes can be created for several data types, which are listed in Table 1-30.

Table 1-30 Data Types for Data Type Indexing

Data Type URI Oracle Type Index Type

http://www.w3.org/2001/XMLSchema#decimal

NUMBER

Non-unique B-tree (creates a single index for all xsd numeric types, including xsd:float, xsd:double, and xsd:decimal and all of its subtypes)

http://www.w3.org/2001/XMLSchema#string

VARCHAR2

Non-unique B-tree (creates a single index for xsd:string typed literals and plain literals)

http://www.w3.org/2001/XMLSchema#time

TIMESTAMP WITH TIMEZONE

Non-unique B-tree

http://www.w3.org/2001/XMLSchema#date

TIMESTAMP WITH TIMEZONE

Non-unique B-tree

http://www.w3.org/2001/XMLSchema#dateTime

TIMESTAMP WITH TIMEZONE

Non-unique B-tree

http://xmlns.oracle.com/rdf/text

(Not applicable)

CTXSYS.CONTEXT

http://xmlns.oracle.com/rdf/geo/WKTLiteral

SDO_GEOMETRY

SPATIAL_INDEX

http://www.opengis.net/geosparql#wktLiteral

SDO_GEOMETRY

SPATIAL_INDEX

http://www.opengis.net/geosparql#gmlLiteral

SDO_GEOMETRY

SPATIAL_INDEX

http://xmlns.oracle.com/rdf/like

VARCHAR2

Non-unique B-tree

The suitability of data type indexes depends on your query workload. Data type indexes on xsd data types can be used for filters that compare a variable with a constant value, and are particularly useful when queries have an unselective graph pattern with a very selective filter condition. Appropriate data type indexes are required for queries with spatial or text filters.

While data type indexes improve query performance, overhead from incremental index maintenance can degrade the performance of DML and bulk load operations on the RDF network. For bulk load operations, it may often be faster to drop data type indexes, perform the bulk load, and then re-create the data type indexes. As it is time consuming to create a text index on large amounts of text data, nologging is enabled by default when the text index is created. The logging can be enabled by specifying ‘LOGGING=T’ in the options field of add_datatype_index API for the text index.

You can add, alter, and drop data type indexes using the following procedures, which are described in SEM_APIS Package Subprograms:

Information about existing data type indexes is maintained in the SEM_DTYPE_INDEX_INFO view, which has the columns shown in Table 1-31 and one row for each data type index.

Table 1-31 SEM_DTYPE_INDEX_INFO View Columns

Column Name Data Type Description

DATATYPE

VARCHAR2(51)

Data type URI

INDEX_NAME

VARCHAR2(30)

Name of the index

STATUS

VARCHAR2(8)

Status of the index: USABLE or UNUSABLE

TABLESPACE_NAME

VARCHAR2(30)

Tablespace for the index

FUNCIDX_STATUS

VARCHAR2(8)

Status of the function-based index: NULL, ENABLED, or DISABLED

You can use the HINT0 hint to ensure that data type indexes are used during query evaluation, as shown in Example 1-116, which finds all grandfathers who were born before November 16, 1929.

Example 1-116 Using HINT0 to Ensure Use of Data Type Index

SELECT x, y
  FROM TABLE(SEM_MATCH(
    'PREFIX : <http://www.example.org/family/>
     SELECT ?x ?y 
     WHERE {?x :grandParentOf ?y . ?x rdf:type :Male . ?x :birthDate ?bd
       FILTER (?bd <= "1929-11-15T23:59:59Z"^^xsd:dateTime) }',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    
    null, null, null, 
    'HINT0={ LEADING(?bd) INDEX(?bd rdf_v$dateTime_idx) } 
             FAST_DATE_FILTER=T', 
    null, null,
    'RDFUSER', 'NET1' ));