Using Data Type Indexes

1.13 Using Data Type Indexes

Data type indexes are indexes on the values of typed literals stored in an RDF network.

These indexes may significantly improve the performance of SEM_MATCH queries involving certain types of FILTER expressions. For example, a data type index on xsd:dateTime literals may speed up evaluation of the filter (?x < "1929-11-16T13:45:00Z"^^xsd:dateTime). Indexes can be created for several data types, which are listed in Table 1-30.

Table 1-30 Data Types for Data Type Indexing

Data Type URI	Oracle Type	Index Type
http://www.w3.org/2001/XMLSchema#decimal	NUMBER	Non-unique B-tree (creates a single index for all xsd numeric types, including `xsd:float`, `xsd:double`, and `xsd:decimal` and all of its subtypes)
http://www.w3.org/2001/XMLSchema#string	VARCHAR2	Non-unique B-tree (creates a single index for `xsd:string` typed literals and plain literals)
http://www.w3.org/2001/XMLSchema#time	TIMESTAMP WITH TIMEZONE	Non-unique B-tree
http://www.w3.org/2001/XMLSchema#date	TIMESTAMP WITH TIMEZONE	Non-unique B-tree
http://www.w3.org/2001/XMLSchema#dateTime	TIMESTAMP WITH TIMEZONE	Non-unique B-tree
http://xmlns.oracle.com/rdf/text	(Not applicable)	CTXSYS.CONTEXT
http://xmlns.oracle.com/rdf/geo/WKTLiteral	SDO_GEOMETRY	SPATIAL_INDEX
http://www.opengis.net/geosparql#wktLiteral	SDO_GEOMETRY	SPATIAL_INDEX
http://www.opengis.net/geosparql#gmlLiteral	SDO_GEOMETRY	SPATIAL_INDEX
http://xmlns.oracle.com/rdf/like	VARCHAR2	Non-unique B-tree

The suitability of data type indexes depends on your query workload. Data type indexes on xsd data types can be used for filters that compare a variable with a constant value, and are particularly useful when queries have an unselective graph pattern with a very selective filter condition. Appropriate data type indexes are required for queries with spatial or text filters.

While data type indexes improve query performance, overhead from incremental index maintenance can degrade the performance of DML and bulk load operations on the RDF network. For bulk load operations, it may often be faster to drop data type indexes, perform the bulk load, and then re-create the data type indexes. As it is time consuming to create a text index on large amounts of text data, nologging is enabled by default when the text index is created. The logging can be enabled by specifying ‘LOGGING=T’ in the options field of add_datatype_index API for the text index.

You can add, alter, and drop data type indexes using the following procedures, which are described in SEM_APIS Package Subprograms:

Information about existing data type indexes is maintained in the SEM_DTYPE_INDEX_INFO view, which has the columns shown in Table 1-31 and one row for each data type index.

Table 1-31 SEM_DTYPE_INDEX_INFO View Columns

Column Name	Data Type	Description
DATATYPE	VARCHAR2(51)	Data type URI
INDEX_NAME	VARCHAR2(30)	Name of the index
STATUS	VARCHAR2(8)	Status of the index: `USABLE` or `UNUSABLE`
TABLESPACE_NAME	VARCHAR2(30)	Tablespace for the index
FUNCIDX_STATUS	VARCHAR2(8)	Status of the function-based index: `NULL`, `ENABLED`, or `DISABLED`

You can use the HINT0 hint to ensure that data type indexes are used during query evaluation, as shown in Example 1-116, which finds all grandfathers who were born before November 16, 1929.

Example 1-116 Using HINT0 to Ensure Use of Data Type Index

SELECT x, y
  FROM TABLE(SEM_MATCH(
    'PREFIX : <http://www.example.org/family/>
     SELECT ?x ?y 
     WHERE {?x :grandParentOf ?y . ?x rdf:type :Male . ?x :birthDate ?bd
       FILTER (?bd <= "1929-11-15T23:59:59Z"^^xsd:dateTime) }',
    SEM_Models('family'),
    SEM_Rulebases('RDFS','family_rb'), 
    
    null, null, null, 
    'HINT0={ LEADING(?bd) INDEX(?bd rdf_v$dateTime_idx) } 
             FAST_DATE_FILTER=T', 
    null, null,
    'RDFUSER', 'NET1' ));

Parent topic: RDF Graph Overview