1.12 Using RDF Network Indexes

RDF network indexes are nonunique B-tree indexes that you can add, alter, and drop for use with RDF graphs and inferred graphs in a RDF network.

You can use such indexes to tune the performance of SEM_MATCH queries on the RDF graphs and inferred graphs in the network. As with any indexes, RDF network indexes enable index-based access that suits your query workload. This can lead to substantial performance benefits, such as in the following example scenarios:

  • If your graph pattern is '{<John> ?p <Mary>}', you may want to have a usable 'CSPGM'or 'SCPGM' index for the target RDF graphs and on the corresponding inferred graph, if used in the query.

  • If your graph pattern is '{?x <talksTo> ?y . ?z ?p ?y}', you may want to have a usable RDF network index on the relevant RDF graphs and inferred graph, with C as the leading key (for example, 'CPSGM').

However, using RDF network indexes can affect overall performance by increasing the time required for DML, load, and inference operations.

You can create and manage RDF network indexes using the following subprograms:

All of these subprograms have an index_code parameter, which can contain any sequence of the following letters (without repetition): P, C, S, G, M. These letters used in the index_code correspond to the following columns in the SEMM_* and SEMI_* views: P_VALUE_ID, CANON_END_NODE_ID, START_NODE_ID, G_ID, and MODEL_ID.

The SEM_APIS.ADD_NETWORK_INDEX procedure creates an RDF network index that results in the creation of a nonunique B-tree index in UNUSABLE status for each of the existing RDF graphs and inferred graphs. The name of the index is RDF_LNK_<index_code>_IDX and the index is owned by the network owner. This operation is allowed only if the invoker has DBA role or is the network owner. The following example shows creation of the PSCGM index with the following key: <P_VALUE_ID, START_NODE_ID, CANON_END_NODE_ID, G_ID, MODEL_ID>.

EXECUTE SEM_APIS.ADD_NETWORK_INDEX('PSCGM' network_owner=>'RDFUSER', network_name=>'NET1');

After you create a RDF network index, each of the corresponding nonunique B-tree indexes is in the UNUSABLE status, because making it usable can cause significant time and resources to be used, and because subsequent index maintenance operations might involve performance costs that you do not want to incur. You can make a RDF network index usable or unusable for specific RDF graphs or inferred graphs that you own by calling the SEM_APIS.ALTER_INDEX_ON_RDF_GRAPH and SEM_APIS.ALTER_INDEX_ON_INFERRED_GRAPH procedures and specifying 'REBUILD' or 'UNUSABLE' as the command parameter. Thus, you can experiment by making different RDF network indexes usable and unusable, and checking for any differences in performance. For example, the following statement makes the PSCGM index usable for the FAMILY RDF graph:

EXECUTE SEM_APIS.ALTER_INDEX_ON_RDF_GRAPH('FAMILY','PSCGM','REBUILD' network_owner=>'RDFUSER', network_name=>'NET1');

Also note the following:

  • Independent of any RDF network indexes that you create, when an RDF network is created, one of the indexes that is automatically created is an index that you can manage by referring to the index_code as 'PSCGM' when you call the subprograms mentioned in this section.

  • When you create a new RDF graph or a new inferred graph, a new nonunique B-tree index is created for each of the RDF network indexes, and each such B-tree index is in the USABLE status.

  • Including the MODEL_ID column in an RDF network index key (by including 'M' in the index_code value) may improve query performance. This is particularly relevant when RDF graph collections are used.

1.12.1 SEM_NETWORK_INDEX_INFO View

Information about all network indexes on RDF graphs and inferred graphs is maintained in the SEM_NETWORK_INDEX_INFO view, which includes (a partial list) the columns shown in Table 1-29 and one row for each network index.

Table 1-29 SEM_NETWORK_INDEX_INFO View Columns (Partial List)

Column Name Data Type Description

NAME

VARCHAR2(30)

Name of the RDF graph or inferred graph

TYPE

VARCHAR2(10)

Type of object on which the index is built: MODEL, ENTAILMENT, or NETWORK

ID

NUMBER

ID number for the RDF graph or inferred graph, or zero (0) for an index on the network

INDEX_CODE

VARCHAR2(25)

Code for the index (for example, PSCGM).

INDEX_NAME

VARCHAR2(30)

Name of the index (for example, RDF_LNK_PSCGM_IDX)

LAST_REFRESH

TIMESTAMP(6) WITH TIME ZONE

Timestamp for the last time this content was refreshed

In addition to the columns listed in Table 1-29, the SEM_NETWORK_INDEX_INFO view contains columns from the ALL_INDEXES and ALL_IND_PARTITIONS views (both described in Oracle Database Reference), including:

  • From the ALL_INDEXES view: UNIQUENESS, COMPRESSION, PREFIX_LENGTH

  • From the ALL_IND_PARTITIONS view: STATUS, TABLESPACE_NAME, BLEVEL, LEAF_BLOCKS, NUM_ROWS, DISTINCT_KEYS, AVG_LEAF_BLOCKS_PER_KEY, AVG_DATA_BLOCKS_PER_KEY, CLUSTERING_FACTOR, SAMPLE_SIZE, LAST_ANALYZED

Note that the information in the SEM_NETWORK_INDEX_INFO view may sometimes be stale. You can refresh this information by using the SEM_APIS.REFRESH_NETWORK_INDEX_INFO procedure.