1.12 Using RDF Network Indexes
RDF network indexes are nonunique B-tree indexes that you can add, alter, and drop for use with RDF graphs and inferred graphs in a RDF network.
You can use such indexes to tune the performance of SEM_MATCH queries on the RDF graphs and inferred graphs in the network. As with any indexes, RDF network indexes enable index-based access that suits your query workload. This can lead to substantial performance benefits, such as in the following example scenarios:
-
If your graph pattern is
'{<John> ?p <Mary>}'
, you may want to have a usable'CSPGM'
or'SCPGM'
index for the target RDF graphs and on the corresponding inferred graph, if used in the query. -
If your graph pattern is
'{?x <talksTo> ?y . ?z ?p ?y}'
, you may want to have a usable RDF network index on the relevant RDF graphs and inferred graph, withC
as the leading key (for example,'CPSGM'
).
However, using RDF network indexes can affect overall performance by increasing the time required for DML, load, and inference operations.
You can create and manage RDF network indexes using the following subprograms:
- SEM_APIS.ADD_NETWORK_INDEX
- SEM_APIS.ALTER_INDEX_ON_RDF_GRAPH
- SEM_APIS.ALTER_INDEX_ON_INFERRED_GRAPH
- SEM_APIS.DROP_NETWORK_INDEX
All of these subprograms have an index_code
parameter, which can contain any sequence of the following letters (without repetition): P
, C
, S
, G
, M
. These letters used in the index_code correspond to the following columns in the SEMM_* and SEMI_* views: P_VALUE_ID, CANON_END_NODE_ID, START_NODE_ID, G_ID, and MODEL_ID.
The SEM_APIS.ADD_NETWORK_INDEX procedure creates an RDF network index that results in the creation of a nonunique
B-tree index in UNUSABLE status for each of the existing RDF graphs and inferred graphs.
The name of the index is RDF_LNK_<index_code>_IDX and the index is owned by
the network owner. This operation is allowed only if the invoker has DBA role or is the
network owner. The following example shows creation of the PSCGM
index
with the following key: <P_VALUE_ID, START_NODE_ID, CANON_END_NODE_ID, G_ID,
MODEL_ID>.
EXECUTE SEM_APIS.ADD_NETWORK_INDEX('PSCGM' network_owner=>'RDFUSER', network_name=>'NET1');
After you create a RDF network index, each of the corresponding nonunique B-tree
indexes is in the UNUSABLE status, because making it usable can cause significant time
and resources to be used, and because subsequent index maintenance operations might
involve performance costs that you do not want to incur. You can make a RDF network
index usable or unusable for specific RDF graphs or inferred graphs that you own by
calling the SEM_APIS.ALTER_INDEX_ON_RDF_GRAPH and SEM_APIS.ALTER_INDEX_ON_INFERRED_GRAPH procedures and specifying 'REBUILD'
or 'UNUSABLE'
as the command
parameter. Thus, you can experiment by making different
RDF network indexes usable and unusable, and checking for any differences in
performance. For example, the following statement makes the PSCGM
index
usable for the FAMILY
RDF graph:
EXECUTE SEM_APIS.ALTER_INDEX_ON_RDF_GRAPH('FAMILY','PSCGM','REBUILD' network_owner=>'RDFUSER', network_name=>'NET1');
Also note the following:
-
Independent of any RDF network indexes that you create, when an RDF network is created, one of the indexes that is automatically created is an index that you can manage by referring to the
index_code
as'PSCGM'
when you call the subprograms mentioned in this section. -
When you create a new RDF graph or a new inferred graph, a new nonunique B-tree index is created for each of the RDF network indexes, and each such B-tree index is in the USABLE status.
-
Including the MODEL_ID column in an RDF network index key (by including 'M' in the
index_code
value) may improve query performance. This is particularly relevant when RDF graph collections are used.
Parent topic: RDF Graph Overview
1.12.1 SEM_NETWORK_INDEX_INFO View
Information about all network indexes on RDF graphs and inferred graphs is maintained in the SEM_NETWORK_INDEX_INFO view, which includes (a partial list) the columns shown in Table 1-29 and one row for each network index.
Table 1-29 SEM_NETWORK_INDEX_INFO View Columns (Partial List)
Column Name | Data Type | Description |
---|---|---|
NAME |
VARCHAR2(30) |
Name of the RDF graph or inferred graph |
TYPE |
VARCHAR2(10) |
Type of object on which the index is built: |
ID |
NUMBER |
ID number for the RDF graph or inferred graph, or zero (0) for an index on the network |
INDEX_CODE |
VARCHAR2(25) |
Code for the index (for example, |
INDEX_NAME |
VARCHAR2(30) |
Name of the index (for example, |
LAST_REFRESH |
TIMESTAMP(6) WITH TIME ZONE |
Timestamp for the last time this content was refreshed |
In addition to the columns listed in Table 1-29, the SEM_NETWORK_INDEX_INFO view contains columns from the ALL_INDEXES and ALL_IND_PARTITIONS views (both described in Oracle Database Reference), including:
-
From the ALL_INDEXES view: UNIQUENESS, COMPRESSION, PREFIX_LENGTH
-
From the ALL_IND_PARTITIONS view: STATUS, TABLESPACE_NAME, BLEVEL, LEAF_BLOCKS, NUM_ROWS, DISTINCT_KEYS, AVG_LEAF_BLOCKS_PER_KEY, AVG_DATA_BLOCKS_PER_KEY, CLUSTERING_FACTOR, SAMPLE_SIZE, LAST_ANALYZED
Note that the information in the SEM_NETWORK_INDEX_INFO view may sometimes be stale. You can refresh this information by using the SEM_APIS.REFRESH_NETWORK_INDEX_INFO procedure.
Parent topic: Using RDF Network Indexes