5.7 Managing DML Operations for a CONTEXT Index
DML operations refer to when documents are inserted, updated, or deleted from the base table.
This section describes how you can view, synchronize, and optimize the Oracle Text CONTEXT
index for DML operations. This section contains the following topics:
5.7.1 Viewing Pending DML Operations
When you insert, update, or delete documents in the base table, their ROWIDs are held in a DML queue until you synchronize the index.
You can view the DML queue by querying index tables, as follows:
-
When the index is created using the default
fast_dml
option and when theCOMPATIBLE
database parameter is set to a value lower than20.0
, theCTXSYS.DR$PENDING
table keeps track of pending DMLs. You can query pending insert and update operations with theCTX_PENDING
andCTX_USER_PENDING
views. -
When the index is created using the default
fast_dml
option and when theCOMPATIBLE
database parameter is set to20.0
or higher, theDR$INDEX_NAME$C
table stores information on ROWIDs that are waiting for synchronization into the index.
The indexes that are set to MAINTENANCE AUTO
(automatic maintenance) or SYNC EVERY
are automatically synchronized, however you can periodically examine the pending DML tables to determine whether a synchronization call is failing. For example, if your query results appear incorrect or outdated, then you can check if the documents have been synchronized and accordingly run a manual SYNC_INDEX
call if required.
For example, to view pending DML operations on your indexes, enter the following command:
SELECT COUNT(*) FROM myschema.dr$myindex$c;
The output appears as follows:
COUNT(*)
----------
1
To retrieve ROWIDs of all unsynchronized changes from the $C
table, enter the following command:
SELECT dml_rid FROM myschema.dr$myindex$c;
The output appears as follows:
DML_RID ------------------ AAAVP9AAMAAAAKGAAD
You can run CTX_DDL.SYNC_INDEX
to synchronize your index, and then check if the $C
table has been cleared:
EXEC CTX_DDL.SYNC_INDEX('myschema.myindex');
SELECT COUNT(*) FROM myschema.dr$myindex$c;
COUNT(*)
----------
0
5.7.2 Synchronizing the Index
When you synchronize the index, you process all pending updates and inserts to the base table. You can do this in PL/SQL with the CTX_DDL.SYNC_INDEX
procedure. You can also control the duration and locking behavior for index synchronization with the CTX_DDL.SYNC_INDEX
procedure.
Synchronizing the Index with SYNC_INDEX
The following example synchronizes the index with 2 megabytes of memory:
begin
ctx_ddl.sync_index('myindex', '2M');
end;
Maxtime Parameter for SYNC_INDEX
The SYNC_INDEX
procedure includes a maxtime
parameter that, like OPTIMIZE_INDEX,
indicates a suggested time limit in minutes for the operation. The SYNC_INDEX
procedure processes as many documents in the queue as possible within the given time limit.
-
The time limit is approximate. The actual time may be less than, or greater than, what you specify.
-
The
ALTER
INDEX
...sync
command has no changes because it is deprecated. -
The
maxtime
parameter is ignored whenSYNC_INDEX
is invoked without an index name. -
The
maxtime
parameter cannot be communicated for automatic synchronizations (for example,sync
on
commit
orsync
every
).
Locking Parameter for SYNC_INDEX
The locking parameter of SYNC_INDEX
enables you to configure how the synchronization works when another synchronization is already running on the index.
-
The locking parameter is ignored when
SYNC_INDEX
is invoked without an index name. -
The locking parameter cannot be communicated for automatic synchronizations (that is,
sync
on
commit
orsync
every
). -
When the locking mode is
LOCK_WAIT,
the mode waits forever and ignores the maxtime setting if it cannot get a lock.
The options are as follows:
Option | Description |
---|---|
|
If another |
|
If another |
|
If another |
Note:
Starting with Oracle Database 12c Release 2 (12.2.0.1), you automatically merge rows from STAGE_ITAB
back to the $I
table by using SYNC_INDEX.
This merging of rows happens when the number of rows in STAGE_ITAB ($G)
exceeds the STAGE_ITAB_MAX_ROWS
parameter (10K by default). Therefore, you do not have to run merge optimization explicitly or schedule an auto optimize job.
See Also:
Oracle Text Reference to learn more about the CTX_DDL.SYNC_INDEX
statement syntax
5.7.3 Optimizing the Index
The CONTEXT
index is an inverted index where each word contains the list of documents that contain that word. For example, after a single initial indexing operation, the word DOG might have the following entry:
DOG DOC1 DOC3 DOC5
Frequent index synchronization ultimately causes fragmentation of your CONTEXT
index. Index fragmentation can adversely affect query response time. Therefore, to reduce fragmentation and index size and to ensure optimal query performance, allow time to optimize your CONTEXT
index.
To schedule an auto optimize job, you must explicitly set STAGE_ITAB_MAX_ROWS
to 0
to disable the automatic merging that now happens with SYNC_INDEX.
To optimize an index, Oracle recommends that you use CTX_DDL.OPTIMIZE_INDEX.
To understand index optimization, you must understand the structure of the index and what happens when it is synchronized. This section contains the following topics:
See Also:
Oracle Text Reference for the CTX_DDL.OPTIMIZE_INDEX
statement syntax and examples
5.7.3.1 Index Fragmentation
When you add new documents to the base table, the index is synchronized by adding new rows. For example, if you add the DOC 7 document with the word dog and synchronize the index, you now have:
DOG DOC1 DOC3 DOC5 DOG DOC7
Subsequent inserts, updates, or deletes also create new rows, as follows:
DOG DOC1 DOC3 DOC5 DOG DOC7 DOG DOC9 DOG DOC11
Index fragmentation occurs when you add new documents and synchronize the index. In particular, background inserts, updates, or deletes, which synchronize the index frequently, generally produce more fragmentation than batch mode synchronization.
When you perform batch processing less frequently, you reduce fragmentation because you produce longer document lists with a reduced number of rows in the index.
You can reduce index fragmentation by optimizing the index in either FULL
or FAST
mode with CTX_DDL.OPTIMIZE_INDEX.
5.7.3.2 Document Invalidation and Garbage Collection
When you remove documents from the base table, Oracle Text marks the document as removed but does not immediately alter the index.
Because the old information takes up space and can cause extra overhead at query time, you must remove the old information from the index by optimizing it in FULL
mode. This process is called garbage collection. Optimizing in FULL
mode for garbage collection is necessary when you perform frequent updates or deletes to the base table.
5.7.3.3 Single Token Optimization
In addition to optimizing the entire index, you can optimize single tokens. You can use token mode to optimize index tokens that are frequently searched, without spending time on optimizing tokens that are rarely referenced.
For example, you can specify that only the token DOG be optimized in the index, if you know that this token is updated and queried frequently.
An optimized token can improve query response time for the token.
To optimize an index in token mode, use CTX_DDL.OPTIMIZE_INDEX.
5.7.3.4 Viewing Index Fragmentation and Garbage Data
With the CTX_REPORT.INDEX_STATS
procedure, you can create a statistical report on your index. The report includes information on optimal row fragmentation, a list of most fragmented tokens, and the amount of garbage data in your index. Although this report might take a long time to run for large indexes, it can help you decide whether to optimize your index.
See Also:
Oracle Text Reference to learn more about using the CTX_REPORT.INDEX_STATS
procedure