Oracle Text Application Developer's Guide
Release 9.0.1

Part Number A90122-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to beginning of chapter Go to next page

Indexing, 6 of 6


Managing DML Operations for a CONTEXT Index

DML operations to the base table refer to when documents are inserted, updated or deleted from the base table. This section describes how you can monitor, synchronize, and optimize the Oracle Text CONTEXT index when DML operations occur.


Note:

CTXCAT indexes are transactional and thus updated immediately when there is an update to the base table. Manual synchronization as described in this section is not necessary for a CTXCAT index. 


Viewing Pending DML

When documents in the base table are inserted, updated, or deleted, their ROWIDs are held in a DML queue until you synchronize the index. You can view this queue with the CTX_USER_PENDING view.

For example, to view pending DML on all your indexes, issue the following statement:

SELECT pnd_index_name, pnd_rowid, to_char(pnd_timestamp, 'dd-mon-yyyy 
hh24:mi:ss') timestamp FROM ctx_user_pending;

This statement gives output in the form:

PND_INDEX_NAME                 PND_ROWID          TIMESTAMP
------------------------------ ------------------ --------------------
MYINDEX                        AAADXnAABAAAS3SAAC 06-oct-1999 15:56:50


See Also:

Oracle Text Reference to learn more about this view. 

Synchronizing the Index

Synchronizing the index involves processing all pending updates, inserts, and deletes to the base table. You can do this in PL/SQL with the CTX_DDL.SYNC_INDEX procedure.

The following example synchronizes the index with 2 megabytes of memory:

begin

ctx_ddl.sync_index('myindex', '2M');
end;

Setting Background DML

You can set CTX_DDL.SYNC_INDEX to run automatically at regular intervals using the DBMS_JOB.SUBMIT procedure. Oracle Text includes a SQL script you can use to do this. The location of this script is:

$ORACLE_HOME/ctx/sample/script/drjobdml.sql

To use this script, you must be the index owner and you must have execute privileges on the CTX_DDL package. You must also set the job_queue_processes parameter in your Oracle initialization file.

For example, to set the index synchronization to run every 360 minutes on myindex, you can issue the following in SQL*Plus:

SQL> @drjobdml myindex 360

See Also:

Oracle Text Reference to learn more about the CTX_DDL.SYNC_INDEX command syntax. 

Index Optimization

Frequent index synchronization can fragment your CONTEXT index. Index fragmentation can adversely affect query response time. You can optimize your CONTEXT index to reduce fragmentation and index size and so improve query performance.

To understand index optimization, you must understand the structure of the index and what happens when it is synchronized.

CONTEXT Index Structure

The CONTEXT index is an inverted index where each word contains the list of documents that contain that word. For example, after a single initial indexing operation, the word DOG might have an entry as follows:

DOG DOC1 DOC3 DOC5

Index Fragmentation

When new documents are added to the base table, the index is synchronized by adding new rows. Thus if you add a new document (DOC 7) with the word dog to the base table and synchronize the index, you now have:

DOG DOC1 DOC3 DOC5
DOG DOC7

Subsequent DML will also create new rows:

DOG DOC1 DOC3 DOC5
DOG DOC7
DOG DOC9
DOG DOC11

Adding new documents and synchronizing the index causes index fragmentation. In particular, background DML which synchronizes the index frequently generally produces more fragmentation than synchronizing in batch.

Less frequent batch processing results in longer document lists, reducing the number of rows in the index and hence reducing fragmentation.

You can reduce index fragmentation by optimizing the index in either FULL or FAST mode with CTX_DDL.OPTIMIZE_INDEX.

Document Invalidation and Garbage Collection

When documents are removed from the base table, Oracle Text marks the document as removed but does not immediately alter the index.

Because the old information takes up space and can cause extra overhead at query time, you must remove the old information from the index by optimizing it in FULL mode. This is called garbage collection.

Optimizing in FULL mode for garbage collection is necessary when you have frequent updates or deletes to the base table.

Single Token Optimization

In addition to optimizing the entire index, you can optimize single tokens. You can use token mode to optimize index tokens that are frequently searched, without spending time on optimizing tokens that are rarely referenced.

For example, you can specify that only the token DOG be optimized in the index, if you know that this token is updated and queried frequently.

An optimized token can improve query response time for the token.

To optimize an index in token mode, you can use CTX_DDL.OPTIMIZE_INDEX.

Examples: Optimizing the Index

To optimize an index, Oracle recommends that you use CTX_DDL.OPTIMIZE_INDEX.

See Also:

Oracle Text Reference for the CTX_DDL.OPTIMIZE_INDEX command syntax and examples. 


Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback