Skip Headers

Oracle® Text Reference
10g Release 1 (10.1)

Part Number B10730-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

1 SQL Statements and Operators

This chapter describes the SQL statements and Oracle Text operators you use for creating and managing Text indexes and performing Text queries.

The following statements are described in this chapter:


ALTER INDEX


Note:

This section describes the ALTER INDEX statement as it pertains to managing a Text domain index.

For a complete description of the ALTER INDEX statement, see Oracle Database SQL Reference.


Purpose

Use ALTER INDEX to perform the following maintenance tasks for a CONTEXT, CTXCAT, or CTXRULE index:


All Indextypes

You can use ALTER INDEX to perform the following task on all Oracle Text index types:


CONTEXT and CTXRULE Indextypes

You can use ALTER INDEX to perform the following task on CONTEXT and CTXRULE indextypes:

ALTER INDEX RENAME Syntax

Use the following syntax to rename an index or index partition:

ALTER INDEX [schema.]index_name RENAME TO new_index_name;  

ALTER INDEX [schema.]index_name RENAME PARTITION part_name TO new_part_name;
[schema.]index_name

Specify the name of the index to rename.

new_index_name

Specify the new name for schema.index. The new_index_name parameter can be no more than 25 bytes. If you specify a name longer than 25 bytes, Oracle Text returns an error and the renamed index is no longer valid.


Note:

When new_index_name is more than 25 bytes and less than 30 bytes, Oracle Text renames the index, even though the system returns an error. To drop the index and associated tables, you must DROP new_index_name with the DROP INDEX statement and then re-create and drop index_name.

part_name

Specify the name of the index partition to rename.

new_part_name

Specify the new name for partition.

ALTER INDEX REBUILD Syntax

The following syntax is used to rebuild the index, rebuild an index partition, resume a failed operation, perform batch DML, replace index metadata, add stopwords to index, add sections and stop sections to index, or optimize the index:

ALTER INDEX [schema.]index REBUILD [PARTITION partname] [ONLINE] [PARAMETERS (paramstring)][PARALLEL N] ;
PARTITION partname

Rebuilds the index partition partname. Only one index partition can be built at a time.

When you rebuild a partition you can specify only SYNC, OPTIMIZE FULL/FAST, RESUME, or REPLACE in paramstring. These operations work only on the partname you specify. You cannot specify RESUME when you rebuild partitions or a partitioned index.

With the REPLACE operation, you can only specify MEMORY and STORAGE for each index partition.


Adding Partitions

To add a partition to the base table, use the ALTER TABLE SQL statement. When you add a partition to an indexed table, Oracle Text automatically creates the metadata for the new index partition. The new index partition has the same name as the new table partition. You can change the index partition name with ALTER INDEX RENAME. To populate the new index partition, you must rebuild it with ALTER INDEX REBUILD.


Splitting or Merging Partitions

Splitting or merging a table partition with ALTER TABLE renders the index partition(s) invalid. You must rebuild them with ALTER INDEX REBUILD.

[ONLINE]

Optionally specify the ONLINE parameter for nonblocking operation, which enables the index to be queried during an ALTER INDEX synchronize or optimize operation.

ONLINE enables you to continue to perform updates, inserts, and deletes on a base table; it does not enable you to query the base table.

You cannot use PARALLEL with ONLINE. ONLINE is only supported for CONTEXT indexes.


Note:

You can specify replace or resume when rebuilding and index ONLINE, but you cannot specify replace or resume when rebuilding and index partition ONLINE.

PARALLEL n

Optionally specify with n the parallel degree for parallel indexing. This parameter is supported only when you use SYNC, REPLACE, and RESUME in paramstring. The actual degree of parallelism might be smaller depending on your resources.

Parallel indexing can speed up indexing when you have large amounts of data to index and when your operating system supports multiple CPUs.

You cannot use PARALLEL with ONLINE.

PARAMETERS (paramstring)

Optionally specify paramstring. If you do not specify paramstring, Oracle Text rebuilds the index with existing preference settings.

The syntax for paramstring is as follows:

paramstring = 
'REPLACE 
     [DATASTORE datastore_pref] 
     [FILTER filter_pref] 
     [LEXER lexer_pref] 
     [WORDLIST wordlist_pref] 
     [STORAGE storage_pref] 
     [STOPLIST stoplist] 
     [SECTION GROUP section_group]
     [MEMORY memsize]
     [INDEX SET index_set]
 
     [METADATA preference new_preference]
     [[METADATA] SYNC (MANUAL | EVERY "interval-string" | ON COMMIT)]
     [[METADATA] TRANSACTIONAL|NONTRANSACTIONAL

| RESUME [memory memsize]
| OPTIMIZE [token index_token | fast | full [maxtime (time | unlimited)]
| SYNC [memory memsize]
| ADD STOPWORD word [language language]
| ADD ZONE SECTION section_name tag tag
| ADD FIELD SECTION section_name tag tag [(VISIBLE | INVISIBLE)]
| ADD ATTR SECTION section_name tag tag@attr
| ADD STOP SECTION tag'
REPLACE [optional_preference_list]

Rebuilds an index. You can optionally specify preferences, your own or system-defined.

You can only replace preferences that are supported for that index type. For instance, you cannot replace index set for a CONTEXT or CTXRULE index. Similarly, for the CTXCAT index type, you can replace only lexer, wordlist, storage index set, and memory preferences.

If you are rebuilding a partitioned index with REPLACE, you can only specify STORAGE and MEMORY.


See Also:

Chapter 2, " Oracle Text Indexing Elements" for more information about creating and setting preferences, including information about system-defined preferences.

REPLACE METADATA preference new_preference

Replaces the existing preference class settings, including SYNC parameters, of the index with the settings from new_preference. Only index preferences and attributes are replaced. The index is not rebuilt.

This command is useful for when you want to replace a preference and its attribute settings after the index is built, without reindexing all data. Reindexing data can result in significant time and computing resources.

This command is also useful for changing the type of SYNC, which can be automatic, manual, or on-commit.

ALTER INDEX REBUILD PARAMETER ('REPLACE METADATA') does not work for a local partitioned index at the index (global) level; you cannot, for example, use this syntax to change a global preference, such as filter or lexer type, without rebuilding the index. Use CTX_DDL.REPLACE_INDEX_METADATA instead.


When is the METADATA keyword ok to use?

This command is intended only for when the change in index metadata would not lead to an inconsistent index, which can lead to incorrect query results.

For example, you can use this command in the following instances:

These changes are safe and would not lead to an inconsistent index that might adversely affect your query results


Caution:

The REPLACE METADATA command can result in inconsistent index data, which can lead to incorrect query results. As such, Oracle does not recommend using this command, unless you carefully consider the effect it will have on the consistency of your index data and subsequent queries.


There can be many instances when changing metadata can result in inconsistent index data. For example, Oracle does not advise you to use the METADATA keyword after doing the following:

In these unsafe cases, Oracle recommends rebuilding the index.

[METADATA] REPLACE SYNC (MANUAL | EVERY "interval-string" | ON COMMIT)

Specify SYNC for automatic synchronization of the CONTEXT index when there is DML to the base table. You can specify one of the following SYNC methods:

SYNC type Description
MANUAL No automatic synchronization. This is the default. You must manually synchronize the index with CTX_DDL.SYNC_INDEX.

Use MANUAL to disable ON COMMIT and EVERY synchronization.

EVERY interval-string Automatically synchronize the index at a regular interval specified by the value of interval-string. interval-string takes the same syntax as that for scheduler jobs. Automatic synchronization using EVERY requires that the index creator have CREATE JOB privileges.

Make sure that interval-string is set to a long enough period that any previous sync jobs will have completed; otherwise, the sync job may hang. interval-string must be enclosed in double quotes.

See Enabling Automatic Index Synchronization for an example of automatic sync syntax.

ON COMMIT Synchronize the index immediately after a commit. The commit does not return until the sync is complete. (Since the synchronization is performed as a separate transaction, there may be a period, usually small, when the data is committed but index changes are not.)

The operation uses the memory specified with the memory parameter.

Note that the sync operation has its own transaction context. If this operation fails, the data transaction still commits. Index synchronization errors are logged in the CTX_USER_INDEX_ERRORS view. See Viewing Index Errors under CREATE INDEX.

See Enabling Automatic Index Synchronization for an example of ON COMMIT syntax.


Each partition of a locally partitioned index can have its own type of sync (ON COMMIT, EVERY, or MANUAL). The type of sync specified in master parameter strings applies to all index partitions unless a partition specifies its own type.

With automatic (EVERY) synchronization, users can specify memory size and parallel synchronization. That syntax is:

... EVERY interval_string MEMORY mem_size PARALLEL paradegree ...

ON COMMIT synchronizations can only be executed serially and at the same memory size as at index creation.


Note:

This command rebuilds the index. When you want to change the SYNC setting without rebuilding the index, use the REBUILD METADATA SYNC (MANUAL | ON COMMIT) operation.

[METADATA] TRANSACTIONAL | NONTRANSACTIONAL

This parameter enables you to turn the TRANSACTIONAL property on or off. For more on TRANSACTIONAL, see "TRANSACTIONAL" in this book.

Using this parameter only succeeds if there are no rows in the DML pending queue. Therefore, you may need to sync the index before issuing this command.

To turn on TRANSACTIONAL index property:

ALTER INDEX myidx REBUILD PARAMETERS('replace metadata transactional');

or

ALTER INDEX myidx REBUILD PARAMETERS('replace  transactional');

To turn off TRANSACTIONAL index property:

ALTER INDEX myidx REBUILD PARAMETERS('replace metadata nontransactional');

or

ALTER INDEX myidx REBUILD PARAMETERS('replace  nontransactional');
RESUME [MEMORY memsize]

Resumes a failed index operation. You can optionally specify the amount of memory to use with memsize.


Note:

This ALTER INDEX operation applies only to CONTEXT and CTXRULE indexes. It does not apply to CTXCAT indexes.


OPTIMIZE [token index_token | fast | full [maxtime (time | unlimited)]

Note:

This ALTER INDEX operation will not be supported in future releases.

To optimize your index, use CTX_DDL.OPTIMIZE_INDEX.


Optimizes the index. Specify token, fast, or full optimization. You typically optimize after you synchronize the index.

When you optimize in token mode, Oracle Text optimizes only index_token. Use this method of optimization to quickly optimize index information for specific words.

When you optimize in fast mode, Oracle Text works on the entire index, compacting fragmented rows. However, in fast mode, old data is not removed.

When you optimize in full mode, you can optimize the whole index or a portion. This method compacts rows and removes old data (deleted rows).


Note:

Optimizing in full mode runs even when there are no deleted document rows. This is useful when you need to optimize time-limited batches with the maxtime parameter.

You use the maxtime parameter to specify in minutes the time Oracle Text is to spend on the optimization operation. Oracle Text starts the optimization where it left off and optimizes until complete or until the time limit has been reached, whichever comes first. Specifying a time limit is useful for automating index optimization, where you set Oracle Text to optimize the index for a specified time on a regular basis.

When you specify maxtime unlimited, the entire index is optimized. This is the default. When you specify 0 for maxtime, Oracle Text performs minimal optimization.

You can log the progress of optimization by writing periodic progress updates to the CTX_OUTPUT log. An event for CTX_OUTPUT.ADD_EVENT, called CTX_OUTPUT.EVENT_OPT_PRINT_TOKEN, prints each token as it is being optimized.


Note:

This ALTER INDEX operation applies only to CONTEXT and CTXRULE indexes. It does not apply to CTXCAT indexes.


SYNC [MEMORY memsize]

Note:

This ALTER INDEX operation will not be supported in future releases.

To synchronize your index, use CTX_DDL.SYNC_INDEX.


Synchronizes the index. You can optionally specify the amount of runtime memory to use with memsize. You synchronize the index when you have DML operations on your base table.


Note:

This ALTER INDEX operation applies only to CONTEXT and CTXRULE indexes. It does not apply to CTXCAT indexes.


Memory Considerations

The memory parameter memsize specifies the amount of memory Oracle Text uses for the ALTER INDEX operation before flushing the index to disk. Specifying a large amount of memory improves indexing performance because there is less I/O and improves query performance and maintenance because there is less fragmentation.

Specifying smaller amounts of memory increases disk I/O and index fragmentation, but might be useful if you want to track indexing progress or when run-time memory is scarce.

ADD STOPWORD word [language language]

Dynamically adds a stopword word to the index.

Index entries for word that existed before this operation are not deleted. However, subsequent queries on word are treated as though it has always been a stopword.

When your stoplist is a multi-language stoplist, you must specify language.

The index is not rebuilt by this statement.

ADD ZONE SECTION section_name tag tag

Dynamically adds the zone section section_name identified by tag to the existing index.

The added section section_name applies only to documents indexed after this operation. For the change to take effect, you must manually re-index any existing documents that contain the tag.

The index is not rebuilt by this statement.


Note:

This ALTER INDEX operation applies only to CONTEXT and CTXRULE indexes. It does not apply to ctxcat indexes.

ADD FIELD SECTION section_name tag tag [(VISIBLE | INVISIBLE)]

Dynamically adds the field section section_name identified by tag to the existing index.

Optionally specify VISIBLE to make the field sections visible. The default is INVISIBLE.


See Also:

CTX_DDL.ADD_FIELD_SECTION for more information on visible and invisible field sections.

The added section section_name applies only to documents indexed after this operation. For the change to affect previously indexed documents, you must explicitly re-index the documents that contain the tag.

The index is not rebuilt by this statement.


Note:

This ALTER INDEX operation applies only to CONTEXT CTXRULE indexes. It does not apply to CTXCAT indexes.


See Also:

"Add Section Constraints" in this section.

ADD ATTR SECTION section_name tag tag@attr

Dynamically adds an attribute section section_name to the existing index. You must specify the XML tag and attribute in the form tag@attr. You can add attribute sections only to XML section groups.

The added section section_name applies only to documents indexed after this operation. Thus for the change to take effect, you must manually re-index any existing documents that contain the tag.

The index is not rebuilt by this statement.


Note:

This ALTER INDEX operation applies only to CONTEXT CTXRULE indexes. It does not apply to CTXCAT indexes.


See Also:

"Add Section Constraints" in this section.

ADD STOP SECTION tag

Dynamically adds the stop section identified by tag to the existing index. As stop sections apply only to automatic sectioning of XML documents, the index must use the AUTO_SECTION_GROUP section group. The tag you specify must be case sensitive and unique within the automatic section group or else ALTER INDEX raises an error.

The added stop section tag applies only to documents indexed after this operation. For the change to affect previously indexed documents, you must explicitly re-index the documents that contain the tag.

The text within a stop section is always searchable.

The number of stop sections you can add is unlimited.

The index is not rebuilt by this statement.


Note:

This ALTER INDEX operation applies only to CONTEXT indexes. It does not apply to CTXCAT indexes.


Add Section Constraints

Before altering the index section information, Oracle Text checks the new section against the existing sections to ensure that all validity constraints are met. These constraints are the same for adding a section to a section group with the CTX_DDL PL/SQL package and are as follows:

ALTER INDEX Examples


Resuming Failed Index

The following statement resumes the indexing operation on newsindex with 2 megabytes of memory:

ALTER INDEX newsindex REBUILD PARAMETERS('resume memory 2M');

Rebuilding an Index

The following statement rebuilds the index, replacing the stoplist preference with new_stop.

ALTER INDEX newsindex REBUILD PARAMETERS('replace stoplist new_stop');

Rebuilding a Partitioned Index

The following example creates a partitioned text table, populates it, and creates a partitioned index. It then adds a new partition to the table and then rebuilds the index with ALTER INDEX:

PROMPT create partitioned table and populate it

create table part_tab (a int, b varchar2(40)) partition by range(a)
(partition p_tab1 values less than (10),
 partition p_tab2 values less than (20),
 partition p_tab3 values less than (30));

insert into part_tab values (1,'Actinidia deliciosa');
insert into part_tab values (8,'Distictis buccinatoria');
insert into part_tab values (12,'Actinidia quinata');
insert into part_tab values (18,'Distictis Rivers');
insert into part_tab values (21,'pandorea jasminoides Lady Di');
insert into part_tab values (28,'pandorea rosea');

commit;

PROMPT create partitioned index
create index part_idx on part_tab(b) indextype is ctxsys.context
local (partition p_idx1, partition p_idx2, partition p_idx3);
PROMPT add a partition and populate it
alter table part_tab add partition p_tab4 values less than (40);
insert into part_tab values (32, 'passiflora citrina');
insert into part_tab values (33, 'passiflora alatocaerulea');
commit;

The following statement rebuilds the index in the newly populated partition. In general, the index partition name for a newly added partition is the same as the table partition name, unless it is already been used. In this case, Oracle Text generates a new name.

alter index part_idx rebuild partition p_tab4;

The following statement queries the table for the two hits in the newly added partition:

select * from part_tab where contains(b,'passiflora') >0;

The following statement queries the newly added partition directly:

select * from part_tab partition (p_tab4) where contains(b,'passiflora') >0;

Replacing Index Metadata: Changing Single-lexer to Multi-lexer

The following example demonstrates how an application can migrate from single-language documents (English) to multi-language documents (English and Spanish) by replacing the index metadata for the lexer.

REM create a simple table, which stores only english (American) text

create table simple (text varchar2(80));
insert into simple values ('the quick brown fox');
commit;

REM we'll create a simple lexer to lex this english text

begin
  ctx_ddl.create_preference('us_lexer','basic_lexer');
end;
/

REM create a text index on the simple table
create index simple_idx on simple(text)
indextype is ctxsys.context parameters ('lexer us_lexer');

REM we can query easily
select * from simple where contains(text, 'fox')>0;

REM now suppose we want to start accepting spanish documents.
REM first we have to extend the table with a language column
alter table simple add (lang varchar2(10) default 'us');

REM now let's create a spanish lexer, 
begin
  ctx_ddl.create_preference('e_lexer','basic_lexer');
  ctx_ddl.set_attribute('e_lexer','base_letter','yes');
end;
/
REM Then we create a multi-lexer incorporating our english and spanish lexers. 
REM Note that the DEFAULT lexer is the exact same lexer that we have already
REM indexed all the documents with.
begin
  ctx_ddl.create_preference('m_lexer','multi_lexer');
  ctx_ddl.add_sub_lexer('m_lexer','default','us_lexer');
  ctx_ddl.add_sub_lexer('m_lexer','spanish','e_lexer');
end;
/
REM now let's replace our metadata
alter index simple_idx rebuild 
parameters ('replace metadata language column lang lexer m_lexer');

REM we're ready for some spanish data.  Note that we could have inserted
REM this BEFORE the alter index, as long as we didn't SYNC.
insert into simple values ('el zorro marrón rápido', 'e');
commit;
exec ctx_ddl.sync_index('simple_idx');
REM now we can query the spanish data with base lettering:
select * from simple where contains(text, 'rapido')>0;

Optimizing the Index

Optimizing your index with ALTER INDEX will not be supported in future releases. To optimize your index, use CTX_DDL.OPTIMIZE_INDEX.


Synchronizing the Index

Synchronizing the index with ALTER INDEX will not be supported in future releases. To synchronize your index, use CTX_DDL.SYNC_INDEX.


Adding a Zone Section

To add to the index the zone section author identified by the tag <author>, issue the following statement:

ALTER INDEX myindex REBUILD PARAMETERS('add zone section author tag author');

Adding a Stop Section

To add a stop section identified by tag <fluff> to the index that uses the AUTO_SECTION_GROUP, issue the following statement:

ALTER INDEX myindex REBUILD PARAMETERS('add stop section fluff');

Adding an Attribute Section

Assume that the following text appears in an XML document:

<book title="Tale of Two Cities">It was the best of times.</book>

You want to create a separate section for the title attribute and you want to name the new attribute section booktitle. To do so, issue the following statement:

ALTER INDEX myindex REBUILD PARAMETERS('add attr section booktitle tag title@book');

Related Topics

CTX_DDL.SYNC_INDEX in Chapter 7, " CTX_DDL Package"

CTX_DDL.OPTIMIZE_INDEX in Chapter 7, " CTX_DDL Package"

CREATE INDEX


ALTER TABLE: Supported Partitioning Statements


Note:

This section describes the ALTER TABLE statement as it pertains to adding and modifying a partitioned text table with a context domain index.

For a complete description of the ALTER TABLE statement, see Oracle Database SQL Reference.


Purpose

You can use ALTER TABLE to add, modify, split, merge, exchange, or drop a partitioned text table with a context domain index. The following sections describe some of the ALTER TABLE operations you can issue.

Modify Partition Syntax


Unusable Local Indexes
ALTER TABLE  [schema.]table MODIFY PARTITION partition UNUSABLE LOCAL INDEXES

Marks the index partition corresponding to the given table partition UNUSABLE. You might mark an index partition unusable before you rebuild the index partition as described in Rebuild Unusable Local Indexes.

If the index partition is not marked unusable, the rebuild command returns without actually rebuilding the local index partition.


Rebuild Unusable Local Indexes
ALTER TABLE  [schema.]table MODIFY PARTITION partition REBUILD UNUSABLE LOCAL INDEXES

Rebuilds the index partition corresponding to the specified table partition that has an UNUSABLE status.


Note:

If the index partition status is already VALID before you issue this command, this command does NOT rebuild the index partition. Do not depend on this command to rebuild the index partition unless the index partition status is UNUSABLE.

Add Partition Syntax

ALTER TABLE [schema.]table ADD PARTITION [partition] 
VALUES LESS THAN (value_list) [partition_description]

Adds a new partition to the high end of a range partitioned table.

To add a partition to the beginning or to the middle of the table, use ALTER TABLE SPLIT PARTITION.

The newly added table partition is always empty, and the context domain index (if any) status for this partition is always VALID. After doing DML, if you want to synchronize or optimize this newly added index partition, you must look up the index partition name, and issue the ALTER INDEX REBUILD PARTITION command. For this newly added partition, index partition name is usually the same as the table partition name, but if the table partition name is already used by another index partition, the system assigns a name in the form of SYS_Pn.

By querying the USER_IND_PARTITIONS view and comparing the HIGH_VALUE field, you can determine the index partition name for the newly added partition.

Merge Partition Syntax

ALTER TABLE [schema.]table 
MERGE PARTITIONS partition1, partition2 
[INTO PARTITION [new_partition] [partition_description]]
[UPDATE GLOBAL INDEXES]

Applies only to a range partition. This command merges the contents of two adjacent partitions into a new partition and then drops the original two partitions. If the resulting partition is non-empty, the corresponding local domain index partition is marked UNUSABLE. Users can use ALTER TABLE MODIFY PARTITION to rebuild the partition index.

For a global index, if you perform the merge operation without an UPDATE GLOBAL INDEXES clause, the resulting index (if not NULL) will be invalid and must be rebuilt. If you specify the UPDATE GLOBAL INDEXES clause after the operation, the index will be valid, but you will still need to synchronize the index with CTX_DDL.SYNC_INDEX for the update to take place, if the sync type is manual.

The naming convention for the resulting index partition is the same as in ALTER TABLE ADD PARTITION.

Split Partition Syntax

ALTER TABLE [schema.]table
SPLIT PARTITION partition_name_old 
AT (value_list)
[into (partition_description, partition_description)]
[prallel_clause]
[UPDATE GLOBAL INDEXES]

Applies only to range partition. This command divides a table partition into two partitions, thus adding a new partition to the table. The local corresponding index partitions will be marked UNUSABLE if the corresponding table partitions are non-empty. You can use ALTER TABLE MODIFY PARTITION to rebuild the partition indexes.

For a global index, if you perform the split operation without an UPDATE GLOBAL INDEXES clause, the resulting index (if not NULL) will be invalid and must be rebuilt. If you specify the UPDATE GLOBAL INDEXES clause after the operation, the index will be valid, but you will still need to synchronize the index with CTX_DDL.SYNC_INDEX for the update to take place, if the sync type is manual.

The naming convention for the two resulting index partition is the same as in ALTER TABLE ADD PARTITION.

Exchange Partition Syntax

ALTER TABLE [schema.]table EXCHANGE PARTITION partition WITH TABLE table
[INCLUDING|EXCLUDING INDEXES}
[WITH|WITHOUT VALIDATION]
[EXCEPTIONS INTO [schema.]table]
[UPDATE GLOBAL INDEXES]

Converts a partition to a non-partitioned table, and converts a table to a partition of a partitioned table by exchanging their data segments. Rowids are preserved.

If EXCLUDING INDEXES is specified, all the context indexes corresponding to the partition and all the indexes on the exchanged table are marked as UNUSABLE. To rebuild the new index partition this case, you can issue ALTER TABLE MODIFY PARTITION.

If INCLUDING INDEXES is specified, then for every local domain index on the partitioned table, there must be a non-partitioned domain index on the non-partitioned table. The local index partitions are exchanged with the corresponding regular indexes.

For a global index, if you perform the exchange operation without an UPDATE GLOBAL INDEXES clause, the resulting index (if not NULL) will be invalid and must be rebuilt. If you specify the UPDATE GLOBAL INDEXES clause after the operation, the index will be valid, but you will still need to synchronize the index with CTX_DDL.SYNC_INDEX for the update to take place, if the sync type is manual.


Field Sections

Field section queries might not work the same if the non-partitioned index and local index use different section id's for the same field section.


Storage

Storage is not changed. So if the index on the non-partitioned table $I table was in tablespace XYZ, then after the exchange partition it will still be in tablespace XYZ, but now it is the $I table for an index partition.

Storage preferences are not switched, so if you switch and then rebuild the index the table may be created in a different location.


Restrictions

Both indexes must be equivalent. They must use the same objects, same settings for each object. Note: we only check that they are using the same object. But they should use the same exact everything.

No index object can be partitioned, that is, when the user has used the storage object to partition the $I, $N tables.

If either index or index partition does not meet all these restrictions an error is raised and both the index and index partition will be INVALID. The user needs to manually rebuild both index and index partition using ALTER INDEX REBUILD.

Truncate Partition Syntax

ALTER TABLE [schema.]table TRUNCATE PARTITION [DROP|REUSE STORAGE] [UPDATE GLOBAL INDEXES]

Removes all rows from a partition in a table. Corresponding CONTEXT index partitions are also removed.

For a global index, if you perform the truncate operation without an UPDATE GLOBAL INDEXES clause, the resulting index (if not NULL) will be invalid and must be rebuilt. If you specify the UPDATE GLOBAL INDEXES clause after the operation, the index will be valid.

ALTER TABLE Examples


Global Index on Partitioned Table Examples

The following example creates a range partitioned table with three partitions. Each partition is populated with two rows. A global context index is then created. To demonstrate the UPDATE GLOBAL INDEXES clause, the partitions are split and merged with an index synchronization.

create table tdrexglb_part(a int, b varchar2(40)) partition by range(a)
(partition p1 values less than (10),
 partition p2 values less than (20),
 partition p3 values less than (30));

insert into tdrexglb_part values (1,'row1');
insert into tdrexglb_part values (8,'row2');
insert into tdrexglb_part values (11,'row11');
insert into tdrexglb_part values (18,'row18');
insert into tdrexglb_part values (21,'row21');
insert into tdrexglb_part values (28,'row28');

commit;
create index tdrexglb_parti on tdrexglb_part(b) indextype is ctxsys.context;

create table tdrexglb(a int, b varchar2(40));

insert into tdrexglb values(20,'newrow20');
commit;


PROMPT make sure query works
select * from tdrexglb_part where contains(b,'row18') >0;

PROMPT split partition
alter table tdrexglb_part split partition p2 at (15) into
(partition p21, partition p22) update global indexes;

PROMPT before sync
select * from tdrexglb_part where contains(b,'row11') >0;
select * from tdrexglb_part where contains(b,'row18') >0;

exec ctx_ddl.sync_index('tdrexglb_parti')

PROMPT after sync
select * from tdrexglb_part where contains(b,'row11') >0;
select * from tdrexglb_part where contains(b,'row18') >0;

PROMPT merge partition
alter table tdrexglb_part merge partitions p22, p3 
into partition pnew3 update global indexes;

PROMPT before sync
select * from tdrexglb_part where contains(b,'row18') >0;
select * from tdrexglb_part where contains(b,'row28') >0;
exec ctx_ddl.sync_index('tdrexglb_parti');

PROMPT after sync
select * from tdrexglb_part where contains(b,'row18') >0;
select * from tdrexglb_part where contains(b,'row28') >0;

PROMPT drop partition 
alter table tdrexglb_part drop partition p1 update global indexes;

PROMPT before sync
select * from tdrexglb_part where contains(b,'row1') >0;
exec ctx_ddl.sync_index('tdrexglb_parti');

PROMPT after sync
select * from tdrexglb_part where contains(b,'row1') >0;

PROMPT exchange partition
alter table tdrexglb_part exchange partition pnew3 with table
tdrexglb update global indexes;

PROMPT before sync
select * from tdrexglb_part where contains(b,'newrow20') >0;
select * from tdrexglb_part where contains(b,'row28') >0;

exec ctx_ddl.sync_index('tdrexglb_parti');
PROMPT after sync
select * from tdrexglb_part where contains(b,'newrow20') >0;
select * from tdrexglb_part where contains(b,'row28') >0;

PROMPT move table partition
alter table tdrexglb_part move partition p21 update global indexes;
PROMPT before sync
select * from tdrexglb_part where contains(b,'row11') >0;

exec ctx_ddl.sync_index('tdrexglb_parti');
PROMPT after sync
select * from tdrexglb_part where contains(b,'row11') >0;

PROMPT truncate table partition
alter table tdrexglb_part truncate partition p21 update global indexes;

update global indexes;



CATSEARCH

Use the CATSEARCH operator to search CTXCAT indexes. Use this operator in the WHERE clause of a SELECT statement.

The grammar of this operator is called CTXCAT. You can also use the CONTEXT grammar if your search criteria requires special functionality, such as thesaurus, fuzzy matching, proximity searching or stemming. To utilize the CONTEXT grammar, use the Query Template Specification in the text_query parameter as described in this section.

About Performance

You use the CATSEARCH operator with a CTXCAT index mainly to improve mixed query performance. You specify your text query condition with text_query and your structured condition with structured_query.

Internally, Oracle Text uses a combined b-tree index on text and structured columns to quickly produce results satisfying the query.

Limitation

If the optimizer chooses to use the functional query invocation, your query will fail. The optimizer might choose functional invocation when your structured clause is highly selective.

Syntax

CATSEARCH(
[schema.]column,
text_query       VARCHAR2,
structured_query VARCHAR2,
RETURN NUMBER;
[schema.]column

Specify the text column to be searched on. This column must have a CTXCAT index associated with it.

text_query

Specify one of the following to define your search in column.


CATSEARCH query operations

The CATSEARCH operator supports only the following query operations:

These operators have the following syntax:

Operation Syntax Description of Operation
Logical AND a b c Returns rows that contain a, b and c.
Logical OR a | b | c Returns rows that contain a, b, or c.
Logical NOT a - b Returns rows that contain a and not b.
hyphen with no space a-b Hyphen treated as a regular character.

For example, if the hyphen is defined as skipjoin, words such as web-site are treated as the single query term website.

Likewise, if the hyphen is defined as a printjoin, words such as web-site are treated as web-site in the CTXCAT query language.

" " "a b c" Returns rows that contain the phrase "a b c".

For example, entering "Sony CD Player" means return all rows that contain this sequence of words.

( ) (A B) | C Parentheses group operations. This query is equivalent to the CONTAINS query (A &B) | C.
wildcard

(right and double truncated)

term*

a*b

The wildcard character matches zero or more characters.

For example, do* matches dog, and gl*s matches glass.

Left truncation not supported.

Note: Oracle recommends that you create a prefix index if your application uses wildcard searching. You set prefix indexing with the BASIC_WORDLIST preference.


The following limitations apply to these operators:

For example, these expressions are supported:

catsearch(text, 'dog', 'foo > 15')
catsearch(text, 'dog', 'bar = ''SMITH''')
catsearch(text, 'dog', 'foo between 1 and 15')
catsearch(text, 'dog', 'foo = 1 and abc = 123')

And these expression are not supported:

catsearch(text, 'dog', 'upper(bar) = ''A''')
catsearch(text, 'dog', 'bar LIKE ''A%''')
catsearch(text, 'dog', 'foo = abc')
catsearch(text, 'dog', 'foo = 1 or abc = 3')

Query Template Specification

You specify a marked-up string that specifies a query template. You can specify one of the following templates:

structured_query

Specify the structured conditions and the ORDER BY clause. There must exist an index for any column you specify. For example, if you specify 'category_id=1 order by bid_close', you must have an index for 'category_id, bid_close' as specified with CTX_DDL.ADD_INDEX.

With structured_query, you can use standard SQL syntax with only the following operators:

Examples

  1. Create the Table

The following statement creates the table to be indexed.

CREATE TABLE auction (category_id number primary key, title varchar2(20), 
bid_close date);

The following table inserts the values into the table:

INSERT INTO auction values(1, 'Sony CD Player', '20-FEB-2000');
INSERT INTO auction values(2, 'Sony CD Player', '24-FEB-2000');
INSERT INTO auction values(3, 'Pioneer DVD Player', '25-FEB-2000');
INSERT INTO auction values(4, 'Sony CD Player', '25-FEB-2000');
INSERT INTO auction values(5, 'Bose Speaker', '22-FEB-2000');
INSERT INTO auction values(6, 'Tascam CD Burner', '25-FEB-2000');
INSERT INTO auction values(7, 'Nikon digital camera', '22-FEB-2000');
INSERT INTO auction values(8, 'Canon digital camera', '26-FEB-2000');

  1. Create the CTXCAT Index

The following statements create the CTXCAT index:

begin
ctx_ddl.create_index_set('auction_iset');
ctx_ddl.add_index('auction_iset','bid_close'); 
end;
/
CREATE INDEX auction_titlex ON auction(title) INDEXTYPE IS CTXSYS.CTXCAT PARAMETERS ('index set auction_iset');

  1. Query the Table

A typical query with CATSEARCH might include a structured clause as follows to find all rows that contain the word camera ordered by bid_close:

SELECT * FROM auction WHERE CATSEARCH(title, 'camera', 'order by bid_close desc')> 0;

CATEGORY_ID TITLE                BID_CLOSE
----------- -------------------- ---------
          8 Canon digital camera 26-FEB-00
          7 Nikon digital camera 22-FEB-00

The following query finds all rows that contain the phrase Sony CD Player and that have a bid close date of February 20, 2000:

SELECT * FROM auction WHERE CATSEARCH(title, '"Sony CD Player"', 'bid_close=''20-FEB-00''')> 0;

CATEGORY_ID TITLE                BID_CLOSE
----------- -------------------- ---------
          1 Sony CD Player       20-FEB-00

The following query finds all rows with the terms Sony and CD and Player:

SELECT * FROM auction WHERE CATSEARCH(title, 'Sony CD Player', 'order by bid_close desc')> 0;
CATEGORY_ID TITLE                BID_CLOSE
----------- -------------------- ---------
          4 Sony CD Player       25-FEB-00
          2 Sony CD Player       24-FEB-00
          1 Sony CD Player       20-FEB-00

The following query finds all rows with the term CD and not Player:

SELECT * FROM auction WHERE CATSEARCH(title, 'CD - Player', 'order by bid_close desc')> 0;

CATEGORY_ID TITLE                BID_CLOSE
----------- -------------------- ---------
          6 Tascam CD Burner     25-FEB-00

The following query finds all rows with the terms CD or DVD or Speaker:

SELECT * FROM auction WHERE CATSEARCH(title, 'CD | DVD | Speaker', 'order by bid_close desc')> 0;

CATEGORY_ID TITLE                BID_CLOSE
----------- -------------------- ---------
          3 Pioneer DVD Player   25-FEB-00
          4 Sony CD Player       25-FEB-00
          6 Tascam CD Burner     25-FEB-00
          2 Sony CD Player       24-FEB-00
          5 Bose Speaker         22-FEB-00
          1 Sony CD Player       20-FEB-00

The following query finds all rows that are about audio equipment:

SELECT * FROM auction WHERE CATSEARCH(title, 'ABOUT(audio equipment)', NULL)> 0;

CONTEXT Query Grammar Examples

The following examples show how to specify the CONTEXT grammar in CATSEARCH queries using the template feature.

PROMPT
PROMPT fuzzy: query = ?test
PROMPT should match all fuzzy variations of test (for example, text)
select pk||' ==> '||text from test 
where catsearch(text,
'<query> 
  <textquery grammar="context">
     ?test
  </textquery>
  <score datatype="integer"/>
</query>','')>0
order by pk; 

PROMPT
PROMPT fuzzy: query = !sail
PROMPT should match all soundex variations of bot (for example, sell)
select pk||' ==> '||text from test 
where catsearch(text,
'<query> 
  <textquery grammar="context">
     !sail
  </textquery>
  <score datatype="integer"/>
</query>','')>0
order by pk; 

PROMPT
PROMPT theme (ABOUT) query
PROMPT query: about(California)
select pk||' ==> '||text from test 
where catsearch(text,
'<query> 
  <textquery grammar="context">
     about(California)
  </textquery>
  <score datatype="integer"/>
</query>','')>0
order by pk; 

The following example shows a field section search against a CTXCAT index using CONTEXT grammar by means of a query template in a CATSEARCH query.

-- Create and populate table
create table BOOKS (ID number, INFO varchar2(200), PUBDATE DATE);
 
insert into BOOKS values(1, '<author>NOAM CHOMSKY</author><subject>CIVIL
   RIGHTS</subject><language>ENGLISH</language><publisher>MIT
  PRESS</publisher>', '01-NOV-2003');
 
insert into BOOKS values(2, '<author>NICANOR PARRA</author><subject>POEMS 
  AND ANTIPOEMS</subject><language>SPANISH</language>
  <publisher>VASQUEZ</publisher>', '01-JAN-2001');
 
insert into BOOKS values(1, '<author>LUC SANTE</author><subject>XML
  DATABASE</subject><language>FRENCH</language><publisher>FREE
  PRESS</publisher>', '15-MAY-2002');
 
commit;
 
-- Create index set and section group
exec ctx_ddl.create_index_set('BOOK_INDEX_SET');
exec ctx_ddl.add_index('BOOKSET','PUBDATE');
 
exec ctx_ddl.create_section_group('BOOK_SECTION_GROUP',
      'BASIC_SECTION_GROUP');
exec ctx_ddl.add_field_section('BOOK_SECTION_GROUP','AUTHOR','AUTHOR');
exec ctx_ddl.add_field_section('BOOK_SECTION_GROUP','SUBJECT','SUBJECT');
exec ctx_ddl.add_field_section('BOOK_SECTION_GROUP','LANGUAGE','LANGUAGE');
exec ctx_ddl.add_field_section('BOOK_SECTION_GROUP','PUBLISHER','PUBLISHER'); 
 
 
-- Create index
create index books_index on books(info) indextype is ctxsys.ctxcat
  parameters('index set book_index_set section group book_section_group');
 
-- Use the index
-- Note that: even though CTXCAT index can be created with field sections, it
-- cannot be accessed using CTXCAT grammar (default for CATSEARCH).
-- We need to use query template with CONTEXT grammar to access field 
-- sections with CATSEARCH
 
select  id, info from books
where catsearch(info,
'<query>
      <textquery grammar="context">
              NOAM within author and english within language
      </textquery>
 </query>',
'order by pubdate')>0; 

Related Topics

Syntax for CTXCAT Indextype in this chapter.

Oracle Text Application Developer's Guide


CONTAINS

Use the CONTAINS operator in the WHERE clause of a SELECT statement to specify the query expression for a Text query.

CONTAINS returns a relevance score for every row selected. You obtain this score with the SCORE operator.

The grammar for this operator is called CONTEXT. You can also use CTXCAT grammar if your application works better with simpler syntax. To do so, use the Query Template Specification in the text_query parameter as described in this section.

Syntax

CONTAINS(
         [schema.]column,
         text_query    VARCHAR2
         [,label       NUMBER])
RETURN NUMBER;
[schema.]column

Specify the text column to be searched on. This column must have a Text index associated with it.

text_query

Specify one of the following:


Query Rewrite Template

Use this template to automatically write different versions of a query before you submit the query to Oracle Text. This is useful when you need to maximize the recall of a user query. For example, you can program your application to expand a single phrase query of 'cat dog' into the following queries:

{cat} {dog}
{cat} ; {dog}
{cat} AND {dog}
{cat} ACCUM {dog}

These queries are submitted as one query and results are returned with no duplication. In this example, the query returns documents that contain the phrase cat dog as well as documents in which cat is near dog, and documents that have cat and dog.

This is done with the following template:

<query>
   <textquery lang="ENGLISH" grammar="CONTEXT"> cat dog
     <progression>
       <seq><rewrite>transform((TOKENS, "{", "}", " "))</rewrite></seq>
       <seq><<rewrite>transform((TOKENS, "{", "}", " ; "))</rewrite>/seq>
       <seq><rewrite>transform((TOKENS, "{", "}", "AND"))</rewrite></seq>
       <seq><rewrite>transform((TOKENS, "{", "}", "ACCUM"))</rewrite></seq>
     </progression>
   </textquery>
  <score datatype="INTEGER" algorithm="COUNT"/>
</query>

The operator TRANSFORM is used to specify the rewrite rules and has the following syntax (note that it uses double parentheses):

TRANSFORM((terms, prefix, suffix, connector))
Parameter Description

terms

Specify the type of terms to be prodcued from the original query. You can specify either TOKENS or THEMES

prefix

Specify the literal string to be prepended to all the terms

suffix

Specify the literal string to be appended to all the terms.

connector

Specify the literal string to connect all the terms after applying prefix and suffix.



Query Relaxation Template

Use this template to progressively relax your query. Progressive relaxation is when you increase recall by progressively issuing less restrictive versions of a query, so that your application can return an appropriate number of hits to the user.

For example, the query of black pen can be progressively relaxed to:

black pen
black NEAR pen
black AND pen
black ACCUM pen

This is done with the following template

<query>
   <textquery lang="ENGLISH" grammar="CONTEXT">
     black pen
     <progression>
       <seq>black pen</seq>
       <seq>black NEAR pen</seq>
       <seq>black AND pen</seq>
       <seq>black ACCUM pen</seq>
     </progression>
   </textquery>
   <score datatype="INTEGER" algorithm="COUNT"/>
</query>

Alternate Grammar Template

Use this template to specify an alternate grammar, such as CONTEXT or CATSEARCH. Specifying an alternate grammar enables you to issue queries using different syntax and operators.

For example, with CATSEARCH, you can issue ABOUT queries using the CONTEXT grammar. Likewise with CONTAINS, you can issue logical queries using the simplified CATSEARCH syntax.

The phrase 'dog cat mouse' is interpreted as a phrase in CONTAINS. However, with CATSEARCH this is equivalent to a AND query of 'dog AND cat AND mouse'. To specify that CONTAINS use the alternate grammar, we can issue the following template:

<query> 
  <textquery grammar="CTXCAT">dog cat mouse</textquery>
  <score datatype="integer"/>
</query>

Alternate Language Template

Use this template to specify an alternate language.

<query><textquery lang="french">bon soir</textquery></query>

Alternate Scoring Template

Use this template to specify an alternate scoring algorithm. The following example specifies that the query use the CONTEXT grammar and return integer scores using the COUNT algorithm. This algorithm return score as number of query occurrences in document.

<query>        
 <textquery grammar="CONTEXT" lang="english"> mustang  </textquery>     
 <score datatype="INTEGER" algorithm="COUNT"/>     
</query>

Template Attribute Values

The following table gives the possible values for template attributes:

Tag Attribute Description Possible Values Meaning
grammar= Specify the grammar of the query. CONTEXT

CTXCAT


datatype= Specify the type of number returned as score. INTEGER

FLOAT

Returns score as integer between 0 and 100.

Returns score as its high precision floating point number between 0 and 100.

algorithm= Specify the scoring algorithm to use. DEFAULT

COUNT

Default.

Returns scores as the number of occurrences in document.

lang= Specify the language name. ENGLISH

FRENCH

GERMAN

DUTCH




Template Grammar Definition

The query template interface is an XML document. Its grammar is defined with the following XML DTD:

<!ELEMENT query (textquery, score?)> 
<!ELEMENT textquery (#PCDATA|progression)*> 
<!ELEMENT progression (seq)+> 
<!ELEMENT seq (#PCDATA|rewrite)*> 
<!ELEMENT rewrite (#PCDATA)> 
<!ELEMENT score EMPTY> 
<!ATTLIST textquery grammar (context | ctxcat) #IMPLIED>
<!ATTLIST textquery language CDATA #IMPLIED>
<!ATTLIST score datatype (integer | float) "integer">
<!ATTLIST score algorithm (default | count) "default">

All tags and attributes values are case-sensitive.


See Also:

Chapter 3, "CONTAINS Query Operators" for more information about the operators you can use in query expressions.

label

Optionally specify the label that identifies the score generated by the CONTAINS operator.

Returns

For each row selected, CONTAINS returns a number between 0 and 100 that indicates how relevant the document row is to the query. The number 0 means that Oracle Text found no matches in the row.


Note:

You must use the SCORE operator with a label to obtain this number.

Example

The following example searches for all documents in the in the text column that contain the word oracle. The score for each row is selected with the SCORE operator using a label of 1:

SELECT SCORE(1), title from newsindex 
    WHERE CONTAINS(text, 'oracle', 1) > 0;

The CONTAINS operator must be followed by an expression such as > 0, which specifies that the score value calculated must be greater than zero for the row to be selected.

When the SCORE operator is called (for example, in a SELECT clause), the CONTAINS clause must reference the score label value as in the following example:

SELECT SCORE(1), title from newsindex 
     WHERE CONTAINS(text, 'oracle', 1) > 0 ORDER BY SCORE(1) DESC;

The following example specifies that the query be parsed using the CATSEARCH grammar:

SELECT id FROM test WHERE CONTAINS (text,
 '<query>
   <textquery lang="ENGLISH" grammar="CATSEARCH">
      cheap pokemon
   </textquery>
   <score datatype="INTEGER"/>
  </query>' ) > 0;

Grammar Template Example

The following example shows how to use the CTXCAT grammar in a CONTAINS query. The example creates a CTXCAT and a CONTEXT index on the same table, and compares the query results:

PROMPT create context and ctxcat indexes both with theme indexing on
PROMPT
create index tdrbqcq101x on test(text) indextype is ctxsys.context
parameters ('lexer theme_lexer');

create index tdrbqcq101cx on test(text) indextype is ctxsys.ctxcat
parameters ('lexer theme_lexer');

PROMPT *****  San Diego             ***********
PROMPT *****  CONTEXT grammar       ***********
PROMPT ** should be interpreted as phrase query **
select pk||' ==> '||text from test 
where contains(text,'San Diego')>0
order by pk;

PROMPT *****  San Diego      ***********
PROMPT *****  CTXCAT grammar ***********
PROMPT ** should be interpreted as AND query  ***
select pk||' ==> '||text from test 
where contains(text,
'<query> 
  <textquery grammar="CTXCAT">San Diego</textquery>
  <score datatype="integer"/>
</query>')>0
order by pk;

PROMPT *****  Hitlist from CTXCAT index ***********
select pk||' ==> '||text from test 
where catsearch(text,'San Diego','')>0
order by pk;

Query Relaxation Template Example

The following query template defines a query relaxation sequence. The query of black pen is issued in sequence as black pen then black NEAR pen then black AND pen then black ACCUM pen. Query hits are returned in this sequence with no duplication as long as the application needs results.

select id from docs where CONTAINS (text, '
<query>
   <textquery lang="ENGLISH" grammar="CONTEXT">
     black pen
     <progression>
       <seq>black pen</seq>
       <seq>black NEAR pen</seq>
       <seq>black AND pen<seq/>
       <seq>black ACCUM pen<seq/>
     </progression>
   </textquery>
   <score datatype="INTEGER" algorithm="COUNT"/>
</query>')>0;

Query relaxation is most effective when your application needs the top n hits to a query, which you can obtain with the FIRST_ROWS hint or in a PL/SQL cursor.


Query Rewrite Example

The following template defines a query rewrite sequence. The query of kukui nut is rewritten as follows:

{kukui} {nut}

{kukui} ; {nut}

{kukui} AND {nut}

{kukui} ACCUM {nut}

select id from docs where CONTAINS (text, '
 <query>
   <textquery lang="ENGLISH" grammar="CONTEXT"> kukui nut
     <progression>
       <seq><rewrite>transform((TOKENS, "{", "}", " "))</rewrite></seq>
       <seq><<rewrite>transform((TOKENS, "{", "}", " ; "))</rewrite>/seq>
       <seq><rewrite>transform((TOKENS, "{", "}", "AND"))</rewrite><seq/>
       <seq><rewrite>transform((TOKENS, "{", "}", "ACCUM"))</rewrite><seq/>
     </progression>
   </textquery>
  <score datatype="INTEGER" algorithm="COUNT"/>
</query>')>0;

Notes


Querying Multi-Language Tables

With the multi-lexer preference, you can create indexes from multi-language tables.

At query time, the multi-lexer examines the session's language setting and uses the sub-lexer preference for that language to parse the query. If the language setting is not mapped, then the default lexer is used.

When the language setting is mapped, the query is parsed and run as usual. The index contains tokens from multiple languages, so such a query can return documents in several languages.

To limit your query to returning document of a given language, use a structured clause on the language column.


Query Performance Limitation with a Partitioned Index

Oracle Text supports the CONTEXT indexing and querying of a partitioned text table.

However, for optimal performance when querying a partitioned table with an ORDER BY SCORE clause, query the partition. If you query the entire table and use an ORDER BY SCORE clause, the query might not perform optimally unless you include a range predicate that can limit the query to a single partition.

For example, the following statement queries the partition p_tab4 partition directly:

select * from part_tab partition (p_tab4) where contains(b,'oracle') > 0 ORDER BY SCORE DESC;

Related Topics

Syntax for CONTEXT Indextype in this chapter

Chapter 3, "CONTAINS Query Operators"

Oracle Text Application Developer's Guide

SCORE


CREATE INDEX


Note:

This section describes the CREATE INDEX statement as it pertains to creating a Text domain index.

For a complete description of the CREATE INDEX statement, see Oracle Database SQL Reference.


Purpose

Use CREATE INDEX to create an Oracle Text index. An Oracle Text index is an Oracle Database domain index of type CONTEXT, CTXCAT, CTXRULE or CTXXPATH.

You must create an appropriate Oracle Text index to issue CONTAINS, CATSEARCH, or MATCHES queries.

You can create the following types of Oracle Text indexes:


CONTEXT

This is an index on a text column. You query this index with the CONTAINS operator in the WHERE clause of a SELECT statement. This index requires manual synchronization after DML. See Syntax for CONTEXT Indextype.


CTXCAT

This is a combined index on a text column and one or more other columns.You query this index with the CATSEARCH operator in the WHERE clause of a SELECT statement. This type of index is optimized for mixed queries. This index is transactional, automatically updating itself with DML to the base table. See Syntax for CTXCAT Indextype.


CTXRULE

This is an index on a column containing a set of queries. You query this index with the MATCHES operator in the WHERE clause of a SELECT statement. See Syntax for CTXRULE Indextype.


CTXXPATH

Create this index when you need to speed up existsNode() queries on an XMLType column. See Syntax for CTXXPATH Indextype.


Required Privileges

You do not need the CTXAPP role to create an Oracle Text index. If you have Oracle Database grants to create a b-tree index on the text column, you have sufficient permission to create a text index. The issuing owner, table owner, and index owner can all be different users, which is consistent with Oracle standards for creating regular B-tree indexes.

Syntax for CONTEXT Indextype

Use this indextype to create an index on a text column. You query this index with the CONTAINS operator in the WHERE clause of a SELECT statement. This index requires manual synchronization after DML.

CREATE INDEX [schema.]index ON [schema.]table(column) INDEXTYPE IS
ctxsys.context [ONLINE]
[LOCAL [(PARTITION [partition] [PARAMETERS('paramstring')]
[, PARTITION [partition] [PARAMETERS('paramstring')]])]
[PARAMETERS(paramstring)] [PARALLEL n] [UNUSABLE]];
[schema.]index

Specify the name of the Text index to create.

[schema.]table(column)

Specify the name of the table and column to index.

Your table can optionally contain a primary key if you prefer to identify your rows as such when you use procedures in CTX_DOC. When your table has no primary key, document services identifies your documents by ROWID.

The column you specify must be one of the following types: CHAR, VARCHAR, VARCHAR2, BLOB, CLOB, BFILE, XMLType, or URIType.

The table you specify can be a partitioned table. If you do not specify the LOCAL clause, a global index is created.

DATE, NUMBER, and nested table columns cannot be indexed. Object columns also cannot be indexed, but their attributes can be, provided they are atomic data types.

Attempting to create a index on a Virtual Private Database (VPD) protected table will fail unless one of the following is true:

Indexes on multiple columns are not supported with the CONTEXT index type. You must specify only one column in the column list.


Note:

With the CTXCAT indextype, you can create indexes on text and structured columns. See Syntax for CTXCAT Indextype in this chapter.

ONLINE

Creates the index while enabling inserts/updates/deletes (DML) on the base table.

During indexing, Oracle Text enqueues DML requests in a pending queue. At the end of the index creation, Oracle Text locks the base table. During this time DML is blocked.


Limitations

The following limitations apply to using ONLINE:

LOCAL [(PARTITION [partition] [PARAMETERS('paramstring')]

Specify LOCAL to create a local partitioned context index on a partitioned table. The partitioned table must be partitioned by range. Hash, composite and list partitions are not supported.

You can specify the list of index partition names with partition. If you do not specify a partition name, the system assigns one. The order of the index partition list must correspond to the table partition by order.

The PARAMETERS clause associated with each partition specifies the parameters string specific to that partition. You can only specify sync (manual|every |on commit), memory and storage for each index partition.

You can query the views CTX_INDEX_PARTITIONS or CTX_USER_INDEX_PARTITIONS to find out index partition information, such as index partition name, and index partition status.

You cannot use the ONLINE parameter with this operation.


Query Performance Limitation with Partitioned Index

For optimal performance when querying a partitioned index with an ORDER BY SCORE clause, query the partition. If you query the entire table and use an ORDER BY SCORE clause, the query might not perform optimally unless you include a range predicate that can limit the query to the fewest number of partitions, which is optimally a single partition.

PARALLEL n

Optionally specify with n the parallel degree for parallel indexing. The actual degree of parallelism might be smaller depending on your resources.

You can use this parameter on non-partitioned tables. Creating a non-partitioned index in parallel does not turn on parallel query processing.

Parallel indexing is supported for creating a local partitioned index.


Performance

Parallel indexing can speed up indexing when you have large amounts of data to index and when your operating system supports multiple CPUs.


Note:

Using PARALLEL to create a local partitioned index enables parallel queries. (Creating a non-partitioned index in parallel does not turn on parallel query processing.)

Parallel querying degrades query throughput especially on heavily loaded systems. Because of this, Oracle recommends that you disable parallel querying after creating a local index. To do so, use ALTER INDEX NOPARALLEL.

For more information on parallel querying, see the Performance Tuning chapter in Oracle Text Application Developer's Guide



Limitations

The following limitations apply to using PARALLEL:

UNUSABLE

Create an unusable index. This creates index metadata only and exits immediately.

You might create an unusable index when you need to create a local partitioned index in parallel.

PARAMETERS(paramstring)

Optionally specify indexing parameters in paramstring. You can specify preferences owned by another user using the user.preference notation.

The syntax for paramstring is as follows:

paramstring = 
'[DATASTORE datastore_pref] 
 [FILTER filter_pref] 
 [CHARSET COLUMN charset_column_name]  
 [FORMAT COLUMN format_column_name]

 [LEXER lexer_pref]
 [LANGUAGE COLUMN language_column_name] 

 [WORDLIST wordlist_pref] 
 [STORAGE storage_pref] 
 [STOPLIST stoplist] 
 [SECTION GROUP section_group]
 [MEMORY memsize]
 [POPULATE | NOPOPULATE]
 [[METADATA] SYNC (MANUAL | EVERY "interval-string" | ON COMMIT)]
 [TRANSACTIONAL]'

You create datastore, filter, lexer, wordlist, and storage preferences with CTX_DDL.CREATE_PREFERENCE and then specify them in the paramstring.


Note:

When you specify no paramstring, Oracle Text uses the system defaults.

For more information about these defaults, see "Default Index Parameters" in Chapter 2.


DATASTORE datastore_pref

Specify the name of your datastore preference. Use the datastore preference to specify where your text is stored.See Datastore Types in Chapter 2, " Oracle Text Indexing Elements".

FILTER filter_pref

Specify the name of your filter preference. Use the filter preference to specify how to filter formatted documents to plain text or HTML. See Filter Types in Chapter 2, " Oracle Text Indexing Elements".

CHARSET COLUMN charset_column_name

Specify the name of the character set column. This column must be in the same table as the text column, and it must be of type CHAR, VARCHAR, or VARCHAR2. Use this column to specify the document character set for conversion to the database character set. The value is case insensitive. You must specify a Globalization Support character set string such as JA16EUC.

When the document is plain text or HTML, the INSO_FILTER and CHARSET filter use this column to convert the document character set to the database character set for indexing.

For all rows containing the keywords 'AUTO' or 'AUTOMATIC', Oracle Text will apply statistical techniques to determine the character set of the documents and modify document indexing appropriately.

You use this column when you have plain text or HTML documents with different character sets or in a character set different from the database character set.


Note:

Documents are not marked for re-indexing when only the charset column changes. The indexed column must be updated to flag the re-index.

FORMAT COLUMN format_column_name

Specify the name of the format column. The format column must be in the same table as the text column and it must be CHAR, VARCHAR, or VARCHAR2 type.

The INSO_FILTER uses the format column when filtering documents. Use this column with heterogeneous document sets to optionally bypass INSO filtering for plain text or HTML documents.

In the format column, you can specify one of the following

TEXT indicates that the document is either plain text or HTML. When TEXT is specified the document is not filtered, but might be character set converted.

BINARY indicates that the document is a format supported by the INSO_FILTER object other than plain text or HTML, such as PDF. BINARY is the default if the format column entry cannot be mapped.

IGNORE indicates that the row is to be ignored during indexing. Use this value when you need to bypass rows that contain data incompatible with text indexing such as image data.


Note:

Documents are not marked for re-indexing when only the format column changes. The indexed column must be updated to flag the re-index.

LEXER lexer_pref

Specify the name of your lexer or multi-lexer preference. Use the lexer preference to identify the language of your text and how text is tokenized for indexing. See Lexer Types in Chapter 2, " Oracle Text Indexing Elements".

LANGUAGE COLUMN language_column_name

Specify the name of the language column when using a multi-lexer preference. See MULTI_LEXER in Chapter 2, " Oracle Text Indexing Elements".

This column must exist in the base table. It cannot be the same column as the indexed column. Only the first 30 bytes of the language column is examined for language identification.

For all rows containing the keywords 'AUTO' or 'AUTOMATIC', Oracle Text will apply statistical techniques to determine the language of the documents and modify document indexing appropriately.


Note:

Documents are not marked for re-indexing when only the language column changes. The indexed column must be updated to flag the re-index.

WORDLIST wordlist_pref

Specify the name of your wordlist preference. Use the wordlist preference to enable features such as fuzzy, stemming, and prefix indexing for better wildcard searching. See Wordlist Type in Chapter 2, " Oracle Text Indexing Elements".

STORAGE storage_pref

Specify the name of your storage preference for the Text index. Use the storage preference to specify how the index tables are stored. See Storage Types in Chapter 2, " Oracle Text Indexing Elements".

STOPLIST stoplist

Specify the name of your stoplist. Use stoplist to identify words that are not to be indexed. See CTX_DDL.CREATE_STOPLIST in Chapter 7, " CTX_DDL Package".

SECTION GROUP section_group

Specify the name of your section group. Use section groups to create searchable sections in structured documents. See CTX_DDL.CREATE_SECTION_GROUP in Chapter 7, " CTX_DDL Package".

MEMORY memsize

Specify the amount of run-time memory to use for indexing. The syntax for memsize is as follows:

memsize = number[K|M|G]

where K stands for kilobytes., M stands for megabytes, and G stands for gigabytes.

The value you specify for memsize must be between 1M and the value of MAX_INDEX_MEMORY in the CTX_PARAMETERS view. To specify a memory size larger than the MAX_INDEX_MEMORY, you must reset this parameter with CTX_ADM.SET_PARAMETER to be larger than or equal to memsize.

The default is the value specified for DEFAULT_INDEX_MEMORY in CTX_PARAMETERS.

The memsize parameter specifies the amount of memory Oracle Text uses for indexing before flushing the index to disk. Specifying a large amount memory improves indexing performance because there are fewer I/O operations and improves query performance and maintenance since there is less fragmentation.

Specifying smaller amounts of memory increases disk I/O and index fragmentation, but might be useful when run-time memory is scarce.

POPULATE | NOPOPULATE

Specify nopopulate to create an empty index. The default is populate.


Note:

This is the only option whose default value cannot be set with CTX_ADM.SET_PARAMETER.

This option is not valid with CTXXPATH indexes.


Empty indexes are populated by updates or inserts to the base table. You might create an empty index when you need to create your index incrementally or to selectively index documents in the base table. You might also create an empty index when you require only theme and Gist output from a document set.

[METADATA] SYNC (MANUAL | EVERY "interval-string" | ON COMMIT)

Specify SYNC for automatic synchronization of the CONTEXT index when there are inserts, updates or deletes to the base table. You can specify one of the following SYNC methods:

SYNC type Description
MANUAL No automatic synchronization. This is the default. You must manually synchronize the index with CTX_DDL.SYNC_INDEX.
EVERY "interval-string" Automatically synchronize the index at a regular interval specified by the value of interval-string. interval-string takes the same syntax as that for scheduler jobs. Automatic synchronization using EVERY requires that the index creator have CREATE JOB privileges.

Make sure that interval-string is set to a long enough period that any previous sync jobs will have completed; otherwise, the sync job may hang. interval-string must be enclosed in double quotes, and any single quote within interval-string must be escaped with another single quote.

See Enabling Automatic Index Synchronization for an example of automatic sync syntax.

ON COMMIT Synchronize the index immediately after a commit. The commit does not return until the sync is complete. (Since the synchronization is performed as a separate transaction, there may be a period, usually small, when the data is committed but index changes are not.)

The operation uses the memory specified with the memory parameter.

Note that the sync operation has its own transaction context. If this operation fails, the data transaction still commits. Index synchronization errors are logged in the CTX_USER_INDEX_ERRORS view. See Viewing Index Errors under CREATE INDEX.

See Enabling Automatic Index Synchronization for an example of ON COMMIT syntax.


Each partition of a locally partitioned index can have its own type of sync (ON COMMIT, EVERY, or MANUAL). The type of sync specified in master parameter strings applies to all index partitions unless a partition specifies its own type.

With automatic (EVERY) synchronization, users can specify memory size and parallel synchronization. That syntax is:

... EVERY interval_string MEMORY mem_size PARALLEL paradegree ...

ON COMMIT synchronizations can only be executed serially and at the same memory size as at index creation.

See the Oracle Database Administrator's Guide for information on job scheduling.

TRANSACTIONAL

Specify that documents can be searched immediately after they are inserted or updated. If a text index is created with TRANSACTIONAL enabled, then, in addition to processing the synchronized rowids already in the index, the CONTAINS operator will process unsynchronized rowids as well. (That is, Oracle Text does in-memory indexing of unsynchronized rowids and processes the query against the in-memory index.)

TRANSACTIONAL is an index-level parameter and does not apply at the partition level.

You must still synchronize your text indexes from time to time (with CTX_DDL.SYNC_INDEX) to bring pending rowids into the index. Query performance degrades as the number of unsynchronized rowids increases. For that reason, Oracle recommends setting up your index to use automatic synchronization with the EVERY parameter. (See [METADATA] SYNC (MANUAL | EVERY "interval-string" | ON COMMIT).)

Transactional querying for indexes that have been created with the TRANSACTIONAL parameter can be turned on and off (for the duration of a user session) with the PL/SQL variable CTX_QUERY.disable_transactional_query. This is useful, for example, if you find that querying is slow due to the presence of too many pending rowids. Here is an example of setting this session variable:

exec ctx_query.disable_transactional_query := TRUE;

If the index uses INSO_FILTER, queries involving unsynchronized rowids will require filtering of unsynchronized documents.

CREATE INDEX: CONTEXT Index Examples

The following sections give examples of creating a CONTEXT index.


Creating CONTEXT Index Using Default Preferences

The following example creates a CONTEXT index called myindex on the docs column in mytable. Default preferences are used.

CREATE INDEX myindex ON mytable(docs) INDEXTYPE IS ctxsys.context;

See Also:

For more information about default settings, see "Default Index Parameters" in Chapter 2.

Also refer to Oracle Text Application Developer's Guide.



Creating CONTEXT Index with Custom Preferences

The following example creates a CONTEXT index called myindex on the docs column in mytable. The index is created with a custom lexer preference called my_lexer and a custom stoplist called my_stop.

This example also assumes that the preference and stoplist were previously created with CTX_DDL.CREATE_PREFERENCE for my_lexer, and CTX_DDL.CREATE_STOPLIST for my_stop. Default preferences are used for the unspecified preferences.

CREATE INDEX myindex ON mytable(docs) INDEXTYPE IS ctxsys.context 
  PARAMETERS('LEXER my_lexer STOPLIST my_stop');

Any user can use any preference. To specify preferences that exist in another user's schema, add the user name to the preference name. The following example assumes that the preferences my_lexer and my_stop exist in the schema that belongs to user kenny:

CREATE INDEX myindex ON mytable(docs) INDEXTYPE IS ctxsys.context 
  PARAMETERS('LEXER kenny.my_lexer STOPLIST kenny.my_stop');

Enabling Automatic Index Synchronization

You can create your index and specify that the index be synchronized at regular intervals for inserts, updates and deletes to the base table. To do so, create the index with the SYNC (EVERY "interval-string") parameter.

To use job scheduling, you must log in as a user who has DBA privileges and then grant CREATE JOB privileges.

The following example creates an index and schedules three synchronization jobs for three index partitions. The first partition uses ON COMMIT synchronization. The other two partitions are synchronized by jobs that are scheduled to be executed every Monday at 3 P.M.

CONNECT system/manager
GRANT CREATE JOB TO dr_test

CREATE INDEX tdrmauto02x ON tdrmauto02(text)
   INDEXTYPE IS CTXSYS.CONTEXT local
   (PARTITION tdrm02x_i1 PARAMETERS('
   MEMORY 20m SYNC(ON COMMIT)'),
   PARTITION tdrm02x_i2,
   PARTITION tdrm02x_i3)  PARAMETERS('
   SYNC (EVERY "NEXT_DAY(TRUNC(SYSDATE), ''MONDAY'') + 15/24")
  ');

See the Oracle Database Administrator's Guide for information on job scheduling syntax.


Creating CONTEXT Index with Multi-Lexer Preference

The multi-lexer decides which lexer to use for each row based on a language column. This is a character column in the table which stores the language of the document in the text column. For example, you create the table globaldoc to hold documents of different languages:

CREATE TABLE globaldoc (
   doc_id NUMBER PRIMARY KEY,
   lang VARCHAR2(10),
   text CLOB
);

Assume that global_lexer is a multi-lexer preference you created. To index the global_doc table, you specify the multi-lexer preference and the name of the language column as follows:

CREATE INDEX globalx ON globaldoc(text) INDEXTYPE IS ctxsys.context PARAMETERS ('LEXER global_lexer LANGUAGE COLUMN lang');

See Also:

For more information about creating multi-lexer preferences, see MULTI_LEXER in Chapter 2.


Creating a Local Partitioned Index

The following example creates a text table partitioned into three, populates it, and then creates a partitioned index.

PROMPT create partitioned table and populate it

CREATE TABLE part_tab (a int, b varchar2(40)) PARTITION BY RANGE(a)
(partition p_tab1 values less than (10),
 partition p_tab2 values less than (20),
 partition p_tab3 values less than (30));
PROMPT create partitioned index
CREATE INDEX part_idx on part_tab(b) INDEXTYPE IS CTXSYS.CONTEXT
LOCAL (partition p_idx1, partition p_idx2, partition p_idx3);

Parallel Indexing

Parallel indexing can improve index performance when you have multiple CPUs.

To create an index in parallel, use the PARALLEL clause with a parallel degree. This example uses a parallel degree of 3:

CREATE INDEX myindex ON mytab(pk) INDEXTYPE IS ctxsys.context PARALLEL 3;

Creating a Local Partitioned Index in Parallel

Creating a local partitioned index in parallel can improve performance when you have multiple CPUs. With partitioned tables, you can divide the work. You can create a local partitioned index in parallel in two ways:


Parallelism with CREATE INDEX

You can achieve local index parallelism by using the PARALLEL and LOCAL clauses in CREATE INDEX.In this case, the maximum parallel degree is limited to the number of partitions you have.

The following example creates a table with three partitions, populates them, and then creates the local indexes in parallel with a degree of 2:

create table part_tab3(id number primary key, text varchar2(100)) 
partition by range(id) 
(partition p1 values less than (1000), 
 partition p2 values less than (2000), 
 partition p3 values less than (3000)); 

begin 
  for i in 0..2999 
  loop 
      insert into part_tab3 values (i,'oracle'); 
  end loop; 
end; 
/ 

create index part_tab3x on part_tab3(text) 
indextype is ctxsys.context local (partition part_tabx1, 
                                   partition part_tabx2, 
                                   partition part_tabx3) 
parallel 2;                                                              

Parallelism with DBMS_PCLUTIL.BUILD_PART_INDEX

You can achieve local index parallelism by first creating an unusable CONTEXT index, then running the DBMS_PCLUTIL.BUILD_PART_INDEX utility. This method can result in a higher degree of parallelism, especially when you have more CPUs than partitions.

In this example, the base table has three partitions. We create a local partitioned unusable index first, then run DBMS_PCLUTIL.BUILD_PART_INDEX, which builds the 3 partitions in parallel (inter-partition parallelism). Also inside each partition, index creation proceeds in parallel (intra-partition parallelism) with a parallel degree of 2. Therefore the total parallel degree is 6 (3 times 2).

create table part_tab3(id number primary key, text varchar2(100)) 
partition by range(id) 
(partition p1 values less than (1000), 
 partition p2 values less than (2000), 
 partition p3 values less than (3000)); 

begin 
  for i in 0..2999 
  loop 
      insert into part_tab3 values (i,'oracle'); 
  end loop; 
end; 
/ 

create index part_tab3x on part_tab3(text) 
indextype is ctxsys.context local (partition part_tabx1, 
                                   partition part_tabx2, 
                                   partition part_tabx3) 
unusable; 

exec dbms_pclxutil.build_part_index(jobs_per_batch=>3,
  procs_per_job=>2,
  tab_name=>'PART_TAB3',
  idx_name=>'PART_TAB3X',
  force_opt=>TRUE); 

Viewing Index Errors

After a CREATE INDEX or ALTER INDEX operation, you can view index errors with Oracle Text views. To view errors on your indexes, query the CTX_USER_INDEX_ERRORS view. To view errors on all indexes as CTXSYS, query the CTX_INDEX_ERRORS view.

For example, to view the most recent errors on your indexes, you can issue:

SELECT err_timestamp, err_text FROM ctx_user_index_errors ORDER BY err_timestamp DESC;

Deleting Index Errors

To clear the index error view, you can issue:

DELETE FROM ctx_user_index_errors;

Syntax for CTXCAT Indextype

The CTXCAT index is a combined index on a text column and one or more other columns.You query this index with the CATSEARCH operator in the WHERE clause of a SELECT statement. This type of index is optimized for mixed queries. This index is transactional, automatically updating itself with DML to the base table.

CREATE INDEX [schema.]index on [schema.]table(column) INDEXTYPE IS ctxsys.ctxcat 
[PARAMETERS
('[index set index_set]
[lexer lexer_pref]
[storage storage_pref] 
[stoplist stoplist] 
[section group sectiongroup_pref
[wordlist wordlist_pref] 
[memory memsize]');
[schema.]table(column)

Specify the name of the table and column to index.

The column you specify when you create a CTXCAT index must be of type CHAR or VARCHAR2. No other types are supported for CTXCAT.

Attempting to create a index on a Virtual Private Database (VPD) protected table will fail unless one of the following is true:

Supported Preferences

index set index_set

Specify the index set preference to create the CTXCAT index. Index set preferences name the columns that make up your sub-indexes. Any column named in an index set column list cannot have a NULL value in any row of the base table or else you get an error.

You must always ensure that your columns have non-NULL values before and after indexing.

See "Creating a CTXCAT Index".


Index Performance and Size Considerations

Although a CTXCAT index offers query performance benefits, creating the index has its costs. The time Oracle Text takes to create a CTXCAT index depends on its total size, and the total size of a CTXCAT index is directly related to

Having many component indexes in your index set also degrades DML performance since more indexes must be updated.

Because of these added costs in creating a CTXCAT index, carefully consider the query performance benefit each component index gives your application before adding it to your index set.


See Also:

Oracle Text Application Developer's Guide for more information about creating CTXCAT indexes and its benefits.

Other Preferences

When you create an index of type CTXCAT, you can use the following supported index preferences in the parameters string:

Table 1-1 Supported CTXCAT Index Preferences

Preference Class Supported Types
Datastore This preference class is not supported for CTXCAT.
Filter This preference class is not supported for CTXCAT.
Lexer BASIC_LEXER (index_themes attribute not supported)

CHINESE_LEXER

CHINESE_VGRAM_LEXER

JAPANESE_LEXER

JAPANESE_VGRAM_LEXER

KOREAN_LEXER

KOREAN_LEXER

Wordlist BASIC_WORDLIST
Storage BASIC_STORAGE
Stoplist Supports single language stoplists only (BASIC_STOPLIST type.)
Section Group This preference class is not supported for CTXCAT.

Unsupported Preferences and Parameters

When you create a CTXCAT index, you cannot specify datastore, filter and section group preferences. You also cannot specify language, format, and charset columns as with a CONTEXT index.

Creating a CTXCAT Index

This section gives a brief example for creating a CTXCAT index. For a more complete example, see the Oracle Text Application Developer's Guide.

Consider a table called AUCTION with the following schema:

create table auction(
item_id number,
title varchar2(100),
category_id number,
price number,
bid_close date);

Assume that queries on the table involve a mandatory text query clause and optional structured conditions on price. Results must be sorted based on bid_close. This means that we need an index to support good response time for the structured and sorting criteria.

You can create a catalog index to support the different types of structured queries a user might enter. For structured queries, a CTXCAT index improves query performance over a context index.

To create the indexes, first create the index set preference then add the required indexes to it:

begin
ctx_ddl.create_index_set('auction_iset');
ctx_ddl.add_index('auction_iset','bid_close');
ctx_ddl.add_index('auction_iset','price, bid_close');
end;

Create the CTXCAT index with CREATE INDEX as follows:

create index auction_titlex on AUCTION(title) indextype is CTXSYS.CTXCAT parameters ('index set auction_iset');

Querying a CTXCAT Index

To query the title column for the word pokemon, you can issue regular and mixed queries as follows:

select * from AUCTION where CATSEARCH(title, 'pokemon',NULL)> 0;
select * from AUCTION where CATSEARCH(title, 'pokemon', 'price < 50 order by bid_close desc')> 0;

See Also::

Oracle Text Application Developer's Guide for a complete CTXCAT example.

Syntax for CTXRULE Indextype

This is an index on a column containing a set of queries. You query this index with the MATCHES operator in the WHERE clause of a SELECT statement.

CREATE INDEX [schema.]index on [schema.]table(rule_col) INDEXTYPE IS 
ctxsys.ctxrule 
[PARAMETERS ('[lexer lexer_pref] [storage storage_pref]
[section group section_pref] [wordlist wordlist_pref]
[classifier classifier_pref]'); 
[PARALLEL n];
[schema.]table(column)

Specify the name of the table and rule column to index. The rules can be query compatible strings, query template strings, or binary support vector machine rules.

The column you specify when you create a CTXRULE index must be VARCHAR2, CLOB or BLOB. No other types are supported for CTXRULE.

Attempting to create an index on a Virtual Private Database (VPD) protected table will fail unless one of the following is true:

lexer_pref

Specify the lexer preference to be used for processing the queries and the documents to be classified with the MATCHES function. Currently, the BASIC_LEXER, CHINESE_LEXER, JAPANESE_LEXER, and KOREAN_LEXER lexer types are supported.

For processing queries, this lexer supports the following operators: ABOUT, STEM, AND, NEAR, NOT, OR, and WITHIN.

The thesaural operators (BT*, NT*, PT, RT, SYN, TR, TRSYS, TT, and so on) are supported. However, these operators are expanded using a snapshot of the thesaurus at index time, not when the MATCHES function is issued. This means that if you change your thesaurus after you index, you must re-index your query set.

storage_pref

Specify the storage preference for the index on the queries.Use the storage preference to specify how the index tables are stored. See Storage Types in Chapter 2, " Oracle Text Indexing Elements".

section group

Specify the section group. This parameter does not affect the queries. It applies to sections in the documents to be classified. The following section groups are supported for the CTXRULE indextype:

See Section Group Types in Chapter 2, " Oracle Text Indexing Elements".

CTXRULE does not support special sections.

wordlist_pref

Specify the wordlist preferences. This is used to enable stemming operations on query terms. See Wordlist Type in Chapter 2, " Oracle Text Indexing Elements".

classifier_pref

Specify the classifier preference. See Classifier Types in Chapter 2, " Oracle Text Indexing Elements". You must use the same preference name you specify with CTX_CLS.TRAIN.

Example for Creating a CTXRULE Index

See the Oracle Text Application Developer's Guide for a complete example of using the CTXRULE indextype in a document routing application.

Syntax for CTXXPATH Indextype

Create this index when you need to speed up existsNode() queries on an XMLType column.

CREATE INDEX [schema.]index on [schema.]table(XMLType column) INDEXTYPE IS 
ctxsys.CTXXPATH 
[PARAMETERS ('[storage storage_pref]
              [memory memsize]')];
[schema.]table(column)

Specify the name of the table and column to index.

The column you specify when you create a CTXXPATH index must be XMLType. No other types are supported for CTXXPATH.

storage_pref

Specify the storage preference for the index on the queries.Use the storage preference to specify how the index tables are stored. See Storage Types in Chapter 2, " Oracle Text Indexing Elements".

memory memsize

Specify the amount of run-time memory to use for indexing. The syntax for memsize is as follows:

memsize = number[M|G|K]

where M stands for megabytes, G stands for gigabytes, and K stands for kilobytes.

The value you specify for memsize must be between 1M and the value of MAX_INDEX_MEMORY in the CTX_PARAMETERS view. To specify a memory size larger than the MAX_INDEX_MEMORY, you must reset this parameter with CTX_ADM.SET_PARAMETER to be larger than or equal to memsize.

The default is the value specified for DEFAULT_INDEX_MEMORY in CTX_PARAMETERS.

CTXXPATH Examples

Index creation on an XMLType column:

CREATE INDEX xml_index ON xml_tab(col_xml) indextype is ctxsys.CTXXPATH;

or

CREATE INDEX xml_index ON xml_tab(col_xml) indextype is ctxsys.CTXXPATH
 PARAMETERS('storage my_storage memory 40M');

Querying the table with existsNode:

select xml_id from xml_tab x where x.col_xml.existsnode('/book/chapter[@title="XML"]') > 0;


See Also:

Oracle XML DB Developer's Guide for information on using the CTXXPATH indextype.

Related Topics

CTX_DDL.CREATE_PREFERENCE in Chapter 7, " CTX_DDL Package".

CTX_DDL.CREATE_STOPLIST in Chapter 7, " CTX_DDL Package".

CTX_DDL.CREATE_SECTION_GROUP in Chapter 7, " CTX_DDL Package".

ALTER INDEX

CATSEARCH


DROP INDEX


Note:

This section describes the DROP INDEX statement as it pertains to dropping a Text domain index.

For a complete description of the DROP INDEX statement, see Oracle Database SQL Reference.


Purpose

Use DROP INDEX to drop a specified Text index.

Syntax

DROP INDEX [schema.]index [force];
[force]

Optionally force the index to be dropped. Use force option when Oracle Text cannot determine the state of the index, such as when an indexing operation crashes.

Oracle recommends against using this option by default. Use it a a last resort when a regular call to DROP INDEX fails.

Examples

The following example drops an index named doc_index in the current user's database schema.

DROP INDEX doc_index;

Related Topics

ALTER INDEX

CREATE INDEX


MATCHES

Use this operator to find all rows in a query table that match a given document. The document must be a plain text, HTML, or XML document.

This operator requires a CTXRULE index on your set of queries.

MATCHES returns a number in the range of 0 to 100. Zero means no match. When this number is greater than zero, there are one or more matches. You can use the label parameter and MATCH_SCORE to obtain this number.

Limitation

If the optimizer chooses to use the functional query invocation with a MATCHES query, your query will fail.

Syntax

MATCHES(
[schema.]column,
document VARCHAR2 or CLOB
[,label INTEGER])
RETURN NUMBER;
column

Specify the column containing the indexed query set.

document

Specify the document to be classified. The document can be plain-text, HTML, or XML. Binary formats are not supported.

label

Optionally specify the label that identifies the score generated by the MATCHES operator. You use this label with MATCH_SCORE.

Matches Example

The following example creates a table querytable, and populates it with classification names and associated rules. It then creates a CTXRULE index.

The example issues the MATCHES query with a document string to be classified. The SELECT statement returns all rows (queries) that are satisfied by the document:

create table querytable (classification varchar2(64), text varchar2(4000));
insert into querytable values ('common names', 'smith OR jones OR brown');
insert into querytable values ('countries', 'United States OR Great Britain OR France');
insert into querytable values ('Oracle DB', 'oracle NEAR database');

create index query_rule on querytable(text) indextype is ctxsys.ctxrule;

SELECT classification FROM querytable WHERE MATCHES(text, 'Smith is a common name in the United States') > 0;


CLASSIFICATION
----------------------------------------------------------------
common names
countries

Simple Classification Examples

The MATCHES operator is used in simple and supervised classification. For more extended examples, see the "Building Classification Applications" chapter in the Oracle Text Application Developer's Guide

Related Topics

Syntax for CTXRULE Indextype in this chapter.

CTX_CLS.TRAIN

Oracle Text Application Developer's Guide


MATCH_SCORE

Use the MATCH_SCORE operator in a SELECT statement to return scores produced by a MATCHES query.

This operator returns a score in the range 0 to 100. You can use the matching score to apply a category specific threshold to a particular category.

Syntax

MATCH_SCORE(label NUMBER)
label

Specify a number to identify the score produced by the query. You use this number to identify the MATCHES clause which returns this score.

Example

To get the matching score, use

select cat_id, match_score(1) from training_result where matches(profile, text,1)>0;


SCORE

Use the SCORE operator in a SELECT statement to return the score values produced by a CONTAINS query. The SCORE operator can be used in a SELECT, ORDER BY, or GROUP BY clause.

Syntax

SCORE(label NUMBER)
label

Specify a number to identify the score produced by the query. You use this number to identify the CONTAINS clause which returns this score.

Example


Single CONTAINS

When the SCORE operator is called (for example, in a SELECT clause), the CONTAINS clause must reference the score label value as in the following example:

SELECT SCORE(1), title from newsindex 
           WHERE CONTAINS(text, 'oracle', 1) > 0 ORDER BY SCORE(1) DESC;

Multiple CONTAINS

Assume that a news database stores and indexes the title and body of news articles separately. The following query returns all the documents that include the words Oracle in their title and java in their body. The articles are sorted by the scores for the first CONTAINS (Oracle) and then by the scores for the second CONTAINS (java).

SELECT title, body, SCORE(10), SCORE(20)
FROM news
WHERE CONTAINS (news.title, 'Oracle', 10) > 0 OR
CONTAINS (news.body, 'java', 20) > 0 
ORDER BY SCORE(10), SCORE(20);

Related Topics

CONTAINS

Appendix F, " Scoring Algorithm "