|Oracle8i interMedia Text Reference
Release 2 (8.1.6)
Part Number A77063-01
SQL Commands, 5 of 6
For a complete description of the CREATE INDEX command, see Oracle8i SQL Reference.
Use CREATE INDEX to create an interMedia Text index. An interMedia Text index is an Oracle domain index of type context created using the extensible indexing framework.
You must create an iMT index to issue CONTAINS queries.
CREATE INDEX [schema.]index on [schema.]table(column) INDEXTYPE IS ctxsys.context [PARAMETERS(paramstring)] [PARALLEL N];
Specify the name of the Text index to create.
Specify the name of the table and column to index. The table must have a primary key constraint. This is needed mainly for identifying the documents for document services. Composite primary keys are supported, up to 16 columns.
The column you specify must be one of the following types: CHAR, VARCHAR, VARCHAR2, BLOB, CLOB, or BFILE.
Indexing the deprecated column types LONG and LONG RAW is supported for the process of migrating Oracle7 systems to Oracle8i.
DATE, NUMBER, and nested table columns cannot be indexed. Object columns also cannot be indexed, but their attributes can be, provided they are atomic data types.
Composite indexes are not supported; you must specify only one column in the column list.
Optionally specify with N the parallel degree for parallel indexing. You must have a partitioned text table to use parallel indexing with iMT. Each parallel indexer works on its own partition, so N must not exceed the number of partitions.
The initialization parameters JOB_QUEUE_PROCESSES and JOB_QUEUE_INTERVAL must also be set in the
init.ora file to use parallel indexing. The parallel degree N you specify must be less than JOB_QUEUE_PROCESSES.
A complete example of how to create an index in parallel is given in the Examples section.
Optionally specify indexing parameters in paramstring. You can specify preferences owned by another user using the user.preference notation.
The syntax for
paramstring is as follows:
paramstring = '[datastore datastore_pref] [filter filter_pref] [charset column charset_column_name] [format column format_column_name] [lexer lexer_pref] [language column language_column_name] [wordlist wordlist_pref] [storage storage_pref] [stoplist stoplist] [section group section_group] [memory memsize] [populate | nopopulate]'
You create datastore, filter, lexer, wordlist, and storage preferences with CTX_DDL.CREATE_PREFERENCE.
Specify the name of the character set column. The charset column must be in the same table as the text column and it must be CHAR, VARCHAR, or VARCHAR2 type. Use this column to specify the document character set for conversion to the database character set. The value is case-insensitive, but must be an NLS character set string such as JA16EUC.
When the document is plain-text or HTML, the INSO_FILTER object and CHARSET filter object use this column to convert the document character set to the database character set for indexing.
You use this column when you have plain-text or HTML documents with different character sets or in a character set different from the database character set.
Specify the name of the format column. The format column must be in the same table as the text column and it must be CHAR, VARCHAR, or VARCHAR2 type.
The INSO_FILTER uses the format column when filtering documents. Use this column with heterogeneous document sets to optionally bypass INSO filtering for plain-text or HTML documents.
In the format column, you can specify either TEXT or BINARY. TEXT indicates that the document is either plain-text or HTML. When TEXT is specified the document is not filtered, but might be character-set converted.
BINARY indicates that the document is a format supported by the INSO_FILTER object other than plain-text or HTML, such as PDF. BINARY is the default if the format column entry cannot be mapped.
This column must exist in the base table. It cannot be the same column as the indexed column.Only the first 30 bytes of the language column is examined for language identification.
Documents are not marked for re-indexing when only the language column changes. The indexed column must be updated to flag the re-index.
Specify the amount of run-time memory to use for indexing. The syntax for memsize is as follows:
where M stands for megabytes, G stands for gigabytes, and K stands for kilobytes.
The value for memsize must be between 1M and the value specified for max_index_memory in the CTX_PARAMETERS view. The default is the value specified for default_index_memory in CTX_PARAMETERS.
The memsize parameter specifies the amount of memory Oracle uses for indexing before flushing the index to disk. Specifying a large amount memory improves indexing performance since there is less I/O and improves query performance and maintenance since there is less fragmentation.
Specifying smaller amounts of memory increases disk I/O and index fragmentation, but might be useful if you want to track indexing progress or when run-time memory is scarce.
Specify nopopulate to create an empty index. The default is populate.
Empty indexes are populated by updates or inserts to the base table. You might create an empty index when you need to create your index incrementally or to selectively index documents in the base table. You might also create an empty index when you require only theme and Gist output from a document set.
The following example creates a Text index called
myindex on the
docs column in
mytable. Default preferences are used.
The following example creates a Text index called
myindex on the
docs column in
mytable. The index is created with a custom lexer preference called
my_lexer and a custom stoplist called
This example also assumes that these preferences were previously created with CTX_DDLCREATE_PREFERENCE for
my_lexer, and CTX_DDL.CREATE_STOPLIST for
my_stop. Default preferences are used for the unspecified preferences.
create index myindex on mytable(docs) indextype is ctxsys.context parameters('lexer my_lexer stoplist my_stop');
Any user can use any preference. To specify preferences that exist in another user's schema, add the user name to the preference name. The following example assumes that the preferences
my_stop exist in the schema that belongs to user
create index myindex on mytable(docs) indextype is ctxsys.context parameters('lexer kenny.my_lexer stoplist kenny.my_stop');
The multi-lexer decides which lexer to use for each row based on a language column. This is a character column in the table which stores the language of the document in the text column. For example, you create the table
globaldoc to hold documents of different languages:
global_lexer is a multi-lexer preference you created. To index the
global_doc table, you specify the multi-lexer preference and the name of the language column as follows:
create index globalx on globaldoc(text) indextype is ctxsys.context parameters ('lexer global_lexer language column lang');
This example shows how to set up parallel indexing. Do the following:
The JOB_QUEUE_PROCESSES parameter must be greater than or equal to your parallel degree.
This example starts the indexing job within four seconds of issuing the CREATE INDEX command. If you are not worried about the immediacy of the indexing operation, you can set a longer JOB_QUEUE_INTERVAL in seconds.
pkcolumn as the hash key:
The issuing user does not need the CTXAPP role to create an index. If the user has Oracle grants to create a b-tree index on the column, then this user has sufficient permission to create a Text index. The issuing owner, table owner, and index owner can all be different users, which is the standard behavior for regular b-tree indexes.
For example to view the most recent errors on your indexes, you can issue:
To clear the view of errors, you can issue: