Changes in This Release for Oracle Text Application Developer's Guide

This preface describes changes in Oracle Text for this release.

Changes in Oracle Text for Oracle Database Release 18c, Version 18.1

The changes in Oracle Text for Oracle Database release 18c, version 18.1 are described in this topic.

New Features

This section describes the primary new features for Oracle Text introduced in Oracle Database release 18c, version 18.1.

Faceted Navigation Support

Oracle Text provides faceted navigation support. You can now build applications with faceted navigation support by using SDATA sections and Result Set Interface queries.

Support for Efficient Wildcard Search

Wildcard indexing supports fast and efficient wildcard search for all wildcard expressions. A new wordlist preference, WILDCARD_INDEX, replaces the current options (SUBSTRING_INDEX, PREFIX_INDEX, and REVERSE_INDEX). When you enable WILDCARD_INDEX, a K-gram (fixed-length substring particles) index indexes all substrings within each token. A new wordlist preference, WILDCARD_INDEX_K, controls the length of grams used (that is, the length of each substring indexed).

Wildcard indexing is supported for languages which only use single-byte characters.

See Also:

Automatic Background Index Maintenance

The query performance deteriorates when the $G table is too fragmented. To avoid this, Oracle Text now provides automatic background optimize merge for every index or partition.

If the index synchronization has been running for some time, the index tables might get fragmented. To avoid this issue, Oracle Text now runs automatic background jobs to optimize the various index tables.

See Also:

Support for Concurrent Data Manipulation Language Operations

Synchronization is performed as part of the same transaction for indexes created with the SYNC (ON COMMIT) option. If there is a fatal index synchronization error, the entire data transaction is rolled back. Non-fatal (individual row) synchronization errors are logged in the CTX_USER_INDEX_ERRORS view but the transaction still completes.

New Options to Optimize the Index

The CTX_DDL.OPTIMIZE_INDEX procedure has two new parameters, maxtokens and section_type.

Support Indexing of JSON Key Names Longer Than 64 Characters

All Oracle Text index types, except CTXCAT and CTXRULE indexes, store tokens in a table column of type VARCHAR2 (255 BYTE) now. SDATA sections continue to store tokens in a table column of type VARCHAR2 (249 BYTE).

Note:

You must rebuild any JSON search indexes and Oracle Text indexes created prior to Oracle Database 18c if they index JSON data that contains object fields with names longer than 64 bytes. See Oracle Database Upgrade Guide for more information.

See Also:

Oracle Text Reference for more information about token limitations

Deprecated Features

The following features are deprecated in Oracle Database Release 18c, and may be desupported in a future release:

The deprecated features for Oracle Database Release 18c are described in Oracle Database Upgrade Guide.

Changes in Oracle Text 12c Release 2 (12.2.0.1)

The changes in Oracle Text for Oracle Database 12c Release 2 (12.2.0.1) are described in this topic.

New Features

This section describes the primary new features for Oracle Text introduced in Oracle Database 12c Release 2 (12.2.0.1).

SDATA Section Improvements

Oracle Text provides enhancements to the SDATA section operations. A new kind of SDATA section is added.

See Also:

SDATA Section

Keep Updated Documents in the Index

Oracle Text can keep updated index entries to search for original content with the ASYNCHRONOUS_UPDATE option.

DML Improvements

Oracle Text supports a new storage preference, SMALL_R_ROW, for indexed lookups.

Oracle Text discontinued locked base table rows. A new $U table for each index or partition keeps track of all concurrent updates and also introduces the new $U_TABLE_CLAUSE storage clause for this $U table.

See Also:

Reverse Token Index for Left-Truncated Queries

Oracle Text provides the new REVERSE_INDEX attribute for left-truncated queries. This attribute is part of the wordlist preference and can be set to TRUE or FALSE. It is set to FALSE by default so that the new feature is disabled. You can set this attribute with the CTX_DDL.SET_ATTRIBUTE procedure. You can also add it with ALTER INDEX REBUILD, just like any other wordlist preference. Use this attribute if you want better query performance for left-truncated queries.

See Also:

Oracle Text Reference for more information about the BASIC_WORDLIST attributes table and the REVERSE_INDEX attribute

Partition-Specific Near Real-Time Indexes

Oracle Text supports the partition-specific STAGE_ITAB option, which provides a two-level index mechanism to prevent the main index from fragmenting because of frequent inserts, updates, or deletes. Set this option at a partition level if, for example, partitions contain mostly static data, whereas other partitions contain rapidly changing data.

The STAGE_ITAB_PARALLEL storage option controls the level of parallelism used to merge the data from the $G staging table back into the $I table.

To prevent the near real-time $G index table from becoming too large to fit into memory, specify a maximum size for the table.

Sentiment Analysis and Collocates

Oracle Text supports sentiment analysis and collocates. Sentiment analysis lets you identify positive and negative trends associated with search terms. Collocates let you identify other keywords that are related to, or used frequently with, a specified keyword.

NEAR2 Operator and NDATA Operator Enhancements

Oracle Text provides a new operator, NEAR2, that is an enhanced version of the existing NEAR operator. The NEAR2 operator aims to combine the semantics of the PHRASE, NEAR, and AND operators.

The NDATA operator now provides more control on the similarity scoring of character and phonetic matches. It also provides more control on the overall ranked results returned by the operation.

Join Character Support for Japanese VGRAM Lexer and WORLD Lexer

Oracle Text provides join character support for the Japanese VGRAM lexer and WORLD lexer.

New Document Formats

Oracle Text provides new text filters to support new document formats.

Extract Synonyms of Words in Documents

Oracle Text provides new options in the CTX_DOC package to enable thesaurus support. You can use the CTX_DOC.TOKENS and CTX_DOC.POLICY_TOKENS procedures to extract synonyms of index tokens.

See Also:

Read-only MDATA Sections

Oracle Text supports read-only MDATA sections. When a section is queried, an extra cursor is not opened for each MDATA operator. Because you cannot add or remove MDATA values in a nonupdatable MDATA section, there is no extra overhead in tracking the updated MDATA values, and queries run faster.

See Also:

MDATA Section

Index Name Length and Long Identifier Support for Oracle Text Objects

Oracle Text index names can be as long as the database object names. The length is 128 bytes for Oracle Database release 12.2 compatible and later, and it is 30 bytes for earlier releases. Oracle Text supports long identifiers for Oracle Text objects. The maximum size was increased to 128 bytes.

Increased Default Value and Upper Limit of the MAX_INDEX_MEMORY Parameter

Oracle Text provides an increase in the default value and the upper limit of the MAX_INDEX_MEMORY parameter that can be allocated for indexing purposes. The size was increased to 256 GB.

JSON Improvements

You can use a simpler alternative syntax to create a search index on JSON.

See Also:

Oracle Database JSON Developer's Guide for more information about creating a search index for JSON