|Oracle® Text Reference
11g Release 2 (11.2)
Part Number E16593-02
The following describes new features of the Oracle Database 11g Release 2 (11.2) edition of Oracle Text and provides pointers to additional information. Information about new features from previous releases is also retained to help you migrate to the current release.
The following sections describe the new features in Oracle Text:
Entity extraction and identification
You can search for terms that are unknown to you without specifying a particular search text. This process of identifying names, places, dates, and other objects when they are mentioned in a document and tagging each occurrence with its type and subtype correctly is called entity extraction.
The value of entity extraction is that it enables you to identify instances of a particular pre-specified class of entities in textual documents, thus enabling you to produce a structured view of a document that can later be used for text/data mining and more comprehensive intelligence analysis.
Someone accustomed to the spelling rules of one culture can have difficulty applying those same rules to a name originating from a different culture. Name matching provides a solution to match proper names that might differ in spelling due to orthographic variation. It also enables you to search for somewhat inaccurate data, such as might occur when a record's first name and surname are not propertly segmented.
Result set interface
A page of search results typically consists of many disparate elements, such as metadata of the first few documents, per-word hit counts, or total hit counts. In past releases of Oracle Text, generating these results required several queries and calls, such as a query on the base table, a call to
CTX_QUERY.COUNT_HITS, and so on. Each call required time to reparse the query and look up index metadata. In this release, instead of accessing the database to construct bits of the search results, you can use the result set interface, which is able to produce the various kinds of data needed for a page of search results all at once, thus improving performance by sharing overhead. The result set interface can also return data views that are difficult to express in SQL, such as top N by category queries.
On Windows systems, the executable file that you specify for the
command attribute must now exist in the
%ORACLE_HOME%/ctx/bin directory instead of
Zero downtime for applications with new incremental indexing and online index creation.
See Also:"Creating a CONTEXT Index Incrementally with POPULATE_PENDING" in Oracle Text Application Developer's Guide
New features for re-creating an index online and finer control for maintenance processes.
See Also:"Re-Creating an Index" in Oracle Text Application Developer's Guide
New Oracle Text Manager in Oracle Enterprise Manager with which you can:
Monitor health of Oracle Text indexes.
Modify index settings.
Generate index-level statistics about disk space, fragmentation, garbage, frequency of words, and more.
Synchronize, optimize, and rebuild indexes.
Diagnose problems, and resume failed operations.
See Also:"Text Manager in Oracle Enterprise Manager" in Oracle Text Application Developer's Guide
New support for composite domain index for
CONTEXT indextype for improved mixed-query performance.
See Also:"Composite Domain Index (CDI) in Oracle Text" in Oracle Text Application Developer's Guide
Improved query performance and scalability.
See Also:"Parallelizing Queries Across Oracle RAC Nodes" in Oracle Text Application Developer's Guide
SDATA section type and
SDATA operator that enable range searches on metadata.
See Also:"SDATA Section" in Oracle Text Application Developer's Guide
New user-defined scoring feature,
AUTO_LEXER lexer type that performs language identification, word segmentation, document analysis, part-of-speech tagging, and stemming. The
AUTO_LEXER type also enables customization of these components.
New values for the
INDEX_STEMS attribute of the
BASIC_LEXER type to enable better query performance for stem ($) queries.
NOPOPULATE option for
INDEX to support incremental indexing.
See Also:"POPULATE | NOPOPULATE"
New limit for the number of partitions in Oracle Text is now the same as the maximum for Oracle Database.
See Also:"Partitioned Tables and Indexes" in Oracle Text Application Developer's Guide
New usage tracking feature.
See Also:"Database Feature Usage Tracking in Oracle Enterprise Manager" in Oracle Text Application Developer's Guide