Skip Headers

Oracle® Text Application Developer's Guide
10g Release 1 (10.1)

Part Number B10729-01
Go to Documentation Home
Go to Book List
Book List
Go to Index
Go to Master Index
Master Index
Go to Feedback page

Go to next page
View PDF


List of Figures

List of Tables

Title and Copyright Information

Send Us Your Comments


Related Documentation
Documentation Accessibility

1 Oracle Text Application Development

1.1 What is Oracle Text?
1.2 Designing Your Application
1.3 Text Queries on Document Collections
1.3.1 Flowchart of Text Query Application
1.4 Queries on Catalog Information
1.4.1 Flowchart for Catalog Query Application
1.5 Document Classification
1.6 XML Searching
1.6.1 Using Oracle Text
1.6.2 Using the Oracle XML DB Framework
1.6.3 Combining Oracle Text features with Oracle XML DB Using the Text-on-XML Method Using the XML-on-Text Method

2 Getting Started with Oracle Text

2.1 Overview of Getting Started with Oracle Text
2.2 Creating an Oracle Text User
2.3 Query Application Quick Tour
2.3.1 Building Web Applications with the Oracle Text Wizard Oracle JDeveloper Oracle Text Wizard Addins Oracle Text Wizard Instructions
2.4 Catalog Application Quick Tour
2.5 Classification Application Quick Tour
2.5.1 Steps for Creating a Classification Application

3 Indexing

3.1 About Oracle Text Indexes
3.1.1 Type of Index
3.1.2 Structure of the Oracle Text CONTEXT Index Merged Word and Theme Index
3.1.3 The Oracle Text Indexing Process Datastore Object Filter Object Sectioner Object Lexer Object Indexing Engine
3.1.4 Partitioned Tables and Indexes Querying Partitioned Tables
3.1.5 Creating an Index Online
3.1.6 Parallel Indexing
3.1.7 Indexing and Views
3.2 Considerations For Indexing
3.2.1 Location of Text Supported Column Types Storing Text in the Text Table Storing File Path Names Storing URLs Storing Associated Document Information Format and Character Set Columns Supported Document Formats Summary of DATASTORE Types
3.2.2 Document Formats and Filtering No Filtering for HTML Filtering Mixed-Format Columns Custom Filtering
3.2.3 Bypassing Rows for Indexing
3.2.4 Document Character Set Mixed Character Set Columns
3.2.5 Document Language Languages Features Outside BASIC_LEXER Indexing Multi-language Columns
3.2.6 Indexing Special Characters Printjoins Character Skipjoins Character Other Characters
3.2.7 Case-Sensitive Indexing and Querying
3.2.8 Language Specific Features Indexing Themes Base-Letter Conversion for Characters with Diacritical Marks Alternate Spelling Composite Words Korean, Japanese, and Chinese Indexing
3.2.9 Fuzzy Matching and Stemming
3.2.10 Better Wildcard Query Performance
3.2.11 Document Section Searching
3.2.12 Stopwords and Stopthemes Multi-Language Stoplists
3.2.13 Index Performance
3.2.14 Query Performance and Storage of LOB Columns
3.3 Index Creation
3.3.1 Procedure for Creating a CONTEXT Index
3.3.2 Creating Preferences Datastore Examples NULL_FILTER Example: Indexing HTML Documents PROCEDURE_FILTER Example BASIC_LEXER Example: Setting Printjoins Characters MULTI_LEXER Example: Indexing a Multi-Language Table BASIC_WORDLIST Example: Enabling Substring and Prefix Indexing
3.3.3 Creating Section Groups for Section Searching Example: Creating HTML Sections
3.3.4 Using Stopwords and Stoplists Multi-Language Stoplists Stopthemes and Stopclasses PL/SQL Procedures for Managing Stoplists
3.3.5 Creating an Index
3.3.6 Creating a CONTEXT Index CONTEXT Index and DML Default CONTEXT Index Example Custom CONTEXT Index Example: Indexing HTML Documents
3.3.7 Creating a CTXCAT Index CTXCAT Index and DML About CTXCAT Sub-Indexes and Their Costs Creating CTXCAT Sub-indexes Creating CTXCAT Index
3.3.8 Creating a CTXRULE Index Create a Table of Queries Create the CTXRULE Index Classifying a Document
3.4 Index Maintenance
3.4.1 Viewing Index Errors
3.4.2 Dropping an Index
3.4.3 Resuming Failed Index Example: Resuming a Failed Index
3.4.4 Rebuilding an Index Example: Rebuilding and Index
3.4.5 Dropping a Preference Example
3.5 Managing DML Operations for a CONTEXT Index
3.5.1 Viewing Pending DML
3.5.2 Synchronizing the Index Setting Background DML
3.5.3 Index Optimization CONTEXT Index Structure Index Fragmentation Document Invalidation and Garbage Collection Single Token Optimization Viewing Index Fragmentation and Garbage Data Examples: Optimizing the Index

4 Querying

4.1 Overview of Queries
4.1.1 Querying with CONTAINS CONTAINS SQL Example CONTAINS PL/SQL Example Structured Query with CONTAINS
4.1.2 Querying with CATSEARCH CATSEARCH SQL Query CATSEARCH Example
4.1.3 Querying with MATCHES MATCHES SQL Query MATCHES PL/SQL Example
4.1.4 Word and Phrase Queries CONTAINS Phrase Queries CATSEARCH Phrase Queries
4.1.5 Querying Stopwords
4.1.6 ABOUT Queries and Themes Querying Stopthemes
4.1.7 Query Expressions CONTAINS Operators CATSEARCH Operator MATCHES Operator
4.1.8 Case-Sensitive Searching Word Queries ABOUT Queries
4.1.9 Query Feedback
4.1.10 Query Explain Plan
4.1.11 Using a Thesaurus in Queries
4.1.12 Document Section Searching
4.1.13 Using Query Templating
4.1.14 Query Rewrite
4.1.15 Query Relaxation
4.1.16 Query Language
4.1.17 Alternative Scoring
4.1.18 Alternative Grammar
4.1.19 Query Analysis
4.1.20 Other Query Features
4.2 The CONTEXT Grammar
4.2.1 ABOUT Query
4.2.2 Logical Operators
4.2.3 Section Searching
4.2.4 Proximity Queries with NEAR and NEAR_ACCUM Operators
4.2.5 Fuzzy, Stem, Soundex, Wildcard and Thesaurus Expansion Operators
4.2.6 Using CTXCAT Grammar
4.2.7 Stored Query Expressions Defining a Stored Query Expression SQE Example
4.2.8 Calling PL/SQL Functions in CONTAINS
4.2.9 Optimizing for Response Time Other Factors that Influence Query Response Time
4.2.10 Counting Hits SQL Count Hits Example Counting Hits with a Structured Predicate PL/SQL Count Hits Example
4.3 The CTXCAT Grammar
4.3.1 Using CONTEXT Grammar with CATSEARCH

5 Document Presentation

5.1 Highlighting Query Terms
5.1.1 Text highlighting
5.1.2 Theme Highlighting
5.1.3 CTX_DOC Highlighting Procedures Highlight Procedure Markup Procedure Filter Procedure CTX_DOC.POLICY_FILTER Procedure
5.2 Obtaining Lists of Themes, Gists, and Theme Summaries
5.2.1 Lists of Themes In-Memory Themes Result Table Themes
5.2.2 Gist and Theme Summary In-Memory Gist Result Table Gists Theme Summary
5.3 Document Presentation and Highlighting
5.3.1 Highlighting Example
5.3.2 Document List of Themes Example
5.3.3 Gist Example

6 Document Classification

6.1 Overview
6.1.1 Classification Applications
6.2 Classification Solutions
6.3 Rule-Based Classification
6.3.1 Rule-based Classification Example
6.3.2 CTXRULE Parameters and Limitations
6.4 Supervised Classification
6.4.1 Decision Tree Supervised Classification Decision Tree Supervised Classification Example
6.4.2 SVM-Based Supervised Classification SVM-Based Supervised Classification Example
6.5 Unsupervised Classification (Clustering)
6.5.1 Clustering Example

7 Performance Tuning

7.1 Optimizing Queries with Statistics
7.1.1 Collecting Statistics Example
7.1.2 Re-Collecting Statistics
7.1.3 Deleting Statistics
7.2 Optimizing Queries for Response Time
7.2.1 Other Factors that Influence Query Response Time
7.2.2 Improved Response Time with FIRST_ROWS(n) for ORDER BY Queries About the FIRST_ROWS Hint
7.2.3 Improved Response Time using Local Partitioned CONTEXT Index Range Search on Partition Key Column ORDER BY Partition Key Column
7.2.4 Improved Response Time with Local Partitioned Index for Order by Score
7.3 Optimizing Queries for Throughput
7.3.1 CHOOSE and ALL ROWS Modes
7.3.2 FIRST_ROWS Mode
7.4 Tracing
7.5 Parallel Queries
7.6 Tuning Queries with Blocking Operations
7.7 Frequently Asked Questions a About Query Performance
7.7.1 What is Query Performance?
7.7.2 What is the fastest type of text query?
7.7.3 Should I collect statistics on my tables?
7.7.4 How does the size of my data affect queries?
7.7.5 How does the format of my data affect queries?
7.7.6 What is a functional versus an indexed lookup?
7.7.7 What tables are involved in queries?
7.7.8 Does sorting the results slow a text-only query?
7.7.9 How do I make a ORDER BY score query faster?
7.7.10 Which Memory Settings Affect Querying?
7.7.11 Does out of line LOB storage of wide base table columns improve performance?
7.7.12 How can I make a CONTAINS query on more than one column faster?
7.7.13 Is it OK to have many expansions in a query?
7.7.14 How can local partition indexes help?
7.7.15 Should I query in parallel?
7.7.16 Should I index themes?
7.7.17 When should I use a CTXCAT index?
7.7.18 When is a CTXCAT index NOT suitable?
7.7.19 What optimizer hints are available, and what do they do?
7.8 Frequently Asked Questions About Indexing Performance
7.8.1 How long should indexing take?
7.8.2 Which index memory settings should I use?
7.8.3 How much disk overhead will indexing require?
7.8.4 How does the format of my data affect indexing?
7.8.5 Can parallel indexing improve performance?
7.8.6 How can I improve index performance for creating local partitioned index?
7.8.7 How can I tell how much indexing has completed?
7.9 Frequently Asked Questions About Updating the Index
7.9.1 How often should I index new or updated records?
7.9.2 How can I tell when my indexes are getting fragmented?
7.9.3 Does memory allocation affect index synchronization?

8 Document Section Searching

8.1 About Document Section Searching
8.1.1 Enabling Section Searching Create a Section Group Define Your Sections Index your Documents Section Searching with WITHIN Operator Path Searching with INPATH and HASPATH Operators
8.1.2 Section Types Zone Section Field Section Stop Section MDATA Section Attribute Section Special Sections
8.2 HTML Section Searching
8.2.1 Creating HTML Sections
8.2.2 Searching HTML Meta Tags Example: Creating Sections for <META>Tags
8.3 XML Section Searching
8.3.1 Automatic Sectioning
8.3.2 Attribute Searching Creating Attribute Sections Searching Attributes with the INPATH Operator
8.3.3 Creating Document Type Sensitive Sections
8.3.4 Path Section Searching Creating Index with PATH_SECTION_GROUP Top-Level Tag Searching Any-Level Tag Searching Direct Parentage Searching Tag Value Testing Attribute Searching Attribute Value Testing Path Testing Section Equality Testing with HASPATH

9 Working With a Thesaurus

9.1 Overview of Thesauri
9.1.1 Thesaurus Creation and Maintenance CTX_THES Package Thesaurus Operators ctxload Utility
9.1.2 Case-sensitive Thesauri
9.1.3 Case-insensitive Thesauri
9.1.4 Default Thesaurus
9.1.5 Supplied Thesaurus Supplied Thesaurus Structure and Content Supplied Thesaurus Location
9.2 Defining Thesaural Terms
9.2.1 Defining Synonyms
9.2.2 Defining Hierarchical Relations
9.3 Using a Thesaurus in a Query Application
9.3.1 Loading a Custom Thesaurus and Issuing Thesaural Queries Advantage Limitations
9.3.2 Augmenting Knowledge Base with Custom Thesaurus Advantage Limitations Linking New Terms to Existing Terms Loading a Thesaurus with ctxload Compiling a Loaded Thesaurus
9.4 About the Supplied Knowledge Base
9.4.1 Adding a Language-Specific Knowledge Base Limitations

10 Administration

10.1 Oracle Text Users and Roles
10.1.1 CTXSYS User
10.1.2 CTXAPP Role
10.1.3 Granting Roles and Privileges to Users
10.2 DML Queue
10.3 The CTX_OUTPUT Package
10.4 The CTX_REPORT Package
10.5 Servers
10.6 Administration Tool

11 Migrating Applications from Earlier Releases

11.1 Security Improvements in Oracle Text
11.1.1 CTXSYS No Longer Has DBA Permissions
11.1.2 Migrating CTXSYS-Owned Procedures
11.1.3 Effective User During Indexing
11.1.4 Procedures Do Not Need to Be Owned by CTXSYS
11.1.5 Synching and Optimizing of Other Users' Indexes
11.1.6 CTX Packages and Invoker's Rights
11.1.7 CREATE TABLE Permissions
11.2 Migrating Back to Previous Releases

A CONTEXT Query Application

A.1 Web Query Application Overview
A.2 The PSP Web Application
A.2.1 Web Application Prerequisites
A.2.2 Building the Web Application
A.2.3 PSP Sample Code
A.2.3.1 loader.ctl
A.2.3.2 loader.dat
A.2.3.3 search_htmlservices.sql
A.2.3.4 search_html.psp
A.3 The JSP Web Application
A.3.1 Web Application Prerequisites
A.3.2 JSP Sample Code
A.3.2.1 search_html.jsp

B CATSEARCH Query Application

B.1 CATSEARCH Web Query Application Overview
B.2 The JSP Web Application
B.2.1 Building the JSP Web Application
B.2.2 JSP Sample Code
B.2.2.1 loader.ctl
B.2.2.2 loader.dat
B.2.2.3 catalogSearch.jsp