Oracle8i interMedia Text Reference
Release 2 (8.1.6)

Part Number A77063-01

Library

Product

Contents

Index

Go to previous page Go to beginning of chapter Go to next page

CTX_DOC Package , 3 of 8


GIST

Use the CTX_DOC.GIST procedure to generate a Gist and theme summaries for a document. You can generate paragraph-level or sentence-level Gists/theme summaries.

Syntax 1: In-Memory Storage

CTX_DOC.GIST(
              index_name    IN VARCHAR2, 
              textkey       IN VARCHAR2, 
              restab        IN OUT CLOB, 
              query_id      IN NUMBER DEFAULT 0,
              glevel        IN VARCHAR2 DEFAULT 'P',
              pov           IN VARCHAR2 DEFAULT NULL,
              numParagraphs IN NUMBER DEFAULT 16,
              maxPercent    IN NUMBER DEFAULT 10);

Syntax 2: Result Table Storage

CTX_DOC.GIST(
              index_name    IN VARCHAR2, 
              textkey       IN VARCHAR2, 
              restab        IN VARCHAR2, 
              query_id      IN NUMBER DEFAULT 0,
              glevel        IN VARCHAR2 DEFAULT 'P',
              pov           IN VARCHAR2 DEFAULT NULL,
              numParagraphs IN NUMBER DEFAULT 16,
              maxPercent    IN NUMBER DEFAULT 10);

index_name

Specify the name of the index associated with the text column containing the document identified by textkey.

textkey

Specify the unique identifier (usually the primary key) for the document.

The textkey parameter can be one of the following:

You toggle between primary key and rowid identification using CTX_DOC.SET_KEY_TYPE.

restab

You can specify that this procedure store the Gist and theme summaries to either a table or to an in-memory CLOB.

To store results to a table specify the name of the table.

See Also:

For more information about the structure of the Gist result table, see "Gist Table" in Appendix B

To store results in memory, specify the name of the CLOB locator. If restab is NULL, a temporary CLOB is allocated and returned. You must de-allocate the locator after using it.

If restab is not NULL, the CLOB is truncated before the operation.

query_id

Specify an identifier to use to identify the row(s) inserted into restab.

glevel

Specify the type of Gist/theme summary to produce. The possible values are:

The default is P.

pov

Specify whether a Gist or a single theme summary is generated. The type of Gist/theme summary generated (sentence-level or paragraph-level) depends on the value specified for glevel.

To generate a Gist for the entire document, specify a value of `GENERIC' for pov. To generate a theme summary for a single theme in a document, specify the theme as the value for pov.

When using result table storage and you do not specify a value for POV, this procedure returns the generic Gist plus up to fifty theme summaries for the document.

When using in-memory result storage to a CLOB, you must specify a pov. However, if you do not specify pov, this procedure generates only a generic Gist for the document.


Note:

The pov parameter is case sensitive. To return a Gist for a document, specify `GENERIC' in all uppercase. To return a theme summary, specify the theme exactly as it is generated for the document.

Only the themes generated by CTX_DOC.THEMES for a document can be used as input for pov


numParagraphs

Specify the maximum number of document paragraphs (or sentences) selected for the document Gist/theme summaries. The default is 16.


Note:

The numParagraphs parameter is used only when this parameter yields a smaller Gist/theme summary size than the Gist/theme summary size yielded by the maxPercent parameter.

This means that the system always returns the smallest size Gist/theme summary. 


maxPercent

Specify the maximum number of document paragraphs (or sentences) selected for the document Gist/theme summaries as a percentage of the total paragraphs (or sentences) in the document. The default is 10.


Note:

The maxPercent parameter is used only when this parameter yields a smaller Gist/theme summary size than the Gist/theme summary size yielded by the numParagraphs parameter.

This means that the system always returns the smallest size Gist/theme summary.  


Examples

In-Memory Gist

The following example generates a non-default size generic Gist of at most ten paragraphs. The result is stored in memory in a CLOB locator. The code then de-allocates the returned CLOB locator after using it.

declare
  gklob clob;
  amt number := 40;
  line varchar2(80);

begin
 ctx_doc.gist('newsindex','34','gklob',1,glevel => 'P',pov => 'GENERIC',     
numParagraphs => 10);
  -- gklob is NULL when passed-in, so ctx-doc.gist will allocate a temporary
  -- CLOB for us and place the results there.
  
  dbms_lob.read(gklob, amt, 1, line);
  dbms_output.put_line('FIRST 40 CHARS ARE:'||line);
  -- have to de-allocate the temp lob
  dbms_lob.freetemporary(gklob);
 end;

Result Table Gists

The following example creates a Gist table called CTX_GIST:

create table CTX_GIST (query_id  number,
                       pov       varchar2(80), 
                       gist      CLOB);
Gists

The following example returns a default sized paragraph level Gist for document 34 as well as a theme summary for all the themes in the document:

begin
ctx_doc.gist('newsindex','34','CTX_GIST',1,glevel => 'P');
end;

The following example generates a non-default size Gist of at most ten paragraphs:

begin
ctx_doc.gist('newsindex','34','CTX_GIST',1,glevel => 'P',pov => 'GENERIC', 
numParagraphs => 10);
end;

The following example generates a Gist whose number of paragraphs is at most ten percent of the total paragraphs in document:

begin 
ctx_doc.gist('newsindex','34','CTX_GIST',1, glevel =>'P',pov => 'GENERIC', 
maxPercent => 10);
end;
Theme Summary

The following example returns a paragraph level theme summary for insects for document 34. The default theme summary size is returned.

begin
ctx_doc.gist('newsindex','34','CTX_GIST',1,glevel =>'P', pov => 'insects');
end;

Notes

By default when using result table storage and you specify no pov, this procedure generates up to 50 theme summaries for a document. As a result, CTX_DOC.GIST creates a maximum of 51 gists for each document: one theme summary for each theme and one Gist for the entire document.

When you use in-memory storage, CTX_DOC.GIST creates only one Gist.

When textkey is a composite textkey, you must encode the composite textkey string using the CTX_DOC.PKENCODE procedure as in the second example above.


Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2000, Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index