8.3 Obtaining Lists of Themes, Gists, and Theme Summaries
The following table describes lists of themes, gists, and theme summaries.
Table 8-1 Lists of Themes, Gists, and Theme Summaries
| Output Type | Description |
|---|---|
|
A list of the main concepts of a document. Each theme is a single word, a single phrase, or a hierarchical list of parent themes. |
|
|
Text in a document that best represents what the document is about as a whole. |
|
|
Text in a document that best represents a given theme in the document. |
To obtain lists of themes, gists, and theme summaries, use procedures in the CTX_DOC package to:
-
Identify documents by
ROWIDin addition to primary key -
Store results in-memory for improved performance
8.3.1 Lists of Themes
A list of themes is a list of the main concepts in a document. Use the CTX_DOC.THEMES procedure to generate lists of themes.
See Also:
Oracle Text Reference to learn more about the command syntax for CTX_DOC.THEMES
The following in-memory theme example generates the top 10 themes for document 1 and stores them in an in-memory table called the_themes. The example then loops through the table to display the document themes.
declare
the_themes ctx_doc.theme_tab;
begin
ctx_doc.themes('myindex','1',the_themes, numthemes=>10);
for i in 1..the_themes.count loop
dbms_output.put_line(the_themes(i).theme||':'||the_themes(i).weight);
end loop;
end;
The following example create a result table theme:
create table ctx_themes (query_id number,
theme varchar2(2000),
weight number);
In this example, you obtain a list of themes where each element in the list is a single theme:
begin
ctx_doc.themes('newsindex','34','CTX_THEMES',1,full_themes => FALSE);
end;
In this example, you obtain a list of themes where each element in the list is a hierarchical list of parent themes:
begin
ctx_doc.themes('newsindex','34','CTX_THEMES',1,full_themes => TRUE);
end;8.3.2 Gist and Theme Summary
A gist is the text in a document that best represents what the document is about as a whole. A theme summary is the text in a document that best represents a single theme in the document.
Use the CTX_DOC.GIST procedure to generate gists and theme summaries. You can specify the size of the gist or theme summary when you call the procedure.
See Also:
Oracle Text Reference to learn about the command syntax for CTX_DOC.GIST
In-Memory Gist Example
The following example generates a nondefault size generic gist of at most 10 paragraphs. The result is stored in memory in a CLOB locator. The code then de-allocates the returned CLOB locator after using it.
declare
gklob clob;
amt number := 40;
line varchar2(80);
begin
ctx_doc.gist('newsindex','34','gklob',1,glevel => 'P',pov => 'GENERIC', numParagraphs => 10);
-- gklob is NULL when passed-in, so ctx-doc.gist will allocate a temporary
-- CLOB for us and place the results there.
dbms_lob.read(gklob, amt, 1, line);
dbms_output.put_line('FIRST 40 CHARS ARE:'||line);
-- have to de-allocate the temp lob
dbms_lob.freetemporary(gklob);
end;Result Table Gists Example
To create a gist table, enter the following:
create table ctx_gist (query_id number,
pov varchar2(80),
gist CLOB);
The following example returns a default-sized paragraph gist for document 34:
begin
ctx_doc.gist('newsindex','34','CTX_GIST',1,'PARAGRAPH', pov =>'GENERIC');
end;
The following example generates a nondefault size gist of 10 paragraphs:
begin
ctx_doc.gist('newsindex','34','CTX_GIST',1,'PARAGRAPH', pov =>'GENERIC', numParagraphs => 10);
end;
The following example generates a gist whose number of paragraphs is 10 percent of the total paragraphs in the document:
begin
ctx_doc.gist('newsindex','34','CTX_GIST',1, 'PARAGRAPH', pov =>'GENERIC', maxPercent => 10);
end;Theme Summary Example
The following example returns a theme summary on the theme of insects for document with textkey 34. The default gist size is returned.
begin
ctx_doc.gist('newsindex','34','CTX_GIST',1, 'PARAGRAPH', pov => 'insects');
end;