Oracle8i Application Developer's Guide - XML Release 3 (8.1.7) Part Number A86030-01 |
|
Using interMedia Text to Search and Retrieve Data from XML Documents, 19 of 22
See the section, "Querying with the CONTAINS Operator" for information about how to use the CONTAINS operator in query statements.
You can query within attribute sections when you index with either XML_SECTION_GROUP or AUTO_SECTION_GROUP as your section group type.
Assume you have an XML document as follows:
<book title="Tale of Two Cities">It was the best of times.</book>
You can define the section title@book as the attribute section title. You can do so with the CTX_DLL.Add_Attr_Section procedure or dynamically after indexing with ALTER INDEX.
If you use the XML_SECTION_GROUP, you can name attribute sections anything with CTX_DDL.Add_Attr_Section.
To search on Tale within the attribute section title, issue the following query:
WHERE CONTAINS (...,'Tale WITHIN title', ...)
Consider an XML file that defines the BOOK tag with a TITLE attribute as follows:
<BOOK TITLE="Tale of Two Cities"> It was the best of times. </BOOK> <Author="Charles Dickens"> Born in England in the town, Stratford_Upon_Avon </Author>
Recall the CTX_DDL.Add_Attr_Section syntax is:
CTX_DDL.Add_Attr_Section ( group_name, section_name, tag );
To define the title attribute as an attribute section, create an XML_SECTION_GROUP and define the attribute section as follows:
ctx_ddl_create_section_group('myxmlgroup', 'XML_SECTION_GROUP'); ctx_ddl.add_attr_section('myxmlgroup', 'booktitle', 'book@title'); ctx_ddl.add_attr_section('myxmlgroup', 'authors', 'author'); end;
When you define the TITLE attribute section as such and index the document set, you can query the XML attribute text as follows:
... WHERE CONTAINS (...,'Cities WITHIN booktitle', ....)...
When you define the AUTHOR attribute section as such and index the document set, you can query the XML attribute text as follows:
... WHERE 'England WITHIN authors'
This example does the following:
drop table res_xml; CREATE TABLE res_xml ( pk NUMBER PRIMARY KEY , text CLOB ) ; insert into res_xml values(111, 'ENTITY chap8 "Chapter 8, <q>Keeping it Tidy: the XML Rule Book </q>"> this is the document section'); commit; --- --- script to create index on res_xml --- --- cleanup, in case we have run this before DROP INDEX res_index ; EXEC CTX_DDL.DROP_SECTION_GROUP ( 'res_sections' ) ; --- create a section group BEGIN CTX_DDL.CREATE_SECTION_GROUP ( 'res_sections', 'XML_SECTION_GROUP' ) ; CTX_DDL.ADD_FIELD_SECTION ( 'res_sections', 'chap8', '<q>') ; END ; / begin ctx_ddl.create_preference ( preference_name => 'my_basic_lexer', object_name => 'basic_lexer' ); ctx_ddl.set_attribute ( preference_name => 'my_basic_lexer', attribute_name => 'index_text', attribute_value => 'true' ); ctx_ddl.set_attribute ( preference_name => 'my_basic_lexer', attribute_name => 'index_themes', attribute_value => 'false'); end; / CREATE INDEX res_index ON res_xml(text) INDEXTYPE IS ctxsys.context PARAMETERS ( 'lexer my_basic_lexer SECTION GROUP res_sections' ) ;
Test the above index with a test query, such as:
SELECT pk FROM res_xml WHERE CONTAINS( text, 'keeping WITHIN chap8' )>0 ;
drop table explain_ex; create table explain_ex ( id number primary key, text varchar(2000) ); insert into explain_ex ( id, text ) values ( 1, 'thinks thinking thought go going goes gone went' || chr(10) || 'oracle orackle oricle dog cat bird' || chr(10) || 'President Clinton' ); insert into explain_ex ( id, text ) values ( 2, 'Last summer I went to New England' || chr(10) || 'I hiked a lot.' || chr(10) || 'I camped a bit.' ); commit;
begin Ctx_Ddl.Drop_Preference ('my_lexer'); end; / begin Ctx_Ddl.Create_Preference ( 'my_lexer', 'basic_lexer'); Ctx_Ddl.Set_Attribute ( 'my_lexer', 'index_text', 'true' ); /* Experiment with 'index_themes' = 'false' */ Ctx_Ddl.Set_Attribute ( 'my_lexer', 'index_themes', 'true' ); end; / begin Ctx_Ddl.Drop_Stoplist ( 'my_stoplist' ); end; / begin Ctx_Ddl.Create_Stoplist ( 'my_stoplist' ); Ctx_Ddl.Add_Stopword ( stoplist_name => 'my_stoplist', stopword => 'because' ); Ctx_Ddl.Add_Stopword ( 'my_stoplist', 'and' ); Ctx_Ddl.Add_Stopword ( 'my_stoplist', 'in' ); Ctx_Ddl.Add_Stopword ( 'my_stoplist', 'to' ); Ctx_Ddl.Add_Stopword ( 'my_stoplist', 'a' ); Ctx_Ddl.Add_Stopword ( 'my_stoplist', 'I' ); end; / drop index explain_ex_text; select err_text from ctx_user_index_errors; create index explain_ex_text on explain_ex ( text ) indextype is ctxsys.context parameters ( 'lexer my_lexer stoplist ctxsys.empty_stoplist' ); select err_text from ctx_user_index_errors; begin Ctx_Ddl.Drop_Preference ('my_lexer'); Ctx_Ddl.Drop_Stoplist ( 'my_stoplist' ); end; /
Set Define Off select text from explain_ex WHERE CONTAINS ( text, '( $( think & go ) , ?oracle ) & ( dog , ( cat & bird ) ) & about(mammal during Bill Clinton)' ) > 0; select text from explain_ex WHERE CONTAINS ( text, 'about ( camping and hiking in new england )' ) > 0;
|
Copyright © 1996-2000, Oracle Corporation. All Rights Reserved. |
|