Oracle Text Application Developer's Guide
Release 9.0.1

Part Number A90122-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to beginning of chapter Go to next page

Document Section Searching, 4 of 4


XML Section Searching

Like HTML documents, XML documents have tagged text which you can use to define blocks of text for section searching. The contents of a section can be searched on with the WITHIN or INPATH operators.

For XML searching, you can do the following:

Automatic Sectioning

You can set up your indexing operation to automatically create sections from XML documents using the section group AUTO_SECTION_GROUP. The system creates zone sections for XML tags. Attribute sections are created for the tags that have attributes and these sections named in the form tag@attribute.

For example, the following command creates the index myindex on a column containing the XML files using the AUTO_SECTION_GROUP:

CREATE INDEX myindex ON xmldocs(xmlfile) INDEXTYPE IS ctxsys.context PARAMETERS 
('datastore ctxsys.default_datastore filter ctxsys.null_filter section group 
ctxsys.auto_section_group');

Attribute Searching

You can search XML attribute text in one of two ways:

Creating Attribute Sections

Consider an XML file that defines the BOOK tag with a TITLE attribute as follows:

<BOOK TITLE="Tale of Two Cities"> 
  It was the best of times. 
</BOOK> 

To define the title attribute as an attribute section, create an XML_SECTION_GROUP and define the attribute section as follows:

begin
ctx_ddl.create_section_group('myxmlgroup', 'XML_SECTION_GROUP');
ctx_ddl.add_attr_section('myxmlgroup', 'booktitle', 'book@title');
end;

To index:

CREATE INDEX myindex ON xmldocs(xmlfile) INDEXTYPE IS ctxsys.context PARAMETERS 
('datastore ctxsys.default_datastore filter ctxsys.null_filter section group 
myxmlgroup');

You can query the XML attribute section booktitle as follows:

'Cities within booktitle'

Searching Attributes with the INPATH Operator

You can search attribute text with the INPATH operator. To do so, you must index your XML document set with the PATH_SECTION_GROUP.

See Also:

"Path Section Searching" in this chapter. 

Creating Document Type Sensitive Sections

You have an XML document set that contains the <book> tag declared for different document types. You want to create a distinct book section for each document type.

Assume that mydocname1 is declared as an XML document type (root element) as follows:

<!DOCTYPE mydocname1 ... [...

Within mydocname1, the element <book> is declared. For this tag, you can create a section named mybooksec1 that is sensitive to the tag's document type as follows:

begin

ctx_ddl.create_section_group('myxmlgroup', 'XML_SECTION_GROUP');
ctx_ddl.add_zone_section('myxmlgroup', 'mybooksec1', 'mydocname1(book)');
end;

Assume that mydocname2 is declared as another XML document type (root element) as follows:

<!DOCTYPE mydocname2 ... [...

Within mydocname2, the element <book> is declared. For this tag, you can create a section named mybooksec2 that is sensitive to the tag's document type as follows:

begin

ctx_ddl.create_section_group('myxmlgroup', 'XML_SECTION_GROUP');
ctx_ddl.add_zone_section('myxmlgroup', 'mybooksec2', 'mydocname2(book)');
end;

To query within the section mybooksec1, use WITHIN as follows:

'oracle within mybooksec1'

Path Section Searching

XML documents can have parent-child tag structures such as the following:

<A> <B> <C> dog </C> </B </A>

In this example, tag C is a child of tag B which is a child of tag A.

With Oracle Text, you can do path searching with PATH_SECTION_GROUP. This section group allows you to specify direct parentage in queries, such as to find all documents that contain the term dog in element C which is a child of element B and so on.

With PATH_SECTION_GROUP, you can also perform attribute value searching and attribute equality testing.

The new operators associated with this feature are

Creating Index with PATH_SECTION_GROUP

To enable path section searching, index your XML document set with PATH_SECTION_GROUP.

Create the preference:

begin
ctx_ddl.create_section_group('xmlpathgroup', 'PATH_SECTION_GROUP');
end;

Create the index:

CREATE INDEX myindex ON xmldocs(xmlfile) INDEXTYPE IS ctxsys.context PARAMETERS 
('datastore ctxsys.default_datastore filter ctxsys.null_filter section group 
xmlpathgroup');

When you create the index, you can use the INPATH and HASPATH operators.

Top-Level Tag Searching

To find all documents that contain the term dog in the top-level tag <A>:

dog INPATH (/A)

or

dog INPATH(A)

Any-Level Tag Searching

To find all documents that contain the term dog in the <A> tag at any level:

dog INPATH(//A)

This query finds the following documents:

<A>dog</A>

and

<A><B><C>dog</C></B></A>

Direct Parentage Searching

To find all documents that contain the term dog in a B element that is a direct child of a top-level A element:

dog INPATH(A/B)

This query finds the following XML document:

<A><B>My dog is friendly.</B><A>

but does not find:

<C><B>My dog is friendly.</B></C>

Tag Value Testing

You can test the value of tags. For example, the query:

dog INPATH(A[B="dog"])

Finds the following document:

<A><B>dog</B></A>

But does not find:

<A><B>My dog is friendly.</B></A>

Attribute Searching

You can search the content of attributes. For example, the query:

dog INPATH(//A/@B)

Finds the document

<C><A  B="snoop dog"> </A> </C>

Attribute Value Testing

You can test the value of attributes. For example, the query

California INPATH (//A[@B = "home address"])

Finds the document:

<A B="home address">San Francisco, California, USA</A>

But does not find:

<A B="work address">San Francisco, California, USA</A>

Path Testing

You can test if a path exists with the HASPATH operator. For example, the query:

HASPATH(A/B/C)

finds and returns a score of 100 for the document

<A><B><C>dog</C></B></A>

without the query having to reference dog at all.

Section Equality Testing with HASPATH

You can use the HASPATH operator to do section quality tests. For example, consider the following query:

dog INPATH A

finds

<A>dog</A>

but it also finds

<A>dog park</A>

To limit the query to the term dog and nothing else, you can use a section equality test with the HASPATH operator. For example,

HASPATH(A="dog")

finds and returns a score of 100 only for the first document, and not the second.

See Also:

Oracle Text Reference to learn more about using the INPATH and HASPATH operators. 



Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback