Table of Contents
BDB XML provides a robust and flexible indexing mechanism that can greatly improve the performance of your BDB XML queries. Designing your indexing strategy is one of the most important aspects of designing a BDB XML-based application.
To make the most effective usage of BDB XML indices, design your indices for your most frequently occurring XQuery queries. Be aware that BDB XML indices can be updated or deleted in-place so if you find that your application's queries have changed over time, then you can modify your indices to meet your application's shifting requirements.
The time it takes to re-index a container is proportional to the container's size. Re-indexing a container can be an extremely expensive and time-consuming operation. If you have large containers in use in a production setting, you should not expect container re-indexing to be a routine operation.
You can define indices for both document content and for metadata. You can also define default indices that are used for portions of your documents for which no other index is defined.
When you declare an index, you must identify its type and its syntax. You do this by providing the API with a string that identifies the type and syntax for the index. See Syntax Types for information on specifying the index syntax.
Finally, by default BDB XML does automatically index your containers, regardless of whether you added indexes yourself. You can turn this feature off if it is in your way. See Automatic Indexes for more information.
The index type is defined by the following four types of information:
Uniqueness indicates whether the indexed value must be unique within the container. For example, you can index an attribute and declare that index to be unique. This means the value indexed for the attribute must be unique within the container.
By default, indexed values are not unique; you must explicitly declare uniqueness for your indexing strategy in order for it to be enforced.
If you think of an XML document as a tree of nodes, then there are two
types of path elements in the tree. One type is just a node, such as an
element or attribute within the document. The other type is any
location in a path where two nodes meet. The path type, then, identifies
the path element type that you want indexed. Path type
node
indicates that you want to index a single node
in the path. Path type edge
indicates that you want
to index the portion of the path where two nodes meet.
Of the two of these, the BDB XML query processor prefers
edge
-type indices because they are more specific than
an node
-type index. This means that the query
processor will use a edge
-type index over a
node
-type if both indices provide similar
information.
Consider the following document:
<vendor type="wholesale"> <name>TriCounty Produce</name> <address>309 S. Main Street</address> <city>Middle Town</city> <state>MN</state> <zipcode>55432</zipcode> <phonenumber>763 555 5761</phonenumber> <salesrep> <name>Mort Dufresne</name> <phonenumber>763 555 5765</phonenumber> </salesrep> </vendor>
Suppose you want to declare an index for the name
node in the preceding document. In that case:
Path Type | Description |
---|---|
node
|
There are two locations in the document where the |
edge
|
There are two edge nodes in the document that involve the
and
Indices that use this path type are more specific because queries that cross these edge boundaries only have to examine one index entry for the document instead of two. |
Given this, use:
node
path types to improve queries where there can be no overlap
in the node name. That is, if the query is based on an element or attribute that
appears on only one context within the document, then use node
path types.
In the preceding sample document, you would want to use node-type indices with the
address
, city
, state
,
zipcode
, and salesrep
elements because
they appear in only one context within the document.
edge
path types to improve query performance when a node name is
used in multiple contexts within the document. In the preceding document, use edge
path types for the name
and phonenumber
elements because they appear in multiple (2) contexts within the document.
BDB XML can index three types of nodes: element
,
attribute
, or metadata
.
Metadata nodes are, of course, indices set for a document's metadata
content.
Element and attribute nodes are only found in document content. In the following document:
<vendor type="wholesale"> <name>TriCounty Produce</name> </vendor>
vendor
and name
are element nodes, while
type
is an attribute node.
Use the element node type to improve queries that test the value of an element node. Use the attribute node type to improve any query that examines an attribute or attribute value.
The Key type identifies what sort of test the index supports. You can use one of three key types:
Key Type | Description |
---|---|
equality
|
Improves the performances of tests that look for nodes with a specific value. |
presence
|
Improves the performance of tests that look for the existence of an node, regardless of its value. |
substring
|
Improves the performance of tests that look for a node whose value contains a given
substring. This key type is best used when your queries
use the XQuery |