Verifying Indices using Query Plans

Query Plans
Using the dbxml Shell to Examine Query Plans

When designing your indexing strategy, you should create indices to improve the performance of your most frequently occurring queries. Without indices, BDB XML must walk every document in the container in order to satisfy the query. For containers that contain large numbers of documents, or very large documents, or both, this can be a time-consuming process.

However, when you set the appropriate index(es) for your container, the same query that otherwise takes minutes to complete can now complete in a time potentially measured in milliseconds. So setting the appropriate indices for your container is a key ingredient to improving your application's performance.

That said, the question then becomes, how do you know that a given index is actually being used by a given query? That is, how do you do this without loading the container with enough data that it is noticeably faster to complete a query with an index set than it is to complete the query without the index?

The way to do this is to examine BDB XML's query plan for the query to see if it intends to use an index for the query. And the best and easiest way to examine a query plan is by using the dbxml command line utility.

Query Plans

The query plan is literally BDB XML's plan for how it will satisfy a query. When you use XmlManager.prepare(), one of the things you are doing is regenerating a query plan so that BDB XML does not have to continually re-create it every time you run the query.

Printed out, the query plan looks like an XML document that describes the steps the query processor will take to fulfill a specific query.

For example, suppose your container holds documents that look like the following:

<a>
    <docId id="aaUivth" />
    <b>
        <c>node1</c>
        <d>node2</d>
    </b>
</a>

Also, suppose you will frequently want to retrieve the document based on the value set for the id parameter on the docId node. That is, you will frequently perform queries that look like this:

collection("myContainer.dbxml")/a/docId[@id='bar']

In this case, if you print out the query plan (we describe how to do this below), you will see something like this:

<XQuery>
  <QueryPlanToAST>
    <NodePredicateFilterQP uri="" name="#tmp5">
      <StepQP axis="child" name="docId" nodeType="element">
        <StepQP axis="child" name="a" nodeType="element">
          <SequentialScanQP container="myContainer.dbxml" 
            nodeType="document"/>
        </StepQP>
      </StepQP>
      <ValueFilterQP comparison="eq" general="true">
        <StepQP axis="attribute" name="id" nodeType="attribute">
          <VariableQP name="#tmp5"/>
        </StepQP>
        <Sequence>
          <AnyAtomicTypeConstructor value="bar" 
          typeuri="http://www.w3.org/2001/XMLSchema" typename="string"/>
        </Sequence>
      </ValueFilterQP>
    </NodePredicateFilterQP>
  </QueryPlanToAST>
</XQuery> 

While a complete description of the query plan is outside the scope of this manual, notice that there is no element specified in the query plan that includes an index attribute. This attribute can appear on different element nodes, depending on the nature of the query and the actual index that the query wants to use. For example, queries that use indexes which examine the value of a node might specify a ValueQP node.

<ValueQP container="myContainer.dbxml" 
index="node-attribute-equality-string" operation="eq" child="id" 
value="bar"/>

Other indexes that simply test for the presence of a node would specify the index on a PresenceQP element:

<PresenceQP container="parts.dbxml"
index="node-element-presence-none" operation="eq"
child="parent-part"/> 

Using the dbxml Shell to Examine Query Plans

dbxml is a command line utility that allows you to gracefully interact with your BDB XML containers. You can perform a great many operations on your containers and documents using this utility, but of interest to the current discussion is the utility's ability to allow you add and delete indices to your containers, to query for documents, and to examine query plans.

The dbxml shell is described in the Introduction to Berkeley DB XML guide.

Note that while you can create containers and load XML documents into those containers using dbxml, we assume here that you have already performed these activities using some other mechanism.

In order to examine query plans using dbxml, do the following (the following assumes the container already exists and contains documents):

> dbxml
dbxml> openContainer myContainer.dbxml 

Next, examine your query plan using the qPlan command. Note that we assume your container only has the standard, default index that all containers have when they are first created.

dbxml> qPlan 'collection("myContainer.dbxml")/a/docId[@id="aaUivth"]'
<XQuery>
  <QueryPlanToAST>
    <NodePredicateFilterQP uri="" name="#tmp5">
      <StepQP axis="child" name="docId" nodeType="element">
        <StepQP axis="child" name="a" nodeType="element">
            <SequentialScanQP container="myContainer.dbxml" 
                nodeType="document"/>
        </StepQP>
      </StepQP>
      <ValueFilterQP comparison="eq" general="true">
        <StepQP axis="attribute" name="id" nodeType="attribute">
          <VariableQP name="#tmp5"/>
        </StepQP>
        <Sequence>
          <AnyAtomicTypeConstructor value="aaUivth" 
             typeuri="http://www.w3.org/2001/XMLSchema" 
             typename="string"/>
        </Sequence>
      </ValueFilterQP>
    </NodePredicateFilterQP>
  </QueryPlanToAST>
</XQuery> 

Notice that this query plan does not make use of an index. No index is identified anywhere in the query plan, and it calls for only a sequential scan. Now add the index that you want to test.

dbxml> addindex "" id "node-attribute-equality-string"
Adding index type: node-attribute-equality-string to node: {}:id

Now try the query plan again. Notice that there's a ValueQP element that specifies our newly added index using a index attribute.

dbxml> qplan collection("myContainer.dbxml")/a/docId[@id='aaUivth']
<XQuery>
  <QueryPlanToAST>
    <ParentOfAttributeJoinQP>
      <ValueQP container="myContainer.dbxml" 
          index="node-attribute-equality-string" operation="eq" 
           child="id" value="aaUivth"/>
      <StepQP axis="child" name="docId" nodeType="element">
        <StepQP axis="child" name="a" nodeType="element">
            <SequentialScanQP container="myContainer.dbxml" 
                nodeType="document"/>
        </StepQP>
      </StepQP>
    </ParentOfAttributeJoinQP>
  </QueryPlanToAST>
</XQuery> 

You are done testing your index. To exit dbxml, use the quit command:

dbxml> quit