XPath is an industry standard developed by the World Wide Web Consortium (W3C).
It is the method used to navigate through an XML document. XPath is a set of syntax rules for addressing the individual pieces of an XML document. You might not know it, but you have already used XPath; RTF templates use XPath to navigate through the XML data at runtime.
This section contains a brief introduction to XPath principles. For more information, see the W3C Web site: http://www.w3.org/TR/xpath
XPath follows the Document Object Model (DOM), which interprets an XML document as a tree of nodes. A node can be one of seven types:
root
element
attribute
text
namespace
processing instruction
comment
Many of these elements are shown in the following sample XML, which contains a catalog of CDs:
<?xml version="1.0" encoding="UTF-8"?> <! - My CD Listing - > <CATALOG> <CD cattype=Folk> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD cattype=Rock> <TITLE>Hide Your Heart</TITLE> <ARTIST>Bonnie Tylor</ARTIST> <COUNTRY>UK</COUNTRY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD> </CATALOG>
The root node in this example is CATALOG. CD is an element, and it has an attribute cattype. The sample contains the comment My CD Listing. Text is contained within the XML document elements.
Locate information in an XML document using location-path expressions.
A node is the most common search element that you encounter. Nodes in the example CATALOG XML include CD, TITLE, and ARTIST. Use a path expression to locate nodes within an XML document. For example, the following path returns all CD elements:
//CATALOG/CD
where
the double slash (//) indicates that all elements in the XML document that match the search criteria are to be returned, regardless of the level within the document.
the slash (/) separates the child nodes. All elements matching the pattern are returned.
To retrieve the individual TITLE elements, use the following command:
/CATALOG/CD/TITLE
This example returns the following XML:
<CATALOG> <CD cattype=Folk> <TITLE>Empire Burlesque</TITLE> </CD> <CD cattype=Rock> <TITLE>Hide Your Heart</TITLE> </CD> </CATALOG>
Further limit the search by using square brackets. The brackets locate elements with certain child nodes or specified values. For example, the following expression locates all CDs recorded by Bob Dylan:
/CATALOG/CD[ARTIST="Bob Dylan"]
Or, if each CD element did not have an PRICE element, you could use the following expression to return only those CD elements that include a PRICE element:
/CATALOG/CD[PRICE]
Use the bracket notation to leverage the attribute value in the search. Use the @ symbol to indicate an attribute. For example, the following expression locates all Rock CDs (all CDs with the cattype attribute value Rock):
//CD[@cattype="Rock"]
This returns the following data from the sample XML document:
<CD cattype=Rock> <TITLE>Hide Your Heart</TITLE> <ARTIST>Bonnie Tylor</ARTIST> <COUNTRY>UK</COUNTRY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD>
You can also use brackets to specify the item number to retrieve. For example, the first CD element is read from the XML document using the following XPath expression:
/CATALOG/CD[1]
The sample returns the first CD element:
<CD cattype=Folk> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD>
XPath also supports wildcards to retrieve every element contained within the specified node. For example, to retrieve all the CDs from the sample XML, use the following expression:
/CATALOG/*
You can combine statements with Boolean operators for more complex searches. The following expression retrieves all Folk and Rock CDs, thus all the elements from the sample:
//CD[@cattype="Folk"]|//CD[@cattype="Rock"]
The pipe (|) is equal to the logical OR operator. In addition, XPath recognizes the logical OR and AND, as well as the equality operators: <=, <, >, >=, ==, and !=. For example, you can find all CDs released in 1985 or later using the following expression:
/CATALOG/CD[YEAR >=1985]
The first character in an XPath expression determines the point at which it should start in the XML tree.
Statements beginning with a forward slash (/) are considered absolute. No slash indicates a relative reference. An example of a relative reference is:
CD/*
This statement begins the search at the current reference point. That means if the example occurred within a group of statements the reference point left by the previous statement would be utilized.
As noted earlier, double forward slashes (//) retrieve every matching element regardless of location in the document, therefore the use of double forward slashes (//) should be used only when necessary to improve performance.
To select current and parent elements, XPath recognizes the dot notation commonly used to navigate directories.
Use a single period (.) to select the current node and use double periods (..) to return the parent of the current node. For example, to retrieve all child nodes of the parent of the current node, use:
../*
Therefore, to access all CDs from the sample XML, use the following expression:
/CATALOG/CD/..
You could also access all the CD tittles released in 1988 using the following:
/CATALOG/CD/TITLE[../YEAR=1988]
The two periods (..) are used to navigate up the tree of elements to find the YEAR element at the same level as the TITLE, where it is then tested for a match against "1988". You could also use // in this case, but if the element YEAR is used elsewhere in the XML document, then you might get erroneous results.
XPath is an extremely powerful standard when combined with RTF templates allowing you to use conditional formatting and filtering in the template.