The Java EE 5 Tutorial

Reading XML Streams

As described earlier in this chapter, the way you read XML streams with a StAX processor, and what you get back, vary significantly depending on whether you are using the StAX cursor API or the event iterator API. The following two sections describe how to read XML streams with each of these APIs.

Using XMLStreamReader

The XMLStreamReader interface in the StAX cursor API lets you read XML streams or documents in a forward direction only, one item in the infoset at a time. The following methods are available for pulling data from the stream or skipping unwanted events:

Instances of XMLStreamReader have at any one time a single current event on which its methods operate. When you create an instance of XMLStreamReader on a stream, the initial current event is the START_DOCUMENT state. The XMLStreamReader.next method can then be used to step to the next event in the stream.

Reading Properties, Attributes, and Namespaces

The XMLStreamReader.next method loads the properties of the next event in the stream. You can then access those properties by calling the XMLStreamReader.getLocalName and XMLStreamReader.getText methods.

When the XMLStreamReader cursor is over a StartElement event, it reads the name and any attributes for the event, including the namespace. All attributes for an event can be accessed using an index value, and can also be looked up by namespace URI and local name. Note, however, that only the namespaces declared on the current StartEvent are available; previously declared namespaces are not maintained, and redeclared namespaces are not removed.

XMLStreamReader Methods

XMLStreamReader provides the following methods for retrieving information about namespaces and attributes:

int getAttributeCount();
String getAttributeNamespace(int index);
String getAttributeLocalName(int index);
String getAttributePrefix(int index);
String getAttributeType(int index);
String getAttributeValue(int index);
String getAttributeValue(String namespaceUri, String localName);
boolean isAttributeSpecified(int index);

Namespaces can also be accessed using three additional methods:

int getNamespaceCount();
String getNamespacePrefix(int index);
String getNamespaceURI(int index);

Instantiating an XMLStreamReader

This example, taken from the StAX specification, shows how to instantiate an input factory, create a reader, and iterate over the elements of an XML stream:

XMLInputFactory f = XMLInputFactory.newInstance();
XMLStreamReader r = f.createXMLStreamReader( ... );
while(r.hasNext()) {
    r.next();
}

Using XMLEventReader

The XMLEventReader API in the StAX event iterator API provides the means to map events in an XML stream to allocated event objects that can be freely reused, and the API itself can be extended to handle custom events.

XMLEventReader provides four methods for iteratively parsing XML streams:

For example, the following code snippet illustrates the XMLEventReader method declarations:

package javax.xml.stream;
import java.util.Iterator;
public interface XMLEventReader extends Iterator {
    public Object next();
    public XMLEvent nextEvent() throws XMLStreamException;
    public boolean hasNext();
    public XMLEvent peek() throws XMLStreamException;
    ...
}

To read all events on a stream and then print them, you could use the following:

while(stream.hasNext()) {
    XMLEvent event = stream.nextEvent();
    System.out.print(event);
}

Reading Attributes

You can access attributes from their associated javax.xml.stream.StartElement, as follows:

public interface StartElement extends XMLEvent {
    public Attribute getAttributeByName(QName name);
    public Iterator getAttributes();
}

You can use the getAttributes method on the StartElement interface to use an Iterator over all the attributes declared on that StartElement.

Reading Namespaces

Similar to reading attributes, namespaces are read using an Iterator created by calling the getNamespaces method on the StartElement interface. Only the namespace for the current StartElement is returned, and an application can get the current namespace context by using StartElement.getNamespaceContext.