Sun Java Streaming XML Parser Release Notes

Implementation Version: 1.0 EA

Sun Java Streaming XML Parser (SJSXP) is the Implementation of JSR 173. JSR 173 introduces new Streaming APIs for XML (StAX) which is a Java based API for pull-parsing XML. This SJSXP release supports all the features of JSR 173 with different configurations. This is an Early Access release, We would like you to try this and give your feedback to users@jwsdp.dev.java.net

This project was code named as 'Zephyr' so one would find the references of 'Zephyr' for ex. in the factory implementation class name. 'Zephyr' started with Xerces2 code base. Xerces2 lower layers (Scanner and related classes) have been redesigned so as to behave in pull fashion. There are two main reasons for this redesign. First, It is easy to build push layer on top of pull layer than the other way round. Second, It is more efficient. Besides the changes in lower layers, stax related functionality have been added and lot of performance improvements have been done. SJSXP is non-validating, W3C XML 1.0 and Namespace 1.0 compliant parser.




JSR 173 implementation (sjsxp.jar)

Once you have installed JWSDP. You can find sjsxp.jar, jsr173_api.jar under <JWSDP-INSTALL-DIR>/sjsxp/lib directory. jsr173_api.jar contains the StAX (JSR 173) APIs and sjsxp.jar is the implementation of JSR 173.

Samples:

There are 3 samples CursorParse.java, CursorWriter.java, StreamFP.java distributed with this installation. You can find them under <JWSDP-INSTALL-DIR>/sjsxp/samples directory.



Go to <JWSDP-INSTALL-DIR>/sjsxp/samples directory.

For ex. to compile 'CursorParse' sample. Type,

javac -classpath ../lib/jsr173_api.jar CursorParse.java

To run 'CursorParse' sample place the 'jsr173_api.jar & sjsxp.jar' in classpath. type,

java -cp .:../lib/jsr173_api.jar:../lib/sjsxp.jar CursorParse -x 1 ./data/BookCatalogue.xml

Each sample has usage instructions as how to run a particular sample which can be seen by typing

java -cp .:../lib/jsr173_api.jar:../lib/sjsxp.jar <sample-class-name>

OR

You can download the build.xml here which has targets to compile and run all the samples. Save this build.xml in <JWSDP-INSTALL-DIR>/sjsxp/samples directory and type

 'ant all'   

This will compile and run all the samples.

SJSXP Features and Properties:

Reporting CData event:


javax.xml.stream.XMLStreamReader doesn't report CDATA events. If application needs to recieve that event, one should configure the XMLInputFactory to set the following implementation specific property "report-cdata-event"

 XMLInputFactory factory = XMLInptuFactory.newInstance();
 factory.setProperty("report-cdata-event", Boolean.TRUE);



If you would like to have any new feature, please let us know.

JSR 173 Factories Implementation:


Most of the applications need not know about the factory implementation class name. Just by dropping sjsxp.jar in classpath would do. Because sjsxp.jar supplies the factory implementation classname of different properties (javax.xml.stream.XMLInputFactory, javax.xml.stream.XMLOutputFactory, javax.xml.stream.XMLEventFactory) under the META-INF/services directory which is the third step of look up when applications asks for the factory instance. Look at the javadoc of XMLInputFactory.newInstance() for details on the lookup mechanism.

However, There could be scenarios when application would like to know (for ex there are multiple JSR 173 implementations in the classpath and application wants to choose one, say based on performance, bug fix, compliance to spec) about the factory implementation class name and set the property explicitly.

If application sets the SystemProerty, it is the first step in lookup mechanism, obtaining factory instance would be fast compared to other options.


 javax.xml.stream.XMLInputFactory --> com.sun.xml.stream.ZephyrParserFactory
 javax.xml.stream.XMLOutputFactory --> com.sun.xml.stream.ZephyrWriterFactory
 javax.xml.stream.XMLEventFactory --> com.sun.xml.stream.events.ZephyrEventFactory

Issues

Filters


When application creates a filtered reader object by invoking createFilteredReader on InputFactory object, the filtered reader will point to the first event that is accepted by the StreamFilter or EventFilter implementation provided by the application. If none of the events is accepted then getEventType would return -1. The application can use hasNext or next to traverse the xml document and read the filtered events.

Let us know what is your opinion and what should be the right behavior ? It would be good if the correct behavior is clarified in next version of Stax spec.


IS_COALESCE & IS_REPLACE_ENTITY_REFERENCE:


JSR 173 Spec is not clear what should be the parser behavior when javax.xml.stream.XMLStreamConstants.IS_COALESCE is set to 'true' and javax.xml.stream.XMLStreamConstants.IS_REPLACE_ENTITY_REFERENCE is set to 'false'. SJSXP behavior In this case is that
'javax.xml.stream.XMLStreamConstants.IS_COALESCE' takes precedence and value of javax.xml.stream.XMLStreamConstants.IS_REPLACE_ENTITY_REFERENCE is ignored.

Let us know what is your opinion and what should be the right behavior ? It would be good if the correct behavior is clarified in next version of Stax spec.

Found a Bug ?


In case you find any bug, please send the mail to users@jwsdp.dev.java.net and if possible with standalone test case showing the problem.

Question ?


Any question related to SJSXP should be sent to users@jwsdp.dev.java.net