Implementation Version: 1.0 EA
Sun Java Streaming XML Parser (SJSXP) is the Implementation
of JSR 173. JSR 173
introduces new Streaming APIs for XML (StAX) which is a Java based API
for pull-parsing XML. This SJSXP release supports all the features of JSR
173 with different configurations. This is an Early Access release, We
would like you to try this and give your feedback to users@jwsdp.dev.java.net
This project was code named as 'Zephyr' so one would find the references
of 'Zephyr' for ex. in the factory implementation class name. 'Zephyr'
started with Xerces2 code base. Xerces2 lower layers (Scanner and related
classes) have been redesigned so as to behave in pull fashion. There are
two main reasons for this redesign. First, It is easy to build push layer
on top of pull layer than the other way round. Second, It is more efficient.
Besides the changes in lower layers, stax related functionality have been
added and lot of performance improvements have been done. SJSXP is
non-validating, W3C XML 1.0 and Namespace 1.0 compliant parser.
Once you have installed JWSDP. You can find sjsxp.jar, jsr173_api.jar under <JWSDP-INSTALL-DIR>/sjsxp/lib directory. jsr173_api.jar contains the StAX (JSR 173) APIs and sjsxp.jar is the implementation of JSR 173.
There are 3 samples CursorParse.java, CursorWriter.java, StreamFP.java distributed with this installation. You can find them under <JWSDP-INSTALL-DIR>/sjsxp/samples directory.
CursorParse.java
shows how to instantiate XMLInputFactory
and use XMLStreamReader to parse XML file.
CursorWriter.java
shows how to use Stax Writing APIs
to write XML file programatticaly.
StreamFP.java
shows the usage of Stax Stream Filter
APIs. This filter accepts only StartElement and EndElement events and filters
out rest of the events.
Go to <JWSDP-INSTALL-DIR>/sjsxp/samples
directory.
For ex. to compile 'CursorParse' sample. Type,
javac -classpath ../lib/jsr173_api.jar CursorParse.java
To run 'CursorParse' sample place the 'jsr173_api.jar & sjsxp.jar' in classpath. type,
java -cp .:../lib/jsr173_api.jar:../lib/sjsxp.jar CursorParse -x 1 ./data/BookCatalogue.xml
Each sample has usage instructions as how to run a particular sample which can be seen by typing
java -cp .:../lib/jsr173_api.jar:../lib/sjsxp.jar <sample-class-name>
OR
You can download the build.xml here which has targets to compile and run all the samples. Save this build.xml in <JWSDP-INSTALL-DIR>/sjsxp/samples directory and type
'ant all'
This will compile and run all the samples.
javax.xml.stream.XMLStreamReader doesn't report CDATA events. If application
needs to recieve that event, one should configure the XMLInputFactory to set
the following implementation specific property "report-cdata-event"
XMLInputFactory factory = XMLInptuFactory.newInstance();
factory.setProperty("report-cdata-event", Boolean.TRUE);
If you would like to have any new feature, please let us know.
Most of the applications need not know about the factory implementation
class name. Just by dropping sjsxp.jar in classpath would do. Because sjsxp.jar
supplies the factory implementation classname of different properties (javax.xml.stream.XMLInputFactory,
javax.xml.stream.XMLOutputFactory, javax.xml.stream.XMLEventFactory) under
the META-INF/services directory which is the third step of look up when applications
asks for the factory instance. Look at the javadoc of XMLInputFactory.newInstance()
for details on the lookup mechanism.
However, There could be scenarios when application would like to know (for
ex there are multiple JSR 173 implementations in the classpath and application
wants to choose one, say based on performance, bug fix, compliance to spec)
about the factory implementation class name and set the property explicitly.
If application sets the SystemProerty, it is the first step in lookup mechanism,
obtaining factory instance would be fast compared to other options.
javax.xml.stream.XMLInputFactory --> com.sun.xml.stream.ZephyrParserFactory
javax.xml.stream.XMLOutputFactory --> com.sun.xml.stream.ZephyrWriterFactory
javax.xml.stream.XMLEventFactory --> com.sun.xml.stream.events.ZephyrEventFactory
When application creates a filtered reader object by invoking createFilteredReader
on InputFactory object, the filtered reader will point to the first event
that is accepted by the StreamFilter or EventFilter implementation provided
by the application. If none of the events is accepted then getEventType
would return -1. The application can use hasNext or next to traverse the
xml document and read the filtered events.
Let us know what is your opinion and what should be the right behavior
? It would be good if the correct behavior is clarified in next version
of Stax spec.
JSR 173 Spec is not clear what should be the parser behavior when javax.xml.stream.XMLStreamConstants.IS_COALESCE
is set to 'true' and javax.xml.stream.XMLStreamConstants.IS_REPLACE_ENTITY_REFERENCE
is set to 'false'. SJSXP behavior In this case is that
'javax.xml.stream.XMLStreamConstants.IS_COALESCE'
takes precedence
and value of javax.xml.stream.XMLStreamConstants.IS_REPLACE_ENTITY_REFERENCE
is ignored.
Let us know what is your opinion and what should be the right behavior
? It would be good if the correct behavior is clarified in next version
of Stax spec.
In case you find any bug, please send the mail to users@jwsdp.dev.java.net and
if possible with standalone test case showing the problem.
Any question related to SJSXP should be sent to users@jwsdp.dev.java.net