BEA Systems, Inc.

com.beasys.commerce.util.dom
Class SAXDocumentBuilder

java.lang.Object
  |
  +--com.beasys.commerce.util.dom.SAXDocumentBuilder

public class SAXDocumentBuilder
extends java.lang.Object
implements org.xml.sax.DocumentHandler, org.xml.sax.DTDHandler

A SAX DocumentHandler that can generate a DOM Document for any SAX-compliant parser.

This will generate DOM implementation objects from the com.beasys.commerce.util.dom package, based off of the com.beasys.commerce.util.dom.DocumentImpl object.

Like all SAX handlers, this was not designed to be reusable, although it might accidently be (via the reset() method). It's also not intentionally thread-safe/reentrant; once you put this is on a parser and call parse(), don't do anything with it until that parse finishes.

Since SAX only deals with elements, attributes, processing instructions, and text, that's all that will be in the DOM Document generated. According to SAX, the parser must fully expanded entities, so there should be no EntityReferences. Since entities are expanded, there's no need in SAX for separating Text from CDATASections. Comments are not handed to SAX document handlers, so those are lost. A DocumentType for the Document will be generated when and if either a Notation, Entity, or Element is found; however, I haven't seen a SAX parser that passes Entities to the DTDHandler, so those might be lost. Plus, since the DOCTYPE isn't passed to the DocumentHandler, I can only assume that the DTD type of the DocType will be the tag name of the first element parsed.

To use this to construct a DOM Document tree, it would look like:

     // get a SAX parser from somewhere
     Parser p = ...;

     // instantiate a document builder
     SAXDocumentBuilder builder = new SAXDocumentBuilder();

     // install it on the parser
     builder.installOn(p);

     // parse the document
     p.parse(...);

     // get the Document tree
     Document doc = builder.getDocument();
 

See Also:
DocumentImpl, Document

Field Summary
protected  DocumentImpl doc
          Our document object.
protected  org.xml.sax.Locator docLocator
          The document-event locator object supplied by the parser.
protected  java.util.Stack docStack
          Our document stack.
protected  DocumentTypeImpl docType
          Our document type object for our document.
protected  boolean maintainWS
          Do we maintain ignorable whitespace in the document.
protected  boolean strictParsing
          Do we enforce strict parsing?.
 
Constructor Summary
SAXDocumentBuilder()
          Constructor that maintains whitespace and enforce strict parsing.
SAXDocumentBuilder(boolean maintainWS, boolean strictParsing)
          Constructor.
 
Method Summary
protected  void appendToTOS(org.w3c.dom.Node node)
          Add the given node to the TOS.
protected  void assignAttributeList(org.w3c.dom.Element e, org.xml.sax.AttributeList list)
          Assign the attributes in a SAX AttributeList to a DOM Element.
 void characters(char[] ch, int start, int length)
          The parser found character data.
protected  org.xml.sax.SAXException createSAXException(java.lang.String message, java.lang.Exception ex)
          Generate a SAXException.
 void endDocument()
          The end of a document parse.
 void endElement(java.lang.String name)
          The end of an element.
 org.w3c.dom.Document getDocument()
          Get the current Document object.
protected  DocumentImpl getDocumentImpl()
          Get the document as a DocumentImpl.
 org.xml.sax.Locator getDocumentLocator()
          Get the document-event locator we're currently using.
protected  DocumentTypeImpl getDocumentType()
          Get the DocumentTypeImpl of the current Document.
protected  org.w3c.dom.Node getTOS()
          Get the TOS as a Node.
 void ignorableWhitespace(char[] ch, int start, int length)
          The parser found ignorable whitespace characters.
 void installOn(org.xml.sax.Parser p)
          Install this builder on a SAX Parser.
 boolean maintainsWhitespace()
          Tell if this maintains whitespace.
 void notationDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId)
          The parser found a notation declaration.
 void processingInstruction(java.lang.String target, java.lang.String data)
          The parser found a processing instruction.
 void reset()
          Reset the handler to a state where it can be used again.
protected  void resetDocumentStack()
          Reset the document stack.
 void setDocumentLocator(org.xml.sax.Locator locator)
          The parser will call this to set the document-event locator for the parse.
 void setMaintainsWhitespace(boolean maintainsWS)
          Set if this should maintain whitespace.
 void setStrictParsing(boolean strictParsing)
          Set if this should use strict parsing.
 void startDocument()
          The start of a document parse.
 void startElement(java.lang.String name, org.xml.sax.AttributeList attrs)
          The start of an element.
 void unparsedEntityDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId, java.lang.String notationName)
          The parse found an unparsed entity declaration.
 boolean usesStrictParsing()
          Tell if this enforces strict parsing.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

docLocator

protected org.xml.sax.Locator docLocator
The document-event locator object supplied by the parser.

docStack

protected java.util.Stack docStack
Our document stack.

doc

protected DocumentImpl doc
Our document object.

docType

protected DocumentTypeImpl docType
Our document type object for our document.

maintainWS

protected boolean maintainWS
Do we maintain ignorable whitespace in the document.

strictParsing

protected boolean strictParsing
Do we enforce strict parsing?.

This implies that:

  1. All tags must be closed correctly (startElement tag == endElement tag).
  2. All elements found in the stack must be of the correct type.
  3. All document events must occur in the correct order.

Constructor Detail

SAXDocumentBuilder

public SAXDocumentBuilder(boolean maintainWS,
                          boolean strictParsing)
Constructor.
Parameters:
maintainWS - true to maintain ignorable whitespace.
strictParsing - true to do strict parsing.

SAXDocumentBuilder

public SAXDocumentBuilder()
Constructor that maintains whitespace and enforce strict parsing.
Method Detail

maintainsWhitespace

public boolean maintainsWhitespace()
Tell if this maintains whitespace.

setMaintainsWhitespace

public void setMaintainsWhitespace(boolean maintainsWS)
Set if this should maintain whitespace.

usesStrictParsing

public boolean usesStrictParsing()
Tell if this enforces strict parsing.

setStrictParsing

public void setStrictParsing(boolean strictParsing)
Set if this should use strict parsing.

reset

public void reset()
Reset the handler to a state where it can be used again.

This resets the document stack and clears the last generated document.

See Also:
resetDocumentStack()

installOn

public void installOn(org.xml.sax.Parser p)
Install this builder on a SAX Parser.

This is a convience method that sets the parser's document handler and dtd handler to this.


resetDocumentStack

protected void resetDocumentStack()
Reset the document stack.

This will also lazily initialize the document stack.


getTOS

protected org.w3c.dom.Node getTOS()
                           throws org.xml.sax.SAXException
Get the TOS as a Node.
Throws:
org.xml.sax.SAXException - thrown if there's isn't a TOS

appendToTOS

protected void appendToTOS(org.w3c.dom.Node node)
                    throws org.xml.sax.SAXException
Add the given node to the TOS.

createSAXException

protected org.xml.sax.SAXException createSAXException(java.lang.String message,
                                                      java.lang.Exception ex)
Generate a SAXException.

If the locator was set, then this will generate a SAXParseException.


assignAttributeList

protected void assignAttributeList(org.w3c.dom.Element e,
                                   org.xml.sax.AttributeList list)
                            throws org.xml.sax.SAXException
Assign the attributes in a SAX AttributeList to a DOM Element.
Parameters:
e - the DOM Element.
list - the SAX AttributeList.
Throws:
org.xml.sax.SAXException - thrown if a DOMException occurs in the assigning.

getDocumentLocator

public org.xml.sax.Locator getDocumentLocator()
Get the document-event locator we're currently using.

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
The parser will call this to set the document-event locator for the parse.
Specified by:
setDocumentLocator in interface org.xml.sax.DocumentHandler

getDocument

public org.w3c.dom.Document getDocument()
Get the current Document object.

getDocumentImpl

protected DocumentImpl getDocumentImpl()
                                throws org.xml.sax.SAXException
Get the document as a DocumentImpl.
Throws:
org.xml.sax.SAXException - thrown if there isn't a document yet.

getDocumentType

protected DocumentTypeImpl getDocumentType()
                                    throws org.xml.sax.SAXException
Get the DocumentTypeImpl of the current Document.

If the doucment type hasn't been created yet, it will be here.

Returns:
the DocumentTypeImpl created.
Throws:
org.xml.sax.SAXException - thrown if it cannot be added to the document.
See Also:
getDocumentImpl()

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
The start of a document parse.

This should reset everything, generate a Document for us to use, and push that Document onto the stack.

Specified by:
startDocument in interface org.xml.sax.DocumentHandler
See Also:
reset()

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
The end of a document parse.

This should just clear the document stack.

Specified by:
endDocument in interface org.xml.sax.DocumentHandler

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
The parser found character data.

This should generate a Text and add it to the current TOS.

Specified by:
characters in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
getDocumentImpl(), appendToTOS(org.w3c.dom.Node)

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
The parser found ignorable whitespace characters.
Specified by:
ignorableWhitespace in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
characters(char[], int, int)

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
                           throws org.xml.sax.SAXException
The parser found a processing instruction.

This should generate a ProcessingInstruction and add it to the current TOS.

Specified by:
processingInstruction in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
getDocumentImpl(), appendToTOS(org.w3c.dom.Node)

startElement

public void startElement(java.lang.String name,
                         org.xml.sax.AttributeList attrs)
                  throws org.xml.sax.SAXException
The start of an element.

This should generate an Element, add it to the current TOS, the push it onto the stack. If this is the first element in the document (i.e. docStack.size() == 1), then also set the document's docType's DTD name to the tagName.

Specified by:
startElement in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
getDocumentImpl(), appendToTOS(org.w3c.dom.Node), getDocumentType()

endElement

public void endElement(java.lang.String name)
                throws org.xml.sax.SAXException
The end of an element.

This should pop the TOS.

Specified by:
endElement in interface org.xml.sax.DocumentHandler

notationDecl

public void notationDecl(java.lang.String name,
                         java.lang.String publicId,
                         java.lang.String systemId)
                  throws org.xml.sax.SAXException
The parser found a notation declaration.

This should create a Notation and add it to the document type.

Specified by:
notationDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
getDocumentType()

unparsedEntityDecl

public void unparsedEntityDecl(java.lang.String name,
                               java.lang.String publicId,
                               java.lang.String systemId,
                               java.lang.String notationName)
                        throws org.xml.sax.SAXException
The parse found an unparsed entity declaration.

This should create an Entity and add it to the document's docType. However, I'm not sure where the children of the entity will come from.

Specified by:
unparsedEntityDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
getDocumentType()

BEA Systems, Inc.

Copyright © 2000 BEA Systems, Inc. All Rights Reserved