BEA Systems, Inc.

com.beasys.commerce.util.dom
Class SAXDocumentBuilder

java.lang.Object
  |
  +--com.beasys.commerce.util.dom.SAXDocumentBuilder

public class SAXDocumentBuilder
extends java.lang.Object
implements org.xml.sax.DocumentHandler, org.xml.sax.DTDHandler

A SAX DocumentHandler that can generate a DOM Document for any SAX-compliant parser.

This will generate DOM implementation objects from the com.beasys.commerce.util.dom package, based off of the com.beasys.commerce.util.dom.DocumentImpl object.

Like all SAX handlers, this was not designed to be reusable, although it might accidently be (via the reset() method). It's also not intentionally thread-safe/reentrant; once you put this is on a parser and call parse(), don't do anything with it until that parse finishes.

Since SAX only deals with elements, attributes, processing instructions, and text, that's all that will be in the DOM Document generated. According to SAX, the parser must fully expanded entities, so there should be no EntityReferences. Since entities are expanded, there's no need in SAX for separating Text from CDATASections. Comments are not handed to SAX document handlers, so those are lost. A DocumentType for the Document will be generated when and if either a Notation, Entity, or Element is found; however, I haven't seen a SAX parser that passes Entities to the DTDHandler, so those might be lost. Plus, since the DOCTYPE isn't passed to the DocumentHandler, I can only assume that the DTD type of the DocType will be the tag name of the first element parsed.

To use this to construct a DOM Document tree, it would look like:

     // get a SAX parser from somewhere
     Parser p = ...;

     // instantiate a document builder
     SAXDocumentBuilder builder = new SAXDocumentBuilder();

     // install it on the parser
     builder.installOn(p);

     // parse the document
     p.parse(...);

     // get the Document tree
     Document doc = builder.getDocument();
 

See Also:
DocumentImpl, Document

Constructor Summary
SAXDocumentBuilder()
          Constructor that maintains whitespace and enforce strict parsing.
SAXDocumentBuilder(boolean maintainWS, boolean strictParsing)
          Constructor.
 
Method Summary
 void characters(char[] ch, int start, int length)
          The parser found character data.
 void endDocument()
          The end of a document parse.
 void endElement(java.lang.String name)
          The end of an element.
 org.w3c.dom.Document getDocument()
          Get the current Document object.
 org.xml.sax.Locator getDocumentLocator()
          Get the document-event locator we're currently using.
 void ignorableWhitespace(char[] ch, int start, int length)
          The parser found ignorable whitespace characters.
 void installOn(org.xml.sax.Parser p)
          Install this builder on a SAX Parser.
 boolean maintainsWhitespace()
          Tell if this maintains whitespace.
 void notationDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId)
          The parser found a notation declaration.
 void processingInstruction(java.lang.String target, java.lang.String data)
          The parser found a processing instruction.
 void reset()
          Reset the handler to a state where it can be used again.
 void setDocumentLocator(org.xml.sax.Locator locator)
          The parser will call this to set the document-event locator for the parse.
 void setMaintainsWhitespace(boolean maintainsWS)
          Set if this should maintain whitespace.
 void setStrictParsing(boolean strictParsing)
          Set if this should use strict parsing.
 void startDocument()
          The start of a document parse.
 void startElement(java.lang.String name, org.xml.sax.AttributeList attrs)
          The start of an element.
 void unparsedEntityDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId, java.lang.String notationName)
          The parse found an unparsed entity declaration.
 boolean usesStrictParsing()
          Tell if this enforces strict parsing.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SAXDocumentBuilder

public SAXDocumentBuilder(boolean maintainWS,
                          boolean strictParsing)
Constructor.
Parameters:
maintainWS - true to maintain ignorable whitespace.
strictParsing - true to do strict parsing.

SAXDocumentBuilder

public SAXDocumentBuilder()
Constructor that maintains whitespace and enforce strict parsing.
Method Detail

maintainsWhitespace

public boolean maintainsWhitespace()
Tell if this maintains whitespace.

setMaintainsWhitespace

public void setMaintainsWhitespace(boolean maintainsWS)
Set if this should maintain whitespace.

usesStrictParsing

public boolean usesStrictParsing()
Tell if this enforces strict parsing.

setStrictParsing

public void setStrictParsing(boolean strictParsing)
Set if this should use strict parsing.

reset

public void reset()
Reset the handler to a state where it can be used again.

This resets the document stack and clears the last generated document.

See Also:
#resetDocumentStack

installOn

public void installOn(org.xml.sax.Parser p)
Install this builder on a SAX Parser.

This is a convience method that sets the parser's document handler and dtd handler to this.


getDocumentLocator

public org.xml.sax.Locator getDocumentLocator()
Get the document-event locator we're currently using.

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
The parser will call this to set the document-event locator for the parse.
Specified by:
setDocumentLocator in interface org.xml.sax.DocumentHandler

getDocument

public org.w3c.dom.Document getDocument()
Get the current Document object.

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
The start of a document parse.

This should reset everything, generate a Document for us to use, and push that Document onto the stack.

Specified by:
startDocument in interface org.xml.sax.DocumentHandler
See Also:
reset()

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
The end of a document parse.

This should just clear the document stack.

Specified by:
endDocument in interface org.xml.sax.DocumentHandler

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
The parser found character data.

This should generate a Text and add it to the current TOS.

Specified by:
characters in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
#getDocumentImpl, #appendToTOS

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
The parser found ignorable whitespace characters.
Specified by:
ignorableWhitespace in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
characters(char[], int, int)

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
                           throws org.xml.sax.SAXException
The parser found a processing instruction.

This should generate a ProcessingInstruction and add it to the current TOS.

Specified by:
processingInstruction in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
#getDocumentImpl, #appendToTOS

startElement

public void startElement(java.lang.String name,
                         org.xml.sax.AttributeList attrs)
                  throws org.xml.sax.SAXException
The start of an element.

This should generate an Element, add it to the current TOS, the push it onto the stack. If this is the first element in the document (i.e. docStack.size() == 1), then also set the document's docType's DTD name to the tagName.

Specified by:
startElement in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
#getDocumentImpl, #appendToTOS, #getDocumentType

endElement

public void endElement(java.lang.String name)
                throws org.xml.sax.SAXException
The end of an element.

This should pop the TOS.

Specified by:
endElement in interface org.xml.sax.DocumentHandler

notationDecl

public void notationDecl(java.lang.String name,
                         java.lang.String publicId,
                         java.lang.String systemId)
                  throws org.xml.sax.SAXException
The parser found a notation declaration.

This should create a Notation and add it to the document type.

Specified by:
notationDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
#getDocumentType

unparsedEntityDecl

public void unparsedEntityDecl(java.lang.String name,
                               java.lang.String publicId,
                               java.lang.String systemId,
                               java.lang.String notationName)
                        throws org.xml.sax.SAXException
The parse found an unparsed entity declaration.

This should create an Entity and add it to the document's docType. However, I'm not sure where the children of the entity will come from.

Specified by:
unparsedEntityDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException - thrown if something goes wrong.
See Also:
#getDocumentType

BEA Systems, Inc.

Copyright © 2000 BEA Systems, Inc. All Rights Reserved