Skip Headers

Oracle9i Application Developer's Guide - XML
Release 1 (9.0.1)

Part Number A88894-01
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

E
XDK for C: Specifications and Cheat Sheets

This appendix contains the following sections:

XML Parser for C Specifications

Oracle provides a set of XML parsers for Java, C, C++, and PL/SQL. Each of these parsers is a stand-alone XML component that parses an XML document (or a standalone DTD) so that it can be processed by an application. Library and command-line versions are provided and support the following "standards" and features:

Validating and Non-Validating Mode Support

The XML Parser for C can parse XML in validating or non-validating modes.

Validation involves checking whether or not the attribute names and element tags are legal, whether nested elements belong where they are, and so on.

Example Code

See Chapter 24, "Using XML Parser for C" for example code and suggestions on how to use the XML Parser for C.

Online Documentation

Documentation for Oracle XML Parser for C is located in the $ORACLE_HOME/xdk/c/parser/doc directory.

Release Specific Notes

The readme.html file in the root directory of the archive contains release specific information including bug fixes, API additions, and so on.

The Oracle XML parser for C is written in C. It will check if an XML document is well-formed, and optionally validate it against a DTD. The parser will construct an object tree which can be accessed via a DOM interface or operate serially via a SAX interface.

Standards Conformance

XML Parser for C conforms to the following standards:

Supported Character Set Encodings

XML Parser for C supports documents in the following encodings, in addition to the ones specified in Appendix A, "Character Sets", of Oracle9i Globalization and National Language Support Guide:

Default:

The default encoding is UTF-8. It is recommended that you set the default encoding explicitly if using only single byte character sets (such as US-ASCII or any of the ISO-8859 character sets) for performance up to 25% faster than with multibyte character sets, such as UTF-8.

XML Parser for C Revision History

Table E-1 lists the XML Parser for C revision history.


Table E-1 XML Parser for C: Revision History
Revision  Description 

XML Parser 2.0.4.0.0 (C) 

This is the first production V2 release. This changes in this release were mainly bug fixes.

For the XML parser, the following bugs were fixed:

  • 1352943 XMLPARSE() SOMETIMES CHOKES ON FILENAMES

  • 1302311 PROBLEM WITH PARAMETER ENTITY PROCESSING

  • 1323674 INCONSISTENT ERROR HANDLING IN THE C XML PARSER

  • 1328871 LPXPRINTBUFFER UNCONDITIONALLY PREPENDS XML COMMENT TO OUTPUT

  • 1349962 USING FREED MEMORY LOCATION CAUSES TLPXVNSA31.DIF oraxmldom.h was renamed to oradom.h

 

 

For the XSLT processor, the following bugs were fixed:

  • 1225546 USELESS ERROR MESSAGE NEEDS DETAIL

  • 1267616 TLPXST14.DIF: REPLACE DBL_MAX WITH SBIG_ORAMAXVAL IN LPXXP.C:LPXXPSUBSTRING()

  • 1289228 ERROR CONTEXT REQUIRED FOR DEBUGGING: FILE NAME, LINE#, FUNCTION, ETC

 

 

  • 1289214 XSL:CHOOSE DOESN'T WORK

  • 1298028 XPATH CONSTRUCT NOT(POSITION()=LAST()) NOT WORKING

  • 1298193 XPATH FUNCTIONS DON'T PROVIDE IMPLICIT TYPE CONVERSION OF PARAMS

  • 1323665 C XML PARSER CANNOT SET BASE DIRECTORY OR URI FOR STYLESHEET PARSING

  • 1325452 SEVERE MEMORY CONSUMPTION / LEAK IN XSLPROCESS

  • 1333693 CHAINED TRANSFORMS WITH C XSL PROCESSOR DON'T WORK: LPX-00002

 

XML Parser 2.0.3.0.0 (C) 

SAX memory usage: Much smaller, and flat for any input size and multiple parses (memory leaks plugged).

XSLT memory usage: Improved. Validation warnings: Validity Constraint (VC) errors have been changed to warnings and do not terminate parsing. For compatibility with the old behavior (halt on warnings as well as errors), a new flag XML_FLAG_STOP_ON_WARNING (or '-W' to the xml program) has been added. Performance improvements: Switch to finite automata VC structure validation yields 10% performance gain.

HTTP support: HTTP URIs are now supported; look for FTP in the next release. For other access methods, the user may define their own callbacks with the new xmlaccess() API.  

Oracle XML Parser 2.0.2.0.0 (C) 

XSLT improvements: Various bugs fixed in the XSLT processor; error messages are improved; xsl:number, xsl:sort, xsl:namespace-alias, xsl:decimal-format, forwards-compatible processing with xsl:version, and literal result element as stylesheet are now available; the following XSLT-specific additions to the core XPath library are now available: current(), format-number(), generate-id(), and system-property().

Bug fixes: Some problems with validation and matching of start and end tags with SAX were fixed (1227096). Also, a bug with parameter entity processing in external entities was fixed (1225219).  

Oracle XML Parser 2.0.1.0.0 (C) 

Performance improvements: Major performance improvement over the last, about two and a half times faster for UTF-8 parsing and about four times faster for ASCII parsing. Comparison timing against previous version for parsing (DOM) and validating various standalone files (SPARC Ultra 1 CPU time):

File size Old UTF-8 New UTF-8 Speedup Old ASCII New ASCII Speedup

42K 180ms 70ms 2.6 120ms 40ms 3.0

134K 510ms 210ms 2.4 450ms 100ms 4.5

247K 980ms 400ms 2.5 690ms 180ms

3.81M 2860ms 1130ms 2.5 1820ms 380ms 4.82

7M 10550ms 4100ms 2.6 7450ms 1930ms 3.9

10.5M 42250ms 16400ms 2.6 29900ms 7800ms 3.8. 

 

Conformance improvements: Stricter conformance to the XML 1.0 spec yields higher scores on standard test suites (Jim Clark, Oasis,...).

Lists, not arrays: Internal parser data structures are now uniformly lists; arrays have been dropped. Therefore, access is now better suited to a firstChild/nextSibling style loop instead of numChildNodes/getChildNode.

DTD parsing:A new API call xmlparsedtd() is added which parses an external DTD directly, without needing an enclosing document. Used mainly by the Class Generator.  

 

Error reporting: Error messages are improved and more specific, with nearly twice as many as before. Error location is now described by a stack of line number/entity pairs, showing the final location of the error and intermediate inclusions (e.g. line X of file, line Y of entity).

NOTE: You must use the new error message file (lpxus.msb) provided with this release; the error message file provided with earlier releases is incompatible. See below. XSL improvements: Various bugs fixed in the XSLT processor; xsl:call-template is now fully supported.  

Oracle XML Parser 2.0.0.0.0 (C) 

Oracle XML v2 parser is a beta release and is written in C. The main difference from the Oracle XML v1 parser is the ability to format the XML document according to a stylesheet via an integrated an XSLT processor. The XML parser will check if an XML document is well-formed, and optionally validate it against a DTD. The parser will construct an object tree which can be accessed via a DOM interface or operate serially via a SAX interface.

Supported operating systems are Solaris 2.6, Linux 2.2, HP-UX 11.0, and NT 4 / Service Pack 3 (and above). Be sure to read the licensing agreement before using this product.  

XML Parser for C: Parser Functions

Table E-2 lists the XML Parser for C Parser functions, a brief description, and syntax.

Table E-2 XML Parser for C: Parser Function s 
Function  Brief Description  Syntax and Comments 

xmlinit  

Initialize XML parser 

xmlctx *xmlinit (uword *err, const oratext *encoding, void (*msghdlr)(void *msgctx, const oratext *msg, ub4 errcode), void *msgctx, const xmlsaxcb *saxcb, void *saxcbctx, const xmlmemcb *memcb, void *memcbctx, const oratext *lang); 

xmlclean 

Clean up memory used during parse 

void xmlclean(xmlctx *ctx);

For those who want to parse multiple files but would like to free the memory used for parses before the subsequent call to xmlparse() or xmlparsebuf(). 

xmlparse 

Parse a file 

uword xmlparse(xmlctx *ctx, const oratext *filename, const oratext *encoding, ub4 flags);

Flag bits must be OR'd to override the default behavior of the parser. The following flag bits may be set:

  • XML_FLAG_VALIDATE turns validation on.

  • XML_FLAG_DISCARD_WHITESPACE discards whitespace where it appears to be insignificant.

The default behavior is to not validate the input. The default behavior for whitespace processing is to be fully conformant to the XML 1.0 spec, i.e. all whitespace is reported back to the application but it is indicated which whitespace is ignorable.  

xmlparsebuf 

Parse a buffer 

uword xmlparsebuf(xmlctx *ctx, const oratext *buffer, size_t len, const oratext *encoding, ub4 flags); 

xmlterm 

Shut down XML parser 

uword xmlterm(xmlctx *ctx); 

createDocument 

Create a new document 

xmlnode* createDocument(xmlctx *ctx)

An XML document is always rooted in a node of type DOCUMENT_NODE-- this function creates that root node and sets it in the context. 

isStandalone 

Return document's standalone flag 

boolean isStandalone(xmlctx *ctx)

Returns the boolean value of the document's standalone flag, as specified in the <?xml?> processing instruction.  

XML Parser for C: DOM API Functions

Table E-3 lists the XML Parser for C DOM API functions.

Table E-3 XML Parser for C: DOM API Functions  
Function  Brief Description 

appendChild 

Append child node to current node 

appendData 

Append character data to end of node's current data  

cloneNode  

Create a new node identical to the current one  

createAttribute 

Create an new attribute for an element node  

createCDATASection 

Create a CDATA_SECTION node 

createComment 

Create a COMMENT node  

createDocumentFragment 

Create a DOCUMENT_FRAGMENT node 

createElement 

Create an ELEMENT node 

createEntityReference 

Create an ENTITY_REFERENCE node 

createProcessingInstruction 

Create a PROCESSING_INSTRUCTION (PI) node 

createTextNode 

Create a TEXT node 

deleteData 

Remove substring from a node's character data 

getAttrName 

Return an attribute's name 

getAttrSpecified 

Return value of attribute's specified flag [DOM getSpecified] 

getAttrValue 

Return the value of an attribute  

getAttribute 

Return the value of an attribute 

getAttributeIndex 

Return an element's attribute given its index  

getAttributeNode 

Get an element's attribute node given its name [DOM getName]  

getAttributes 

Return array of element's attributes  

getCharData 

Return character data for a TEXT node [DOM getData]  

getCharLength 

Return length of TEXT node's character data [DOM getLength]  

getChildNode 

Return indexed node from array of nodes [DOM item]  

getChildNodes 

Return array of node's children  

getContentModel 

Returns the content model for an element from the DTD [DOM extension]  

getDocument 

Return top-level DOCUMENT node [DOM extension]  

getDocumentElemen 

Return highest-level (root) ELEMENT node  

getDocType  

Returns current DTD  

getDocTypeEntities 

Returns array of DTD's general entities  

getDocTypeName  

Returns name of DTD  

getDocTypeNotations  

Returns array of DTD's notations  

getElementsByTagName  

Returns list of elements with matching name  

getEntityNotation  

Returns an entity's NDATA [DOM getNotation]  

getEntityPubID 

Returns an entity's public ID [DOM getPublicId] 

getEntitySysID 

Returns an entity's system ID [DOM getSystemId]  

getFirstChild 

Returns the first child of a node 

getImplementation 

Returns DOM-implementation structure (if defined)  

getLastChild 

Returns the last child of a node 

getModifier 

Returns a content model node's '?', '*', or '+' modifier [DOM extension] 

getNextSibling 

Returns a node's next sibling 

getNamedItem 

Returns the named node from a list of nodes 

getNodeMapLength 

Returns number of entries in a NodeMap [DOM getLength] 

getNodeName 

Returns a node's name 

getNodeType 

Returns a node's type code (enumeration) 

getNodeValue 

Returns a node's "value", its character data 

getNotationPubID 

Returns a notation's public ID [DOM getPublicId] 

getNotationSysID 

Returns a notation's system ID [DOM getSystemId]  

getOwnerDocument 

Returns the DOCUMENT node containing the given node  

getPIData 

Returns a processing instruction's data [DOM getData]  

getPITarget 

Returns a processing instruction's target [DOM getTarget]  

getParentNode 

Returns a node's parent node  

getPreviousSibling 

Returns a node's "previous" sibling  

getTagName 

Returns a node's "tagname", same as name for now  

hasAttributes 

Determines if element node has attributes [DOM extension]  

hasChildNodes  

Determines if node has children 

hasFeature 

Determines if DOM implementation supports a specific feature  

insertBefore 

Inserts a new child node before the given reference node  

insertData 

Inserts new character data into a node's existing data  

isStandalone 

Determines if document is standalone [DOM extension]  

nodeValid 

Validates a node against the current DTD [DOM extension]  

normalize 

Normalize a node by merging adjacent TEXT nodes  

numAttributes 

Returns number of element node's attributes [DOM extension]  

numChildNodes 

Returns number of node's children [DOM extension]  

removeAttribute 

Removes an element's attribute given its names  

removeAttributeNode 

Removes an element's attribute given its pointer  

removeChild 

Removes a node from its parents list of children  

removeNamedItem 

Removes a node from a list of nodes given its name  

replaceChild 

Replaces one node with another  

replaceData 

Replaces a substring of a node's character data with another string setAttribute Sets (adds or replaces) a new attribute for an element node given the attribute's name and value setAttributeNode Sets (adds or replaces) a new attribute for an element node given a pointer to the new attribute  

setNamedItem 

Sets (adds or replaces) a new node in a parent's list of children  

setNodeValue 

Sets a node's "value" (character data)  

setPIData 

Sets a processing instruction's data [DOM setData]  

XML Parser for C: Namespace API Functions

Table E-4 lists the XML Parser for C, Namespace functions.

Table E-4 XML Parser for C: Namespace API Functions
Function  Brief Description 

getAttrLocal(xmlattr *attrs)  

Returns attribute local name 

getAttrNamespace(xmlattr *attr)  

Returns attribute namespace (URI) 

getAttrPrefix(xmlattr *attr)  

Returns attribute prefix 

getAttrQualifiedName(xmlattr *attr) 

Returns attribute fully qualified name 

getNodeLocal(xmlnode *node) 

Returns node local name 

getNodeNamespace(xmlnode *node) 

Returns node namespace (URI) 

getNodePrefix(xmlnode *node) 

Returns node prefix 

getNodeQualifiedName(xmlnode *node) 

Returns node qualified name 

XML Parser for C: XSLT API Functions

Table E-5 lists the XML Parser for C, XSLT functions.

Table E-5 XML Parser for C: XSLT API Functions
Function  Brief Description 

xslprocess()

xslprocess(xmlctx *docctx, xmlctx *xslctx, xmlctx *resctx, xmlnode **result) 

Processes XSL Stylesheet with XML document source and returns success or an error code. 

XML Parser for C: SAX API Functions

Table E-6 lists the XML Parser for C, SAX API functions.

Table E-6 XML Parser for C: SAX API Functions
SAX Function  Brief Description 

characters(void *ctx, const oratext *ch, size_t len) 

Receive notification of character data inside an element.  

endDocument(void *ctx)  

Receive notification of the end of the document.  

endElement(void *ctx, const oratext *name)  

Receive notification of the end of an element.  

ignorableWhitespace(void *ctx, const oratext *ch, size_t len) 

Receive notification of ignorable whitespace in element content.  

notationDecl(void *ctx, const oratext *name, const oratext *publicId, const oratext *systemId) 

Receive notification of a notation declaration.  

processingInstruction(void *ctx, const oratext *target, const oratext *data) 

Receive notification of a processing instruction.  

startDocument(void *ctx) 

Receive notification of the beginning of the document.  

startElement(void *ctx, const oratext *name, const struct xmlattrs *attrs) 

Receive notification of the start of an element.  

unparsedEntityDecl(void *ctx, const oratext *name, const oratext *publicId, const oratext *systemId, const oratext *notationName) 

Receive notification of an unparsed entity declaration.  

Non-SAX Callback Functions  

 

nsStartElement(void *ctx, const oratext *qname, const oratext *local, const oratext *namespace, const struct xmlattrs *attrs) 

Receive notification of the start of a namespace for an element.  


Go to previous page Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index