9 Package XML APIs for C

This C implementation of the XML processor (or parser) follows the W3C XML specification (rev REC-xml-19980210) and implements the required behavior of an XML processor in terms of how it must read XML data and the information it must provide to the application.

This chapter contains the following section:


XML Interface

Table 9-1 summarizes the methods of available through the XML interface.

Table 9-1 Summary of XML Methods

Function Summary
XmlAccess() Set access method callbacks for URL.
XmlCreate() Create an XML Developer's Toolkit xmlctx.
XmlCreateDTD() Create DTD.
XmlCreateDocument() Create Document (node).
XmlDestroy() Destroy an xmlctx.
XmlFreeDocument() Free a document (releases all resources).
XmlGetEncoding() Returns data encoding in use by XML context.
XmlHasFeature() Determine if DOM feature is implemented.
XmlIsSimple() Returns single-byte (simple) characterset flag.
XmlIsUnicode() Returns XmlIsUnicode (simple) characterset flag.
XmlLoadDom() Load (parse) an XML document and produce a DOM.
XmlLoadSax() Load (parse) an XML document from and produce SAX events.
XmlLoadSaxVA() Load (parse) an XML document from and produce SAX events [varargs].
XmlSaveDom() Saves (serializes, formats) an XML document.
XmlVersion() Returns version string for XDK.


XmlAccess()

Sets the open/read/close callbacks used to load data for a specific URL access method. Overrides the built-in data loading functions for HTTP, FTP, and so on, or provides functions to handle new types, such as UNKNOWN.

Syntax

xmlerr XmlAccess(
   xmlctx *xctx, 
   xmlurlacc access, 
   void *userctx,
   XML_ACCESS_OPEN_F(
      (*openf),
      ctx,
      uri,
      parts,
      length,
      uh),
   XML_ACCESS_READ_F(
      (*readf),
      ctx,
      uh,
      data,
      nraw,
      eoi),
   XML_ACCESS_CLOSE_F(
      (*closef), 
      ctx,
      uh));
Parameter In/Out Description
xctx
IN
XML context
access
IN
URL access method
userctx
IN
user-defined context passed to callbacks
openf
IN
open-access callback function
readf
IN
read-access callback function
closef
IN
close-access callback function

Returns

(xmlerr) numeric error code, XMLERR_OK [0] on success


XmlCreate()

Create an XML Developer's Toolkit xmlctx. Properties common to all xmlctx's (both XDK and XMLType) are:

  • "data_encoding", name of data encoding) The encoding in which XML data will be presented through DOM and SAX. If not specified, the default is UTF-8 (or UTF-E on EBCDIC platforms). Note that single-byte encodings such as EBCDIC or ISO-8859 are substantially faster than multibyte encodings like UTF-8; Unicode (UTF-16) uses more memory but has better performance than multibyte.

  • BEGIN_NO_DOC ("data_lid", data encoding lid) The data encoding specified as an NLS lx_langid; the matching NLS global area must also be specified. END_NO_DOC

  • "default_input_encoding", name of default input encoding) If the encoding of an input document cannot be automatically determined through BOM, XMLDecl, protocol header, and so on, then this encoding will be assumed.

  • BEGIN_NO_DOC ("default_input_lid", default input encoding lid) The default input encoding specified as an NLS lx_langid; the matching NLS global area must also be specified. END_NO_DOC

  • "error_language", error language or language.encoding) The language (and optional encoding) in which error messages are created. The default is American with UTF-8 encoding. To specify just the language, give the name of the language and nothing else ("American"); To also specify the encoding, add a dot and the Oracle name of the encoding ("American.WE8ISO8859P1").

  • "error_handler", function pointer, see XML_ERRMSG_F) Default behavior on errors is to output the formatted message to stderr. If an error handler is provided, the formatted message will be passed to it instead of being printed.

  • "error_context", user-defined context for error handler) This is a context pointer to be passed to the error handler function. It's meaning is user-defined; it is just specified here and passed along when an error occurs.

  • "input_encoding", name of forced input encoding) The forced input encoding for input documents. Used to override a document's XMLDecl, and so on, and always interpret it in the given encoding. Use of this feature is strongly discouraged. It should be not necessary in normal use, as BOMs, XMLDecls, and so on, when existing, should be correct.

  • BEGIN_NO_DOC ("input_lid", INLID, POINTER), The forced input encoding

  • ("lpu_context", lpu context) The LPU context used for URL data loading and access-method hooking. If one is not provided, it will be made for you.

  • ("lml_context", LMLCTX, POINTER), The LML context used for low-level memory allocation. If not provided, one will be made. From the outside, end-users have to set memory_alloc, memory_free, and so on. END_NO_DOC

  • ("memory_alloc", low-level memory allocation function) Low-level memory allocation function, if malloc is not to be used. If provided, the matching free function must also be given. See XML_ALLOC_F.

  • ("memory_free", low-level memory freeing function) Low-level memory freeing function, if free is not to be used. Matches the alloc function.

  • ("memory_context", user-defined memory context) User-defined memory context which is passed to the alloc and free functions. Its definition and use is entirely up to the user; it is just set here and passed to the callbacks.

  • BEGIN_NO_DOC ("nls_global_area", NLS global area, lx_glo) If any encoding are specified as NLS lids, the matching NLS global area must also be specified. END_NO_DOC

  • The XDK has properties of its own, that only apply to an XDK type xmlctx (the previous properties were all general and applied to all xmlctx's).

  • ("input_buffer_size", size in characters of input buffer) This is the basic I/O buffer size. Default is 256K, minimum is 4K and maximum is 4MB. Depending on the encoding, 1, 2 or 3 of these buffers may be needed. Note size is in characters, not bytes. If the buffer holds Unicode data, it will be twice as large.

  • ("memory_block_size", size in bytes of memory allocation unit) This is the size of chunk the high-level memory package will request from the low-level allocator; i.e., the basic unit of memory allocation. Default is 64K, minimum is 16K and maximum is 256K.

Syntax

xmlctx *XmlCreate(
   xmlerr *err, 
   oratext *name,
   list);
Parameter In/Out Description
err
OUT
returned error code
access
IN
name of context, for debugging
list
IN
NULL-terminated list of variable arguments

Returns

(xmlctx *) created xmlctx [or NULL on error with err set]


XmlCreateDTD()

Create DTD.

Syntax

xmldocnode* XmlCreateDTD(
   xmlctx *xctx
   oratext *qname,
   oratext *pubid,
   oratext *sysid,
   xmlerr *err);
Parameter In/Out Description
xctx
IN
XML context
qname
IN
qualified name
pubid
IN
external subset public identifier
sysid
IN
external subset system identifier
err
OUT
returned error code

Returns

(xmldtdnode *) new DTD node


XmlCreateDocument()

Creates the initial top-level DOCUMENT node and its supporting infrastructure. If a qualified name is provided, a an element with that name is created and set as the document's root element.

Syntax

xmldocnode* XmlCreateDocument(
   xmlctx *xctx,
   oratext *uri,
   oratext *qname, 
   xmldtdnode *dtd,
   xmlerr *err);
Parameter In/Out Description
xctx
IN
XML context
uri
IN
namespace URI of root element to create, or NULL
qname
IN
qualified name of root element, or NULL if none
dtd
IN
associated DTD node
err
OUT
returned error code

Returns

(xmldocnode *) new Document object.


XmlDestroy()

Destroys an xmlctx

Syntax

void XmlDestroy(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context


See Also:

XmlCreate()


XmlFreeDocument()

Destroys a document created by XmlCreateDocument or through one of the Load functions. Releases all resources associated with the document, which is then invalid.

Syntax

void XmlFreeDocument(
   xmlctx *xctx,
   xmldocnode *doc);
Parameter In/Out Description
xctx
IN
XML context
doc
IN
document to free


XmlGetEncoding()

Returns data encoding in use by XML context. Ordinarily, the data encoding is chosen by the user, so this function is not needed. However, if the data encoding is not specified, and allowed to default, this function can be used to return the name of that default encoding.

Syntax

oratext *XmlGetEncoding(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

Returns

(oratext *) name of data encoding


XmlHasFeature()

Determine if a DOM feature is implemented. Returns TRUE if the feature is implemented in the specified version, FALSE otherwise.

In level 1, the legal values for package are 'HTML' and 'XML' (case-insensitive), and the version is the string "1.0". If the version is not specified, supporting any version of the feature will cause the method to return TRUE.

  • DOM 1.0 features are "XML" and "HTML".

  • DOM 2.0 features are "Core", "XML", "HTML", "Views", "StyleSheets", "CSS", "CSS2", "Events", "UIEvents", "MouseEvents", "MutationEvents", "HTMLEvents", "Range", "Traversal"

Syntax

boolean XmlHasFeature(
   xmlctx *xctx,
   oratext *feature,
   oratext *version);
Parameter In/Out Description
xctx
IN
XML context
feature
IN
package name of the feature to test
version
IN
version number of the package name to test

Returns

(boolean) feature is implemented?


XmlIsSimple()

Returns a flag saying whether the context's data encoding is "simple", single-byte for each character, like ASCII or EBCDIC.

Syntax

boolean XmlIsSimple(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

Returns

(boolean) TRUE of data encoding is "simple", FALSE otherwise


XmlIsUnicode()

Returns a flag saying whether the context's data encoding is Unicode, UTF-16, with two-byte for each character.

Syntax

boolean XmlIsUnicode(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

Returns

(boolean) TRUE of data encoding is Unicode, FALSE otherwise


XmlLoadDom()

Loads (parses) an XML document from an input source and creates a DOM. The root document node is returned on success, or NULL on failure (with err set).

The function takes two fixed arguments, the xmlctx and an error return code, then zero or more (property, value) pairs, then NULL.

SOURCE Input source is set by one of the following mutually exclusive properties (choose one):

  • ("uri", document URI) [compiler encoding]

  • ("file", document filesystem path) [compiler encoding]

  • ("buffer", address of buffer, "buffer_length", # bytes in buffer)

  • ("stream", address of stream object, "stream_context", pointer to stream object's context)

  • ("stdio", FILE* stream)

PROPERTIES Additional properties:

  • ("dtd", DTD node) DTD for document

  • ("base_uri", document base URI) for documents loaded from other sources than a URI, sets the effective base URI. the document's base URI is needed in order to resolve relative URI include, import, and so on.

  • ("input_encoding", encoding name) forced input encoding [name]

  • ("default_input_encoding", encoding_name) default input encoding to assume if document is not self-describing (no BOM, protocol header, XMLDecl, and so on)

  • ("schema_location", string) schemaLocation of schema for this document. used to figure optimal layout when loading documents into a database

  • ("validate", boolean) when TRUE, turns on DTD validation; by default, only well-formedness is checked. note that schema validation is a separate beast.

  • ("discard_whitespace", boolean) when TRUE, formatting whitespace between elements (newlines and indentation) in input documents is discarded. by default, ALL input characters are preserved.

  • ("dtd_only", boolean) when TRUE, parses an external DTD, not a complete XML document.

  • ("stop_on_warning", boolean) when TRUE, warnings are treated the same as errors and cause parsing, validation, and so on, to stop immediately. by default, warnings are issued but the game continues.

  • ("warn_duplicate_entity", boolean) when TRUE, entities which are declared more than once will cause warnings to be issued. the default is to accept the first declaration and silently ignore the rest.

  • ("no_expand_char_ref", boolean) when TRUE, causes character references to be left unexpanded in the DOM data. ordinarily, character references are replaced by the character they represent. however, when a document is saved those characters entities do not reappear. to way to ensure they remain through load and save is to not expand them.

  • ("no_check_chars", boolean) when TRUE, omits the test of XML [2] Char production: all input characters will be accepted as valid

Syntax

xmldocnode *XmlLoadDom(
   xmlctx *xctx, 
   xmlerr *err, 
   list);
Parameter In/Out Description
xctx
IN
XML context
err
OUT
returned error code
list
IN
NULL-terminated list of variable arguments

Returns

(xmldocnode *) document node on success [NULL on failure with err set]


See Also:

XmlSaveDom()


XmlLoadSax()

Loads (parses) an XML document from an input source and generates a set of SAX events (as user callbacks). Input sources and basic set of properties is the same as for XmlLoadDom.

Syntax

xmlerr XmlLoadSax(
   xmlctx *xctx,
   xmlsaxcb *saxcb,
   void *saxctx, 
   list);
Parameter In/Out Description
xctx
IN
XML context
saxcb
IN
SAX callback structure
saxctx
IN
context passed to SAX callbacks
list
IN
NULL-terminated list of variable arguments

Returns

(xmlerr) numeric error code, XMLERR_OK [0] on success


XmlLoadSaxVA()

Loads (parses) an XML document from an input source and generates a set of SAX events (as user callbacks). Input sources and basic set of properties is the same as for XmlLoadDom.

Syntax

xmlerr XmlLoadSaxVA(
   xmlctx *xctx, 
   xmlsaxcb *saxcb, 
   void *saxctx, 
   va_list va);
Parameter In/Out Description
xctx
IN
XML context
saxcb
IN
SAX callback structure
saxctx
IN
context passed to SAX callbacks
va
IN
NULL-terminated list of variable arguments

Returns

(xmlerr) numeric error code, XMLERR_OK [0] on success


XmlSaveDom()

Serializes document or subtree to the given destination and returns the number of bytes written; if no destination is provided, just returns formatted size but does not output.

If an output encoding is specified, the document will be re-encoded on output; otherwise, it will be in its existing encoding.

The top level is indented step*level spaces, the next level step*(level+1) spaces, and so on.

When saving to a buffer, if the buffer overflows, 0 is returned and err is set to XMLERR_SAVE_OVERFLOW.

DESTINATION Output destination is set by one of the following mutually exclusive properties (choose one):

  • ("uri", document URI) POST, PUT? [compiler encoding]

  • ("file", document filesystem path) [compiler encoding]

  • ("buffer", address of buffer, "buffer_length", # bytes in buffer)

  • ("stream", address of stream object, "stream_context", pointer to stream object's context)

PROPERTIES Additional properties:

  • ("output_encoding", encoding name) name of final encoding for document. unless specified, saved document will be in same encoding as xmlctx.

  • ("indent_step", unsigned) spaces to indent each level of output. default is 4, 0 means no indentation.

  • ("indent_level", unsigned) initial indentation level. default is 0, which means no indentation, flush left.

  • ("xmldecl", boolean) include an XMLDecl in the output document. ordinarily an XMLDecl is output for a compete document (root node is DOC).

  • ("bom", boolean) input a BOM in the output document. usually the BOM is only needed for certain encodings (UTF-16), and optional for others (UTF-8). causes optional BOMs to be output.

  • ("prune", boolean) prunes the output like the unix 'find' command; does not not descend to children, just prints the one node given.

Syntax

ubig_ora XmlSaveDom(
   xmlctx *xctx,
   xmlerr *err,
   xmlnode *root,
   list);
Parameter In/Out Description
xctx
IN
XML context
err
OUT
error code on failure
root
IN
root node or subtree to save
list
IN
NULL-terminated list of variable arguments

Returns

(ubig_ora) number of bytes written to destination


See Also:

XmlLoadDom()


XmlVersion()

Returns the version string for the XDK

Syntax

oratext *XmlVersion();

Returns

(oratext *) version string