Declaring the Character Set in Use

You must include the following parameter in the XML version declaration of your XML, XSD, or DTD document to declare the character set in use, if it is not the default of UTF-8:

	<?xml version="1.0" encoding="US-ASCII"?>

Supported character sets include but are not limited to ASCII, UTF-8, UTF-16 (Big or Small Endian), UCS4 (Big or Small Endian), EBCDIC code pages IBM037 and IBM1140 encodings, ISO-8859-1, and Windows-1252. This means that the XML parser can parse input XML files in these encodings.

The following encodings can be used in the XML declaration:

  • US-ASCII

  • UTF-8

  • ISO-10646-UCS-4

  • ebcdic-cp-us

  • ibm1140

  • ISO-8859-1

  • windows-1252

The character set declaration encoding must appear after the version declaration. For example: <?xml version="1.0" encoding="US-ASCII"?>

The output can be in one of the following XML encodings:

  • UTF-8

  • UTF-16

  • Local Code Page