The Java EE 5 Tutorial

Further Information about Character Encoding

The character set and encoding names recognized by Internet authorities are listed in the IANA character set registry at http://www.iana.org/assignments/character-sets.

The Java programming language represents characters internally using the Unicode character set, which provides support for most languages. For storage and transmission over networks, however, many other character encodings are used. The Java 2 platform therefore also supports character conversion to and from other character encodings. Any Java runtime must support the Unicode transformations UTF-8, UTF-16BE, and UTF-16LE as well as the ISO-8859-1 character encoding, but most implementations support many more. For a complete list of the encodings that can be supported by the Java 2 platform, see http://java.sun.com/javase/6/docs/technotes/guides/intl/encoding.doc.html.