International Language Environments Guide

Code Conversions

The en_US.UTF-8 locale supports various code conversions among major codesets of several countries throughiconv(1) andiconv(3).


Note -

In the Solaris 8 environment, the utility geniconvtbl enables user-defined code conversions. The user-defined code conversions created with the geniconvtbl utility can be used with both iconv(1) and iconv(3). For more detail on this utility, refer togeniconvtbl(1) andgeniconvtbl(4) man pages.


The available fromcode and tocode names that can be applied to iconv(1) and iconv_open(3)are shown in the following table. For more details on iconv code conversion, see theiconv(1) andiconv_open(3),iconv(3), andiconv_close(3) man pages. For more information on available code conversions, see iconv_en_US.UTF-8(5).

Also see Appendix A, iconv Code Conversions.


Note -

UCS-2, UCS-4, UTF-16 are all fixed-width Unicode/ ISO/IEC 10646 representation forms that recognizes Byte Order Mark (BOM) characters defined in the Unicode 3.0 and ISO/IEC10646-1:1999 standards. Other forms, like UCS-2BE, UCS-4BE, and UTF-16BE, are all fixed-width Unicode/ ISO/IEC 10646 representation forms that do not recognize the BOM character and also assume Big Endian byte ordering. Representation forms like UCS-2LE, UCS-4LE, UTF-16LE, on the other hand, assume Little Endian byte ordering. They also do not recognize the BOM character.



Note -

For associated scripts/languages of ISO 8859-* and KOI8-*, see http://czyborra.com/charsets/iso8859.html.