International Language Environments Guide

Unicode Locale: en_US.UTF-8 Support

The Unicode/UTF-8 locales support Unicode 4.0. The en_US.UTF-8 locale provides multiscript processing support by using UTF-8 as its codeset. This locale handles processing of input and output text in multiple scripts, and was the first locale with this capability in the Oracle Solaris operating system. The capabilities of other UTF-8 locales are similar to those of en_us.UTF-8. The discussion of en_US.UTF-8 that follows applies equally to these locales.

Note –

UTF-8 is a file-system safe Universal Character Set Transformation Format of Unicode/ISO/IEC 10646-1 formulated by X/Open-Uniforum Joint Internationalization Working Group (XoJIG) in 1992 and approved by ISO and IEC, as Amendment 2 to ISO/IEC 10646-1:1993 in 1996. This standard has been adopted by the Unicode Consortium, the International Standards Organization, and the International Electrotechnical Commission as a part of Unicode 4.0 and ISO/IEC 10646-1.

Unicode locales in the Oracle Solaris environment support the processing of every code point value that is defined in Unicode 4.0 and ISO/IEC 10646-1 and 10646-2. Supported scripts include pan-European and Asian scripts and also complex text layout scripts for the Arabic, Hebrew, Indic, and Thai languages.

Note –

Some Unicode locales, notably the Asian locales, include more Kanji or Hanzi glyphs.

Due to limited font resources, the current Oracle Solaris Unicode locales include character glyphs from the following character sets.

If you try to view characters for which the en_US.UTF-8 locale does not have corresponding glyphs, the locale displays a no-glyph glyph instead, as shown in the following illustration:

The preceding context describes the graphic.

The locale is selectable at installation time and may be designated as the system default locale.

The same level of en_US.UTF-8 locale support is provided for both 64-bit and 32-bit Oracle Solaris systems.

Note –

Motif and CDE desktop applications and libraries support the en_US.UTF-8 locale. However, XView™ and OLIT libraries do not support the en_US.UTF-8 locale.