International Language Environments Guide

Unicode Locale: en_US.UTF-8 Support Overview

The Unicode/UTF-8 locales support Unicode 3.1. The en_US.UTF-8 locale provides multiscript processing support by using UTF-8 as its codeset. This locale handles processing of input and output text in multiple scripts, and was the first locale with this capability in the Solaris operating environment. The capabilities of other UTF-8 locales are similar to those of en_us.UTF-8; the discussion of en_US.UTF-8 that follows applies equally to these locales.


Note –

UTF-8 is a file-system safe Universal Character Set Transformation Format of Unicode / ISO/IEC 10646-1 formulated by X/Open-Uniforum Joint Internationalization Working Group (XoJIG) in 1992 and approved by ISO and IEC, as Amendment 2 to ISO/IEC 10646-1:1993 in 1996. This standard has been adopted by the Unicode Consortium, the International Standards Organization, and the International Electrotechnical Commission as a part of Unicode 2.0 and ISO/IEC 10646-1.


Unicode locales in Solaris support the processing of every code point value that is defined in Unicode 3.1 and ISO/IEC 10646-1 and 10646-2. Supported scripts include not only pan-European scripts and Asian scripts but also complex text layout scripts such as Arabic, Hebrew, Hindi, and Thai. Due to limited font resources, the Solaris 9 software includes only character glyphs from the following character sets:

If you try to view characters for which the en_US.UTF-8 locale does not have corresponding glyphs, the locale displays a “no-glyph” glyph instead, as shown in the following illustration:

Graphic

The locale is selectable at installation time and may be designated as the system default locale.

The same level of en_US.UTF-8 locale support is provided for both 64-bit and 32-bit Solaris systems.


Note –

Motif and CDE desktop applications and libraries support the en_US.UTF-8 locale. However, XView™ and OLIT libraries do not support the en_US.UTF-8 locale.