To avoid conversion issues, Oracle Solaris locales use the UTF-8 encoding form described in UTF-8 Overview of the Unicode character set. All supported languages have a UTF-8 locale as the preferred and supported form.
For historical, technical, and legal reasons, non-UTF-8 locales are also available in Oracle Solaris – the C locale, legacy single-byte (8-bit) ISO locales for EMEA languages, and traditional locales for APAC languages.
Single-byte character sets were popular in the past because they used just one byte (8 bits) to represent one character. But due to the limited size of the sets (a maximum of 256 characters), different languages have to use different character sets. This introduces many problems – a file created in one character set is often unreadable in another character set, representing a multilanguage document is an issue, and also many languages have more characters than can be represented by a single byte, and the like. For these languages, such as Chinese, different traditional multibyte character sets were created.
The non-UTF-8 locales, also called legacy or traditional locales, have limited support in Oracle Solaris 11.4. These limited support locales are not installed by default. Localization that exists for a UTF-8 locale might not be available in the non-UTF-8 locale variant.
Locale facets also need to be set correctly. For more information, see Locale Facets.