Common Desktop Environment: Internationalization Programmer's Guide

Code Set Strategy

The common open software environment code set support is based on International Organization for Standardization (ISO) and industry-standard code sets providing industry-standard code sets that satisfy the data processing needs of users.

Each locale in the system defines which code set it uses and how the characters within the code set are manipulated. Because multiple locales can be installed on the system, multiple code sets can be used by different users on the system. While the system can be configured with locales using different code sets, all system utilities assume that the system is running under a single code set.

Most commands have no knowledge of the underlying code set being used by the locale. The knowledge of code sets is hidden by the code-set-independent library subroutines (Internationalization libraries), which pass information to the code-set-dependent subroutines.

Because many programs rely on ASCII, all code sets include the 7-bit ASCII code set as a proper subset. Because the 7-bit ASCII code set is common to all supported code sets, its characters are sometimes referred to as the portable character set.

The 7-bit ASCII code set is based on the ISO646 definition and contains the control characters, punctuation characters, digits (0-9), and the English alphabet in uppercase and lowercase.