The large number of ideographs needed to support the Traditional Chinese, Simplified Chinese, Japanese, and Korean writing systems cannot be represented in one byte, and are often called double-byte or multibyte languages, depending on the platform architecture. The Solaris operating environment supports multibyte encoding, representing characters in one, two, or more bytes.
Separate software versions for multibyte locales need not be developed in the Solaris operating environment. However, there are issues unique to multibyte locale development--most importantly, that one character is not one byte in multibyte locales.
All localized versions of Solaris software are supersets of the U.S. English version and contain the same utilities and features. The difference between the U.S. English and a localized version is the addition of locale-specific data and tools facilitating input, display, and printing of local-language characters. All Asian versions of Solaris software include a locale database, user interface, and other locale-specific features. For example, Figure 4-1 shows how the locale database fits into the Japanese Solaris architecture.
Specific features were also added to the Traditional Chinese, Simplified Chinese, Japanese, and Korean localized versions of the Solaris operating environment to address the following issues:
The thousands of characters used in everyday communication
Ideographs with multiple meanings depending on context or pronunciation
How to enter thousands of characters is always an important issue in a multibyte language. Designing a keyboard with enough keys is simply not feasible. Instead, localized Solaris operating environments use input methods. Input methods (IMs) are system applications that convert keyboard input into a system-supported character. Figure 4-2 shows how an input method works.
Generally, the Motif text widget manages the input method. However, to customize the input method or have direct control, call the X11 XIM (X Input Method) APIs.
An application cannot assume a one-to-one mapping between a key-input stroke and a character. A single character may require more than a one key-input stroke and a one key-input event may trigger the input of more than one character.
In Chinese, Japanese, and Korean, more than one ideograph can correspond to an input string. To avoid confusion, the Solaris operating environment uses a Conversion Manager to display the possible dictionary choices in a window as shown in Figure 4-3. The pre-edit, status, and lookup choice areas are highlighted for the sample Simplified Chinese input-method.
Pre-edit area--displays characters as entered
Status area--displays whether conversion is activated and the states or mode of the input method
Lookup choice area--displays ideographic choices for the corresponding phonetic representation
Input methods and associated dictionaries are often referred to as language engines.
For more information on how to input Asian characters, see Section 3.1 in the Unicode Support in the Solaris Operating Environment.
An input-method server (IM server) acts as the interface between input methods and applications as shown in Figure 4-4.
The IM server can support multiple language engines and provides user control over language-engine preferences, such as:
Method of displaying status string when the portion of the string under consideration for conversion loses input focus.
Number of rows and columns in the input conversion candidate pop-up window.
Whether input conversion candidate selection window is displayed.
Many X toolkit-based applications automatically use the IM server for Asian text input. If you use any of Sun's toolkits (Motif, XViewTM, or OLIT), the input/output conversion process is transparent to the application.
The Font Editor is used to edit bitmap fonts. For example, a user may want to create a character not supported by the operating system because the repertoire of Han characters is too large. Using the Font Editor, new characters can be created and existing characters modified.
To start the Font Editor, type fontedit at the system prompt.
The User-Defined Character Tool is used to create new characters as well as to specify font size for new characters. This utility can support both bitmap and Type 1 fonts.
To start the User-Defined Character Tool, type sdtudctool at the system prompt.