This section provides definitions for terms and concepts used for internationalized text input and a brief overview of the intended use of the mechanisms provided by Xlib.
A large number of languages in the world use alphabets consisting of a small set of symbols (letters) to form words. To enter text into a computer in an alphabetic language, a user usually has a keyboard on which there are key symbols corresponding to the alphabet. Sometimes, a few characters of an alphabetic language are missing on the keyboard. Many computer users who speak a Latin-alphabet-based language only have an English-based keyboard. They need to press a combination of keystrokes to enter a character that does not exist directly on the keyboard. A number of algorithms have been developed for entering such characters, known as European input methods, the compose input method, or the dead-keys input method.
Japanese is an example of a language with a phonetic symbol set, where each symbol represents a specific sound. There are two phonetic symbol sets in Japanese: Katakana and Hiragana. In general, Katakana is used for words that are of foreign origin, and Hiragana for writing native Japanese words. Collectively, the two systems are called Kana. Hiragana consists of 83 characters; Katakana, 86 characters.
Korean also has a phonetic symbol set, called Hangul. Each of the 24 basic phonetic symbols (14 consonants and 10 vowels) represent a specific sound. A syllable is composed of two or three parts: the initial consonants, the vowels, and the optional last consonants. With Hangul, syllables can be treated as the basic units on which text processing is done. For example, a delete operation may work on a phonetic symbol or a syllable. Korean code sets include several thousands of these syllables. A user types the phonetic symbols that make up the syllables of the words to be entered. The display may change as each phonetic symbol is entered. For example, when the second phonetic symbol of a syllable is entered, the first phonetic symbol may change its shape and size. Likewise, when the third phonetic symbol is entered, the first two phonetic symbols may change their shape and size.
Not all languages rely solely on alphabetic or phonetic systems. Some languages, including Japanese and Korean, employ an ideographic writing system. In an ideographic system, rather than taking a small set of symbols and combining them in different ways to create words, each word consists of one unique symbol (or, occasionally, several symbols). The number of symbols may be very large: approximately 50,000 have been identified in Hanzi, the Chinese ideographic system.
There are two major aspects of ideographic systems for their computer usage. First, the standard computer character sets in Japan, China, and Korea include roughly 8,000 characters, while sets in Taiwan have between 15,000 and 30,000 characters, which make it necessary to use more than one byte to represent a character. Second, it is obviously impractical to have a keyboard that includes all of a given language's ideographic symbols. Therefore a mechanism is required for entering characters so that a keyboard with a reasonable number of keys can be used. Those input methods are usually based on phonetics, but there are also methods based on the graphical properties of characters.
In Japan, both Kana and Kanji are used. In Korea, Hangul and sometimes Hanja are used. Now, consider entering ideographs in Japan, Korea, China, and Taiwan.
In Japan, either Kana or English characters are entered and a region is selected (sometimes automatically) for conversion to Kanji. Several Kanji characters can have the same phonetic representation. If that is the case, with the string entered, a menu of characters is presented and the user must choose the appropriate option. If no choice is necessary or a preference has been established, the input method does the substitution directly. When Latin characters are converted to Kana or Kanji, it is called a Romaji conversion.
In Korea, it is usually acceptable to keep Korean text in Hangul form, but some people may choose to write Hanja-originated words in Hanja rather than in Hangul. To change Hangul to Hanja, a region is selected for conversion and the user follows the same basic method as described for Japanese.
Probably because there are well-accepted phonetic writing systems for Japanese and Korean, computer input methods in these countries for entering ideographs are fairly standard. Keyboard keys have both English characters and phonetic symbols engraved on them, and the user can switch between the two sets.
The situation is different for Chinese. While there is a phonetic system called Pinyin promoted by authorities, there is no consensus for entering Chinese text. Some vendors use a phonetic decomposition (Pinyin or another), others use ideographic decomposition of Chinese words, with various implementations and keyboard layouts. There are about 16 known methods, none of which is a clear standard.
Also, there are actually two ideographic sets used: Traditional Chinese (the original written Chinese) and Simplified Chinese. Several years ago, the People's Republic of China launched a campaign to simplify some ideographic characters and eliminate redundancies altogether. Under the plan, characters would be streamlined every five years. Characters have been revised several times now, resulting in the smaller, simpler set that makes up Simplified Chinese.
As shown in the previous section, there are many different input methods used today, each varying with language, culture, and history. A common feature of many input methods is that the user can type multiple keystrokes to compose a single character (or set of characters). The process of composing characters from keystrokes is called preediting. It may require complex algorithms and large dictionaries involving substantial computer resources.
Input methods may require one or more areas in which to show the feedback of the actual keystrokes, to show ambiguities to the user, to list dictionaries, and so on. The following are the input method areas of concern.
Intended to be a logical extension of the light-emitting diodes (LEDs) that exist on the physical keyboard. It is a window that is intended to present the internal state of the input method that is critical to the user. The status area may consist of text data and bitmaps or some combination.
Intended to display the intermediate text for those languages that are composing prior to the client handling the data.
Used for pop-up menus and customizing dialog boxes that may be required for an input method. There may be multiple auxiliary areas for any input method. Auxiliary areas are managed by the input method independent of the client. Auxiliary areas are assumed to be a separate dialog that is maintained by the input method.
Data is displayed directly in the application window. Application data is moved to allow preedit data to be displayed at the point of insertion.
Data is displayed in a preedit window that is placed over the point of insertion.
Preedit window is displayed inside the application window but not at the point of insertion. Often, this type of window is placed at the bottom of the application window.
Preedit window is the child of RootWindow.
It would require a lot of computing resources if portable applications had to include input methods for all the languages in the world. To avoid this, a goal of the Xlib design is to allow an application to communicate with an input method placed in a separate process. Such a process is called an input server. The server to which the application should connect is dependent on the environment when the application is started up: what the user language is and the actual encoding to be used for it. The input method connection is said to be locale-dependent. It is also user-dependent; for a given language, the user can choose, to some extent, the user-interface style of input method (if there are several choices).
Using an input server implies communications overhead, but applications can be migrated without relinking. Input methods can be implemented either as a token communicating to an input server or as a local library.
The abstraction used by a client to communicate with an input method is an opaque data structure represented by the XIM data type. This data structure is returned by the XOpenIM() function, which opens an input method on a given display. Subsequent operations on this data structure encapsulate all communication between client and input method. There is no need for an X client to use any networking library or natural language package to use an input method.
A single input server can be used for one or more languages, supporting one or more encoding schemes. But the strings returned from an input method are always encoded in the (single) locale associated with the XIM object.
Xlib provides the ability to manage a multithreaded state for text input. A client may be using multiple windows, each window with multiple text entry areas, with the user possibly switching among them at any time. The abstraction for representing the state of a particular input thread is called an input context. The Xlib representation of an input context is an XIC. See Figure 5-1 for an illustration.
An input context is the abstraction retaining the state, properties, and semantics of communication between a client and an input method. An input context is a combination of an input method, a locale specifying the encoding of the character strings to be returned, a client window, internal state information, and various layout or appearance characteristics. The input context concept somewhat matches for input the graphics context abstraction defined for graphics output.
One input context belongs to exactly one input method. Different input contexts can be associated with the same input method, possibly with the same client window. An XIC is created with the XCreateIC() function, providing an XIM argument, affiliating the input context to the input method for its lifetime. When an input method is closed with the XCloseIM() function, no affiliated input contexts should be used again (and should preferably be deleted before closing the input method).
Considering the example of a client window with multiple text entry areas, the application programmer can choose to implement the following:
As many input contexts are created as text-entry areas. The client can get the input accumulated on each context each time it looks up that context.
A single context is created for a top-level window in the application. If such a window contains several text-entry areas, each time the user moves to another text-entry area, the client has to indicate changes in the context.
Application designers can choose a range of single or multiple input contexts, according to the needs of their applications.
To obtain characters from an input method, a client must call the XmbLookupString() function or XwcLookupString() function with an input context created from that input method. Both a locale and display are bound to an input method when they are opened, and an input context inherits this locale and display. Any strings returned by the XmbLookupString() or XwcLookupString() function are encoded in that locale.
For each text-entry area in which the XmbLookupString() or XwcLookupString() function is used, there is an associated input context.
When the application focus moves to a text-entry area, the application must set the input context focus to the input context associated with that area. The input context focus is set by calling the XSetICFocus() function with the appropriate input context.
Also, when the application focus moves out of a text-entry area, the application should unset the focus for the associated input context by calling the XUnsetICFocus() function. As an optimization, if the XSetICFocus() function is called successively on two different input contexts, setting the focus on the second automatically unsets the focus on the first.
To set and unset the input context focus correctly, it is necessary to track application-level focus changes. Such focus changes do not necessarily correspond to X server focus changes.
If a single input context is used to do input for multiple text-entry areas, it is also necessary to set the focus window of the input context whenever the focus window changes.
In most input method architectures (OnTheSpot being the notable exception), the input method performs the display of its own data. To provide better visual locality, it is often desirable to have the input method areas embedded within a client. To do this, the client may need to allocate space for an input method. Xlib provides support that allows the client to provide the size and position of input method areas. The input method areas that are supported for geometry management are the status area and the preedit area.
The fundamental concept on which geometry management for input method windows is based is the proper division of responsibilities between the client (or toolkit) and the input method. The division of responsibilities is the following:
The client is responsible for the geometry of the input method window.
The input method is responsible for the contents of the input method window. It is also responsible for creating the input method window per the geometry constraints given to it by the client.
An input method can suggest a size to the client, but it cannot suggest a placement. The input method can only suggest a size: it does not determine the size, and it must accept the size it is given.
Before a client provides geometry management for an input method, it must determine if geometry management is needed. The input method indicates the need for geometry management by setting the XIMPreeditArea() or XIMStatusArea() function in its XIMStyles value returned by the XGetIMValues() function. When a client decides to provide geometry management for an input method, it indicates that decision by setting the XNInputStyle value in the XIC.
After a client has established with the input method that it will do geometry management, the client must negotiate the geometry with the input method. The geometry is negotiated by the following steps:
The client suggests an area to the input method by setting the XNAreaNeeded value for that area. If the client has no constraints for the input method, it either does not suggest an area or sets the width and height to 0 (zero). Otherwise, it sets one of the values.
The client gets the XIC XNAreaNeeded value. The input method returns its suggested size in this value. The input method should pay attention to any constraints suggested by the client.
The client sets the XIC XNArea value to inform the input method of the geometry of the input method's window. The client should try to honor the geometry requested by the input method. The input method must accept this geometry.
Clients performing geometry management must be aware that setting other IC values may affect the geometry desired by an input method. For example, the XNFontSet and XNLineSpacing values may change the geometry desired by the input method. It is the responsibility of the client to renegotiate the geometry of the input method window when it is needed.
In addition, a geometry management callback is provided by which an input method can initiate a geometry change.
A filtering mechanism is provided to allow input methods to capture X events transparently to clients. It is expected that toolkits (or clients) using the XmbLookupString() or XwcLookupString() function call this filter at some point in the event processing mechanism to make sure that events needed by an input method can be filtered by that input method. If there is no filter, a client can receive and discard events that are necessary for the proper functioning of an input method. The following provides a few examples of such events:
Expose events that are on a preedit window in local mode.
Events can be used by an input method to communicate with an input server. Such input server protocol-related events have to be intercepted if the user does not want to disturb client code.
Key events can be sent to a filter before they are bound to translations such as Xt provides.
Clients are expected to get the XIC XNFilterEvents value and add to the event mask for the client window with that event mask. This mask can be 0.