Input Method Architecture (Common Desktop Environment: Internationalization Programmer's Guide)

Common Desktop Environment: Internationalization Programmer's Guide

Input Method Architecture

As shown in the previous section, there are many different input methods used today, each varying with language, culture, and history. A common feature of many input methods is that the user can type multiple keystrokes to compose a single character (or set of characters). The process of composing characters from keystrokes is called preediting. It may require complex algorithms and large dictionaries involving substantial computer resources.

Input methods may require one or more areas in which to show the feedback of the actual keystrokes, to show ambiguities to the user, to list dictionaries, and so on. The following are the input method areas of concern.

Status area: Intended to be a logical extension of the light-emitting diodes (LEDs) that exist on the physical keyboard. It is a window that is intended to present the internal state of the input method that is critical to the user. The status area may consist of text data and bitmaps or some combination.
Preedit area: Intended to display the intermediate text for those languages that are composing prior to the client handling the data.
Auxiliary area: Used for pop-up menus and customizing dialog boxes that may be required for an input method. There may be multiple auxiliary areas for any input method. Auxiliary areas are managed by the input method independent of the client. Auxiliary areas are assumed to be a separate dialog that is maintained by the input method.

There are various user interaction styles used for preediting. The following are the preediting styles supported by Xlib.

OnTheSpot: Data is displayed directly in the application window. Application data is moved to allow preedit data to be displayed at the point of insertion.
OverTheSpot: Data is displayed in a preedit window that is placed over the point of insertion.
OffTheSpot: Preedit window is displayed inside the application window but not at the point of insertion. Often, this type of window is placed at the bottom of the application window.
Root window: Preedit window is the child of RootWindow.

It would require a lot of computing resources if portable applications had to include input methods for all the languages in the world. To avoid this, a goal of the Xlib design is to allow an application to communicate with an input method placed in a separate process. Such a process is called an input server. The server to which the application should connect is dependent on the environment when the application is started up: what the user language is and the actual encoding to be used for it. The input method connection is said to be locale-dependent. It is also user-dependent; for a given language, the user can choose, to some extent, the user-interface style of input method (if there are several choices).

Using an input server implies communications overhead, but applications can be migrated without relinking. Input methods can be implemented either as a token communicating to an input server or as a local library.

The abstraction used by a client to communicate with an input method is an opaque data structure represented by the XIM data type. This data structure is returned by the XOpenIM() function, which opens an input method on a given display. Subsequent operations on this data structure encapsulate all communication between client and input method. There is no need for an X client to use any networking library or natural language package to use an input method.

A single input server can be used for one or more languages, supporting one or more encoding schemes. But the strings returned from an input method are always encoded in the (single) locale associated with the XIM object.