Wubi is a popular input method in China. The encoding rule used in the Wubi input method is based on the radical or stroke shape of Chinese characters.
One of the main advantages of Wubi and other shape-based input methods is a very low repetition rate. The lower repetition rate, a feature not found in PinYin-based input systems, means that only one or two Chinese characters are represented by a Wubi key sequence. Because a single Wubi code seldom represents more than one character, you can enter text more quickly.
Wubi is built on the GB18030-2000 character set standard, a graphemic encoding system. Almost all Chinese, Kanji, and Hanja characters can be encoded with the GB18030-2000 standard.
GB18030-2000 character set support
Easy character set switching
New radical mechanism for Simplified and Traditional Chinese
Three-level progressive identification code
Phrase input and professional word galleries
Fault tolerance code
The GB18030-2000 character set is a national encoding standard issued by the Chinese government in 2000. The encoding length set by the standard is one, two, or four bytes. GB18030-2000 includes 6,763 standard Simplified Chinese characters, 13,053 Traditional Chinese (Big5) characters, 3,000 characters used in Hong Kong, and 21,003 GBK characters. The Wubi input method supports the GB18030-2000 character set, which makes it working with the smaller character sets contained in GB18030-2000 easy. See Easy Character Set Switching.
For example, if you type the letters gigg and scroll pages to the end, you will find a GB18030 character shown in the following figures:
GB2312, which contains 6,763 characters
GBK, which contains 21,003 characters
GB18030-2000, which contains 27,533 characters
To use the GB2312 character set, press Control-Shift-1.
To use the GBK character set, press Control-Shift-2.
To use the GB18030-2000 character set, press Control-Shift-3.
Because GB18030-2000 is a relatively new standard, support in Wubi for the GB2312 and GBK character sets ensures backward-compatibility with earlier standards. You might prefer to work in the GB2312 or GBK character set because of improved performance and lower repetition rates.
The new radical, or root, mechanism is a patented technology invented by professor Wang Yongmin who invented Wubi. Professor Yongmin developed from the mechanism from version 86, the old radical system. The mechanism has evolved into a new encoding system compatible with both Simplified and Traditional Chinese. Users of Wubi version 86 can work with three times more characters, using the same encoding and typing rules, without additional training.
One of the main features of Wubi is the last-stroke grapheme identification codes that distinguish between characters of a similar shape. The identification codes are assigned according to the shape of the last radical of the character. The purpose of identification codes is to help users master the Wubi input method at three different levels.
In level A, for beginning users, all three graphemic types with less than four codes have identification codes.
In level B, for intermediate users, only the left-right shaped Chinese characters have identification codes.
In Level C, for advanced users, identification codes are not used.
Wubi supports phrase input. In addition to individual characters, entire phrases can be assigned Wubi codes. In addition to 90,000 basic phrases, there are 11 professional word galleries, similar to glossaries, for each of the following industries:
Traffic and transportation
Computer and household electronics
Economy and finance
Medicine and health
Mining and metallurgy
Foreign trade and travel
Military affairs and national defense
Law and aesthetics
Galleries also exist for place names and for idioms.
You can select word galleries that contain between 3,000 and 20,000 entries. in the Preferences dialog box.
For example, when you choose the Medicine and Health phrase gallery and type the word mino, medical phrases are listed for selection.
The Solaris Wubi input method supports encoding hint features. As you type, the character encoding appears in the Select Repetition Code Window. This feature can help you master the encoding methods and codes of Chinese characters. In addition, you can use the uppercase or lowercase Z key as a wildcard at any time. Z is the only key not mapped to a character in Wubi. To help you learn to use Wubi, you can press the Z key to query the system for input codes.
For example, when you can type azzd to search all characters or phrases with a Wubi code that begins with the letter A and ends with the letter D.
According to the preferences you set, the fault tolerance code feature can increase the probability that the system will provide the correct character even when you make a typing mistake.
The word-phrase feature is another productivity aid. The system provides a list of characters that are most likely to follow the character just selected. Instead of typing a code, the system provides a list of likely options from which you can choose the correct character. This feature is also accessed in the Preferences dialog box.
For example, when you type the letters iuxx, the Chinese character ×Ì is automatically committed to application. After the character appears in application window, a new candidate window will display and the phrases which begin with this Chinese character will be listed in this candidate window.
Character sets: GB2312, GBK, or GB18030
Professional word galleries
Identification code mode
Display the Wubi code for a candidate
Display the candidates after each keystroke
Association of characters with phrases
Fault tolerance code
Display characters and phrases with the same code
Display the key prompt in the preedit area