Simplified Chinese Solaris User's Guide

Chapter 4 Entering Simplified Chinese Text

About This Chapter

This chapter describes the Simplified Chinese Solaris input modes for typing Simplified Chinese characters with Simplified Chinese Solaris software. You can type any of the following kinds of characters:

You can type all of these characters in the input areas of the following application windows:

For information about creating your own input method, see Chapter 5, Code Table Input Method Interface.

Input Window Areas

Three separate areas of an application subwindow are involved in entering characters. These areas are typically displayed, named, and used as follows:

Preedit Area

The highlighted (for example, reverse video and underlined) preedit area displays characters as they are typed or converted. It holds formations of text before they are converted to Simplified Chinese characters or symbols and put in the text block being assembled for the application.

Status Area

The status area is shows what input conversion mode is in effect. In the above example, it is located in the lower left corner of the window margin.

Lookup Choice Area

The lookup choice area displays multiple Simplified Chinese or special character choices available for conversion of the character(s)/radical(s) in the preedit area. In the above example, it is a pop-up.

Auxiliary Window

The auxiliary window provides tools and utilities to manage the input methods or to make the input simpler.

Input Method Utilities

Solaris 9 provides graphics interface tools and utilities to manage input methods, set the properties of input methods, and to facilitate the input of special characters.

The following tools are supported:

Selecting the Utility Menu
  1. Click the utility button Graphicto display the utilities menu.

Graphic

Select one of the input method tools from the menu.

Input Method SelectionTool

The input method selection tool allows you to select a list of input method. You can also set the default input method and the sequence of the input methods.

  1. Select input method selection item of the utility menu

The input method selection panel appears as below:

Graphic

After selecting an input method, click "OK" or "Apply", the setting will be activated. The first input method selected becomes the default input method.

Press "CTRL+Space" in the application window to activate Chinese input, the default input method will be selected. Press "F2" to switch to the first input method selected. Press "F3" to switch to the second one, and so on.

Setting Input Method Options

The properties of Simplified Chinese input methods can be set from the input method options setting screen, which appears as follows:

Graphic

With this options setting tool, user can set the options of input methods. After setting the options in this panel, then click "OK" or "Apply", the setting will be activated.

For the input methods that based on code table, there are 4 options can be set by user:

  1. "Display candidates key by key

    if TRUE: when a valid key is entered for this input method, IM will search the dictionary table and display the candidates in Lookup window.

    if FALSE: when a valid key is entered for this input method, IM does not search the dictionary table, but displays the key in the preedit area. The "SPACE" key must be pressed for IM to begin to search the dictionary table, and display the candidates.

  2. "Display external codes":

    if TRUE: in every Lookup window, the external codes for every candidate will follow after the candidate.

    if FALSE: the external codes for every candidate will not follow after that candidate.

    This option provides a way to study the input method and view the external code of a Chinese character in that input method.

  3. "Automatically commit if only one candidate":

    if TRUE: if there is only one candidate for the external code, IM will automatically commit it.

    if FALSE: IM will display it in the Lookup window.

  4. "Display keymap character for every external code"

    if TRUE: when a valid key is entered, the corresponding keymap character of the key will display in Preedit area,

    if FALSE: only the key is displayed without the keymap character,

Lookup table

User can use the the lookup table tools to search a Chinese characters and input it. There are three kinds of lookup table provided:

Virtual Keyboards

Virtual Keyboard tools can be used as a Lookup utilities to simplify the input of some special symbols.

There are several kinds of virtual keyboard for Simplified Chinese environment. They are shown below:

PC Keyboard

The PC Virtual Keyboard appears as below:

Graphic

Greek Keyboard

The Greek virtual keyboard appears as below:

Graphic

Russian Keyboard:

The Russian Virtual Keyboard appears as below:

Graphic

ZhuYin Keyboard:

The ZhuYin Virtual Keyboard appears as below:

Graphic

Chinese Punctuation Characters Keyboard:

The Chinese Punctuation Characters Keyboard appears as below:

Graphic

Number Symbol Lookup Keyboard:

The Number Symbol virtual keyboard appears as below

Graphic

Mathmatic Symbol Lookup Keyboard:

The Mathmatic Symbol virtual keyboard appears as below:

Graphic

Special Symbols Lookup Keyboard:

The Special Symbol virtual keyboard appears as below:

Graphic

Table Symbol Lookup Keyboard:

The Table Symbol virtual keyboard appears as below:

Graphic

User Defined Characters (UDC)

The UDC editor tool allows you to draw and save new characters. After ascribing the character to an input method, it can be displayed in an application.

  1. Select the user defined character item on the utility menu to activate the UDC tool.

Graphic
Note -

Chapter 7, Fonts provides more information about user defined characters.


Input Method Help

Help pages are displayed in a default browser such as Netscape or Hotjava.

  1. Select the input method help item of the utility menu to activate help pages in a bowser.

Input Methods and Conversion Modes for Entering Text

The following input methods and conversion modes are available for entering ASCII/English, Simplified Chinese and other text:

In In zh/zh_CN/zh_CN.EUC locale:

In zh.GBK/zh_CN.GBK locales:

In zh_CN.GB18030/zh.UTF-8/zh_CN.UTF-8 locales:

Press Control-spacebar to toggle on or off the Simplified Chinese input conversion. The function keys listed above (for example: F2, F3) turn on the corresponding input methods.

Typing ASCII Text

Each tool first starts with all Simplified Chinese input modes off, the window's status area blank. This mode is for typing ASCII text:

Graphic

Simplified Chinese input conversion mode is toggled on and off by pressing Control-spacebar. After Simplified Chinese input conversion has been turned on once and input conversion is then turned off, the status area is no longer blank, but instead shows that conversion is off.

Switching Between English Status and Chinese Status:

    Type "CTRL+SPACE".

    An auxiliary window appears, as shown below:

Graphic

Select Input Method

    In the Chinese status window, type Function key "Fn". For example, type F2 to switch to the first input method. F3 to switch to the second input nethod, and so on. OR click the input mehtod selection button on the auxiliary window.

    The input method selection menu appears as below:

Graphic

Toggling Input Methods

This procedure allows you to toggle between 6 imput methods. Text is entered in the Chinese status window.

  1. Type "CTRL+ESC" to switch input methods.

Switching Between Half_width Character Mode and Full_width Character Mode

This method is entered in the Chinese status window.

  1. Type "SHIFT+SPACE" to switch between Half_width Character Mode and Full_width Character Mode. OR click the Half_width/Full_width button of auxiliary window to toggle between modes.

The Graphicindicates the input method system is in Half_width Character Mode.

The Graphicindicates the input method system is in Full_width Character Mode.

When in Full_width mode, the Full_width character of the input key will be commit to system. For example: if you input 'a' when in Full_width mode, the fullwidth character of 'a' is commited to the application as in the figure below:

Graphic

Switching Between Chinese Punctuation Mode and English Punctuation Mode:

This method is entered in the Chinese status window.

    Type "CTRL+." to switch between Chinese Punctuation Mode and English Punctuation Mode. Or click the Chinese/English Punctuation Button on the auxiliary window to toggle between modes.

The Graphicicon indicates the input method system is in Chinese Punctuation Mode.

The Graphicicon indicates the input method system is in English Punctuation Mode.

When the punctuation key is selected in Chinese Punctuation mode, the corresponding Chinese punctuation character is commited to the application. For example: when in Chinese Punctuation mode and the "$" symbol is selected, the Chinese currency symbol character is commited to the application as in the figure below:

Graphic

The punctuation keys include the following: , . / <> :;'"\$!^&_-

The correspondence between English and Chinese punctuation is mapped below:

Graphic

Language Input Methods

Solaris 9 supports the following input methods for the Simplified Chinese environment:

  1. NewQuanPin and NewShuangPin Input Method

  2. GB2312 Code

  3. GBK Code

  4. GB18030 Code

  5. ShuangPin

  6. QuanPin

  7. English_Chinese

New QuanPin and New ShuangPin Input Methods

This section describes the features in the New QuanPin and New ShuangPin input methods, and how to use some of the features in the zh_CN.EUC and zh_CN.GBK locales.

PinYin is a popular input method in PRC, and there are various PinYin-based input methods. Two of them, New QuanPin and New ShuangPin, contain the following features:

These features are described in detail in the following sections.

Defining Phrases for Later Use

The following example shows how to define the phrase "ke lin dun" and store it for later use.

  1. Type the phrase kelindun without spaces.

    The New QuanPin and New ShuangPin input methods will insert spaces for you automatically.

    Graphic
  2. Type the number representing the first character you want to select.

    The following example shows the second character selected.

    Graphic
  3. Select Chinese for the second and third parts of the phrase.

    Graphic

    The new phrase is defined and added to the user dictionary file. The next time you type ke lin dun, you will see the phrase you defined in the candidate area.

    Graphic
Selecting Frequently-Used Candidates

In these input methods, candidates that have been selected are moved to the start of the list to facilitate repeated use.

  1. Type sh yi.

    Notice the order of the five available candidates.

  2. Select the fifth candidate.

    Graphic
  3. Type sh yi again.

    Graphic

    Notice that the fifth candidate has moved to the first position because you previously selected it. Frequently-used candidates are promoted for faster selection.

Other Features

Typing Long PinYin Strings

The New QuanPin input methods accepts PinYin strings up to 222 characters long. The following illustrations use the string below:


>>meiguozhongtongkelindunzhengzaitaolunhaiwanjushiwenti<<
Graphic

The result is the following Chinese string:

Graphic
Note -

The New ShuangPin input method supports up to 30-character strings.


Typing ShengMu

You can also type ShengMu only. Candidates are supplied for ShengMu, as shown in the following illustration:

Graphic

GBK Support

The zh_CN.GBK locale supports GBK by default, as shown in the following illustration:

Graphic

The second Chinese character in the following illustration is defined only in the GBK standard.

Graphic

Single GBK candidates are placed at the end of the list of candidates. Press Return to scroll to the GBK area. For easier selection next time, you can define the GBK candidate as a phrase (for more information, see "Defining Phrases for Later Use"). Once a phrase is defined, you can insert it easily.

Both New QuanPin and New ShuangPin support GBK Hanzi by default in the zh.GBK locale. However, because several Hanzi have the same ShengMu (the first part of Pinyin), New QuanPin and New ShuangPin do not display GBK candidates if you provide only the ShengMu.

For example, typing the string rong will display GBK candidates because it is a complete Pinyin string. However, typing r alone will not display any GBK candidates because it is only a ShengMu.

Keyboard Definition

Edit Keys

The following table shows the definitions of the edit keys.


Note -

The preedit line is a normal X text field.


Table 4-1 Edit Key Definitions

Key 

Definition 

[a-z] 

PinYin character. 

Home 

Moves to the start of the preedit line. 

End 

Moves to the end of the preedit line. 

Left 

Moves the caret in the preedit line to the left. If left is Hanzi, the original PinYin is recovered. 

Right 

Moves the caret in the preedit line to the right. 

Delete 

Deletes the PinYin character following the caret on the preedit line. 

Backspace 

Deletes the PinYin character preceding the caret on the preedit line. 

Page Scroll Keys

The candidates of a Pinyin string belong to the following groups:

Some Pinyin strings may have more candidates than can be displayed in the same window. In that case, use the keys described in the following table to scroll through the candidates.

Table 4-2 Page Scroll Key Definitions

Key 

Definition 

- = 

Scrolls to previous/next candidate(s) 

[ ] 

Scrolls to previous/next candidate(s) 

, . 

Scrolls to previous/next candidate(s) 

Return 

Quickly scrolls through all candidates 

Select Keys

New QuanPin and New ShuangPin use the numeric selection keys.

Separators

In accord with the national Pinyin standard, the separator (') is supported to avoid ambiguous interpretations of Pinyin strings. For example, the Pinyin string [jiang] can be interpreted as [jiang] or [ji][ang]; both are valid. In New QuanPin, however, [jiang] is interpreted only as [jiang]. You must use the separator and enter [ji'ang] for it to be interpreted as [ji] and [ang]. New ShuangPin does not require the use of separators.

Dictionary Files

New QuanPin and New ShuangPin share two dictionary files: PyCiku.dat and Ud.Ciku.dat. In the zh_CN.EUC and zh_CN.GBKlocale, the default path names are /usr/lib/im/locale/zh_CN/data/PyCiku.dat and /usr/lib/im/locale/zh_CN/data/UdCiku.dat.

Users cannot normally write to these files. However, since users can affect the way New QuanPin and New ShuangPin work through features such as frequency adjustment and user-defined phrases, it is necessary to update the dictionary files frequently.

A user's dictionary is normally located in ~/.Xlocale/PyCiku.dat or ~/.Xlocale/UdCiku.dat (~ indicates the home directory of the user who starts the htt command). When New QuanPin and New ShuangPin are started, they locate and read the dictionary files in the user's home directory. If a dictionary file is not found, the system default path is used (that is, /usr/lib/im/locale/zh_CN/...).

New ShuangPin Features

ShuangPin is an abbreviated form of QuanPin. It is faster but more difficult to use than QuanPin. New ShuangPin supports all of the features, keyboard definitions, and dictionary files of New QuanPin.

There are various ShuangPin keyboard mapping designs in PRC. The most popular three are ZiRanMa, Chinese Star, and Intelligent_ABC. The New ShuangPin input method supports all three of these keyboard mappings.

New ShuangPin Keyboard Mapping

The following tables contain keyboard mappings for the ZiRanMa, Chinese Star, and Intelligent_ABC keyboards.

Table 4-3 ZiRanMa Keyboard Mapping

Key 

Definition  

ch 

sh 

zh 

ou 

iao 

uang, iang 

en 

eng 

ang 

an 

ao 

ai 

ian 

in 

o, uo 

un 

iu 

uan, er 

iong, ong 

ue 

v, ui 

ua, ia 

ie 

uai, ing 

ei 

Table 4-4 CStar2.97 Keyboard Mapping

Key 

Definition  

ch 

sh 

zh 

ia, ua 

uan 

ao 

an 

ang 

iang, uang 

ian 

iao 

in 

ie 

iu 

o, uo 

ou 

er, ing 

en 

ai 

eng 

v, ui 

ei 

uai, ue 

iong, ong 

un 

Table 4-5 Intelligent ABC Keyboard Mapping

Key 

Definition  

ch 

sh 

zh 

ou 

in, uai 

ua, ia 

en 

eng 

ang 

an 

ao 

ai 

ue, ui 

un 

o, uo 

uan 

ei 

iu, er 

ong, iong 

uang, iang 

ian 

ie 

ing 

iao 

GBK Code Input Method

This method uses the GBK code defined by the Chinese Internal Code Specification. It includes all of the Chinese characters and symbols in GB2312-80, and other CJK Chinese characters in GB 13000-1. Each Chinese character or symbol is identified by a four hexadecimal digital internal code defined in the Chinese Internal Code Specification.

Typing GBK Code Text

This section contains instructions on how to use the GBK codes to type Chinese characters and symbols.

  1. In a new Terminal, turn Chinese input conversion on by pressing Control-spacebar.

  2. Press F4 to turn on the GBK code input method.

    The status area shows that GBK code input mode is on.

    Graphic
  3. Press the first three of the four keys that represent the character to display (in this example, b0a1).

    The key remains visible in the preedit area.

    Graphic
  4. Type the fourth key.

    The character automatically replaces the preedit area.

    Graphic

GB2312 Code Input Method

This method uses the GBK code defined by the Chinese Internal Code Specification. It includes all of the Chinese characters and symbols in GB2312-80, and other CJK Chinese characters in GB 13000-1. Each Chinese character or symbol is identified by a four hexadecimal digital internal code defined in the Chinese Internal Code Specification.

Typing GB2312 Code Text

This section contains instructions on how to use the GB2312 codes to type Chinese characters and symbols.

  1. In a new Terminal, turn Chinese input conversion on by pressing Control-spacebar.

  2. Click the Input method selection button on the auxiliary window and select GB2312 input method.

    The status area shows that GB2312 code input mode is on.

    Graphic
  3. Press the first three of the four keys that represent the character to display (in this example, b0a1).

    The key remains visible in the preedit area.

    Graphic
  4. Type the fourth key.

    The character automatically replaces the preedit area.

    Graphic

GB18030 Code Input Method

This method uses the GB18030 code defined by the Chinese Internal Code Specification. It includes all of the Chinese characters and symbols in GB2312-80, and other CJK Chinese characters in GB 18030. Each Chinese character or symbol is identified by a four or eight hexadecimal digital internal code defined in the Chinese Internal Code Specification.

Typing GB18030 Code Text

This section contains instructions on how to use the GB18030 codes to type Chinese characters and symbols.

  1. In a new Terminal, turn Chinese input conversion on by pressing Control-spacebar.

  2. Click the Input method selection button on the auxiliary window and select GB18030 input method.

    The status area shows that GB18030 code input mode is on.

    Graphic
  3. For example: To input Chinese GB18030 character with code 0xb0a1, press the first three of the four keys that represent the character to display (in this example, b0a).

    The key remains visible in the preedit area.

    Graphic
  4. Type the fourth key.

    The character automatically replaces the preedit area.

    Graphic
  5. For example: To input Chinese GB18030 character with code 0x82358538, press the first three of the four keys that represent the character to display (in this example, 8235853).

    The key remains visible in the preedit area.

    Graphic
  6. Type the last key. The character is automatically commited to the window.

    Graphic

QuanPin Input Method

The QuanPin input method requires up to six keystrokes to type each Chinese Pinyin character. Quanpy maps Pinyin phonetics to single lowercase Roman letters. You can use the QuanPin input method to type individual Chinese characters in both zh_CN.EUC and zh_CN.GBK.

A lookup area showing the characters that match the QuanPin input is displayed with each keystroke. If more than one option is available, you can type a period (.) to display the next page moving forward through the lookup choices and typing a comma (,) to display the next page moving backward. You can select the character you want by typing the label letter corresponding to the character in the lookup area.

Typing QuanPin Text

This section describes how to create QuanPin text.

The following figure shows how to use this input method to type the character representing the Full Pinyin word fang. The word requires four keystrokes. Type them and select the text as follows:

  1. Type the four keystrokes fang.

    Graphic
  2. Type 1 to select the corresponding GBK Chinese character in the lookup choice list.

    Your choice is substituted for the Full Pinyin string in the preedit area.

    Graphic

English_Chinese Input Method

The English_Chinese input method requires up to fifteen keystrokes to type each Chinese word. English_Chinese maps the English word to a Chinese phrase. You can use the English_Chinese input method to type a Chinese phrase in both zh_CN.EUC and zh_CN.GBK locales.

A lookup area showing the characters that match the QuanPin input is displayed with each keystroke. If more than one option is available, you can type a period (.) to display the next page. Moving forward through the lookup choices and typing a comma (,) displays the next page moving backward. You can select the character you want by typing the label letter corresponding to the character in the lookup area.

Typing English_Chinese Text

This section describes how to create English_Chinese text.

The following figure shows how to use this input method to type the character representing the Engilsh word "world". The word requires five keystrokes. Type them and select the text as follows:

  1. Type the five keystrokes world.

    Graphic
  2. Type 3 to select the corresponding Chinese phrase in the lookup choice list.

    Your choice is substituted for the English string in the preedit area.

    Graphic
  1. Wild characters "*"or "?" can be used to search in the dictionary," * " stands for one or several letters, and ? represents only one letter. For example, to search all English words which end with lution, you can input "*lution" and the lookup choice window appears as below.

    Graphic

    Or to search all English words which begin with "c", and only three letters, you can input "c??", the lookup choice window appears as below:

    Graphic