The following sections describe the Asian supported locales:
The following table provides the a summary of Asian supported locales.
Table 4–1 Summary of Asian Locales
This window provides a friendly and extensible input method management tool for all Chinese customer., A new input method auxiliary window supports the following new functions and utilities:
Input method switching
Chinese full-width/half-width character mode switching
Chinese/English punctuation mode switching
Input method properties setting
Input method selection
Lookup tables for GB2312/GBK/GB18030/CNS11643/Big5/HKSCS/Unicode character sets
Virtual keyboard
For more detailed information, please see the Simplified Chinese User's Guide and the Traditional Chinese User's Guide.
The input method auxiliary windows supports all UTF-8 locales and the following Chinese locales:
zh/zh_CN.EUC
zh.GBK/zh_CN.GBK
zh.UTF-8/zh_CN.UTF-8
zh_TW/zh_TW.EUC
zh_TW.BIG5
zh_TW.UTF-8
zh_HK.BIG5HK
zh_HK.UTF-8
zh_CN.GB18030
Two kinds of input methods are supported:
Methods based on a code table such as CangJie
Methods developed by a vendor (such as NewPinYin or NeiMa)
The interface model for auxiliary window support is shown in the following figure.
According to the Thai IT Standard, there are three input levels for the Thai character sequence checking method:
Passthrough level, no input check.
Basic input check level.
Strict input check level.
In the Solaris 9 release, the default input check level is still passthrough level. This means no sequence check, which is the same level as in previous Solaris releases. You can use the F2 Function key to switch between the three levels:
passthrough -> basic -> strict -> passthrough
A Thai input method auxiliary window supports the following new functions and utilities:
Switching between the three input levels (passthrough/basic/strict)
Thai virtual keyboard
Click the input level button on the auxiliary bar to select a specific Thai input level and input check level. Click the keyboard button to display the Thai virtual keyboard. Use the Thai virtual keyboard to input Thai characters.
Simplified Chinese in the Solaris 9 environment provides four locales: zh, zh.GBK, zh_CN.GB18030, and zh.UTF-8. In the zh locale, the EUC scheme is used to encode GB2312–80. The zh.GBK locale supports the GBK codeset, which is a superset of GB2312–80.
The new GB18030–2000 codeset is now supported in the zh_CN.GB18030 locale.
Simplified Chinese is used mostly in the People's Republic of China (PRC) and in Singapore.
The following input methods are supported for the zh locale:
New QuanPin
New ShuangPin
QuanPin
ShuangPin
GB2312 NeiMa
English-Chinese
Optional codetable input methods
Input method auxiliary window support for Simplified Chinese
The following input methods are supported for the zh_CN.GB18030 locale:
New QuanPin
New ShuangPin
QuanPin
ShuangPin
GB18030–2000 NeiMa
English-Chinese
Optional codetable input methods
Input method auxiliary window support for Simplified Chinese
The following input methods are supported for both the zh.GBK and the zh.UTF-8 locales:
New QuanPin
New ShuangPin
QuanPin
ShuangPin
GBK NeiMa
English-Chinese
Optional codetable input methods
Input method auxiliary window support for Simplified Chinese
The auxiliary window for Chinese input methods provides a friendly and extensible input method user interface for all Chinese locales. See Input Method Auxiliary Window Support for Simplified and Traditional Chinese.
For more detailed information about auxiliary windows for Chinese input methods, please see Simplified Chinese User's Guide and Traditional Chinese User's Guide.
The following table shows the TrueType fonts for the zh locale.
Table 4–2 TrueType Fonts for the zh_CN.EUC Locale
Full Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
Fangsong | R | TrueType | Hanyi | GB2312.1980 |
Hei | R | TrueType | Monotype | GB2312.1980 |
Kai | R | TrueType | Monotype | GB2312.1980 |
Song | R | TrueType | Monotype | GB2312.1980 |
The following table shows the bitmap fonts for the zh locale.
Table 4–3 Bitmap Fonts for the zh_CN.EUC Locale
Full Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Song | B | PCF (14,16) | GB2312.1980 |
Song | R | PCF (12,14,16,20,24) | GB2312.1980 |
The following table shows the TrueType fonts for the zh_CN.GBK locale.
Table 4–4 TrueType Fonts for the zh_CN.GBK Locale
Full Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
Fangsong | R | TrueType | Zhongyi | GBK |
Hei | R | TrueType | Zhongyi | GBK |
Kai | R | TrueType | Zhongyi | GBK |
Song | R | TrueType | Zhongyi | GBK |
The following table shows the bitmap fonts for the zh_CN.GBK locale.
Table 4–5 Bitmap Fonts for the zh_CN.GBK Locale
Full Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Song | R | PCF (12,14,16,20,24) | GBK |
The following table shows the TrueType fonts for the zh_CN.GB18030 locale.
Table 4–6 TrueType Fonts for the zh_CN.GB18030 Locale
Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
FangSong |
R |
TrueType |
FangZheng |
GB18030–2000 |
Song |
R |
TrueType |
FangZheng |
GB18030–2000 |
Hei |
R |
TrueType |
FangZheng |
GB18030–2000 |
Kai |
R |
TrueType |
FangZheng |
GB18030–2000 |
The following table shows bitmap fonts for the zh_CN.GB18030 locale.
Table 4–7 Bitmap Fonts for the zh_CN.GB18030 Locale
Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Song |
R |
PCF(12,14,16,20,24) |
GB18030–2000 |
The following table shows the supported codeset conversions for Simplified Chinese.
Table 4–8 Codeset Conversions for Simplified Chinese
Code |
Symbol |
Target Code |
Symbol |
---|---|---|---|
GB2312-80 |
zh_CN.euc |
ISO 2022-7 |
zh_CN.iso2022-7 |
GB2312-80 |
zh_CN.euc |
ISO 2022-CN |
zh_CN.iso2022-CN |
GB2312-80 |
zh_CN.euc |
UTF-8 |
UTF-8 |
GB18030 |
zh_CN.gb18030 |
UTF-8 |
UTF-8 |
HZ-GB-2312 |
HZ-GB-2312 |
GB2312–80 |
zh_CN.euc |
HZ-GB-2312 |
HZ-GB-2312 |
GBK |
zh_CN.gbk |
HZ-GB-2312 |
HZ-GB-2312 |
UTF-8 |
UTF-8 |
ISO2022-7 |
zh_CN.iso2022-7 |
GB2312-80 |
zh_CN.euc |
ISO2022-CN |
zh_CN.iso2022-CN |
GB2312-80 |
zh_CN.euc |
ISO2022-CN |
zh_CN.iso2022-CN |
UTF-8 |
UTF-8 |
ISO2022-CN |
zh_CN.iso2022-CN |
zh.GBK |
zh_CN.gbk |
UTF-8 |
UTF-8 |
GB2312-80 |
zh_CN.euc |
UTF-8 |
UTF-8 |
GB18030 |
zh_CN.gb18030 |
UTF-8 |
UTF-8 |
ISO2022-CN |
zh_CN.iso2022-CN |
UTF-8 |
UTF-8 |
zh.GBK |
zh_CN.gbk |
zh.GBK |
zh_CN.gbk |
ISO2022-CN |
zh_CN.iso2022-CN |
zh.GBK |
zh_CN.gbk |
UTF-8 |
UTF-8 |
Traditional Chinese in the Solaris 9 product provides five locales:
zh_TW.EUC where the EUC scheme is used to encode the CNS11643.1992 codeset
zh_TW.BIG5 where the locale supports Big5
zh_TW.UTF-8 where the locale supports Unicode 3.1
zh_HK.BIG5HK where the locale supports Big5-HKSCS
zh_HK.UTF-8 where the locale supports Unicode 3.1
Traditional Chinese is used mostly in Taiwan and Hong Kong, China. The following input methods are supported in the zh_TW.EUC, zh_TW.BIG5, and zh_TW.UTF-8 locales:
New ChuYin
ChuYin
TsangChieh
Array
BoShiaMy
DaYi
JianYi
Cantonese
EUC NeiMa
Big5 NeiMa
English-Chinese
Optional codetable input methods (such as PinYin)
Input method auxiliary window support for Traditional Chinese
The following input methods are supported in the zh_HK.BIG5HK and zh_HK.UTF-8 locales.
ChuYin
TsangChieh
Array
BoShiaMy
DaYi
JianYi
Cantonese
BIG5+HKSCS NeiMa
English-Chinese
Optional codetable input methods (such as PinYin)
Input method auxiliary window support for Traditional Chinese
New ChuYin
The following table shows the Traditional Chinese TrueType Fonts for the zh_TW locales.
Table 4–9 Traditional Chinese TrueType Fonts for the zh_TW Locales
Full Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
Hei | R | TrueType | Hanyi | CNS11643.1992 |
Kai | R | TrueType | Hanyi | CNS11643.1992 |
Ming | R | TrueType | Hanyi | CNS11643.1992 |
The following table shows the Traditional Chinese bitmap fonts for the zh_TW locales.
Table 4–10 Traditional Chinese Bitmap Fonts for the zh_TW Locales
Full Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Ming | R | PCF (12,14,16,20,24) | CNS11643.1992 |
The following table shows the TrueType fonts for the zh_HK.BIG5HK locale.
Table 4–11 TrueType Fonts for the zh_HK.BIG5HK Locale
Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
Ming |
R |
TrueType |
FangZheng |
Big5–HKSCS |
Hei |
R |
TrueType |
FangZheng |
Big5–HKSCS |
Kai |
R |
TrueType |
FangZheng |
Big5–HKSCS |
The following table shows the bitmap fonts for the zh_HK.BIG5HK locale.
Table 4–12 Bitmap Fonts for the zh_HK.BIG5HK Locale
Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Ming |
R |
PCF(12,14,16,20,24) |
Big5–HKSCS |
The following table shows the supported codeset conversions for Traditional Chinese.
Table 4–13 Codeset Conversions for Traditional Chinese
Code |
Symbol |
Target Code |
Symbol |
---|---|---|---|
BIG5 |
zh_TW-big5 |
CNS 11643 |
zh_TW-euc |
BIG5 |
zh_TW-big5 |
ISO2022–CN |
zh_TW-iso2022–CN-EXT |
BIG5 |
zh_TW-big5 |
UTF-8 |
UTF-8 |
BIG5+HKSCS |
zh_HK.big5hk |
UTF-8 |
UTF-8 |
CNS 11643 |
zh_TW-euc |
BIG5 |
zh_TW-big5 |
CNS 11643 |
zh_TW-euc |
UTF-8 |
UTF-8 |
CNS 11643 |
zh_TW-euc |
ISO2022-7 |
zh_TW-iso2022-7 |
CNS 11643 |
zh_TW-euc |
ISO2022-CN-EXT |
zh_TW-iso2022-CN-EXT |
CNS 11643 |
zh_TW-euc |
UTF-8 |
UTF-8 |
ISO2022-7 |
zh_TW-iso2022-7 |
CNS 11643 |
zh_TW-euc |
ISO2022-7 |
zh_TW-iso2022-7 |
UTF-8 |
UTF-8 |
ISO2022-CN |
zh_TW-iso2022-CN-EXT |
BIG5 |
zh_TW-big5 |
ISO2022-CN-EXT |
zh_TW-iso2022-CN-EXT |
CNS 11643 |
zh_TW-euc |
UTF-8 |
UTF-8 |
BIG5 |
zh_TW-big5 |
UTF-8 |
UTF-8 |
BIG5+HKSCS |
zh_HK.big5hk |
UTF-8 |
UTF-8 |
CNS 11643 |
zh_TW-euc |
UTF-8 |
UTF-8 |
ISO 2022-7 |
zh_TW-iso2022-7 |
This section describes Japanese locale-specific information.
Four Japanese locales, which support different character encodings, are available in the Solaris 9 environment. The ja and ja_JP.eucJP locales are based on the Japanese EUC. The ja_JP.eucJP locale conforms to the UI-OSF Japanese Environment Implementation Agreement Version 1.1 and the ja locale conforms to the traditional specification from earlier Solaris releases. The ja_JP.PCK locale is based on PC-Kanji code (known as Shift_JIS) and the ja_JP.UTF-8 is based on UTF-8.
See the eucJP(5) man page for a map between Japanese EUC and the character set. See the PCK(5) man page for the map between PC-Kanji code and the character set.
The supported Japanese character sets are:
JIS X 0201–1976
JIS X 0208–1990
JIS X 0212–1990
JIS X 0213–2000 (only characters defined in Unicode 3.1)
JIS X 0212–1990 is not supported in the ja_JP.PCK locale. JIS X 0213–2000 is supported in the ja_JP.UTF-8 locale only. Not all characters defined in the JIS X 0213–2000 are available. Only those characters defined in the Unicode 3.1 character set are available.
Vendor-defined characters (VDC) and user-defined characters (UDC) are also supported. VDCs occupy unused (reserved) code points of JIS X 0208–1990 or JIS X 0212–1990. UDCs occupy the same code points as VDCs, except those code points allocated for VDCs.
Three Japanese font formats are supported: bitmap, TrueType and Type1. The Japanese Type1 font includes only JIS X 0212 for printing. The Type1 font is also used by UDC.
Japanese bitmap fonts are described in the following table.
Table 4–14 Japanese Bitmap Fonts
Full Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
sun gothic |
R, B |
PCF(12,14,16,20,24) |
|
JIS X 0208–1983, JIS X 0201–1976 |
sun minchou |
R |
PCF(12,14,16,20,24) |
|
JIS X 0208–1983, JIS X 0201–1976 |
ricoh hg gothic b |
R |
PCF(10,12,14,16,18,20,24) |
RICOH |
JIS X 0208–1983, JIS X 0201–1976 |
ricoh hg mincho l |
R |
PCF(10,12,14,16,18,20,24) |
RICOH |
JIS X 0208–1983, JIS X 0201–1976 |
ricoh gothic |
R |
PCF(10,12,14,16,18,20,24) |
RICOH |
JIS X 0212–1990, JIS X 0213–2000 |
ricoh mincho |
R |
PCF(10,12,14,16,18,20,24) |
RICOH |
JIS X 0212–1990, JIS X 0213–2000 |
ricoh heiseimin |
R |
PCF(12,14,16,18,20,24) |
RICOH |
JIS X 0212–1990 |
Japanese TrueType fonts are described in the following table.
Table 4–15 Japanese TrueType Fonts
Full Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
ricoh hg gothic b |
Fixed |
TrueType |
RICOH |
JIS X 0208–1983, JIS X 0201–1976 |
ricoh hg mincho l |
Fixed |
TrueType |
RICOH |
JIS X 0208–.1983, JIS X 0201–1976 |
ricoh gothic |
Fixed, Proportional |
TrueType |
RICOH |
JIS X 0201–176, JIS X 0208–1983, JIS X 0213–2000 |
ricoh mincho |
Fixed, Proportional |
TrueType |
RICOH |
JIS X 0201–1976, JIS X 0208–1983, JIS X 0213–2000 |
ricoh heiseimin |
Fixed |
TrueType |
RICOH |
JIS X 0212–1990 |
ATOK12 is the default Japanese input system in the Solaris 9 environment. It is available for all Japanese locales and all UTF-8 locales when the Japanese locale is installed. The Wnn6 Japanese input system is also available for all Japanese locales. You can switch input systems from the Workspace menu. For Japanese Solaris 1.x BCP support, the kkcv Japanese input system is available.
The following example describes how you would input Japanese input using ATOK12.
Turn conversion mode on by pressing Control + spacebar.
Type Kana character text (for example kanjihenkan).
Convert to kanji character by pressing the spacebar.
To display other kanji characters, press the space bar to display the conversion candidate table. Type the number you want to select.
To commit the entire text to kanji character text, press return.
Press the down arrow key to commit only selected characters.
Turn conversion mode off by pressing Control + spacebar.
Using Japanese locales on a character-based terminal (TTY) requires that you use terminal settings to make line editing work correctly.
If your terminal is a CDE Terminal emulator (dtterm), use stty(1) with argument -defeucw in any Japanese locale (ja, ja_JP.PCK, or ja_JP.UTF-8). An example in locale ja is:
% setenv LANG ja % stty defeucw
If your terminal is not a CDE Terminal emulator, but the codeset of your terminal is the same as that of the current locale, use stty(1) with argument -defeucw.
If your terminal's codeset doesn't match that of the current locale, use setterm(1) to enable code conversion. For example, if you are in locale ja but your terminal requires PCK (Shift_JIS code), specify:
% setenv LANG ja % setterm -x PCK
See the setterm(3CURSES) man page for details.
Several Japanese codeset conversions are supported with iconv(1) and iconv(3). See the iconv_ja(5) man page for details.
The user-defined character utility sdtudctool handles both outline (Type1) and bitmap (PCF) fonts. Some utilities are also available to migrate the UDC fonts that were created by old utilities in prior releases, such as fontedit, type3creator, and fontmanager.
The following components are only available in the Japanese full locale environment with the Language CD:
Translated message, help, and man pages
Wnn6 Japanese input system
Japanese Solaris 1.x BCP support
Mincho (min*) typeface bitmap fonts
JIS X 0212 Type1 fonts for printing
Japanese-specific dumb printer and jpostprint support
Legacy Japanese utilities such as kanji(1)
In December 1995, the Korean government announced a standard Korean codeset, KS X 1005–1, which is based on ISO 10646-1/Unicode 2.0.
The ISO-10646 character set uses two universal character sets:
UCS-2. Universal Character Set (two-byte form)
UCS-4. Universal Character Set (four-byte form).
The ISO-10646 character set cannot be used directly on IBM PC-based operating systems. For example, the kernel and many other modules of the Solaris operating environment interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations.
In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which recodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).
The ko.UTF-8 is a Solaris locale to support KS X 1005–1, the Korean standard codeset. This locale supports all characters in the previous KS X 1005 and all 11,172 Korean characters. Korean UTF-8 supports the Korean language-related ISO-10646 characters and fonts. Because ISO-10646 covers all characters in the world, all of the various input methods and fonts are supplied so that you can input and output any character in any language. Before Universal UTF/UCS becomes available, Korean UTF-8 supports the ISO-10646 code subset that is related to Korean characters as well as all other characters in the previous Korean standard codeset, and extended ASCII.
In the ko locale, the EUC scheme is used to encode KS X 1001. The ko.UTF-8 locale supports the KS X 1005–1/Unicode 2.0 codeset, which is a superset of KS X 1001. These two locales look the same to the end user, but the internal character encoding is different. The Korean Solaris product supports the following input methods:
For the ko locale:
Hangul 2–BeolSik (one set of consonants and one set of vowels)
Hangul-Hanja conversion
Special character
Hexadecimal code
For the ko.UTF-8 locale:
Hangul 2–BeolSik (one set of consonants and one set of vowels)
Hangul-Hanja conversion
Special character
Hexadecimal code
The following table shows the Korean bitmap fonts for the ko locale.
Table 4–16 Solaris 9 Korean Bitmap Fonts for the ko Locale
Full Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Gothic | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
Graphic | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
Haeso | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
Kodig | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
Myeongijo | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
Pilki | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
Round gothic | R/B | PCF (12,14,16,18,20,24) | KS X 1001 |
The following table shows the Korean bitmap fonts for the ko.UTF-8 locale.
Table 4–17 Solaris 9 Korean Bitmap Fonts for the ko.UTF-8 Locale
Full Family Name |
Subfamily |
Format |
Encoding |
---|---|---|---|
Gothic | R/B | PCF (12,14,16,18,20,24) | KS X 1001 (Johap) |
Graphic | R/B | PCF (12,14,16,18,20,24) | KS X 1001 (Johap) |
Haeso | R/B | PCF (12,14,16,18,20,24) | KS X 1001 (Johap) |
Kodig | R/B | PCF (12,14,16,18,20,24) | KS X 1001 (Johap) |
Myeongijo | R/B | PCF (12,14,16,18,20,24) | KS X 1001 (Johap) |
Pilki | R/B | PCF (12,14,16,18,20,24) | KS X 1001 (Johap) |
The following table shows the Korean TrueType Fonts for the ko/ko.UTF-8 locales.
Table 4–18 Solaris 9 Korean TrueType Fonts for the ko/ko.UTF-8 Locales
Full Family Name |
Subfamily |
Format |
Vendor |
Encoding |
---|---|---|---|---|
Kodig/Gothic |
R |
TrueType |
Hanyang |
Unicode |
Myeongijo |
R |
TrueType |
Hanyang |
Unicode |
Haeso |
R |
TrueType |
Hanyang |
Unicode |
Round gothic |
R |
TrueType |
Hanyang |
Unicode |
The following table shows the Korean iconv.
Table 4–19 Korean iconv
Code |
Symbol |
Target Code |
Symbol |
---|---|---|---|
IBM CP933 | cp933 |
UTF-8 (Unicode 2.0) | ko_KR-UTF-8 |
ISO646 | 646 |
KS X 1001 | 5601 |
ISO2022-KR | iso2022-7 |
KS X 1001 | ko_KR-euc |
ISO2022-KR | iso2022-7 |
UTF-8 (Unicode 2.0) | ko_KR-UTF-8 |
KS X 1001 | 5601 |
UTF-8 | UTF-8 |
KS X 1001 | EUC-KR |
UTF-8 | UTF-8 |
KS X 1001 | KSC5601 |
UTF-8 | UTF-8 |
KS X 1001 | ko_KR-euc |
UTF-8 (Unicode 2.0) | ko_KR-UTF-8 |
KS X 1001 | ko_KR-euc |
ISO2022-KR | ko_KR-iso2022-7 |
KS X 1001 | ko_KR-euc |
KS X 1001 | ko_KR-johap |
KS X 1001 | ko_KR-euc |
KS X 1001 | ko_KR-johap92 |
KS X 1001 | ko_KR-euc |
KS X 1001 | ko_KR-nbyte |
KS X 1001 | ko-KR-nbyte |
KS X 1001 | ko_KR-euc |
KS X 1001 | ko-KR-johap |
UTF-8 (Unicode 2.0) | ko_KR-UTF-8 |
KS X 1001 | ko-KR-johap |
KS X 1001 | ko_KR-euc |
KS X 1001 | ko-KR-johap92 |
UTF-8 (Unicode 2.0) | ko_KR-UTF-8 |
KS X 1001 | ko-KR-johap92 |
KS X 1001 | ko_KR-euc |
UTF-8 | UTF-8 |
KS X 1001 | 5601 |
UTF-8 | UTF-8 |
KS X 1001 | EUC-KR |
UTF-8 | UTF-8 |
KS X 1001 | KSC5601 |
UTF-8 | ko-KR-UTF-8 |
IBM CP 933 | cp 933 |
UTF-8 | ko-KR-UTF-8 |
KS X 1001 | ko_KR-euc |
UTF-8 | ko-KR-UTF-8 |
ISO2022-KR | ko_KR-iso2022-7 |
UTF-8 | ko-KR-UTF-8 |
KS X 1001 | ko_KR-johap |
UTF-8 | ko-KR-UTF-8 |
KS X 1001 | ko_KR-johap92 |