Solaris Internationalization Guide For Developers

Chapter 3 Contents of the Localized Solaris 7 Products

The European Localized Solaris 7 Product

European Solaris is available in three localized versions: French, German, and European. All three versions of Solaris share the same software media, which includes a fully localized CDE environment, error messages, and on-line documentation in six languages--French, German, Spanish, Swedish, Italian, and English. The difference is in the printed documentation. The French and German Solaris products include localized printed documentation, while the printed documentation for the European version is in English only.

Table 3-1 shows a list of locales in the European product. This includes both full and partial locales.

Table 3-1 European 7 Locales

Locale Name  

Language/Territory 

C

POSIX English (7-bit) ASCII C 

cz

Czech Republic 

da

Denmark 

de

Germany 

de_AT

Austria 

de_CH

Switzerland 

de.ISO8859-15

Germany 

el

Greece 

en_AU

Australia 

en_CA

Canada 

en_IE

Ireland 

en_NZ

New Zealand 

en_UK

Great Britain 

en_US

U.S. 

es

Spain 

es_AR

Argentina 

es_BO

Bolivia 

es_CL

Chile 

es_CO

Colombia 

es_CR

Costa Rica 

es_EC

Ecuador 

es_GT

Guatemala 

es_MX

Mexico 

es_NI

Nicaragua 

es_PA

Panama 

es_PE

Peru 

es_PY

Paraguay 

es_SV

El Salvador 

es_UY

Uruguay 

es_VE

Venezuela 

et

Estonia 

fr

France 

fr_BE

Belgium (French) 

fr_CA

Canada (French) 

fr_CH

Switzerland (French) 

fr.ISO8859-15

France 

fr.UTF-8

France 

hu

Hungary 

it.ISO8859-15

Italy 

it.UTF-8

Italy 

it.ISO8859-15

Italy 

lt.ISO8859-13

Lithuania 

lv.ISO8859-13

Latvia 

nl

Netherlands 

nl_BE

Netherlands/Belgium 

no

Norway 

pl

Poland 

pt_BR

Portuguese Brazil 

ru

Russia 

it.ISO8859-15

Italy 

es.ISO8859-15

Spain 

sv.ISO8859-15

Sweden 

en_EU.ISO8859-15

Europe 

en_GB.ISO8895-15

Britain 

fr_BE.ISO8895-15

Belgium 

nl.ISO8895-15

Netherlands 

nl_BE.ISO8895-15

Belgium 

pt.ISO8895-15

Portugal 

de.-AT.ISO8895-15

Austria 

en_IE.ISO8859-15

Ireland 

da.ISO8859-15

Denmark 

fi.ISO8859-15

Finland 

el_EURO

Greece 

sun_eu_greek

Greece 

de.UTF-8

Germany 

de.ISO8859-15

Germany 

fr.UTF-8

France 

it.UTF-8

Italy 

es.UTF-8

Spain 

es.ISO8859-15

Spain 

sv.UTF-8

Sweden 

sv.ISO8859-15

Sweden 

en_UTF.8

Europe 

en_ISO8859-15

Europe 

All of these locales are also present in the base Solaris 7 release.

As mentioned, the locales include partial locales. These are based on core locales for the main language. For example, the fr_CA (French Canadian) is based on the fr (French) locale. These partial locales utilize the messages that are delivered into its parent locale (French for fr_CA). If a locale hasn't been fully localized, then it may contain only English messages.

A number of Eastern European locales have also been added into the Solaris 7 product, which may be based on other ISO standards. Previously Sun locales were based on ISO-8859-1. The Eastern European locales are based on other ISO standards, as shown in Table 3-2.

Locales that are not listed are still based on ISO-8859-1.

Table 3-2 Eastern European Locales in the Solaris 7 Product

Locale Name  

Language/Territory 

ISO 

de_AT

German (Austrian) 

8859-1 

et

Estonian 

8859-15 

cz

Czech 

8859-2 

hu

Hungarian 

8859-2 

pl

Polish 

8859-2 

lv

Latvian 

8859-13 

lt

Lithuanian 

8859-13 

ru

Russian 

8859-5 

el

Greek 

8859-7 

tr

Turkish 

8859-9 

sq_AL

Albanian 

8859-2 

sk_SK

Slovakian 

8859-2 

sl_SL

Slovenian 

8859-2 

hr_HR

Croatian 

8859-2 

nr

Bosnian 

8859-2 

ro_RO

Romanian 

8859-2 

sr_SP

Serbian 

8859-5 

bg_BG

Bulgarian 

8859-5 

mk_MK

Macedonian 

8859-5 

ru.KOI8-R

Russian 

KOI8-R 

ar

Arabic 

8859-6 

he

Hebrew 

8859-8 

th_TH

Thai 

8859-11 (TIS 620.2533) 

All of the locales support character input and output. There is also iconv support for many of the major codesets. (For more on iconv, see iconv(1)The iconv modules are available on the end-user cluster of the Euro product. See Table 3-3 for details.

Table 3-3 iconv Support

Code  

Symbol 

Target Code 

Symbol 

Comment 

ISO 8859-2

iso2 

MS 1250 

win2 

Windows Latin 2 

ISO 8859-2

iso2 

MS 852 

dos2 

MS-DOS Latin 2 

ISO 8859-2

iso2 

Mazovia 

maz 

Mazovia 

ISO 8859-2

iso2 

DHN 

dhn 

Dom Handlowy Nauki 

MS 1250

win2 

ISO 8859-2 

iso2 

ISO Latin 2 

MS 1250

win2 

MS 852 

dos2 

MS-DOS Latin 2 

MS 1250

win2 

Mazovia 

maz 

Mazovia 

MS 1250

win2 

DHN 

dhn 

Dom Handlowy Naduki 

MS 852

dos2 

ISO 8859-2 

iso2 

ISO Latin 2 

MS 852

dos2 

MS 1250 

win2 

Windows Latin 2 

MS 852

dos2 

Mazovia 

maz 

Mazovia 

MS 852

dos2 

DHN 

dhn 

Dom Handlowy Nauki 

Mazovia

maz 

ISO 8859-2 

iso2 

ISO Latin 2 

Mazovia

maz 

MS 1250 

win2 

Windows Latin 2 

Mazovia

maz 

MS 852 

dos2 

MS-DOS Latin 2 

Mazovia

maz 

DHN 

dhn 

Dom Handlowy Nauki 

DHN

dhn 

ISO 8859-2 

iso2 

ISO Latin 2 

DHN

dhn 

MS 1250 

win2 

Windows Latin 2 

DHN

dhn 

MS 852 

dos2 

MS-DOS latin 2 

DHN

dhn 

Mazovia 

maz 

Mazovia 

ISO 8859-5

iso5 

KOI8-R 

koi8 

KOI8-R 

ISO 8859-5

iso5 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

ISO 8859-5

iso5 

MS 1251 

win5 

Windows Cyrillic 

ISO 8859-5

iso5 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

OKI8-R

koi8 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

KOI8-R

koi8 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

KOI8-R

koi8 

MS 1251 

win5 

Windows Cyrillic 

KOI8-R

koi8 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

PC Cyrillic

alt 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

PC Cyrillic

alt 

KOI8-R 

koi8 

KOI8-R 

PC Cyrillic

alt 

MS 1251 

win5 

Windows Cyrillic 

PC Cyrillic

alt 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

MS 1251

win5 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

MS 1251

win5 

KOI8-R 

koi8 

KOI8-R 

MS 1251

win5 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

MS 1251

win5 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

Mac Cyrillic

mac 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

Mac Cyrillic

mac 

KOI8-R 

koi8 

KOI8-R 

Mac Cyrillic

mac 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

Mac Cyrillic

mac 

MS 1251 

win5 

Windows Cyrillic 

Table 3-4 contains a list of the Solaris 7 environment locales and their corresponding codeset names.

Table 3-4 New Locales and Corresponding Codeset Names

Locale 

nl_langinfo (CODESET) 

ICONV name 

Product 

ar ISO8859-6 ISO8859-6 Base/Euro
bg_BG  ISO8859-5 ISO8859-5 Base/Euro
C 646 646 Base/Euro
cz ISO8859-2 ISO8859-2 Base/Euro
da ISO8859-1 ISO8859-1 Base/Euro
da.ISO8859-15  ISO8859-15 ISO8859-15 Base/Euro
de  ISO8859-1 ISO8859-1 Base/Euro
de.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
de_UTF-8 UTF-8 UTF-8 Base/Euro
de_AT ISO8859-1 ISO8859-1 Base/Euro
de_AT.ISO8859-15  ISO8859-15 ISO8859-15 Base/Euro
de_CH ISO8859-1 ISO8859-1 Base/Euro
el ISO8859-7 ISO8859-7 Base/Euro
el.sun_eu_greek  ISO8859-15 ISO8859-15 Base/Euro
en_AU  ISO8859-1 ISO8859-1 Base/Euro
en_CA  ISO8859-1 ISO8859-1 Base/Euro
en_EU.ISO8859-15 ISO8859-15 ISO8859-1 Base/Euro
en_EU.UTF-8 UTF-8 UTF-8 Base/Euro
en_GB ISO8859-1 ISO8859-1 Base/Euro
en_GB.ISO8859-15 ISO8859-15 ISO8859-1 Base/Euro
en_IE ISO8859-1 ISO8859-1 Base/Euro
en_IE.ISO8859-15 ISO8859-15 ISO8859-1 Base/Euro
en_NZ ISO8859-1 ISO8859-1 Base/Euro
en_US ISO8859-1 ISO8859-1 Base/Euro
en_US.UTF-8 UTF-8 UTF-8 Base/Euro
es ISO8859-1 ISO8859-1 Base/Euro
es.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
es_AR ISO8859-1 ISO8859-1 Base/Euro
es_BO ISO8859-1 ISO8859-1 Base/Euro
es_CL ISO8859-1 ISO8859-1 Base/Euro
es_CO ISO8859-1 ISO8859-1 Base/Euro
es_CR ISO8859-1 ISO8859-1 Base/Euro
es_EC ISO8859-1 ISO8859-1 Base/Euro
es_GT ISO8859-1 ISO8859-1 Base/Euro
es_MX ISO8859-1 ISO8859-1 Base/Euro
es-NI ISO8859-1 ISO8859-1 Base/Euro
es_PA ISO8859-1 ISO8859-1 Base/Euro
es_PE ISO8859-1 ISO8859-1 Base/Euro
es_PY ISO8859-1 ISO8859-1 Base/Euro
es_SV ISO8859-1 ISO8859-1 Base/Euro
es.UTF-8 UTF-8 UTF-8 Base/Euro
es_UY ISO8859-1 ISO8859-1 Base/Euro
et_VE ISO8859-1 ISO8859-1 Base/Euro
et ISO8859-1 ISO8859-1 Base/Euro
fi ISO8859-1 ISO8859-1 Base/Euro
fi.IOO8859-15 ISO8859-15 ISO8859-15 Base/Euro
fr ISO8859-1 ISO8859-1 Base/Euro
fr.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
fr.UTF-8 UTF-8 UTF-8 Base/Euro
fr_BE ISO8859-1 ISO8859-1 Base/Euro
fr_BE.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
fr_CA ISO8859-1 ISO8859-1 Base/Euro
fr_CH ISO8859-1 ISO8859-1 Base/Euro
he ISO8859-8 ISO8859-8 Base/Euro
he_IL ISO8859-8 ISO8859-8 Base/Euro
hr_HR ISO8859-2 ISO8859-2 Base/Euro
hu ISO8859-2 ISO8859-2 Base/Euro
it ISO8859-1 ISO8859-1 Base/Euro
it.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
it.UTF-8 UTF-8 UTF-8 Base/Euro
ja eucJP eucJP Japanese
ja_JP.PCK PCK PCK Japanese
ja_JP.UTF-8 UTF-8 UTF-8 Japanese
ko 5601 ko_KR-euc Korean
ko.UTF-8 UTF-8 UTF-8 Korean
lt ISO8859-4 ISO8859-4 Base/Euro
lv ISO8859-4 ISO8859-4 Base/Euro
mk_MK ISO8859-5 ISO8859-5 Base/Euro
nl ISO8859-1 ISO8859-1 Base/Euro
nl.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
nl_BE ISO8859-1 ISO8859-1 Base/Euro
 nl_BE.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
no ISO8859-1 ISO8859-1 Base/Euro
no_NY ISO8859-1 ISO8859-1 Base/Euro
nr ISO8859-2 ISO8859-2 Base/Euro
pl ISO8859-2 ISO8859-2 Base/Euro
POSIX 646 646 Base/Euro
pt ISO8859-1 ISO8859-1 Base/Euro
pt.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
pt_BR ISO8859-1 ISO8859-1 Base/Euro
ro_RO ISO8859-2 ISO8859-2 Base/Euro
ru ISO8859-5 ISO8859-5 Base/Euro
ru.KOI8-R KOI8-R KOI8-R Base/Euro
sk_SK ISO8859-2 ISO8859-2 Base/Euro
sl_SI ISO8859-2 ISO8859-2 Base/Euro
sq_AL ISO8859-2 ISO8859-2 Base/Euro
sr_SP ISO8859-5 ISO8859-5 Base/Euro
sv ISO8859-1 ISO8859-1 Base/Euro
sv.ISO8859-15 ISO8859-15 ISO8859-15 Base/Euro
sv.UTF-8 UTF-8 UTF-8 Base/Euro
th_TH TIS620.2533 TIS620.2533 Base/Euro
tr ISO8859-9 ISO8859-9 Base/Euro
zh gb2312 gb2312 Simplified Chinese
zh.GBK GBK zh_CN.gbk Simplified Chinese
zh_TW cns11643 zh_TW-euc Traditional Chinese
zh_TW.BIG5 BIG5 zh_TW_Big5 Traditional Chinese


Note -

Locale naming conventions are as follows:

language[_territory][.codeset] where language is from ISO639 and territory is from ISO3166.

All locales with Base/Euro in the Product column are also available as Japanese, Korean, Simplified Chinese, and Traditional Chinese products.

All Solaris product locales preserve the Portable Character Set characters with US-ASCII code values.



Note -

5601 signifies the Korean EUC codeset containing KS C 5636 and KS C 5601-1987.

646 signifies ISO/IEC 646, which is US-ASCII.

eucJP signifies the Japanese EUC codeset. It contains JIS X0201-1976, JIS X0208-1983, and JIS X0212-1990.

gb2312 signifies Simplified Chinese EUC codeset, which contains GV 1988-80 and GB 2312-80.

PCK is also known as Shift JIS (SJIS).

UTF-8 is the UTF-8 of ISO/IEC 10646-1 containing various approved amendments and UNICODE 2.1

GBK signifies GB extensions. This includes all GB 2312-80 characters and all Unified Han characters of ISO/IEC 10646-1, as well as Japanese Hiragana and Katagana characters. It also includes many characters of Chinese, Japanese, and Korean character sets and of ISO/IEC 10646-1.


Font Formats

There are many different font formats. The extension lets you determine the font type.

Location of Fonts on the System

Fonts are located at:

/usr/openwin/lib/locale/iso_8859_x/X11/fonts/X11/Type1/afm

or

/usr/openwin/lib/locale/iso_8859_x/X11/fonts/X11/75dpi
Adding and Removing Font Packages

To manually add font packages to the system:

  1. Always add the required font packages before the optional font packages.

  2. When you are removing font packages from the system, remove the optional font packages first.

You must follow this procedure to add or remove fonts. The class action scripts in the font packages depend on this for proper function. The optional font packages contain scripts that concatenate information onto the required font packages that are already resident on the system. If the required font packages are not there, problems may occur.

Summary of Asian Locales

Table 3-6 shows the Asian locales supported by these Asian products.

Table 3-5 Summary of Asian Locales

CD Set 

Locale Name 

Description 

Supported Character Set 

Korean 

ko UTF-8 

Korean (UTF-8 locale) 

KS C 5601-1992  

KS C 5700-1995 

 

 

 

 

Simplified Chinese 

zh GBK 

Simplified Chinese ()EUC)  

Simplified Chinese (GBK ) 

GB 2312-1980 

GBK 

 

 

 

 

Traditional Chinese 

zh_TW zh_TW.BIG5 

Traditional Chinese (EUC)  

Traditional Chinese (BIG5) 

CNS 11643 1992 

BIG5 

 

 

 

 

Japanese 

ja  

ja_JP.PCK 

ja_JP.UTF-8 

Japanese EUC 

Japanese PCK [ja_JP.PCK doesn't support JIS x 0212-1990]

Japanese UTF-8 

JIS x 0201-1976  

JIS x 0208-1990  

JIS x 0212-1990 

VDC [VDC: Vendor Defined Character. VDCs occupy unused (reserved) code points of JIS X 0208-1990 or JIS X 0212-1990]

UDC [UDC: User Defined Character. UDCs occupy unused (reserved) code points of JIS X 0208-1990 or JIS X 0212-1990 (also unused for VDCs.)]

Korean in the Solaris 7 Product

In December 1995, the Korean government announced a standard Korean codeset, KSC-5700, which is based on ISO-10646-1/Unicode 2.0. The standard codeset replaces KSC 5601, which was based on ISO-2022.

The ISO-10646 character set uses 2 (UCS-2; Universal Character Set two-byte form) or 4 (UCS-4) bytes to represent each character.

The ISO-10646 character set cannot be used directly on IBM-PC-based operating systems. For example, the kernel and many other modules of the Solaris operating environment interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations. In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which recodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).

The ko.UTF-8 is a Solaris locale to support KSC-5700, the Korean standard codeset. It supports all characters in the previous KSC 5601 and all 11,172 Korean characters. Korean UTF-8 supports the Korean language-related ISO-10646 characters and fonts. Because ISO-10646 covers all characters in the world, all of the various input methods and fonts are supplied so that you may input and output any character in any language. Before Universal UTF/UCS becomes available, Korean UTF-8 supports the ISO-10646 code subset that is related to Korean characters as well as all other characters in the previous Korean standard codeset, and Extended ASCII.

Table 3-6 lists the Korean codesets.

Table 3-6 Codeset Conversions Supported for Korean ko, ko.UTF-8

Code 

Symbol 

TargetCode 

Symbol 

UTF-8

ko_KR-UTF-8 

Wansung 

ko_KR-euc 

UTF-8

ko_KR-UTF-8 

Johap 

ko_KR-johap92 

UTF-8

ko_KR-UTF-8 

Packed 

ko_KR-johap 

UTF-8

ko_KR-UTF-8 

ISO-2022-KR 

ko_KR-iso2022-7 

Wansung

ko_KR-euc 

UTF-8 

ko_KR-UTF-8 

Johap

ko_KR-johap92 

UTF-8 

ko_KR-UTF-8 

Packed

ko_KR-johap 

UTF-8 

ko_KR-UTF-8 

ISO-2022-KR

ko_KR-iso2022-7 

UTF-8 

ko_KR-UTF-8 

Wansung

ko_KR-euc 

Johap 

ko_KR-johap92 

Wansung

ko_KR-euc 

Packed 

ko_KR-johap 

Wansung

ko_KR-euc 

N-Byte 

ko_KR-nbyte 

Wansung

ko_KR-euc 

ISO-2022-KR 

ko_KR-iso2022-7 

Johap

ko_KR-johap92 

Wansung 

ko_KR-euc 

Packed

ko_KR-johap 

Wansung 

ko_KR-euc 

N-Byte

ko_KR-nbyte  

Wansung 

ko_KR-euc 

ISO-2022-KR

ko_KR-iso2022-7 

Wansung 

ko_KR-euc 

Chinese: Simplified and Traditional

Simplified Chinese in the Solaris 7 environment provides two locales: zh and zh.GBK. In the zh locale, the EUC scheme is usesd to encode GB2312-80 The zh.GBK locale supports the GBK codeset, which is a superset of GB2312-80.

Simplified Chinese is used mostly in the People's Republic of China (PRC) and in Singapore..

The following input methods are supported for the zh locale

The following input methods are supported for the zh.GBK locale

Table 3-7 shows the TrueType Fonts for the zh Locale

Table 3-7 Solaris 7 TrueType Fonts for the zh Locale
 Full Family Name Subfamily Format Vendor Encoding
 Fangsong R TrueType Hanyi GB2312.1980
 Hei R TrueType Monotype GB2312.1980
 Kai R TrueType Monotype GB2312.1980
 Song R TrueType Monotype GB2312.1980

Table 3-8 shows the Bitmap Fonts for the zh Locale

Table 3-8 Solaris 7 Bitmap Fonts for the zh Locale
 Full Family Name Subfamily Format Encoding
 Song B PCF (14,16) GB2312.1980
 Song R PCF (12,14,16,20,24) GB2312.1980

Table 3-9 shows the TrueType Fonts for the zh.GBK Locale

Table 3-9 TrueType Fonts for the zh.GBK Locale
 Full Family NameS Subfamily Format Vendor Encoding
 Fansong R TrueType Zhongyi GBK
 Hei R TrueType Zhongyi GBK
 Kai R TrueType Zhongyi GBK
 Song R TrueType Zhongyi GBK

Table 3-10 shows the Bitmap Fonts for the zh.GBK Locale

Table 3-10 Bitmap Fonts for the zh.GBK Locale
 Full Family Name Subfamily Format Encoding
 Song R PCF (12,14,16,20,24) GBK

Table 3-11 shows the supported codeset conversions for Simplified Chinese.

Table 3-11 Codeset Conversions for Simplified Chinese

Code 

Symbol 

TargetCode 

Symbol 

GB2312-80

zh_CN.euc 

ISO 2022-7 

zh_CN.iso2022-7 

ISO 2022-7

zh_CN.iso2022-7 

GB2312-80 

zh_CN.euc 

GB2312-80

zh_CN.euc 

ISO 2022-CN 

zh_CN.iso2022-CN 

ISO-2022-CN

zh_CN.iso2022-CN 

GB2312-80 

zh_CN.euc 

UTF-8

UTF-8 

GB2312-80 

zh_CN.euc 

GB2312-80

zh_CN.euc 

UTF-8 

UTF-8 

zh.GBK

zh_CN.gbk 

ISO2022-CN 

zh_CN.iso2022-CN 

ISO2022-CN

 zh_CN.iso2022-CN zh.GBK zh_CN.gbk

zh.GBK

zh_CN.gbk 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5 

zh.GBK 

zh_CN.gbk 

GB2312-80

zh_CN.euc 

Big-5 

zh_TW-Big5 

Big-5

 zh_TW-Big5

GB2312-80 

zh_CN.euc 

UTF-8

UTF-8 

 zh.GBK

zh_CN.gbk 

zh.GBK

zh_CN.gbk 

UTF-8 

UTF-8 

UTF-8

UTF-8 

ISO2022-CN 

zh_CN.iso2022-CN 

ISO2022-CN

zh_CN.iso2022-CN 

UTF-8 

UTF-8 

Traditional Chinese in the Solaris 7 product provides two locales: zh_TW and zh_TW.BIG5. In the zh_TW locale, the EUC scheme is used to encode CNS 11643.1992 codeset. The zh_TW.BIG5 locale supports the Big-5 codeset.

Traditional Chinese is used mostly in Taiwan and Hong Kong.

Traditional Chinese supports the following input methods:

Table 3-12 Traditional Chinese Truetype Fonts for the zh_TW Locales
 Full Family Name Subfamily Format Vendor Encoding
 Hei R Truetype Hanyi CNS11643.1992
 Kai R Truetype Hanyi CNS11643.1992
 Ming R Truetype Hanyi CNS11643.1992

Table 3-13 shows the Traditional Chinese BitMap Fonts for the zh_TW Locales

Table 3-13 Traditional Chinese BitMap Fonts for the zh_TW Locales
 Full Family Name Subfamily Format Encoding
 Ming R PCF (12,14,16,20,24) CNS11643.1992

Table 3-14 shows the Traditional Chinese TrueType Fonts for the zh_TW.BIG5 Locales

Table 3-14 Traditional Chinese TrueType Fonts for the zh_TW.BIG5 Locales
 Full Family Name Subfamily Format Vendor Encoding
 Hei R TrueType Hanyi Big5
 Kai R TrueType Hanyi Big5
 Ming R TrueType Hanyi Big5

Table 3-15 shows the Traditional Chinese BitMap Fonts for the zh_TW.BIG5 Locales

Table 3-15 Traditional Chinese BitMap Fonts for the zh_TW.BIG5 Locales
 Full Family Name Subfamily Format Encoding
 Ming R PCF (12,14,16,20,24) Big5

Table 3-16 shows the supported codeset conversions for Traditional Chinese.

Table 3-16 Codeset Conversions for Traditional Chinese

Code 

Symbol 

TargetCode 

Symbol 

CNS 11643

zh_TW-euc 

Big-5  

zh_TW-Big5 

CNS 11643

zh_TW-euc 

ISO 2022-7 

zh_TW-iso2022-7 

Big-5

zh_TW-Big5 

CNS 11643 

zh_TW-euc 

Big-5

zh_TW-Big5 

ISO 2022-7 

zh_TW-iso2022-7 

ISO 2022-7

zh_TW-iso2022-7 

CNS 11643 

zh_TW-euc 

ISO 2022-7

zh_TW-iso2022-7 

Big-5 

zh_TW-Big5 

CNS 11643

zh_TW-eu 

ISO 2022-CN-EXT 

zh_TW-iso2022-CN-EXT 

ISO 2022-CN-EXT

zh_TW-iso2022-CN-EXT 

CNS 11643 

zh_TW-euc 

Big-5

zh_TW-Big5 

ISO 2022-CN 

zh_TW-iso2022-CN 

ISO 2022-CN

zh_TW-iso2022-CN 

Big-5 

zh_TW-Big5 

UTF-8

UTF-8 

CNS 11643 

zh_TW-euc 

CNS 11643

zh_TW-euc 

UTF-8 

UTF-8 

UTF-8

UTF-8 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5 

UTF-8 

UTF-8 

UTF-8

UTF-8 

ISO 2022-7 

zh_TW-iso2022-7 

ISO 2022-7

zh_TW-iso2022-7 

UTF-8 

UTF-8 

ISO 2022-CN-EXT

zh_TW-iso2022-CN-EX 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5 

ISO 2022-CN-EXT 

zh_TW-iso2022-CN-EXT 

Japanese Input Systems

Three Japanese input systems are bundled in Japanese Solaris 7. They can be used in the ja, ja_JP.PCK and ja_JP.UTF-8 locales. However, some maintenance utilities do not support the PCK codeset.

The Japanese Input System is shown below in Table 3-17.

Table 3-17 Japanese Input Systems

Name 

Description 

Wnn6

Wnn6 consists of the Kana-Kanji conversion server (jserver), interface module for htt (X Input Method Server) called xjsi.so, utilities, and dictionaries. Wnn6 is the default Japanese input system.

Wnn6 supports JIS X 0201-1976, JIS X 0208-1990 and JIS X0212-1990 character sets. 

ATOK8

ATOK8 consists of atok8 X Input Method Server, utilities, and dictionaries. ATOK8 is a popular Japanese input system facility in the Japanese PC market. ATOK7 was released with Solaris 2.1 until 2.5.1 has been replaced by ATOK8.

ATOK8 supports JIS X 0201-1976 and JIS X 0208-1990 character sets.

cs00

cs00 consists of the Kana-Kanji conversion server (cs00), interface module for htt (X Input Method Server) called xci.so, utilities, and dictionaries. cs00 has been bundled with Japanese Solaris since Solaris 2.1

cs00 supports JIS X 0201-1976, JIS X 0208-1990 and JIS X 0212-1990 character sets.

Japanese TrueType Fonts are show below in Table 3-18.

Table 3-18 Japanese TrueType Fonts

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

hg gothic b

TrueType 

RICOH 

JISX0208.1983, JISX0201.1976 

hg mincho l

TrueType 

RICOH 

JISX0208.1983, JISX0201.1976 

heiseimin

TrueType 

RICOH 

JISX0212.1990 

Japanese Bitmap Fonts are shown in Table 3-19 below.

Table 3-19 Japanese Bitmap Fonts

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

gothic

R, B 

PCF(12,14,16,20,24) 

 

JISX0208.1983, 

JISX0201.1976 

minchou

PCF(12,14,16,20,24) 

 

JISX0208.1983, 

JISX0201.1976 

hg gothic b

PCF(12,14,16,18,20,24) 

RICOH 

JISX0208.1983, JISX0201.1976 

hg mincho l

PCF(12,14,16,18,20,2) 

RICOH 

JISX0208.1983, JISX0201.1976 

heiseimin

PCF(12,14,16,18,20,24) 

RICOH 

JISX0212.1990 

Japanese Locales

Japanese Solaris 7 supports three locales. The ja locale is based on Japanese EUC. The ja_JP.PCK locale is based on PC-Kanji code (Shift JIS) and the ja_JP.UTF-8 locale is based on UTF-8.

Japanese Messages and man Pages

Some messages and manual pages have been translated into Japanese in Japanese Solaris 7.

Japanese Character Code Converter for iconv

The following table shows supported conversion with iconv(1) and iconv(3). See the iconv_ja(5)man page for details.

Table 3-20 shows iconv Conversion Support.

Table 3-20 iconv Conversion Support

Source Code 

Target Code 

eucJP

PCK

eucJP

JIS7

eucJP

SJIS

eucJP

UTF-8

eucJP

jis

eucJP

ibmj

SJIS

eucJP

SJIS

ISO-2022-JP

SJIS

UTF-8

SJIS

jis

SJIS

ibmj

PCK

eucJP

PCK

UTF-8

PCK

ISO-2022-JP

PCK

jis

PCK

ibmj

ISO-2022-JP

eucJP

ISO-2022-JP

PCK

ISO-2022-JP

SJIS

UTF-8

eucJP

UTF-8

SJIS

UTF-8

PCK

JIS7

eucJP

jis

eucJP

jis

PCK

jis

SJIS

ibmj

eucJP

ibmj

PCK

UTF-8

ISO-2022-JP

ISO-2022-JP

UTF-8

eucJP

UTF-8-Java

UTF-8-Java

eucJP

PCK

UTF-8-Java

UTF-8-Java

PCK

eucJP

ISO-2022-JP.RFC1468

PCK

ISO-2022-JP.RFC1468

UTF-8

ISO-2022-JP.RFC1468

eucJP

ibmj-EBCDIK

ibmj-EBCDIK

eucJP

PCK

ibmj-EBCDIK

ibmj-EBCDIK

PCK

Japanese Character Code Converter for TTY STREAMS

There are TTY STREAMS modules that perform code conversion between an encoding for a specific terminal and an encoding for a specific locale. With an appropriate STREAMS module, a user can log in from a Japanese terminal into a Japanese locale, even if the encoding between the terminal and the Japanese locale does not match. tty(1) controls the behavior of those STREAMS modules.

Japanese-specific Printer Support

The Japanese Solaris 7 product supports the following Japanese-specific printers:

JLE Binary Compatibility Package

The Japanese Solaris 7 package also provides Japanese Solaris 1.1.x binary-compatibility packages that are the same as the base products.

User-Defined Character (UDC) Support

To handle User-Defined Characters, sdtudctool has been available since the Solaris 2.6 release. Sdtudctool handles both outline (Type1) and bitmap (PCF) fonts. Some utilities are also available to migrate the UDC fonts that were created by old utilities, such as fontedit, type3creator andfontmanager in prior releases.

Korean Solaris 7 Product

The Korean Solaris product, used mostly in Korea, supports all the locales available in the English/Euro products. Additionally, it supports two Korean locales: ko and ko.UTF-8. In the ko locale, the EUC scheme is used to encode KSC 5601-1987. The ko.UTF-8 locale supports the KSC 5700-1995/Unicode 2.0 codeset, which is a super set of KSC 5601-1987. These two locales look the same for the end user, but the internal character encoding is different. The Korean Solaris product supports the following Input Methods

for the ko locale:

for the ko.UTF-8 locale:

The following fonts are available in the Korean version of the Solaris 7 product:

Table 3-21 Solaris 7 Korean CID/Type 1 Fonts for the ko Locale
 Full Family Name Subfamily Format Vendor Encoding
 Gothic R CID/Type 1 Hanyang Adobe-Korean
 Graphic R CID/Type 1 Hanyang Adobe-Korean
 Haeso R CID/Type 1 Hanyang Adobe-Korean
 Kodig R CID/Type 1 Hanyang Adobe-Korean
 Myeongijo R CID/Type 1 Hanyang Adobe-Korean
 Pilki R CID/Type 1 Hanyang Adobe-Korean
 Roundgothic R CID/Type 1 Hanyang Adobe-Korean

Table 3-22 Solaris 7 Korean Bitmap Fonts for the ko Locale
 Full Family Name Subfamily Format Encoding
 Gothic R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Graphic R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Haeso R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Kodig R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Myeongijo R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Pilki R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Roundgothic R/B PCF (12,14,16,18,20,24) KSC 5601-1987

Table 3-23 Solaris 7 Korean CID/Type 1 Fonts for the ko.UTF-8 Locale
 Full Family Name Subfamily Format Vendor Encoding
 Gothic R CID/Type 1 Hanyang Adobe-Korean
 Graphic R CID/Type 1 Hanyang Adobe-Korean
 Haeso R CID/Type 1 Hanyang Adobe-Korean
 Kodig R CID/Type 1 Hanyang Adobe-Korean
 Myeongijo R CID/Type 1 Hanyang Adobe-Korean
 Pilki R CID/Type 1 Hanyang Adobe-Korean

Table 3-24 Solaris 7 Korean Bitmap Fonts for the ko.UTF-8 Locale
 Full Family Name Subfamily Format Encoding
 Gothic R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Graphic R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Haeso R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Kodig R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Myeongijo R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Pilki R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)

Table 3-25 Korean ICONV
 Code Symbol Target Code Symbol
 KSC 5601-1987 1506 UTF-8 UTF-8
 ISO 646 646 KSC 5601-1987 5601
 KSC 5601-1987 EUC-KR UTF-8 UTF-8
 KSC 5601-1987 KSC5601 UTF-8 UTF-8
 UTF-8 UTF-8 KSC 5601-1987 5601
 UTF-8 UTF-8 KSC 5601-1987 EUC-KR
 UTF-8 UTF-8 KSC 5601-1987 KSC 5601
 UTF-8 ko-KR-UTF-8 IBM CP 933 cp 933
 UTF-8 ko-KR-UTF-8 KSC 5601-1987 ko_KR-euc
 UTF-8 ko-KR-UTF-8 ISO2022-KR ko_KR-iso2022-7
 UTF-8 ko-KR-UTF-8 KSC 5601-1987 - Johap ko_KR-johap
 UTF-8 ko-KR-UTF-8 KSC5601-1992 - Johap ko_KR-johap92
 IBM CP933 cp933 UTF-8 ko_KR-UTF-8
 KSC 5601-1987 ko_KR-euc UTF-8 ko_KR-UTF-8
 KSC 5601-1987 ko_KR-euc ISO 2022-KR ko_KR-iso2022-7
 KSC 5601-1987 ko_KR-euc KSC 5601-1987 - Johap ko_KR-johap
 KSC 5601-1987 ko_KR-euc KSC 5601-1992 - Johap ko_KR-johap92
 KSC 5601-1987 ko_KR-euc KSC 5601-1992-Annex:4 ko_KR-nbyte
 ISO 2022-KR iso2022-7 UTF-8 ko_KR-UTF-8
 ISO 2022-KR iso2022-7 KSC 5601-1987 ko_KR-euc
 KSC 5601-1987 - Johap ko-KR-johap UTF-8 ko_KR-UTF-8
 KSC 5601-1987 - Johap ko-KR-johap KSC 5601-1987 ko_KR-euc
 KSC 5601-1992 - Johap ko-KR-johap92 UTF-8 ko_KR-UTF-8
 KSC 5601-1992 - Johap ko-KR-johap92 KSC 5601-1987 ko_KR-euc
 KSC 5601-1992 - Annex:4 ko-KR-nbyte KSC 5601-1987 ko_KR-euc
 

How to Use the iconv Command

The iconv command converts the characters or sequences of characters in a file from one codeset to another, then writes the results to standard output. If there is no conversion for a particular character, it is converted into an underscore `_' in the target codeset. See the iconv(1) man page for more information.

The following options are supported:

To convert a mail file from one encoding into another, use the iconv command:

example% iconv -f from_codeset -t to_codeset mail.codeset > mail.codeset