Solaris Internationalization Guide For Developers

Chapter 3 Contents of the Localized Solaris 7 Products

The European Localized Solaris 7 Product

European Solaris is available in three localized versions: French, German, and European. All three versions of Solaris share the same software media, which includes a fully localized CDE environment, error messages, and on-line documentation in six languages--French, German, Spanish, Swedish, Italian, and English. The difference is in the printed documentation. The French and German Solaris products include localized printed documentation, while the printed documentation for the European version is in English only.

Table 3-1 shows a list of locales in the European product. This includes both full and partial locales.

Table 3-1 European 7 Locales


Locale Name	Language/Territory
`C`	POSIX English (7-bit) ASCII C
`cz`	Czech Republic
`da`	Denmark
`de`	Germany
`de_AT`	Austria
`de_CH`	Switzerland
`de.ISO8859-15`	Germany
`el`	Greece
`en_AU`	Australia
`en_CA`	Canada
`en_IE`	Ireland
`en_NZ`	New Zealand
`en_UK`	Great Britain
`en_US`	U.S.
`es`	Spain
`es_AR`	Argentina
`es_BO`	Bolivia
`es_CL`	Chile
`es_CO`	Colombia
`es_CR`	Costa Rica
`es_EC`	Ecuador
`es_GT`	Guatemala
`es_MX`	Mexico
`es_NI`	Nicaragua
`es_PA`	Panama
`es_PE`	Peru
`es_PY`	Paraguay
`es_SV`	El Salvador
`es_UY`	Uruguay
`es_VE`	Venezuela
`et`	Estonia
`fr`	France
`fr_BE`	Belgium (French)
`fr_CA`	Canada (French)
`fr_CH`	Switzerland (French)
`fr.ISO8859-15`	France
`fr.UTF-8`	France
`hu`	Hungary
`it.ISO8859-15`	Italy
`it.UTF-8`	Italy
`it.ISO8859-15`	Italy
`lt.ISO8859-13`	Lithuania
`lv.ISO8859-13`	Latvia
`nl`	Netherlands
`nl_BE`	Netherlands/Belgium
`no`	Norway
`pl`	Poland
`pt_BR`	Portuguese Brazil
`ru`	Russia
`it.ISO8859-15`	Italy
`es.ISO8859-15`	Spain
`sv.ISO8859-15`	Sweden
`en_EU.ISO8859-15`	Europe
`en_GB.ISO8895-15`	Britain
`fr_BE.ISO8895-15`	Belgium
`nl.ISO8895-15`	Netherlands
`nl_BE.ISO8895-15`	Belgium
`pt.ISO8895-15`	Portugal
`de.-AT.ISO8895-15`	Austria
`en_IE.ISO8859-15`	Ireland
`da.ISO8859-15`	Denmark
`fi.ISO8859-15`	Finland
`el_EURO`	Greece
`sun_eu_greek`	Greece
`de.UTF-8`	Germany
`de.ISO8859-15`	Germany
`fr.UTF-8`	France
`it.UTF-8`	Italy
`es.UTF-8`	Spain
`es.ISO8859-15`	Spain
`sv.UTF-8`	Sweden
`sv.ISO8859-15`	Sweden
`en_UTF.8`	Europe
`en_ISO8859-15`	Europe

All of these locales are also present in the base Solaris 7 release.

As mentioned, the locales include partial locales. These are based on core locales for the main language. For example, the fr_CA (French Canadian) is based on the fr (French) locale. These partial locales utilize the messages that are delivered into its parent locale (French for fr_CA). If a locale hasn't been fully localized, then it may contain only English messages.

A number of Eastern European locales have also been added into the Solaris 7 product, which may be based on other ISO standards. Previously Sun locales were based on ISO-8859-1. The Eastern European locales are based on other ISO standards, as shown in Table 3-2.

Locales that are not listed are still based on ISO-8859-1.

Table 3-2 Eastern European Locales in the Solaris 7 Product


Locale Name	Language/Territory	ISO
`de_AT`	German (Austrian)	8859-1
`et`	Estonian	8859-15
`cz`	Czech	8859-2
`hu`	Hungarian	8859-2
`pl`	Polish	8859-2
`lv`	Latvian	8859-13
`lt`	Lithuanian	8859-13
`ru`	Russian	8859-5
`el`	Greek	8859-7
`tr`	Turkish	8859-9
`sq_AL`	Albanian	8859-2
`sk_SK`	Slovakian	8859-2
`sl_SL`	Slovenian	8859-2
`hr_HR`	Croatian	8859-2
`nr`	Bosnian	8859-2
`ro_RO`	Romanian	8859-2
`sr_SP`	Serbian	8859-5
`bg_BG`	Bulgarian	8859-5
`mk_MK`	Macedonian	8859-5
`ru.KOI8-R`	Russian	KOI8-R
`ar`	Arabic	8859-6
`he`	Hebrew	8859-8
`th_TH`	Thai	8859-11 (TIS 620.2533)

All of the locales support character input and output. There is also iconv support for many of the major codesets. (For more on iconv, see iconv(1)The iconv modules are available on the end-user cluster of the Euro product. See Table 3-3 for details.

Table 3-3 iconv Support


Code	Symbol	Target Code	Symbol	Comment
`ISO 8859-2`	iso2	MS 1250	win2	Windows Latin 2
`ISO 8859-2`	iso2	MS 852	dos2	MS-DOS Latin 2
`ISO 8859-2`	iso2	Mazovia	maz	Mazovia
`ISO 8859-2`	iso2	DHN	dhn	Dom Handlowy Nauki
`MS 1250`	win2	ISO 8859-2	iso2	ISO Latin 2
`MS 1250`	win2	MS 852	dos2	MS-DOS Latin 2
`MS 1250`	win2	Mazovia	maz	Mazovia
`MS 1250`	win2	DHN	dhn	Dom Handlowy Naduki
`MS 852`	dos2	ISO 8859-2	iso2	ISO Latin 2
`MS 852`	dos2	MS 1250	win2	Windows Latin 2
`MS 852`	dos2	Mazovia	maz	Mazovia
`MS 852`	dos2	DHN	dhn	Dom Handlowy Nauki
`Mazovia`	maz	ISO 8859-2	iso2	ISO Latin 2
`Mazovia`	maz	MS 1250	win2	Windows Latin 2
`Mazovia`	maz	MS 852	dos2	MS-DOS Latin 2
`Mazovia`	maz	DHN	dhn	Dom Handlowy Nauki
`DHN`	dhn	ISO 8859-2	iso2	ISO Latin 2
`DHN`	dhn	MS 1250	win2	Windows Latin 2
`DHN`	dhn	MS 852	dos2	MS-DOS latin 2
`DHN`	dhn	Mazovia	maz	Mazovia
`ISO 8859-5`	iso5	KOI8-R	koi8	KOI8-R
`ISO 8859-5`	iso5	PC Cyrillic	alt	Alternative PC Cyrillic
`ISO 8859-5`	iso5	MS 1251	win5	Windows Cyrillic
`ISO 8859-5`	iso5	Mac Cyrillic	mac	Macintosh Cyrillic
`OKI8-R`	koi8	ISO 8859-5	iso5	ISO 8859-5 Cyrillic
`KOI8-R`	koi8	PC Cyrillic	alt	Alternative PC Cyrillic
`KOI8-R`	koi8	MS 1251	win5	Windows Cyrillic
`KOI8-R`	koi8	Mac Cyrillic	mac	Macintosh Cyrillic
`PC Cyrillic`	alt	ISO 8859-5	iso5	ISO 8859-5 Cyrillic
`PC Cyrillic`	alt	KOI8-R	koi8	KOI8-R
`PC Cyrillic`	alt	MS 1251	win5	Windows Cyrillic
`PC Cyrillic`	alt	Mac Cyrillic	mac	Macintosh Cyrillic
`MS 1251`	win5	ISO 8859-5	iso5	ISO 8859-5 Cyrillic
`MS 1251`	win5	KOI8-R	koi8	KOI8-R
`MS 1251`	win5	PC Cyrillic	alt	Alternative PC Cyrillic
`MS 1251`	win5	Mac Cyrillic	mac	Macintosh Cyrillic
`Mac Cyrillic`	mac	ISO 8859-5	iso5	ISO 8859-5 Cyrillic
`Mac Cyrillic`	mac	KOI8-R	koi8	KOI8-R
`Mac Cyrillic`	mac	PC Cyrillic	alt	Alternative PC Cyrillic
`Mac Cyrillic`	mac	MS 1251	win5	Windows Cyrillic

Table 3-4 contains a list of the Solaris 7 environment locales and their corresponding codeset names.

Table 3-4 New Locales and Corresponding Codeset Names


Locale	nl_langinfo (CODESET)	ICONV name	Product
`ar`	ISO8859-6	ISO8859-6	Base/Euro
`bg_BG`	ISO8859-5	ISO8859-5	Base/Euro
`C`	646	646	Base/Euro
`cz`	ISO8859-2	ISO8859-2	Base/Euro
`da`	ISO8859-1	ISO8859-1	Base/Euro
`da.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`de`	ISO8859-1	ISO8859-1	Base/Euro
`de.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`de_UTF-8`	UTF-8	UTF-8	Base/Euro
`de_AT`	ISO8859-1	ISO8859-1	Base/Euro
`de_AT.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`de_CH`	ISO8859-1	ISO8859-1	Base/Euro
`el`	ISO8859-7	ISO8859-7	Base/Euro
`el.sun_eu_greek`	ISO8859-15	ISO8859-15	Base/Euro
`en_AU`	ISO8859-1	ISO8859-1	Base/Euro
`en_CA`	ISO8859-1	ISO8859-1	Base/Euro
`en_EU.ISO8859-15`	ISO8859-15	ISO8859-1	Base/Euro
`en_EU.UTF-8`	UTF-8	UTF-8	Base/Euro
`en_GB`	ISO8859-1	ISO8859-1	Base/Euro
`en_GB.ISO8859-15`	ISO8859-15	ISO8859-1	Base/Euro
`en_IE`	ISO8859-1	ISO8859-1	Base/Euro
`en_IE.ISO8859-15`	ISO8859-15	ISO8859-1	Base/Euro
`en_NZ`	ISO8859-1	ISO8859-1	Base/Euro
`en_US`	ISO8859-1	ISO8859-1	Base/Euro
`en_US.UTF-8`	UTF-8	UTF-8	Base/Euro
`es`	ISO8859-1	ISO8859-1	Base/Euro
`es.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`es_AR`	ISO8859-1	ISO8859-1	Base/Euro
`es_BO`	ISO8859-1	ISO8859-1	Base/Euro
`es_CL`	ISO8859-1	ISO8859-1	Base/Euro
`es_CO`	ISO8859-1	ISO8859-1	Base/Euro
`es_CR`	ISO8859-1	ISO8859-1	Base/Euro
`es_EC`	ISO8859-1	ISO8859-1	Base/Euro
`es_GT`	ISO8859-1	ISO8859-1	Base/Euro
`es_MX`	ISO8859-1	ISO8859-1	Base/Euro
`es-NI`	ISO8859-1	ISO8859-1	Base/Euro
`es_PA`	ISO8859-1	ISO8859-1	Base/Euro
`es_PE`	ISO8859-1	ISO8859-1	Base/Euro
`es_PY`	ISO8859-1	ISO8859-1	Base/Euro
`es_SV`	ISO8859-1	ISO8859-1	Base/Euro
`es.UTF-8`	UTF-8	UTF-8	Base/Euro
`es_UY`	ISO8859-1	ISO8859-1	Base/Euro
`et_VE`	ISO8859-1	ISO8859-1	Base/Euro
`et`	ISO8859-1	ISO8859-1	Base/Euro
`fi`	ISO8859-1	ISO8859-1	Base/Euro
`fi.IOO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`fr`	ISO8859-1	ISO8859-1	Base/Euro
`fr.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`fr.UTF-8`	UTF-8	UTF-8	Base/Euro
`fr_BE`	ISO8859-1	ISO8859-1	Base/Euro
`fr_BE.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`fr_CA`	ISO8859-1	ISO8859-1	Base/Euro
`fr_CH`	ISO8859-1	ISO8859-1	Base/Euro
`he`	ISO8859-8	ISO8859-8	Base/Euro
`he_IL`	ISO8859-8	ISO8859-8	Base/Euro
`hr_HR`	ISO8859-2	ISO8859-2	Base/Euro
`hu`	ISO8859-2	ISO8859-2	Base/Euro
`it`	ISO8859-1	ISO8859-1	Base/Euro
`it.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`it.UTF-8`	UTF-8	UTF-8	Base/Euro
`ja`	eucJP	eucJP	Japanese
`ja_JP.PCK`	PCK	PCK	Japanese
`ja_JP.UTF-8`	UTF-8	UTF-8	Japanese
`ko`	5601	ko_KR-euc	Korean
`ko.UTF-8`	UTF-8	UTF-8	Korean
`lt`	ISO8859-4	ISO8859-4	Base/Euro
`lv`	ISO8859-4	ISO8859-4	Base/Euro
`mk_MK`	ISO8859-5	ISO8859-5	Base/Euro
`nl`	ISO8859-1	ISO8859-1	Base/Euro
`nl.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`nl_BE`	ISO8859-1	ISO8859-1	Base/Euro
nl_BE.ISO8859-15	ISO8859-15	ISO8859-15	Base/Euro
`no`	ISO8859-1	ISO8859-1	Base/Euro
`no_NY`	ISO8859-1	ISO8859-1	Base/Euro
`nr`	ISO8859-2	ISO8859-2	Base/Euro
`pl`	ISO8859-2	ISO8859-2	Base/Euro
`POSIX`	646	646	Base/Euro
`pt`	ISO8859-1	ISO8859-1	Base/Euro
`pt.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`pt_BR`	ISO8859-1	ISO8859-1	Base/Euro
`ro_RO`	ISO8859-2	ISO8859-2	Base/Euro
`ru`	ISO8859-5	ISO8859-5	Base/Euro
`ru.KOI8-R`	KOI8-R	KOI8-R	Base/Euro
`sk_SK`	ISO8859-2	ISO8859-2	Base/Euro
`sl_SI`	ISO8859-2	ISO8859-2	Base/Euro
`sq_AL`	ISO8859-2	ISO8859-2	Base/Euro
`sr_SP`	ISO8859-5	ISO8859-5	Base/Euro
`sv`	ISO8859-1	ISO8859-1	Base/Euro
`sv.ISO8859-15`	ISO8859-15	ISO8859-15	Base/Euro
`sv.UTF-8`	UTF-8	UTF-8	Base/Euro
`th_TH`	TIS620.2533	TIS620.2533	Base/Euro
`tr`	ISO8859-9	ISO8859-9	Base/Euro
`zh`	gb2312	gb2312	Simplified Chinese
`zh.GBK`	GBK	zh_CN.gbk	Simplified Chinese
`zh_TW`	cns11643	zh_TW-euc	Traditional Chinese
`zh_TW.BIG5`	BIG5	zh_TW_Big5	Traditional Chinese

Note -

Locale naming conventions are as follows:

language[_territory][.codeset] where language is from ISO639 and territory is from ISO3166.

All locales with Base/Euro in the Product column are also available as Japanese, Korean, Simplified Chinese, and Traditional Chinese products.

All Solaris product locales preserve the Portable Character Set characters with US-ASCII code values.

Note -

5601 signifies the Korean EUC codeset containing KS C 5636 and KS C 5601-1987.

646 signifies ISO/IEC 646, which is US-ASCII.

eucJP signifies the Japanese EUC codeset. It contains JIS X0201-1976, JIS X0208-1983, and JIS X0212-1990.

gb2312 signifies Simplified Chinese EUC codeset, which contains GV 1988-80 and GB 2312-80.

PCK is also known as Shift JIS (SJIS).

UTF-8 is the UTF-8 of ISO/IEC 10646-1 containing various approved amendments and UNICODE 2.1

GBK signifies GB extensions. This includes all GB 2312-80 characters and all Unified Han characters of ISO/IEC 10646-1, as well as Japanese Hiragana and Katagana characters. It also includes many characters of Chinese, Japanese, and Korean character sets and of ISO/IEC 10646-1.

Font Formats

There are many different font formats. The extension lets you determine the font type.

PostScript Type 1 Fonts , which are also known as Adobe Type Manager (ATM) fonts, Type 1, and outline fonts, contain information in outline form that allows a PostScript printer or ATM to generate fonts of any size. Most of these fonts also contain hints that allow fonts to be rendered more readable at a low resolution or a small type size.
Bitmap Fonts contain a picture of the font at a specific size that has been optimized to look good at that specific size. If the font is scaled larger or smaller, the quality may degrade. On the other hand, bitmap fonts display quickly.

Location of Fonts on the System

Fonts are located at:

/usr/openwin/lib/locale/iso_8859_x/X11/fonts/X11/Type1/afm

/usr/openwin/lib/locale/iso_8859_x/X11/fonts/X11/75dpi

Adding and Removing Font Packages

To manually add font packages to the system:

Always add the required font packages before the optional font packages.
When you are removing font packages from the system, remove the optional font packages first.

You must follow this procedure to add or remove fonts. The class action scripts in the font packages depend on this for proper function. The optional font packages contain scripts that concatenate information onto the required font packages that are already resident on the system. If the required font packages are not there, problems may occur.

Summary of Asian Locales

Table 3-6 shows the Asian locales supported by these Asian products.

Table 3-5 Summary of Asian Locales


CD Set	Locale Name	Description	Supported Character Set
Korean	ko UTF-8	Korean (UTF-8 locale)	KS C 5601-1992 KS C 5700-1995

Simplified Chinese	zh GBK	Simplified Chinese ()EUC) Simplified Chinese (GBK )	GB 2312-1980 GBK

Traditional Chinese	zh_TW zh_TW.BIG5	Traditional Chinese (EUC) Traditional Chinese (BIG5)	CNS 11643 1992 BIG5

Japanese	ja ja_JP.PCK ja_JP.UTF-8	Japanese EUC Japanese PCK [ja_JP.PCK doesn't support JIS x 0212-1990] Japanese UTF-8	JIS x 0201-1976 JIS x 0208-1990 JIS x 0212-1990 VDC [VDC: Vendor Defined Character. VDCs occupy unused (reserved) code points of JIS X 0208-1990 or JIS X 0212-1990] UDC [UDC: User Defined Character. UDCs occupy unused (reserved) code points of JIS X 0208-1990 or JIS X 0212-1990 (also unused for VDCs.)]

Korean in the Solaris 7 Product

In December 1995, the Korean government announced a standard Korean codeset, KSC-5700, which is based on ISO-10646-1/Unicode 2.0. The standard codeset replaces KSC 5601, which was based on ISO-2022.

The ISO-10646 character set uses 2 (UCS-2; Universal Character Set two-byte form) or 4 (UCS-4) bytes to represent each character.

The ISO-10646 character set cannot be used directly on IBM-PC-based operating systems. For example, the kernel and many other modules of the Solaris operating environment interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations. In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which recodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).

The ko.UTF-8 is a Solaris locale to support KSC-5700, the Korean standard codeset. It supports all characters in the previous KSC 5601 and all 11,172 Korean characters. Korean UTF-8 supports the Korean language-related ISO-10646 characters and fonts. Because ISO-10646 covers all characters in the world, all of the various input methods and fonts are supplied so that you may input and output any character in any language. Before Universal UTF/UCS becomes available, Korean UTF-8 supports the ISO-10646 code subset that is related to Korean characters as well as all other characters in the previous Korean standard codeset, and Extended ASCII.

Table 3-6 lists the Korean codesets.

Table 3-6 Codeset Conversions Supported for Korean ko, ko.UTF-8


Code	Symbol	TargetCode	Symbol
`UTF-8`	ko_KR-UTF-8	Wansung	ko_KR-euc
`UTF-8`	ko_KR-UTF-8	Johap	ko_KR-johap92
`UTF-8`	ko_KR-UTF-8	Packed	ko_KR-johap
`UTF-8`	ko_KR-UTF-8	ISO-2022-KR	ko_KR-iso2022-7
`Wansung`	ko_KR-euc	UTF-8	ko_KR-UTF-8
`Johap`	ko_KR-johap92	UTF-8	ko_KR-UTF-8
`Packed`	ko_KR-johap	UTF-8	ko_KR-UTF-8
`ISO-2022-KR`	ko_KR-iso2022-7	UTF-8	ko_KR-UTF-8
`Wansung`	ko_KR-euc	Johap	ko_KR-johap92
`Wansung`	ko_KR-euc	Packed	ko_KR-johap
`Wansung`	ko_KR-euc	N-Byte	ko_KR-nbyte
`Wansung`	ko_KR-euc	ISO-2022-KR	ko_KR-iso2022-7
`Johap`	ko_KR-johap92	Wansung	ko_KR-euc
`Packed`	ko_KR-johap	Wansung	ko_KR-euc
`N-Byte`	ko_KR-nbyte	Wansung	ko_KR-euc
`ISO-2022-KR`	ko_KR-iso2022-7	Wansung	ko_KR-euc

Chinese: Simplified and Traditional

Simplified Chinese in the Solaris 7 environment provides two locales: zh and zh.GBK. In the zh locale, the EUC scheme is usesd to encode GB2312-80 The zh.GBK locale supports the GBK codeset, which is a superset of GB2312-80.

Simplified Chinese is used mostly in the People's Republic of China (PRC) and in Singapore..

The following input methods are supported for the zh locale

New QuanPin
New ShuangPin
Quanpy
Location
PinYin
Stroke
Golden
Intelligent Pinyin
Simplified Chinese Symbol

The following input methods are supported for the zh.GBK locale

New QuanPin
New ShuangPin
Quanpy
GBK Code
Japanese
Hanja
Zhuyin
Unicode

Table 3-7 shows the TrueType Fonts for the zh Locale

Table 3-7 Solaris 7 TrueType Fonts for the zh Locale


Full Family Name	Subfamily	Format	Vendor	Encoding
Fangsong	R	TrueType	Hanyi	GB2312.1980
Hei	R	TrueType	Monotype	GB2312.1980
Kai	R	TrueType	Monotype	GB2312.1980
Song	R	TrueType	Monotype	GB2312.1980

Table 3-8 shows the Bitmap Fonts for the zh Locale

Table 3-8 Solaris 7 Bitmap Fonts for the zh Locale


Full Family Name	Subfamily	Format	Encoding
Song	B	PCF (14,16)	GB2312.1980
Song	R	PCF (12,14,16,20,24)	GB2312.1980

Table 3-9 shows the TrueType Fonts for the zh.GBK Locale

Table 3-9 TrueType Fonts for the zh.GBK Locale


Full Family NameS	Subfamily	Format	Vendor	Encoding
Fansong	R	TrueType	Zhongyi	GBK
Hei	R	TrueType	Zhongyi	GBK
Kai	R	TrueType	Zhongyi	GBK
Song	R	TrueType	Zhongyi	GBK

Table 3-10 shows the Bitmap Fonts for the zh.GBK Locale

Table 3-10 Bitmap Fonts for the zh.GBK Locale


Full Family Name	Subfamily	Format	Encoding
Song	R	PCF (12,14,16,20,24)	GBK

Table 3-11 shows the supported codeset conversions for Simplified Chinese.

Table 3-11 Codeset Conversions for Simplified Chinese


Code	Symbol	TargetCode	Symbol
`GB2312-80`	zh_CN.euc	ISO 2022-7	zh_CN.iso2022-7
`ISO 2022-7`	zh_CN.iso2022-7	GB2312-80	zh_CN.euc
`GB2312-80`	zh_CN.euc	ISO 2022-CN	zh_CN.iso2022-CN
`ISO-2022-CN`	zh_CN.iso2022-CN	GB2312-80	zh_CN.euc
`UTF-8`	UTF-8	GB2312-80	zh_CN.euc
`GB2312-80`	zh_CN.euc	UTF-8	UTF-8
`zh.GBK`	zh_CN.gbk	ISO2022-CN	zh_CN.iso2022-CN
`ISO2022-CN`	zh_CN.iso2022-CN	zh.GBK	zh_CN.gbk
`zh.GBK`	zh_CN.gbk	Big-5	zh_TW-Big5
`Big-5`	zh_TW-Big5	zh.GBK	zh_CN.gbk
`GB2312-80`	zh_CN.euc	Big-5	zh_TW-Big5
`Big-5`	zh_TW-Big5	GB2312-80	zh_CN.euc
`UTF-8`	UTF-8	zh.GBK	zh_CN.gbk
`zh.GBK`	zh_CN.gbk	UTF-8	UTF-8
`UTF-8`	UTF-8	ISO2022-CN	zh_CN.iso2022-CN
`ISO2022-CN`	zh_CN.iso2022-CN	UTF-8	UTF-8

Traditional Chinese in the Solaris 7 product provides two locales: zh_TW and zh_TW.BIG5. In the zh_TW locale, the EUC scheme is used to encode CNS 11643.1992 codeset. The zh_TW.BIG5 locale supports the Big-5 codeset.

Traditional Chinese is used mostly in Taiwan and Hong Kong.

Traditional Chinese supports the following input methods:

Chuyin
I-Tien
Telecode
TsangChieh
CheinI
NeiMa
ChuangHsing
Array
BoShiaMy
DaYi

Table 3-12 shows Traditional Chinese Truetype Fonts for the zh_TW Locales

Table 3-12 Traditional Chinese Truetype Fonts for the zh_TW Locales


Full Family Name	Subfamily	Format	Vendor	Encoding
Hei	R	Truetype	Hanyi	CNS11643.1992
Kai	R	Truetype	Hanyi	CNS11643.1992
Ming	R	Truetype	Hanyi	CNS11643.1992

Table 3-13 shows the Traditional Chinese BitMap Fonts for the zh_TW Locales

Table 3-13 Traditional Chinese BitMap Fonts for the zh_TW Locales


Full Family Name	Subfamily	Format	Encoding
Ming	R	PCF (12,14,16,20,24)	CNS11643.1992

Table 3-14 shows the Traditional Chinese TrueType Fonts for the zh_TW.BIG5 Locales

Table 3-14 Traditional Chinese TrueType Fonts for the zh_TW.BIG5 Locales


Full Family Name	Subfamily	Format	Vendor	Encoding
Hei	R	TrueType	Hanyi	Big5
Kai	R	TrueType	Hanyi	Big5
Ming	R	TrueType	Hanyi	Big5

Table 3-15 shows the Traditional Chinese BitMap Fonts for the zh_TW.BIG5 Locales

Table 3-15 Traditional Chinese BitMap Fonts for the zh_TW.BIG5 Locales


Full Family Name	Subfamily	Format	Encoding
Ming	R	PCF (12,14,16,20,24)	Big5

Table 3-16 shows the supported codeset conversions for Traditional Chinese.

Table 3-16 Codeset Conversions for Traditional Chinese


Code	Symbol	TargetCode	Symbol
`CNS 11643`	zh_TW-euc	Big-5	zh_TW-Big5
`CNS 11643`	zh_TW-euc	ISO 2022-7	zh_TW-iso2022-7
`Big-5`	zh_TW-Big5	CNS 11643	zh_TW-euc
`Big-5`	zh_TW-Big5	ISO 2022-7	zh_TW-iso2022-7
`ISO 2022-7`	zh_TW-iso2022-7	CNS 11643	zh_TW-euc
`ISO 2022-7`	zh_TW-iso2022-7	Big-5	zh_TW-Big5
`CNS 11643`	zh_TW-eu	ISO 2022-CN-EXT	zh_TW-iso2022-CN-EXT
`ISO 2022-CN-EXT`	zh_TW-iso2022-CN-EXT	CNS 11643	zh_TW-euc
`Big-5`	zh_TW-Big5	ISO 2022-CN	zh_TW-iso2022-CN
`ISO 2022-CN`	zh_TW-iso2022-CN	Big-5	zh_TW-Big5
`UTF-8`	UTF-8	CNS 11643	zh_TW-euc
`CNS 11643`	zh_TW-euc	UTF-8	UTF-8
`UTF-8`	UTF-8	Big-5	zh_TW-Big5
`Big-5`	zh_TW-Big5	UTF-8	UTF-8
`UTF-8`	UTF-8	ISO 2022-7	zh_TW-iso2022-7
`ISO 2022-7`	zh_TW-iso2022-7	UTF-8	UTF-8
`ISO 2022-CN-EXT`	zh_TW-iso2022-CN-EX	Big-5	zh_TW-Big5
`Big-5`	zh_TW-Big5	ISO 2022-CN-EXT	zh_TW-iso2022-CN-EXT

Japanese Input Systems

Three Japanese input systems are bundled in Japanese Solaris 7. They can be used in the ja, ja_JP.PCK and ja_JP.UTF-8 locales. However, some maintenance utilities do not support the PCK codeset.

The Japanese Input System is shown below in Table 3-17.

Table 3-17 Japanese Input Systems


Name	Description
`Wnn6`	`Wnn6` consists of the Kana-Kanji conversion server (`jserver)`, interface module for `htt` (X Input Method Server) called `xjsi.so`, utilities, and dictionaries. Wnn6 is the default Japanese input system. Wnn6 supports JIS X 0201-1976, JIS X 0208-1990 and JIS X0212-1990 character sets.
`ATOK8`	`ATOK8` consists of atok8 X Input Method Server, utilities, and dictionaries. `ATOK8` is a popular Japanese input system facility in the Japanese PC market. `ATOK7` was released with Solaris 2.1 until 2.5.1 has been replaced by `ATOK8.` `ATOK8` supports JIS X 0201-1976 and JIS X 0208-1990 character sets.
`cs00`	`cs00` consists of the Kana-Kanji conversion server (`cs00`), interface module for `htt` (X Input Method Server) called `xci.so`, utilities, and dictionaries. `cs00` has been bundled with Japanese Solaris since Solaris 2.1 `cs00` supports JIS X 0201-1976, JIS X 0208-1990 and JIS X 0212-1990 character sets.

Japanese TrueType Fonts are show below in Table 3-18.

Table 3-18 Japanese TrueType Fonts


Full Family Name	Subfamily	Format	Vendor	Encoding
`hg gothic b`	R	TrueType	RICOH	JISX0208.1983, JISX0201.1976
`hg mincho l`	R	TrueType	RICOH	JISX0208.1983, JISX0201.1976
`heiseimin`	R	TrueType	RICOH	JISX0212.1990

Japanese Bitmap Fonts are shown in Table 3-19 below.

Table 3-19 Japanese Bitmap Fonts


Full Family Name	Subfamily	Format	Vendor	Encoding
`gothic`	R, B	PCF(12,14,16,20,24)		JISX0208.1983, JISX0201.1976
`minchou`	R	PCF(12,14,16,20,24)		JISX0208.1983, JISX0201.1976
`hg gothic b`	R	PCF(12,14,16,18,20,24)	RICOH	JISX0208.1983, JISX0201.1976
`hg mincho l`	R	PCF(12,14,16,18,20,2)	RICOH	JISX0208.1983, JISX0201.1976
`heiseimin`	R	PCF(12,14,16,18,20,24)	RICOH	JISX0212.1990

Japanese Locales

Japanese Solaris 7 supports three locales. The ja locale is based on Japanese EUC. The ja_JP.PCK locale is based on PC-Kanji code (Shift JIS) and the ja_JP.UTF-8 locale is based on UTF-8.

Japanese Messages and `man` Pages

Some messages and manual pages have been translated into Japanese in Japanese Solaris 7.

Japanese Character Code Converter for `iconv`

The following table shows supported conversion with iconv(1) and iconv(3). See the iconv_ja(5)man page for details.

Table 3-20 shows iconv Conversion Support.

Table 3-20 iconv Conversion Support


Source Code	Target Code
`eucJP`	`PCK`
`eucJP`	`JIS7`
`eucJP`	`SJIS`
`eucJP`	`UTF-8`
`eucJP`	`jis`
`eucJP`	`ibmj`
`SJIS`	`eucJP`
`SJIS`	`ISO-2022-JP`
`SJIS`	`UTF-8`
`SJIS`	`jis`
`SJIS`	`ibmj`
`PCK`	`eucJP`
`PCK`	`UTF-8`
`PCK`	`ISO-2022-JP`
`PCK`	`jis`
`PCK`	`ibmj`
`ISO-2022-JP`	`eucJP`
`ISO-2022-JP`	`PCK`
`ISO-2022-JP`	`SJIS`
`UTF-8`	`eucJP`
`UTF-8`	`SJIS`
`UTF-8`	`PCK`
`JIS7`	`eucJP`
`jis`	`eucJP`
`jis`	`PCK`
`jis`	`SJIS`
`ibmj`	`eucJP`
`ibmj`	`PCK`
`UTF-8`	`ISO-2022-JP`
`ISO-2022-JP`	`UTF-8`
`eucJP`	`UTF-8-Java`
`UTF-8-Java`	`eucJP`
`PCK`	`UTF-8-Java`
`UTF-8-Java`	`PCK`
`eucJP`	`ISO-2022-JP.RFC1468`
`PCK`	`ISO-2022-JP.RFC1468`
`UTF-8`	`ISO-2022-JP.RFC1468`
`eucJP`	`ibmj-EBCDIK`
`ibmj-EBCDIK`	`eucJP`
`PCK`	`ibmj-EBCDIK`
`ibmj-EBCDIK`	`PCK`

Japanese Character Code Converter for TTY STREAMS

There are TTY STREAMS modules that perform code conversion between an encoding for a specific terminal and an encoding for a specific locale. With an appropriate STREAMS module, a user can log in from a Japanese terminal into a Japanese locale, even if the encoding between the terminal and the Japanese locale does not match. tty(1) controls the behavior of those STREAMS modules.

Japanese-specific Printer Support

The Japanese Solaris 7 product supports the following Japanese-specific printers:

Epson VP-5085 (based on ESC/P)
NEC PC-PR201 (based on 201PL)
Canon LASERSHOT (based on LIPS)
Japanese PostScript Printer

JLE Binary Compatibility Package

The Japanese Solaris 7 package also provides Japanese Solaris 1.1.x binary-compatibility packages that are the same as the base products.

User-Defined Character (UDC) Support

To handle User-Defined Characters, sdtudctool has been available since the Solaris 2.6 release. Sdtudctool handles both outline (Type1) and bitmap (PCF) fonts. Some utilities are also available to migrate the UDC fonts that were created by old utilities, such as fontedit, type3creator andfontmanager in prior releases.

Korean Solaris 7 Product

The Korean Solaris product, used mostly in Korea, supports all the locales available in the English/Euro products. Additionally, it supports two Korean locales: ko and ko.UTF-8. In the ko locale, the EUC scheme is used to encode KSC 5601-1987. The ko.UTF-8 locale supports the KSC 5700-1995/Unicode 2.0 codeset, which is a super set of KSC 5601-1987. These two locales look the same for the end user, but the internal character encoding is different. The Korean Solaris product supports the following Input Methods

for the ko locale:

Hangul 2-BeolSik (1 set of consonants and 1 set of vowels)
Hangul-Hanja conversion
Special character
Hexadecimal code

for the ko.UTF-8 locale:

Hangul 2-BeolSik (1 set of consonants and 1 set of vowels)
Hangul-Hanja conversion
Special character
Hexadecimal code

The following fonts are available in the Korean version of the Solaris 7 product:

Table 3-21 Solaris 7 Korean CID/Type 1 Fonts for the ko Locale


Full Family Name	Subfamily	Format	Vendor	Encoding
Gothic	R	CID/Type 1	Hanyang	Adobe-Korean
Graphic	R	CID/Type 1	Hanyang	Adobe-Korean
Haeso	R	CID/Type 1	Hanyang	Adobe-Korean
Kodig	R	CID/Type 1	Hanyang	Adobe-Korean
Myeongijo	R	CID/Type 1	Hanyang	Adobe-Korean
Pilki	R	CID/Type 1	Hanyang	Adobe-Korean
Roundgothic	R	CID/Type 1	Hanyang	Adobe-Korean

Table 3-22 Solaris 7 Korean Bitmap Fonts for the ko Locale


Full Family Name	Subfamily	Format	Encoding
Gothic	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987
Graphic	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987
Haeso	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987
Kodig	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987
Myeongijo	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987
Pilki	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987
Roundgothic	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1987

Table 3-23 Solaris 7 Korean CID/Type 1 Fonts for the ko.UTF-8 Locale


Full Family Name	Subfamily	Format	Vendor	Encoding
Gothic	R	CID/Type 1	Hanyang	Adobe-Korean
Graphic	R	CID/Type 1	Hanyang	Adobe-Korean
Haeso	R	CID/Type 1	Hanyang	Adobe-Korean
Kodig	R	CID/Type 1	Hanyang	Adobe-Korean
Myeongijo	R	CID/Type 1	Hanyang	Adobe-Korean
Pilki	R	CID/Type 1	Hanyang	Adobe-Korean

Table 3-24 Solaris 7 Korean Bitmap Fonts for the ko.UTF-8 Locale


Full Family Name	Subfamily	Format	Encoding
Gothic	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1992 (Johap)
Graphic	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1992 (Johap)
Haeso	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1992 (Johap)
Kodig	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1992 (Johap)
Myeongijo	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1992 (Johap)
Pilki	R/B	PCF (12,14,16,18,20,24)	KSC 5601-1992 (Johap)

Table 3-25 Korean ICONV


Code	Symbol	Target Code	Symbol
KSC 5601-1987	1506	UTF-8	UTF-8
ISO 646	646	KSC 5601-1987	5601
KSC 5601-1987	EUC-KR	UTF-8	UTF-8
KSC 5601-1987	KSC5601	UTF-8	UTF-8
UTF-8	UTF-8	KSC 5601-1987	5601
UTF-8	UTF-8	KSC 5601-1987	EUC-KR
UTF-8	UTF-8	KSC 5601-1987	KSC 5601
UTF-8	ko-KR-UTF-8	IBM CP 933	cp 933
UTF-8	ko-KR-UTF-8	KSC 5601-1987	ko_KR-euc
UTF-8	ko-KR-UTF-8	ISO2022-KR	ko_KR-iso2022-7
UTF-8	ko-KR-UTF-8	KSC 5601-1987 - Johap	ko_KR-johap
UTF-8	ko-KR-UTF-8	KSC5601-1992 - Johap	ko_KR-johap92
IBM CP933	cp933	UTF-8	ko_KR-UTF-8
KSC 5601-1987	ko_KR-euc	UTF-8	ko_KR-UTF-8
KSC 5601-1987	ko_KR-euc	ISO 2022-KR	ko_KR-iso2022-7
KSC 5601-1987	ko_KR-euc	KSC 5601-1987 - Johap	ko_KR-johap
KSC 5601-1987	ko_KR-euc	KSC 5601-1992 - Johap	ko_KR-johap92
KSC 5601-1987	ko_KR-euc	KSC 5601-1992-Annex:4	ko_KR-nbyte
ISO 2022-KR	iso2022-7	UTF-8	ko_KR-UTF-8
ISO 2022-KR	iso2022-7	KSC 5601-1987	ko_KR-euc
KSC 5601-1987 - Johap	ko-KR-johap	UTF-8	ko_KR-UTF-8
KSC 5601-1987 - Johap	ko-KR-johap	KSC 5601-1987	ko_KR-euc
KSC 5601-1992 - Johap	ko-KR-johap92	UTF-8	ko_KR-UTF-8
KSC 5601-1992 - Johap	ko-KR-johap92	KSC 5601-1987	ko_KR-euc
KSC 5601-1992 - Annex:4	ko-KR-nbyte	KSC 5601-1987	ko_KR-euc

How to Use the `iconv` Command

The iconv command converts the characters or sequences of characters in a file from one codeset to another, then writes the results to standard output. If there is no conversion for a particular character, it is converted into an underscore `_' in the target codeset. See the iconv(1) man page for more information.

The following options are supported:

-f fromcode Symbol of the input codeset.
-t tocode Symbol of the output codeset.

To convert a mail file from one encoding into another, use the iconv command:

example% iconv -f from_codeset -t to_codeset mail.codeset > mail.codeset