International Language Environments Guide

Chapter 3 Contents of Solaris 8 Products

Overview of the Solaris 8 Locales

Multiple environments exist within the Solaris operating system for support of different national languages. Each of these national environments is called a locale, which considers the language, its characters, fonts, and the customs used to input and format data.

A locale defines the behavior of a program at run time according to the language and cultural conventions of a user's geographical area. Throughout the system, locales affect the following:

Summary of the Solaris 8 Locale

All Solaris 8 locale packages are classified into two categories. The first category is for partial locales, which are the enablers of the locales. With partial locales installed on the system, users can run applications on the target locales, while the OS/GUI messages from Solaris are English. All partial locale packages are available on the Solaris OS CDs.

The second category is for full locale packages. These packages include translations of software messages, on-line help files, optional fonts, and language specific features. Full locale packages provide the full set of language features to 9 languages.

Full locale packages are available on the languages CD. Partial locale packages (locale enablers) have to be installed in order for the full locales to be functional.

Localization Content on Solaris 8 CD-ROMs

Partial locales are selected at the beginning of the install procedure on the OS CD-ROM. Full locales are automatically installed from the Language CD-ROM according to the locale selections made at the beginning of the install procedure.

The distribution of locales is shown in the table below.

Table 3-1 Solaris 8 Installation CD-ROMs

Disk 

Contents 

Solaris OS CD-ROM 

Solaris 8 Operating System 

all partial locales 

Language CD-ROM 

message translations for 9 languages 

locale specific utilities 

As mentioned, the locales include partial locales. These are based on core locales for the main language. For example, the fr_CA.ISO8859-1 (French Canadian) is based on the fr_FR.ISO8859-1 (French) locale. These partial locales utilize the messages that are delivered into its parent locale (French for fr_CA). If a locale hasn't been fully localized, then it might contain only English messages.

Localization Functions in Solaris Interfaces

The OS locale layer provides the basic locale database and functions that are plugged into the OS system interface at the application's run time. Applications will access these OS locale modules through standard APIs as described in Chapter 2.

The X11 locale layer provides the interface to X input method and X output method such that the X11 applications can allow local text input and display. Fonts are provided to allow applications to display characters from various languages.

CDE/Motif is built on top of the X11 window system. Hence, it can utilize the X11 locale capability through X11 APIs. Solaris localizations have various locale-specific configurations for CDE applications, in order to make the desktop functional within the target locale.

Message translations and on-line help contents are provided throughout different layers as described in the following diagram.

Figure 3-1 Functions and Structure of Locales in Solaris

Graphic

Script Enabling for Solaris 8

The Solaris 8 base product provides multiple levels of script enabling, such as simple ASCII support, Latin/European support, Asian multibyte support, and Arabic/Hebrew bidirectional support.

The interfaces defined within the X/Open specification are capable of supporting a large set of languages and territories, including the following types of script:

Script 

Description 

Latin Language 

Americas, Eastern/Western Europe, Turkey 

Greek 

Greece 

East Asia 

Japanese, Korean, and Chinese 

Indic 

Thai 

Bidirectional 

Arabic and Hebrew 

Cyrillic 

Russian 

Localization in the Base and Multilingual Solaris Product

The base Solaris 8 product includes all partial locales, (including multibyte locales) which provide the functionality needed to input, display, and print text in their target languages while using English user interfaces.

The multilingual Solaris 8 product is a super set of the base Solaris product. It additionally includes 9 language translations (user interface and documentation) and some additional software such as BCP support, optional fonts, and optional utilities on the Language CD.

The English Unicode locale (en_US.UTF-8) is installed as the default, while other locales are installed when the locale is selected as install locale during the Solaris install process. Since the UTF-8 locales require all the languages fonts, basic fonts supporting all languages are also installed as the default.

The File System Safe Universal Transformation Format, or UTF-8, is an encoding defined by X/Open as a multi-byte representation of Unicode. UTF-8 encompasses almost all of the characters for traditional single-byte and multi-byte locales for European and Asian languages for Solaris locales.

Additional locale support is packaged according to the geographic region which they support. During the Solaris install process, you are prompted to choose which geographic regions require your support. The locale support available after installation has finished depends on the choices made at this stage.

The following tables lists all the locales supported by the Solaris 8 environment. The locale names have been updated from the Solaris 7 environment in keeping with international naming standards.

All of these locales are also present in the base Solaris 8 release.

Table 3-2 Asia

Locale 

User Interface 

Territory 

Codeset 

Language Support 

ja 

Japanese 

Japan 

eucJP 

Japanese (EUC) 

 

 

 

 

JISX0201-1976 

 

 

 

 

JISX0208-1990 

 

 

 

 

JISX0212-1990 

ja_JP.PCK 

Japanese 

Japan 

PCK 

Japanese (PC kanji) 

 

 

 

 

JISX0201-1976 

 

 

 

 

JISX0208-1990 

ja_JP.UTF-8 

Japanese 

Japan 

UTF-8 

Japanese (UTF-8) Unicode 3.0 

ko 

Korean 

Korea 

5601 

Korean (EUC) 

KSC 5601-1987 

ko.UTF-8 

Korean 

Korea 

UTF-8 

Korean (UTF-8) KSC 

Unicode 3.0 

th 

English 

Thailand 

TIS620.2533 

Thai TIS620.2533 

zh 

Simplified Chinese 

PRC 

gb2312 

Simplified Chinese (EUC)  

GB2312-1980 

zh.GBK 

Simplified Chinese 

PRC 

GBK 

Simplified Chinese (GBK) GBK 

zh.UTF-8 

Simplified Chinese 

PRC 

UTF-8 

Simplified Chinese (UTF-8)  

Unicode 3.0 

zh_TW 

Traditional Chinese 

Taiwan 

cns11643 

Traditional Chinese (EUC)  

CNS 11643-1992 

zh_TW.BIG5 

Traditional Chinese 

Taiwan 

 

BIG5 

Traditional Chinese (BIG5) 

BIG5 

zh_TW.UTF-8 

Traditional Chinese 

Taiwan 

UTF-8 

Traditional Chinese (UTF-8)  

Unicde 3.0 

Table 3-3 Australasia

Locale 

User Interface 

Territory 

Codeset 

Language Support 

en_AU.ISO8859-1 

English 

Australia 

ISO8859-1 

English (Australia)  

en_NZ.ISO8859-1 

English 

New Zealand 

ISO8859-1 

English (New Zealand) 

Table 3-4 Central America

Locale 

User Interface 

Territory 

Codeset 

Language Support 

es_CR.ISO8859-1 

Spanish 

Costa Rica 

ISO8859-1 

Spanish (Costa Rica) 

es_GT.ISO8859-1 

Spanish 

Guatemala 

ISO8859-1 

Spanish (Guatemala) 

es_MX.ISO8859-1 

Spanish 

Mexico 

ISO8859-1 

Spanish (Mexico) 

es_NI.ISO8859-1 

Spanish 

Nicaragua 

ISO8859-1 

Spanish (Nicaragua) 

es_PA.ISO8859-1 

Spanish 

Panama 

ISO8859-1 

Spanish (Panama) 

es_SV.ISO8859-1 

Spanish 

El Salvador 

ISO8859-1 

Spanish (El Salvador) 

Table 3-5 Central Europe

Locale 

User Interface 

Territory 

Codeset 

Language Support 

cs_CZ.ISO8859-2 

English 

Czech Republic 

ISO8859-2 

Czech (Czech Republic) 

de_AT.ISO8859-1 

German  

Austria 

ISO8859-1 

German (Austria)  

de_AT.ISO8859-15 

German  

Austria 

ISO8859-15 

German (Austria, ISO8859-15 - Euro) 

de_CH.ISO8859-1 

German  

Switzerland 

ISO8859-1 

German (Switzerland)  

de_DE.UTF-8 

German  

Germany 

UTF-8 

German (Germany, Unicode 3.0) 

de_DE.ISO8859-1 

German  

Germany 

ISO8859-1 

German (Germany) 

de_DE.ISO8859-15 

German  

Germany 

ISO8859-15 

German (Germany, ISO8859-15 - Euro) 

fr_CH.ISO8859-1 

French  

Switzerland 

ISO8859-1 

German (Switzerland) 

hu_HU.ISO8859-2 

English 

Hungary 

ISO8859-2 

Hungarian (Hungary) 

pl_PL.ISO8859-2 

English 

Poland 

ISO8859-2 

Polish (Poland) 

sk_SK.ISO8859-2 

English 

Slovakia 

ISO8859-2 

Slovak (Slovakia) 

Table 3-6 Eastern Europe

Locale 

User Interface 

Territory 

Codeset 

Language Support 

bg_BG.ISO8859-5 

English 

Bulgaria 

ISO8859-5 

Bulgarian (Bulgaria) 

et_EE.ISO8859-15  

English 

Estonia 

ISO8859-15 

Estonian (Estonia) 

hr_HR.ISO8859-2 

English 

Croatia 

ISO8859-2 

Croatian (Croatia) 

lt_LT.ISO8859-13  

English 

Lithuania 

ISO8859-13 

Lithuanian (Lithuania) 

lv_LV.ISO8859-13  

English 

Latvia 

ISO8859-13 

Latvian (Latvia) 

mk_MK.ISO8859-5 

English 

Macedonia 

ISO8859-5 

Macedonian (Macedonia) 

ro_RO.ISO8859-2 

English 

Romania 

ISO8859-2 

Romanian (Romania) 

ru_RU.KOI8-R 

English 

Russia 

KOI8-R 

Russian (Russia, KOI8-R) 

ru_RU.ANSI1251 

English 

Russia 

ansi-1251 

Russian (Russia, ANSI 1251) 

ru_RU.ISO8859-5 

English 

Russia 

ISO8859-5 

Russia (Russia) 

sh_BA.ISO8859-2@bosnia 

English 

Bosnia 

ISO8859-2 

Bosnian (Bosnia) 

sl_SI.ISO8859-2 

English 

Slovenia 

ISO8859-2 

Slovenian (Slovenia) 

sq_AL.ISO8859-2 

English 

Albania 

ISO8859-2 

Albanian (Albania) 

sr_YU.ISO8859-5 

English 

Serbia 

ISO8859-5 

Serbian (Serbia) 

tr_TR.ISO8859-9 

English 

Turkey 

ISO8859-9 

Turkish (Turkey) 

Table 3-7 Middle East

Locale 

User Interface 

Territory 

Codeset 

Language Support 

he_IL.ISO8859-6 

English 

Israel 

ISO8859-6 

Hebrew (Israel) 

Table 3-8 North Africa

Locale 

User Interface 

Territory 

Codeset 

Language Support 

ar_EY.ISO8859-1 

English 

Egypt 

ISO8859-6 

Arabic (Egypt) 

Table 3-9 North America

Locale 

User Interface 

Territory 

Codeset 

Language Support 

en_CA.ISO8859-1 

English 

Canada 

ISO8859-1 

English (Canada) 

en_US.ISO8859-1 

English 

USA 

ISO8859-1 

English (U.S.A.) 

en_US.ISO8859-15 

English 

USA 

ISO8859-15 

English (U.S.A., ISO8859-15 - Euro) 

en_US.UTF-8 

English 

USA 

UTF-8 

English (U.S.A., Unicode 3.0) 

fr_CA.ISO8859-1 

French 

Canada 

ISO8859-1 

French (Canada) 

Table 3-10 North Europe

Locale 

User Interface 

Territory 

Codeset 

Language Support 

da_DK.ISO8859-1 

English 

Denmark 

ISO8859-1 

Danish (Denmark) 

da_DK.ISO8859-15 

English 

Denmark 

ISO8859-15 

Danish (Denmark, ISO8859-15 Euro) 

fi_FI.ISO8859-1 

English 

Finland 

ISO8859-1 

Finnish (Finland) 

fi_FI.ISO8859-15 

English 

Finland 

ISO8859-15 

Finnish (Finland ISO8859-15 Euro) 

is_IS.ISO8859-1 

English 

Iceland 

ISO8859-1 

Icelandic (Iceland) 

no_NO.ISO8859-1@bokmal 

English 

Norway 

ISO8859-1 

Norwegian (Norway -- Bokmal) 

no_NO.ISO8859-1@nyorsk 

English 

Norway 

ISO8859-1 

Norwegian (Norway -- Nynorsk) 

sv_SE.ISO8859-1 

Swedish 

Sweden 

ISO8859-1 

Swedish (Sweden) 

sv_SE.ISO8859-15 

Swedish 

Sweden 

ISO8859-15 

Swedish (Sweden, ISO8859-15 Euro) 

sv_SE..UTF-8 

Swedish 

Sweden 

UTF-8 

Swedish (Sweden, Unicode 3.0) 

Table 3-11 South America

Locale 

User Interface 

Territory 

Codeset 

Language Support 

es_AR.ISO8859-1 

Spanish 

Argentina 

ISO8859-1 

Spanish (Argentina) 

es_BO.ISO8859-1 

Spanish 

Bolivia 

ISO8859-1 

Spanish (Bolivia)  

es_CL.ISO8859-1 

Spanish 

Chilie 

ISO8859-1 

Spanish (Chile) 

es_CO.ISO8859-1 

Spanish 

Colombia 

ISO8859-1 

Spanish (Colombia) 

es_EC.ISO8859-1 

Spanish 

Ecuador 

ISO8859-1 

Spanish (Ecuador)  

es_PE.ISO8859-1 

Spanish 

Peru 

ISO8859-1 

Spanish (Peru) 

es_PY.ISO8859-1 

Spanish 

Paraguay 

ISO8859-1 

Spanish (Paraguay) 

es_UY.ISO8859-1 

Spanish 

Uruguay 

ISO8859-1 

Spanish (Uruguay) 

es_VE.ISO8859-1 

Spanish 

Venezuela 

ISO8859-1 

Spanish (Venezuela) 

pt_BR.ISO8859-1 

English 

Brazil 

ISO8859-1 

Portuguese (Brazil) 

Table 3-12 South Europe

Locale 

User Interface 

Territory 

Codeset 

Language Support 

el_GR.ISO8859-7 

English 

Greece 

ISO8859-7 

Greek (Greece) 

es_ES.ISO8859-1 

Spanish 

Spain 

ISO8859-1 

Spanish (Spain) 

es_ES.ISO8859-15 

Spanish 

Spain 

ISO8859-15 

Spanish (Spain, ISO8859-15 - Euro) 

es_ES.UTF-8 

Spanish 

Spain 

UTF-8 

Spanish (Spain, Unicode 3.0) 

it_IT.ISO8859-1 

Italian 

Italy 

ISO8859-1 

Italian (Italy) 

it_IT.ISO8859-15 

Italian 

Italy 

ISO8859-15 

Italian (Italy, ISO8859-15 - Euro) 

it_IT.UTF-8 

Italian 

Italy 

UTF-8 

Italian (Italy, Unicode 3.0) 

pt_PT.ISO8859-1 

English 

Portugal 

ISO8859-1 

Portuguese (Portugal) 

pt_PT.ISO8859-15 

English 

Portugal 

ISO8859-15 

Portuguese Portugal, ISO8859-15 - Euro) 

Table 3-13 Western Europe

Locale 

User Interface 

Territory 

Codeset 

Language Support 

en_GB.ISO8859-1 

English 

Great Britain 

ISO8859-1 

English (Great Britain) 

en_GB.ISO8859-15  

English 

Great Britain 

ISO8859-15 

English (Great Britain, ISO8859-15 - Euro) 

en_IE.ISO8859-1 

English 

Ireland 

ISO8859-1 

English (Ireland) 

en_IE.ISO8859-15  

English 

Ireland 

ISO8859-15 

English (Ireland, ISO8859-15 - Euro) 

fr_BE.ISO8859-1 

French 

Belgium-Walloon 

ISO8859-1  

French (Belgium-Walloon) 

fr_BE.ISO8859-15 

French 

Belgium-Wallon 

ISO8859-15  

French (Belgium-Walloon, ISO8859-15 - Euro) 

fr_FR.ISO8859-1 

French 

France 

ISO8859-1 

French (France) 

fr_FR.ISO8859-15 

French 

France 

ISO8859-15 

French (France, ISO8859-15 - Euro) 

fr_FR.UTF-8 

French 

France 

UTF-8 

French (France, Unicode 3.0) 

nl_BE.ISO8859-1 

English 

Belgium-Flemish  

ISO8859-1 

Dutch (Belgium-Flemish) 

 

nl_BE.ISO8859-15 

English 

Belgium-Flemish 

ISO8859-15  

Dutch (Belgium-Flemish, ISO8859-15 - Euro) 

nl_NL.ISO8859-1 

English 

Netherlands 

ISO8859-1 

Dutch (Netherlands) 

nl_NL.ISO8859-15 

English 

Netherlands 

ISO8859-15 

Dutch (Netherlands, ISO8859-15 - Euro) 


Note -

Locale naming conventions are as follows:

language[_territory][.codeset]

where language is from ISO639 and territory is from ISO3166.

All Solaris product locales preserve the Portable Character Set characters with US-ASCII code values.

A single locale can have more than one locale name. For example, ja_JP.eucJP is the same as ja. Also, fr_FR.ISO8859-1 is the same as fr.



Note -

5601 signifies the Korean EUC codeset containing KS C 5636 and KS C 5601-1987.

eucJP signifies the Japanese EUC codeset. It contains JIS X0201-1976, JIS X0208-1983, and JIS X0212-1990.

gb2312 signifies Simplified Chinese EUC codeset, which contains GV 1988-80 and GB 2312-80.

PCK is also known as Shift JIS (SJIS).

UTF-8 is the UTF-8 of ISO/IEC 10646-1 containing various approved amendments and Unicode 3.0

GBK signifies GB extensions. This includes all GB 2312-80 characters and all Unified Han characters of ISO/IEC 10646-1, as well as Japanese Hiragana and Katagana characters. It also includes many characters of Chinese, Japanese, and Korean character sets and of ISO/IEC 10646-1.


European Localization

Solaris 8 software supports the euro currency. Local currency symbols are still available for backward compatibility.

Table 3-14 User Locales To Support the Euro Currency

Region 

Locale Name 

ISO Codeset 

Austria 

de_AT.ISO8859-15

8859-15 

Belgium (French) 

fr_BE.ISO8859-15

8859-15 

Belgium (Dutch) 

nl_BE.ISO8859-15

8859-15 

Denmark 

da_DK.ISO8859-15

8859-15 

Finland 

fi_FI.ISO8859-15

8859-15 

France 

fr_FR.ISO8859-15

8859-15 

Germany 

de_DE.ISO8859-15

8859-15 

Ireland 

en_IE.ISO8859-15

8859-15 

Italy 

it_IT.ISO8859-15

8859-15 

Netherlands 

nl_NL.ISO8859-15

8859-15 

Portugal 

pt_PT.ISO8859-15

8859-15 

Spain 

es_ES.ISO8859-15

8859-15 

Sweden 

sv_SE.ISO8859-15

8859-15 

Great Britain 

en_GB.ISO8859-15

8859-15 

Europe 

en_EU

8859-15 

U.S.A. 

en_US

8859-15 

Multiple Key Compose Sequences for Locales

The Solaris 8 operating environment supports "Compose Sequences" to create the diacritical marks used in writing the scripts covered in the following codesets:

These are the diacritic characters that can be created with the following keys and the Compose key.

Keyboard Support in the Solaris 8 Product

The following locales have keyboard layouts for SPARC (X-server) and IA (Xserver PLUS console):

Changing Between Keyboards on SPARC

Support for changing layouts in the Solaris product is achieved only by using the dip-switch settings under the keyboard. The keyboard layout is determined by the dip switches. A list of keyboard layouts and corresponding defined dip-switch settings is at /usr/openwin/share/etc/keytables/keytable.map.

The following is a layout table for a type 4 keyboard (1=switch up, 0=switch down).

Table 3-15 Layouts for Type 4 Keyboards

Dip Switch 

Keyboard 

Setting in Binary 

51 

Hungary5.kt 

110011 

52 

Poland5.kt 

110100 

53 

Czech5.k 

110101 

54 

Russia5.kt 

110110 

55 

Latvia5.k 

110111 

56 

Turkey5.kt 

111000 

57 

Greece5.kt 

111001 

58 

Lithuania5.kt 

111011 

Changing the layout from U.S./GB to Czech is done by changing the dip-switch settings to the setting defined in the file. The file defines the switches in hex. This needs to be converted into binary and then re-booted.

Russian and Greek keyboard support can be toggled on and off using the SPARC Compose key (Ctrl+Shift+F1 on IA).

Changing Between Keyboards on IA

On IA, a keyboard is selected during the kdmconfig part of install. To change this at any time after installation, use kdmconfig:

  1. Exit CDE/OW to the command line.

  2. Type kdmconfig -u (kdmconfig unconfigure).

  3. Type kdmconfig to run the program.

  4. Follow instructions to get a keyboard layout.

There are no `utilities' for either SPARC or IA (apart from standard UNIX tools such as xmodmap, pcmapkeys) bundled into Solaris 8 for switching keyboards.

Codesets for IA

The default codeset on the Solaris system for IA is ISO-8859-1. The IBM DOS 437 codeset is provided as an option in text mode. That is, if you choose to download IBM DOS 437 codeset by typing:


loadfont -c 437
pcmapkeys -f /usr/share/lib/keyboards/437/en_US

Nonstandard U.S. date, time, currency, numbers, units, and collation are not supported. Non-English message and text presentation is not supported, nor is multibyte character support. Therefore, non-Microsoft Windows users should use the IBM DOS 437 codeset only in the default C locale.

All of the locales support character input and output. There is also iconv support for many of the major codesets. (For more on iconv, see iconv(1).

Table 3-16 iconv Support

Code  

Symbol 

Target Code 

Symbol 

Language Support 

ISO 8859-2

iso2 

MS 1250 

win2 

Windows Latin 2 

ISO 8859-2

iso2 

MS 852 

dos2 

MS-DOS Latin 2 

ISO 8859-2

iso2 

Mazovia 

maz 

Mazovia 

ISO 8859-2

iso2 

DHN 

dhn 

Dom Handlowy Nauki 

MS 1250

win2 

ISO 8859-2 

iso2 

ISO Latin 2 

MS 1250

win2 

MS 852 

dos2 

MS-DOS Latin 2 

MS 1250

win2 

Mazovia 

maz 

Mazovia 

MS 1250

win2 

DHN 

dhn 

Dom Handlowy Naduki 

MS 852

dos2 

ISO 8859-2 

iso2 

ISO Latin 2 

MS 852

dos2 

MS 1250 

win2 

Windows Latin 2 

MS 852

dos2 

Mazovia 

maz 

Mazovia 

MS 852

dos2 

DHN 

dhn 

Dom Handlowy Nauki 

Mazovia

maz 

ISO 8859-2 

iso2 

ISO Latin 2 

Mazovia

maz 

MS 1250 

win2 

Windows Latin 2 

Mazovia

maz 

MS 852 

dos2 

MS-DOS Latin 2 

Mazovia

maz 

DHN 

dhn 

Dom Handlowy Nauki 

DHN

dhn 

ISO 8859-2 

iso2 

ISO Latin 2 

DHN

dhn 

MS 1250 

win2 

Windows Latin 2 

DHN

dhn 

MS 852 

dos2 

MS-DOS Latin 2 

DHN

dhn 

Mazovia 

maz 

Mazovia 

ISO 8859-5

iso5 

KOI8-R 

koi8 

KOI8-R 

ISO 8859-5

iso5 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

ISO 8859-5

iso5 

MS 1251 

win5 

Windows Cyrillic 

ISO 8859-5

iso5 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

OKI8-R

koi8 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

KOI8-R

koi8 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

KOI8-R

koi8 

MS 1251 

win5 

Windows Cyrillic 

KOI8-R

koi8 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

PC Cyrillic

alt 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

PC Cyrillic

alt 

KOI8-R 

koi8 

KOI8-R 

PC Cyrillic

alt 

MS 1251 

win5 

Windows Cyrillic 

PC Cyrillic

alt 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

MS 1251

win5 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

MS 1251

win5 

KOI8-R 

koi8 

KOI8-R 

MS 1251

win5 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

MS 1251

win5 

Mac Cyrillic 

mac 

Macintosh Cyrillic 

Mac Cyrillic

mac 

ISO 8859-5 

iso5 

ISO 8859-5 Cyrillic 

Mac Cyrillic

mac 

KOI8-R 

koi8 

KOI8-R 

Mac Cyrillic

mac 

PC Cyrillic 

alt 

Alternative PC Cyrillic 

Mac Cyrillic

mac 

MS 1251 

win5 

Windows Cyrillic 

Font Formats

Location of Fonts on the System

Fonts to support European locales are available in various formats, such as bitmaps, PostscriptTMType-1, and TrueType. The actual availability varies per character set.

Fonts are located at:

/usr/openwin/lib/locale/iso_8859_x/X11/fonts/

Adding and Removing Font Packages

To manually add font packages to the system:

  1. Always add the required font packages before the optional font packages.

  2. Remove the optional font packages first, when you are removing font packages from the system.

You must follow this procedure to add or remove fonts. The class action scripts in the font packages depend on this to function properly. The optional font packages contain scripts that concatenate information onto the required font packages that are already resident on the system. If the required font packages are not there, problems can occur.

Summary of Asian Locales

The following table shows the Asian supported locales.

Table 3-17 Summary of Asian Locales

CD Set 

Locale Name 

Description 

Supported Character Set 

Korean 

ko

ko.UTF-8

Korean (EUC) 

Korean (UTF-8) 

KSC 5601-1987  

KSC 5601-1992 

 

 

 

 

Simplified Chinese 

zh

zh GBK

zh.UTF-8

Simplified Chinese (EUC)  

Simplified Chinese (GBK) 

Simplified Chinese (UTF-8) 

GB 2312-1980 

GBK 

Unicode 3.0 

 

 

 

 

Traditional Chinese 

zh_TW

zh_TW.BIG5

zh_TW.UTF-8

Traditional Chinese (EUC)  

Traditional Chinese (BIG5) 

Traditional Chinese (UTF-8) 

CNS 11643 -1992 

BIG5 

Unicode 3.0 

 

 

 

 

Japanese 

ja

ja_JP.PCK

ja_JP.UTF-8

Japanese (EUC) 

Japanese (PCK) [ja_JP.PCK (doesn't support JIS x 0212-1990)]

Japanese (UTF-8) 

JIS x 0201-1976  

JIS x 0208-1990  

JIS x 0212-1990 

VDC [VDC: Vendor Defined Character. VDCs occupy unused (reserved) code points of JIS X 0208-1990 or JIS X 0212-1990]

UDC [UDC: User Defined Character. UDCs occupy unused (reserved) code points of JIS X 0208-1990 or JIS X 0212-1990 (also unused for VDCs).]

Simplified Chinese Localization

Simplified Chinese in the Solaris 8 environment provides three locales: zh, zh.UTF-8, and zh.GBK. In the zh locale, the EUC scheme is used to encode GB2312-80. The zh.GBK locale supports the GBK codeset, which is a superset of GB2312-80.

Simplified Chinese is used mostly in the People's Republic of China (PRC) and in Singapore.

The following input methods are supported for the zh locale:

The following input methods are supported for both the zh.GBK and the zh.UTF-8 locales:

The following table shows the TrueType fonts for the zh locale.

Table 3-18 Solaris 8 TrueType Fonts for the zh Locale

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

 Fangsong R TrueType Hanyi GB2312.1980
 Hei R TrueType Monotype GB2312.1980
 Kai R TrueType Monotype GB2312.1980
 Song R TrueType Monotype GB2312.1980

The following table shows the Bitmap Fonts for the zh Locale.

Table 3-19 Solaris 8 Bitmap Fonts for the zh Locale

Full Family Name 

Subfamily 

Format 

Encoding 

 Song B PCF (14,16) GB2312.1980
 Song R PCF (12,14,16,20,24) GB2312.1980

The following table shows the TrueType fonts for the zh.GBK Locale.

Table 3-20 TrueType Fonts for the zh.GBK Locale

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

 Fansong R TrueType Zhongyi GBK
 Hei R TrueType Zhongyi GBK
 Kai R TrueType Zhongyi GBK
 Song R TrueType Zhongyi GBK

The following table shows the Bitmap Fonts for the zh.GBK Locale.

Table 3-21 Bitmap Fonts for the zh.GBK Locale

Full Family Name 

Subfamily 

Format 

Encoding 

 Song R PCF (12,14,16,20,24) GBK

The following table shows the supported codeset conversions for Simplified Chinese.

Table 3-22 Codeset Conversions for Simplified Chinese

Code 

Symbol 

Target Code 

Symbol 

GB2312-80

zh_CN.euc 

ISO 2022-7 

zh_CN.iso2022-7 

ISO 2022-7

zh_CN.iso2022-7 

GB2312-80 

zh_CN.euc 

GB2312-80

zh_CN.euc 

ISO 2022-CN 

zh_CN.iso2022-CN 

HZ-GB-2312

HZ-GB-2312 

GB2312-80 

zh_CN.euc 

HZ-GB-2312

HZ-GB-2312 

GBK 

zh_CN.gbk 

HZ-GB-2312

HZ-GB-2312 

UTF-8 

UTF-8 

ISO-2022-CN

zh_CN.iso2022-CN 

GB2312-80 

zh_CN.euc 

UTF-8

UTF-8 

GB2312-80 

zh_CN.euc 

GB2312-80

zh_CN.euc 

UTF-8 

UTF-8 

zh.GBK

zh_CN.gbk 

ISO2022-CN 

zh_CN.iso2022-CN 

ISO2022-CN

zh_CN.iso2022-CN 

zh.GBK 

zh_CN.gbk 

zh.GBK

zh_CN.gbk 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5 

zh.GBK 

zh_CN.gbk 

GB2312-80

zh_CN.euc 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5

GB2312-80 

zh_CN.euc 

UTF-8

UTF-8

zh.GBK

zh_CN.gbk 

zh.GBK

zh_CN.gbk 

UTF-8 

UTF-8 

UTF-8

UTF-8 

ISO2022-CN 

zh_CN.iso2022-CN 

ISO2022-CN

zh_CN.iso2022-CN

UTF-8 

UTF-8 

Traditional Chinese Localization

Traditional Chinese in the Solaris 8 product provides three locales: zh_TW, zh_TW.UTF-8 and zh_TW.BIG5. In the zh_TW locale, the EUC scheme is used to encode CNS 11643.1992 codeset. The zh_TW.BIG5 locale supports the Big-5 codeset. The zh_TW.UTF-8 locale supports Unicode 3.0

Traditional Chinese is used mostly in Taiwan and Hong Kong, and supports the following input methods:

Table 3-23 Traditional Chinese Truetype Fonts for the zh_TW Locales

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

 Hei R Truetype Hanyi CNS11643.1992
 Kai R Truetype Hanyi CNS11643.1992
 Ming R Truetype Hanyi CNS11643.1992

The following table shows the Traditional Chinese BitMap Fonts for the zh_TW Locales.

Table 3-24 Traditional Chinese BitMap Fonts for the zh_TW Locales

Full Family Name 

Subfamily 

Format 

Encoding 

 Ming R PCF (12,14,16,20,24) CNS11643.1992

The following table shows the Traditional Chinese TrueType Fonts for the zh_TW.BIG5 Locales.

Table 3-25 Traditional Chinese TrueType Fonts for the zh_TW.BIG5 Locales

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

 Hei R TrueType Hanyi Big5
 Kai R TrueType Hanyi Big5
 Ming R TrueType Hanyi Big5

The following table shows the Traditional Chinese BitMap Fonts for the zh_TW.BIG5 Locales.

Table 3-26 Traditional Chinese BitMap Fonts for the zh_TW.BIG5 Locales

Full Family Name 

Subfamily 

Format 

Encoding 

 Ming R PCF (12,14,16,20,24) Big5

The following table shows the supported codeset conversions for Traditional Chinese.

Table 3-27 Codeset Conversions for Traditional Chinese

Code 

Symbol 

Target Code 

Symbol 

CNS 11643

zh_TW-euc 

Big-5  

zh_TW-Big5 

CNS 11643

zh_TW-euc 

ISO 2022-7 

zh_TW-iso2022-7 

Big-5

zh_TW-Big5 

CNS 11643 

zh_TW-euc 

Big-5

zh_TW-Big5 

ISO 2022-7 

zh_TW-iso2022-7 

ISO 2022-7

zh_TW-iso2022-7 

CNS 11643 

zh_TW-euc 

ISO 2022-7

zh_TW-iso2022-7 

Big-5 

zh_TW-Big5 

CNS 11643

zh_TW-eu 

ISO 2022-CN-EXT 

zh_TW-iso2022-CN-EXT 

ISO 2022-CN-EXT

zh_TW-iso2022-CN-EXT 

CNS 11643 

zh_TW-euc 

Big-5

zh_TW-Big5 

ISO 2022-CN 

zh_TW-iso2022-CN 

ISO 2022-CN

zh_TW-iso2022-CN 

Big-5 

zh_TW-Big5 

UTF-8

UTF-8 

CNS 11643 

zh_TW-euc 

CNS 11643

zh_TW-euc 

UTF-8 

UTF-8 

UTF-8

UTF-8 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5 

UTF-8 

UTF-8 

UTF-8

UTF-8 

ISO 2022-7 

zh_TW-iso2022-7 

ISO 2022-7

zh_TW-iso2022-7 

UTF-8 

UTF-8 

ISO 2022-CN-EXT

zh_TW-iso2022-CN-EX 

Big-5 

zh_TW-Big5 

Big-5

zh_TW-Big5 

ISO 2022-CN-EXT 

zh_TW-iso2022-CN-EXT 

Japanese Localization

This section describes Japanese locale-specific information.

Japanese Locales

Three Japanese locales, which support different character encoding, are available in the Solaris 8 environment. The ja, (or ja_JP.eucJP) locale is based on the Japanese EUC. The ja_JP.PCK locale is based on PC-Kanji code (known as Shift-JIS) and the ja_JP.UTF-8 is based on UTF-8.

See eucJP(5) for a map between Japanese EUC and the character set. See PCK(5) for the map between PCK and the character set.

Japanese Character Set

Supported Japanese character sets are:

JISX0212-1990 is not supported in the ja_JP.PCK locale.

Vendor Defined Character (VDC) and User defined Character (UDC) are also supported. VDCs occupy unused (reserved) code points of JISX0208-1990 or JISX0212-1990. UDCs occupy the same code points as VDCs except the code points are for VDCs.

Japanese Font

Three Japanese font formats are supported. They are: Bitmap, TrueType and Type1. The Japanese Type1 font includes only JIS X0212 for printing. Type1 font is also used by UDC.

Japanese Bitmap Fonts are shown below.

Table 3-28 Japanese Bitmap Fonts

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

gothic

R, B 

PCF(12,14,16,20,24) 

 

JISX0208.1983, 

JISX0201.1976 

minchou

PCF(12,14,16,20,24) 

 

JISX0208.1983, 

JISX0201.1976 

hg gothic b

PCF(12,14,16,18,20,24) 

RICOH 

JISX0208.1983, JISX0201.1976 

hg mincho l

PCF(12,14,16,18,20,2) 

RICOH 

JISX0208.1983, JISX0201.1976 

heiseimin

PCF(12,14,16,18,20,24) 

RICOH 

JISX0212.1990 

Japanese TrueType Fonts are show below.

Table 3-29 Japanese TrueType Fonts

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

hg gothic b

TrueType 

RICOH 

JISX0208.1983, JISX0201.1976 

hg mincho l

TrueType 

RICOH 

JISX0208.1983, JISX0201.1976 

heiseimin

TrueType 

RICOH 

JISX0212.1990 

Japanese Input Systems

Four Japanese input systems, ATOK12, ATOK8, Wnn6, and cs00 are available in the Solaris 8 environment for all Japanese locales. It is possible to switch input systems from the workspace menu. The only Japanese input system available on the Base Solaris is cs00.

How to Input Japanese Strings by using cs00

When turning Kana-Kanji conversion mode ON, keyboard input is grabbed by Htt (X Input Method Server) and sent to the cs00 daemon through the XCI (xci(7)) interface. The cs00 deamon converts the received strings to Japanese strings by using dictionary and returns the result to the program which has a keyboard focus now. See cs00(1M) for more details.

CUI based dictionary maintenance utilities are available. See udicm(1) and mdicm(1) for details.


Note -

GUI based maintenance utilities, sdtudicm(1) or udicmtool(1), are not available in the base Solaris product.


The basic Japanese input procedure is as follows:

  1. Turning Japanese conversion mode on/off: Control + Space

  2. Enter Kana character text: ex: Type "nihon"

  3. Conversion to Kanji character text: Control + N

  4. Commit the Kanji character text: Control + K

The following table shows cs00 operation list.

Table 3-30 cs00 Operation List

Function 

Operation 

Conversion mode on/off 

Control + Space 

Control + @ 

Kana/Kanji conversion 

next Control + N 

post Control + P 

lookup Control + W 

Commit 

Control + K 

Move focus 

forward Control + F 

back Control + B 

Focus scope 

increase Control + I 

decrease Control + U 

Delete (1 character) 

Control + H 

Delete or backspace 

Delete (all characters) 

Control + ] and Control + U 

Full/half Katanka => Hiragana 

Control + ] and Control + O 

Hiragana/half Katakana = > full Katakana 

Control + ] and Control + Y 

Full Katakana/Hiragana => half Katakana 

Control + ] and Control + Z 

Half Roma/Num = > full Roma/Num 

Control + ] and Control + T 

Full Roma/Num = > half Roma/Num 

Control + ] and Control + R 

Learning Mode on/off 

Control + ] and Control + L 

Input Mode Switch: 

  • Hiragana mode

  • Full Katakana mode

  • Full Roma/Num mode

  • Half Katakana mode

  • Half Roma/Num mode

  • Kuten code input mode

  • Bushu input mode

 

 

Control + O 

Control + Y 

Control + T 

Control + Z 

Control + R 

Control + Q 

Control + V 

Terminal Setting for Japanese Terminals

Using Japanese locales on a character based terminal (TTY) requires that you use terminal settings to make line editing work correctly.

Japanese iconv Module

Several Japanese codeset conversions are supported with iconv(1) and iconv(3). See the iconv_ja(5) man page for details.

The following table shows iconv Conversion Support.

Table 3-31 iconv Conversion Support

Source Code 

Target Code 

eucJP

JIS7 

eucJP

SJIS 

eucJP

UTF-8 

eucJP

jis 

eucJP

ibmj 

SJIS

eucJP 

SJIS

ISO-2022-JP 

SJIS

UTF-8 

SJIS

jis 

SJIS

ibmj 

PCK

eucJP 

PCK

UTF-8 

PCK

ISO-2022-JP 

PCK

jis 

PCK

ibmj 

ISO-2022-JP

eucJP 

ISO-2022-JP

PCK 

ISO-2022-JP

SJIS 

UTF-8

eucJP 

UTF-8

SJIS 

UTF-8

PCK 

JIS7

eucJP 

jis

eucJP 

jis

PCK 

jis

SJIS 

ibmj

eucJP 

ibmj

PCK 

UTF-8

ISO-2022-JP 

ISO-2022-JP

UTF-8 

eucJP

UTF-8-Java 

UTF-8-Java

eucJP 

PCK

UTF-8-Java 

UTF-8-Java

PCK 

eucJP

ISO-2022-JP.RFC1468 

PCK

ISO-2022-JP.RFC1468 

UTF-8

ISO-2022-JP.RFC1468 

eucJP

ibmj-EBCDIK 

ibmj-EBCDIK

eucJP 

PCK

ibmj-EBCDIK 

ibmj-EBCDIK

PCK 

Japanese Specific Printer Support

The Japanese Solaris 8 product supports the following Japanese-specific printers:

User Defined Character Support

To handle UDC, sdtudctool is available. Sdtudctool handles both outline (Type1) and bitmap (PCF) fonts. Some utilities are also available to migrate the UDC fonts that were created by old utilities in prior releases, such as fontedit, type3creator, and fontmanager.

Not Included on the Base Solaris Product

The following components are included in the multilingual Solaris product (on Languages CD), but not included in the base Solaris product.

Korean Localization

In December 1995, the Korean government announced a standard Korean codeset, KS C 5700, which is based on ISO 10646-1/Unicode 2.0.

The ISO-10646 character set uses 2 (UCS-2); Universal Character Set (two-byte form) or 4 (UCS-4) bytes to represent each character.

The ISO-10646 character set cannot be used directly on IBM-PC-based operating systems. For example, the kernel and many other modules of the Solaris operating environment interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations. In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which recodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).

The ko.UTF-8 is a Solaris locale to support KSC-5700, the Korean standard codeset. It supports all characters in the previous KSC 5601 and all 11,172 Korean characters. Korean UTF-8 supports the Korean language-related ISO-10646 characters and fonts. Because ISO-10646 covers all characters in the world, all of the various input methods and fonts are supplied so that you can input and output any character in any language. Before Universal UTF/UCS becomes available, Korean UTF-8 supports the ISO-10646 code subset that is related to Korean characters as well as all other characters in the previous Korean standard codeset, and Extended ASCII.

In the ko locale, the EUC scheme is used to encode KSC 5601-1987. The ko.UTF-8 locale supports the KSC 5700-1995/Unicode 2.0 codeset, which is a super set of KSC 5601-1987. These two locales look the same to the end user, but the internal character encoding is different. The Korean Solaris product supports the following Input Methods:

For the ko locale:

For the ko.UTF-8 locale:

Table 3-32 Solaris 8 Korean CID/Type 1 Fonts for the ko Locale

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

 Gothic R CID/Type 1 Hanyang Adobe-Korean
 Graphic R CID/Type 1 Hanyang Adobe-Korean
 Haeso R CID/Type 1 Hanyang Adobe-Korean
 Kodig R CID/Type 1 Hanyang Adobe-Korean
 Myeongijo R CID/Type 1 Hanyang Adobe-Korean
 Pilki R CID/Type 1 Hanyang Adobe-Korean
 Roundgothic R CID/Type 1 Hanyang Adobe-Korean

Table 3-33 Solaris 8 Korean Bitmap Fonts for the ko Locale

Full Family Name 

Subfamily 

Format 

Encoding 

 Gothic R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Graphic R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Haeso R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Kodig R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Myeongijo R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Pilki R/B PCF (12,14,16,18,20,24) KSC 5601-1987
 Roundgothic R/B PCF (12,14,16,18,20,24) KSC 5601-1987

Table 3-34 Solaris 8 Korean CID/Type 1 Fonts for the ko.UTF-8 Locale

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

 Gothic R CID/Type 1 Hanyang Adobe-Korean
 Graphic R CID/Type 1 Hanyang Adobe-Korean
 Haeso R CID/Type 1 Hanyang Adobe-Korean
 Kodig R CID/Type 1 Hanyang Adobe-Korean
 Myeongijo R CID/Type 1 Hanyang Adobe-Korean
 Pilki R CID/Type 1 Hanyang Adobe-Korean

Table 3-35 Solaris 8 Korean Bitmap Fonts for the ko.UTF-8 Locale

Full Family Name 

Subfamily 

Format 

Encoding 

 Gothic R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Graphic R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Haeso R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Kodig R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Myeongijo R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)
 Pilki R/B PCF (12,14,16,18,20,24) KSC 5601-1992 (Johap)

Table 3-36 Solaris 8 Korean TrueType Fonts for the ko/ko.UTF-8 Locales

Full Family Name 

Subfamily 

Format 

Vendor 

Encoding 

Kodig/Gothic 

True Type 

Hanyang 

Unicode 

Myeongjo 

True Type 

Hanyang 

Unicode 

Haeso 

True Type 

Hanyang 

Unicode 

RoundGothic 

True Type 

Hanyang 

Unicode 

Table 3-37 Korean ICONV

Code 

Symbol 

Target Code 

Symbol 

 KSC 5601-1987 1506 UTF-8 UTF-8
 ISO 646 646 KSC 5601-1987 5601
 KSC 5601-1987 EUC-KR UTF-8 UTF-8
 KSC 5601-1987 KSC5601 UTF-8 UTF-8
 UTF-8 UTF-8 KSC 5601-1987 5601
 UTF-8 UTF-8 KSC 5601-1987 EUC-KR
 UTF-8 UTF-8 KSC 5601-1987 KSC 5601
 UTF-8 ko-KR-UTF-8 IBM CP 933 cp 933
 UTF-8 ko-KR-UTF-8 KSC 5601-1987 ko_KR-euc
 UTF-8 ko-KR-UTF-8 ISO2022-KR ko_KR-iso2022-7
 UTF-8 ko-KR-UTF-8 KSC 5601-1987 - Johap ko_KR-johap
 UTF-8 ko-KR-UTF-8 KSC5601-1992 - Johap ko_KR-johap92
 IBM CP933 cp933 UTF-8 ko_KR-UTF-8
 KSC 5601-1987 ko_KR-euc UTF-8 ko_KR-UTF-8
 KSC 5601-1987 ko_KR-euc ISO 2022-KR ko_KR-iso2022-7
 KSC 5601-1987 ko_KR-euc KSC 5601-1987 - Johap ko_KR-johap
 KSC 5601-1987 ko_KR-euc KSC 5601-1992 - Johap ko_KR-johap92
 KSC 5601-1987 ko_KR-euc KSC 5601-1992-Annex:4 ko_KR-nbyte
 ISO 2022-KR iso2022-7 UTF-8 ko_KR-UTF-8
 ISO 2022-KR iso2022-7 KSC 5601-1987 ko_KR-euc
 KSC 5601-1987 - Johap ko-KR-johap UTF-8 ko_KR-UTF-8
 KSC 5601-1987 - Johap ko-KR-johap KSC 5601-1987 ko_KR-euc
 KSC 5601-1992 - Johap ko-KR-johap92 UTF-8 ko_KR-UTF-8
 KSC 5601-1992 - Johap ko-KR-johap92 KSC 5601-1987 ko_KR-euc
 KSC 5601-1992 - Annex:4 ko-KR-nbyte KSC 5601-1987 ko_KR-euc