Solaris Internationalization Guide For Developers

Chapter 2 Contents of the Base Solaris Product

Summary of the Base Product

Solaris 7 includes partial locales, which provide the functionality needed for entering, displaying, and printing in local languages while using an English interface. It also includes the en_US.UTF-8 locale, which also uses an English interface, and supports the Unicode UTF-8 character encoding standard.

The base English Solaris 7 product includes the Euro full locales, a number of partial European locales as well as the en_US.UTF-8 locale.

The File System Safe Universal Transformation Format, or UTF-8, is an encoding defined by X/Open as a multi-byte representation of Unicode. UTF-8 is a variant of UNICODE. UTF-8 provides input and output support for all Solaris single-byte locales.

Partial locales can be split into two groups: the core set and the extended set. The core set is packaged in SUNWploc (operating system locale) and SUNWplow (window system locale). Since these packages are part of the end user cluster, they are installed automatically. The extended set of locales is packaged in SUNWploc1 (operating system locale) and SUNWplow1 (Window system locale). SUNWpldte has CDE support for the Eastern European locales.

SUNWploc1 and SUNWplow1 are available on the entire cluster only. SUNWploc1 and SUNWplow1 need to be added to your system before you can use the locales in the extended set.

Core Set of Locales

The core set of locales is installed automatically. The core sets are listed in Table 2-1.

Table 2-1 Core Set of Locales in SUNWploc and SUNWplow

Locale 

Language 

Country 

Encoding 

de

German 

Germany 

ISO-8859-1 

en_AU

English 

Australia 

ISO-8859-1 

en_CA

English 

Canada 

ISO-8859-1 

en_UK changed to en_GB

English 

Great Britain 

ISO-8859-1 

en_US

English 

United States 

ISO-8859-1 

en_US.UTF-8

English 

United States 

UTF-8 

es

Spanish 

Spain 

ISO-8859-1 

es_AR

Spanish 

Argentina 

ISO-8859-1 

es_BO

Spanish 

Bolivia 

ISO-8859-1 

es_CL

Spanish 

Chile 

ISO-8859-1 

es_CO

Spanish 

Columbia 

ISO-8859-1 

es_CR

Spanish 

Costa Rica 

ISO-8859-1 

es_EC  

Spanish 

Ecuador 

ISO-8859-1 

es_GT

Spanish 

Guatemala 

ISO-8859-1 

es_MX

Spanish 

Mexico 

ISO-8859-1 

es_NI

Spanish 

Nicaragua 

ISO-8859-1 

es_PA

Spanish 

Panama 

ISO-8859-1 

es_PE

Spanish 

Peru 

ISO-8859-1 

es_PY

Spanish 

Paraguay 

ISO-8859-1 

es_SV

Spanish 

El Salvador 

ISO-8859-1 

es_UY

Spanish 

Uruguay 

ISO-8859-1 

es_VE

Spanish 

Venezuela 

ISO-8859-1 

fr

French 

France 

ISO-8859-1 

it

Italian 

Italy 

ISO-8859-1 

sv

Swedish 

Sweden 

ISO-8859-1 

New Locales

Solaris software already supports most of the Western European locales and, in this release, has focused on expanding its support for the Eastern European, Thai, and the Middle Eastern regions. New and changed user locales in the Solaris 7 operating environment are listed in Table 2-2

Table 2-2 New or Changed User Locales

Region 

Locale Name 

ISO Codeset 

Comments 

Albania 

sq_AL

8859-2 

Bosnia 

nr

8859-2 

Bulgaria 

bg_BG

8859-5 

Croatia 

hr_HR

8859-2 

Finland  

su changed to fi

8859-15 

Changed to comply with ISO standards 

France 

fr

UTF-8 

 

Germany 

de

UTF-8 

Macedonia 

mk_MK

8859-5 

 

Israel 

he

8859-8 

 

Italy 

it

UTF-8 

 

Norway (nynorsk) 

no_NY

8859-1 

 

P.R. China 

zh.GBK

GBK 

GBK is a superset of GB2312 

Romania 

ro_RO

8859-2 

 

Russia 

ru

KOI-8 

The default codeset has been changed to KOI-8 from ISO 8859-5 

Saudi Arabia 

ar

8859-6 

Serbia 

sr_SP

8859-5 

 

Slovakia 

sk_SK

8859-2 

 

Slovenia 

sl_SI

8859-2 

 

Spain 

es

UTF-8 

 

Sweden 

sv

UTF-8 

 

Thailand 

th_TH

TIS 620-2533 

Thai character codeset has been registered to ISO 8859-11 

Great Britain 

en_UK changed to en_GB

8859-15 

Changed to comply with ISO standards 

United States 

en_US

UTF-8 

 

Solaris 7 software has added support for the Euro currency by adding six new user locales. These are included in Table 2-3 Note that local currency symbols are still available for backwards compatibility.

Table 2-3 New User Locales To Support the Euro Currency

Region 

Locale Name 

ISO Codeset 

Austria 

de_AT

8859-15 

Belgium (French) 

fr_BE

8859-15 

Belgium (Dutch) 

nl_BE

8859-15 

Denmark 

da

8859-15 

England 

en_EU

8859-15 

Finland 

su changed to fi

8859-15 

France 

fr

8859-15 

Germany 

de

8859-15 

Ireland 

en_IE

8859-15 

Italy 

it

8859-15 

Netherlands 

nl

8859-15 

Portugal 

pt

8859-15 

Spain 

es

8859-15 

Sweden 

sv

8859-15 

Great Britain 

en_GB

8859-15 

Europe 

en_EU

8859-15 

Extended Set of Locales

The extended set of locales is not installed automatically. If you want to use locales listed in Table 2-4 you need to install them manually.

Table 2-4 Extended Set of Locales in SUNWploc1 and SUNWplow1

Locale 

Language  

Country 

Encoding 

cz 

Czech 

Czechoslovakia 

ISO-8859-2 

da 

Danish 

Denmark 

ISO-8859-15 

de_AT 

German 

Austria 

ISO-8859-15 

de_CH 

German 

Switzerland 

ISO-8859-1 

el 

Greek  

Greece 

ISO-8859-7 

en_IE 

English 

Ireland 

ISO-8859-1 

en_NZ 

English 

New Zealand 

ISO-8859-1 

et  

Estonian 

Estonia 

ISO-8859-15 

fr_BE  

French 

Belgium 

ISO-8859-1 

fr_CA  

French 

Canada 

ISO-8859-1 

fr_CH 

French 

Switzerland 

ISO-8859-1 

hu  

Hungarian 

Hungary 

ISO-8859-2 

lt  

Lithuanian 

Lithuania 

ISO-8859-13 

lv  

Latvian 

Latvia 

ISO-8859-13 

nl  

Dutch 

Netherlands 

ISO-8859-1 

nl_BE  

Dutch 

Belgium 

ISO-8859-1 

no  

Norwegian 

Norway 

ISO-8859-1 

pl  

Polish 

Poland 

ISO-8859-2 

pt  

Portuguese 

Portugal 

ISO-8859-1 

pt_BR  

Portuguese 

Brazil 

ISO-8859-1 

ru  

Russian 

Russia 

ISO-8859-5 

su  

Finnish 

Finland 

ISO-8859-1 

tr  

Turkish 

Turkey 

ISO-8859-9 

Unicode Locale: en_US.UTF-8

The en_US.UTF-8 locale is a multiscript locale that can input and output text in multiple scripts, including single-byte and multi-byte scripts. This locale is part of the developer cluster. This is the first locale with this capability in the Solaris operating environment.

This locale uses UTF-8 (Universal Character Set Transformation Format for 8 bits) encoding, which was developed by the X/Open-Uniforum Joint Internationalization Working Group (XoJIG). This standard has been adopted by the Unicode Consortium, the International Standards Organization, and the International Electrotechnical Commission as a part of Unicode 2.0 and ISO/IEC 10646-1.

en_US.UTF-8 supports computation for every code point value, which is defined in Unicode 2.0 and ISO/IEC 10646-1. In Solaris 7, language script support is not limited to pan-European locales, but also includes Asian scripts such as Korean, Traditional Chinese, Simplified Chinese, and Japanese. Input method support has been enabled for the following language scripts only. Due to limited font resources, Solaris 7 software includes only character glyphs from the following codesets:

User Locales in the Base Solaris Product

The Base Solaris 7 product includes the locale support listed in Table 2-5.

Table 2-5 User Locales Included in Solaris 7 Product

Country 

Locale-Name 

ISO codeset 

Austria 

de_AT (German Partial Locale)

8859-1 

Estonia 

et

8859-1 

Czech 

cz

8859-2 

Hungary 

hu

8859-2 

Poland 

pl

8859-2 

Latvia 

lv

8859-4 

Lithuania 

lt

8859-13 

Russia 

ru

8859-5 

Greece 

el.sun_eu_greek

8859-7 (modified) 

Turkey 

tr

8859-9 

These locales are supported through the SUNWploc1 (for operating system support), SUNWplow1 (for OpenWindows support), and SUNWpldte (for locales support) packages, which are part of the entire cluster. The fonts for these locales have the format SUNiXxf.

SUNWi1rf contains the required font and SUNWi1of contains the optional font for an ISO 8859-1 codeset locale. These packages are in different clusters; install the entire cluster or selectively add the appropriate packages. After the packages have been installed, users can login through dtlogin to either CDE or OpenWindows and use the characters associated with their locale.

Multiple Key Compose Sequences for Locales

The Solaris 7 operating environment supports "compose sequences" to create the diacritical marks used in writing the scripts covered in the following codesets:

These are the diacritic characters that can be created with the following keys and the Compose key.

Keyboard Support in the Base Solaris 7 Product

The following locales have keyboard layouts for SPARC (X-server) and X86 (Xserver PLUS console):

[X-server is CDE and OW, console is command line]

Changing Between Keyboards on SPARC

Support for changing layouts in the Solaris product is achieved only by using the dip-switch settings under the keyboard. The keyboard layout is determined by the dip switches. A list of keyboard layouts and corresponding defined dip-switch settings is at /usr/openwin/share/etc/keytables/keytable.map.

The following table Table 2-6 is for a type 4 keyboard .(1=switch up 0=switch down).

Table 2-6 Layouts for Type 4 Keyboards

Dip Switch in Hex 

Keyboard 

Setting in Binary 

51 

Hungary5.kt 

110011 

52 

Poland5.kt 

110100 

53 

Czech5.k 

110101 

54 

Russia5.kt 

110110 

55 

Latvia5.k 

110111 

56 

Turkey5.kt 

111000 

57 

Greece5.kt 

111001 

58 

Lithuania5.kt 

111011 

Changing the layout from U.S./UK to Czech is done by changing the dip-switch settings to the setting defined in the file (the file defines the switches in hex. This needs to be converted into binary as it was shown in Table 2-6) and then re-booting.

Russian and Greek keyboard support can be toggled on and off using the SPARC Compose key (Ctrl+Shift+F1 on x86).

Changing Between Keyboards on x86

On x86, a keyboard is selected during the kdmconfig part of install. To change this at any time after installation, use kdmconfig:

  1. Exit CDE/OW to the command line.

  2. Type kdmconfig -u (kdmconfig unconfigure).

  3. Type kdmconfig to run the program.

  4. Follow instructions to get a keyboard layout.

There are no `utilities' for either SPARC or x86 (apart from standard UNIX tools such as xmodmap, pcmapkeys) bundled into Solaris 7 for switching keyboards.

Codesets for x86

The default codeset on the Solaris system for x86 is ISO-8859-1. The IBM DOS 437 codeset is provided as an option in text mode. That is, if you choose to download IBM DOS 437 codeset by typing:


loadfont -c 437
pcmapkeys -f /usr/share/lib/keyboards/437/en_US

there is no support for nonstandard U.S. date, time, currency, numbers, units, and collation. There will be no support for non-English message and text presentation, and no multibyte character support. Therefore, non-Microsoft Windows users should use the IBM DOS 437 codeset only in the default C locale.

Locales in the Base Installation

The installation window in the base Solaris 7 product offers several English language locales. To use 8-bit characters, install one of the en_XX options, as shown in Table 2-7. The locale used in the installation becomes the default system locale.

Table 2-7 Locales Offered at Installation

Locale Name  

Language/Territory 

Codeset 

C

U.S. English 

7-bit 

en_AU

Australian English 

8-bit 

en_CA

Canadian English 

8-bit 

en_UK

UK English 

8-bit 

en_US

U.S. English 

8-bit 

Using JumpStart

To enable JumpStartTM for the 8-bit locales, add the line localexx (substituting the appropriate 8-bit locale for xx, for example, en_US) to the JumpStart profile file. (For complete instructions, see Chapter 4 of Automating Solaris Installation, available from SunSoft Press.) Current JumpStart users should set the default locale to bypass the language prompt during installation.