Korean Solaris User's Guide

Language Support

The Solaris Operating System builds inherent internationalization features into every localized product. Localization facilities support the ANSI C recommendations for internationalization and localization that define the locale and related categories.

Locale Attributes

A locale contains a language with culturally specific information and conventions for a particular global region. Each process in the Solaris Operating System has the following set of locale attributes:

Korean Locales

In December 1995, the Korean government announced a standard Korean code set, KS X 1005–1, which is based on ISO 10646-1/Unicode 2.0.

The ISO-10646 character set uses two universal character sets:

The ISO-10646 character set cannot be used directly on IBM PC-based operating systems. For example, the kernel and many other modules of the Korean Solaris Operating System interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations.

In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which encodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).

The ko_KR.UTF-8 locale supports KS X 1005–1, the Korean standard code set. The locale supports the characters of the previous KS X 1005 code set, all of the 11,172 Korean characters, and the extended ASCII code set. Until the Universal UTF/UCS becomes available, the ko_KR.UTF-8 locale supports ISO-10646 code subset that is related to the Korean characters and fonts . The ISO-10646 standard covers all characters in the world. With the input methods and fonts provided in this release, you can enter, display, and print characters of any language.

In the ko_KR.EUC locale, the EUC scheme is used to encode KS X 1001. The ko_KR.UTF-8 locale supports the KS X 1005–1/Unicode 3.2 code set, which is a superset of KS X 1001. These two locales look the same to the end user, but the internal character encoding is different.

The Korean Solaris Operating System provides simultaneous support for the locales in the following table. The locales look the same to the end user, but the internal character encoding is different.

Table 1–1 Korean Locales



ko_KR.EUC (ko)

Korean EUC (KS X) 

ko_KR.UTF-8 (ko.UTF-8)

Korean UTF-8 (Unicode 3.2) 

Korean Code sets

The following table lists the supported code sets for each Korean locale.

Table 1–2 Korean Code sets


code set 

ko_KR.EUC (ko)

KS X 1001 

ko_KR.UTF-8 (ko.UTF-8)

KS X 1005–1/Unicode 3.2 

Korean Input Methods and Fonts

The Korean Solaris Operating System provides input methods and fonts for all characters covered the ISO-10646 standard. These methods and fonts enable you to enter, display, and print any character in any language.

The following features are supported by the Korean input methods that are available for the ko_KR.EUC (ko) and the ko_KR.UTF-8 (ko.UTF-8) locales:

For a complete list of scalable and bitmap fonts supported for the ko_KR.EUC (ko) and the ko_KR.UTF-8 (ko.UTF-8) locales, see Chapter 10, Fonts.

Note –

You can use Hangul or standard Sun keyboards to enter Korean text.

Locale Categories

In the Korean Solaris Operating System, you can use the following general and specific categories as defined by ANSI C for the Korean and English locales:

For example, the Korean and the English/ASCII locales have the LC_TIME category that defines the display of the time and date according to the cultural format, as well as the actual Korean or English/ASCII characters used in the display.

Locale Modifier

Aspects of a locale-sensitive operation can be modified by using a locale modifier. The output of the locale command is:

system % locale

system % cat data_file
output of cat data_file

If you sort the data_file, the sort result is:

system % sort data_file
output of sort data_file

In this case, the text is sorted on the code point value of each character defined in the current locale, ko (ko_KR.EUC). This might not be the desired result.

The Solaris operating environment provides a locale modifier. You can modify the behavior of sort by changing the current locale from ko to ko_KR.EUC@dict as shown:

system % env LANG=ko_KR.EUC@dict sort data_file
output of sort data_file with locale modifier