Korean Solaris User's Guide

Korean Locales

In December 1995, the Korean government announced a standard Korean code set, KS X 1005–1, which is based on ISO 10646-1/Unicode 2.0.

The ISO-10646 character set uses two universal character sets:

The ISO-10646 character set cannot be used directly on IBM PC-based operating systems. For example, the kernel and many other modules of the Korean Solaris Operating System interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations.

In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which encodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).

The ko_KR.UTF-8 locale supports KS X 1005–1, the Korean standard code set. The locale supports the characters of the previous KS X 1005 code set, all of the 11,172 Korean characters, and the extended ASCII code set. Until the Universal UTF/UCS becomes available, the ko_KR.UTF-8 locale supports ISO-10646 code subset that is related to the Korean characters and fonts . The ISO-10646 standard covers all characters in the world. With the input methods and fonts provided in this release, you can enter, display, and print characters of any language.

In the ko_KR.EUC locale, the EUC scheme is used to encode KS X 1001. The ko_KR.UTF-8 locale supports the KS X 1005–1/Unicode 3.2 code set, which is a superset of KS X 1001. These two locales look the same to the end user, but the internal character encoding is different.

The Korean Solaris Operating System provides simultaneous support for the locales in the following table. The locales look the same to the end user, but the internal character encoding is different.

Table 1–1 Korean Locales

Locale 

Description 

ko_KR.EUC (ko)

Korean EUC (KS X) 

ko_KR.UTF-8 (ko.UTF-8)

Korean UTF-8 (Unicode 3.2)