In December 1995, the Korean government announced a standard Korean codeset, KSC-5700, which is based on ISO-10646-1/Unicode 2.0. The standard codeset replaces KSC 5601, which was based on ISO-2022.
The ISO-10646 character set uses 2 (UCS-2; Universal Character Set two-byte form) or 4 (UCS-4) bytes to represent each character.
The ISO-10646 character set cannot be used directly on IBM-PC-based operating systems. For example, the kernel and many other modules of the Solaris operating environment interpret certain byte values as control instructions, such as a null character (0x00) in any string. The ISO-10646 character set can be encoded with any bit combinations in the first or subsequent bytes. The ISO-10646 characters cannot be freely transmitted through the Solaris system with these limitations. In order to establish a migration path, the ISO-10646 character set defines the UCS Transformation Format (UTF), which recodes the ISO-10646 characters without using C0 controls (0x00..0x1F), C1 controls (0x80..0x9F), space (0x20), and DEL (0x7F).
The ko.UTF-8 is a Solaris locale to support KSC-5700, the Korean standard codeset. It supports all characters in the previous KSC 5601 and all 11,172 Korean characters. Korean UTF-8 supports the Korean language-related ISO-10646 characters and fonts. Because ISO-10646 covers all characters in the world, all of the various input methods and fonts are supplied so that you may input and output any character in any language. Before Universal UTF/UCS becomes available, Korean UTF-8 supports the ISO-10646 code subset that is related to Korean characters as well as all other characters in the previous Korean standard codeset, and Extended ASCII.
Table 3-6 lists the Korean codesets.
Table 3-6 Codeset Conversions Supported for Korean ko, ko.UTF-8
Code |
Symbol |
TargetCode |
Symbol |
---|---|---|---|
UTF-8 |
ko_KR-UTF-8 |
Wansung |
ko_KR-euc |
UTF-8 |
ko_KR-UTF-8 |
Johap |
ko_KR-johap92 |
UTF-8 |
ko_KR-UTF-8 |
Packed |
ko_KR-johap |
UTF-8 |
ko_KR-UTF-8 |
ISO-2022-KR |
ko_KR-iso2022-7 |
Wansung |
ko_KR-euc |
UTF-8 |
ko_KR-UTF-8 |
Johap |
ko_KR-johap92 |
UTF-8 |
ko_KR-UTF-8 |
Packed |
ko_KR-johap |
UTF-8 |
ko_KR-UTF-8 |
ISO-2022-KR |
ko_KR-iso2022-7 |
UTF-8 |
ko_KR-UTF-8 |
Wansung |
ko_KR-euc |
Johap |
ko_KR-johap92 |
Wansung |
ko_KR-euc |
Packed |
ko_KR-johap |
Wansung |
ko_KR-euc |
N-Byte |
ko_KR-nbyte |
Wansung |
ko_KR-euc |
ISO-2022-KR |
ko_KR-iso2022-7 |
Johap |
ko_KR-johap92 |
Wansung |
ko_KR-euc |
Packed |
ko_KR-johap |
Wansung |
ko_KR-euc |
N-Byte |
ko_KR-nbyte |
Wansung |
ko_KR-euc |
ISO-2022-KR |
ko_KR-iso2022-7 |
Wansung |
ko_KR-euc |