The Korean government announced the standard Korean codeset KS C 5700, which is based on Unicode 2.0. KS C 5700 will be widely used in the Korean market, replacing the previous standard, KS C 5601, which is based on ISO 2022.
To comply with this new standard, the ko.UTF-8 locale was developed. UTF-8 is a file system safe (Universal Character Set Transformation Format) Unicode, which is based on ISO 10646-1/Unicode 2.0.
ko.UTF-8 supports all the characters of KSC 5601 and 11,172 characters from Johap. ko.UTF-8 supports all Korean-related Unicode 2.0 characters and fonts. All Unicode characters can be accepted and processed, but some cannot be correctly displayed because of input and output limitations.
ko.UTF-8 supports the following subset of Unicode:
Basic Latin and Latin-1 (190 characters) - Row 00 of BMP (Basic Multilingual Plan)
Symbolic characters - Row 20 to Row 27, and Row 32 of BMP Including box (line) drawing characters that are defined in KS C 5601
Numerals that are defined in KSC 5601 (20 characters) - Row 21 and Row FF of BMP
Roman, Greek, Japanese, and Cyrillic alphabet characters that are defined in KS C 5601 (362 characters) - Row 03, Row 04, Row 30 and Row FF of BMP
Jamo (Hangul alphabet) characters (94 characters) - Row 31 of BMP
Pre-composed Hangul syllables (11,172 characters) - From Row AC to Row D7 of BMP
Hanja characters defined in KS C 5601 (4,888 characters) - From Row 4E to Row 9F and from Row F9 to Row FA of BMP