Go to main content

man pages section 7: Standards, Environments, Macros, Character Sets, and Miscellany

Exit Print View

Updated: Wednesday, July 27, 2022
 
 

iconv_zh (7)

Name

iconv_zh - codeset conversion for Chinese encodings

Description

The table below provides basic information on available Chinese code conversions for iconv and cconv.

The listed names in the table are canonical names. There are aliases, e.g., EUC-CN, zn_CN.euc, and so on for GB2312, that are also supported.

Code Conversions
Description
GB2312
Chinese Extended UNIX Code representation US-ASCII as the primary single byte codeset and GB2312 as two-byte codeset.
GBK
An extension and superset of GB2312. Includes characters from GB13000.1-93 and more.
GB18030
Chinese National Standard GB 18030 representation, also a superset of GB2312 character set. Can represent characters of US-ASCII as the single byte codeset and GB18030 as two-byte and four-byte codeset.
HZ-GB2312
Chinese 7-bit encoded codeset based on GB2312, and as specified in the RFC 1843. Can represent characters of US-ASCII and GB2312 character sets.
ISO-2022-CN
Chinese 7-bit encoded codeset based on ISO/IEC 2022 extension mechanism and as specified in the RFC 1922. Can represent characters of US-ASCII as the single byte codeset, and GB2312 character set and CNS 11643 planes 1, 2.
ISO-2022-CN-EXT
Chinese 7-bit encoded codeset based on ISO/IEC 2022 extension mechanism and as specified in the RFC 1922. Can represent characters of US-ASCII as the single byte codeset, and GB2312 character set and CNS 11643 planes 1, 2, 3, 4, 5, 6, 7, 15.
IBM-935
IBM EBCDIC mixed multibyte code page for Chinese.
CP936
Windows code page 936 for simplified Chinese, identical to GB2312, and expanded to cover most part of GBK.

Available iconv and cconv conversions in the current system can be obtained by running iconv -l as described in the iconv(1) manual page.

Additional information on the mappings between canonical names and supported aliases with optional variant levels, refer to alias(5) manual page and also /usr/lib/iconv/alias file.

Files

/usr/lib/iconv/*.so

iconv conversion modules

/usr/lib/iconv/*.bt

cconv code conversion binary tables for iconv(1), cconv(3C), and iconv(3C)

/usr/lib/iconv/geniconvtbl/binarytables/*.bt

geniconvtbl conversion binary tables for iconv(1) and iconv(3C)

/usr/lib/iconv/alias

alias table file of codeset names

See Also

geniconvtbl(1), iconv(1), cconv(3C), cconv_close(3C), cconv_open(3C), cconvctl(3C), iconv(3C), iconv(3C), iconv_close(3C), iconv_open(3C), iconvctl(3C), alias(5), geniconvtbl(5), geniconvtbl-cconv(5), iconv_extra(7), iconv_ja(7), iconv_ko(7), iconv_unicode(7), iconv_zh_HK(7), iconv_zh_TW(7)

Lee, F., HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII characters, RFC 1843, Stanford University, August 1995.

Wei, Y., Y. Zhang, J. Li, J. Ding, and Y. Jiang, ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages, RFC 1842, AsiaInfo Services Inc., Harvard University, Rice University, University of Maryland, August 1995.

Zhu, H., D. Hu, Z. Wang, T. Kao, W. Chang, and M. Crispin, Chinese Character Encoding for Internet Messages, RFC 1922, Tsinghua University, China Information Technology Standardization Technical Committee (CITS), Institute for Information Industry (III), University of Washington, March 1996.

Chinese national standard GB 2312-80/GB 18030-2000, Standardization Administration of The People's Republic of China.

Unicode Standard Annex #38: Unicode Han Database (Unihan), https://www.unicode.org/reports/tr38/

Notes

The cconv code conversions related to the Chinese codesets listed here also support the following variations based on the Unihan database:

level 1:

kSimplifiedVariant

level 2:

kTraditionalVariant

level 3:

kSemanticVariant

level 4:

kZVariant

level 5:

kCompatibilityVariant

level 6:

kSpecializedSemanticVariant