The following table provides a detailed listing of the supported code conversions.
Unicode* includes all of the following codesets: UTF-8, UCS-2, UCS-2BE, UCS-2LE, UCS-4, UCS-4BE, UCS-4LE, UTF-16, UTF-16BE, UTF-16LE.
ISO 8859 codesets can also be referenced without the ISO prefix; for example, ISO 8859-1 = 8859-1.
Code |
Code |
Description |
---|---|---|
Unicode* |
ISO 646 |
Unicode* <--> ISO 646 (ASCII) |
Unicode* |
ISO 8859-1 |
Unicode* <--> ISO 8859-1 (Latin-1) |
Unicode* |
ISO 8859-2 |
Unicode* <--> ISO 8859-2 (Latin-2) |
Unicode* |
ISO 8859-3 |
Unicode* <--> ISO 8859-3 (Latin-3) |
Unicode* |
ISO 8859-4 |
Unicode* <--> ISO 8859-4 (Latin-4) |
Unicode* |
ISO 8859-5 |
Unicode* <--> ISO 8859-5 (Cyrillic) |
Unicode* |
ISO 8859-6 |
Unicode* <--> ISO 8859-6 (Arabic) |
Unicode* |
ISO 8859-7 |
Unicode* <--> ISO 8859-7 (Greek) |
Unicode* |
ISO 8859-8 |
Unicode* <--> ISO 8859-8 (Hebrew) |
Unicode* |
ISO 8859-9 |
Unicode* <--> ISO 8859-9 (Latin-5) |
Unicode* |
ISO 8859-10 |
Unicode* <--> ISO 8859-10 (Latin-6) |
Unicode* |
ISO 8859-13 |
Unicode* <--> ISO 8859-13 |
Unicode* |
ISO 8859-14 |
Unicode* <--> ISO 8859-14 |
Unicode* |
ISO 8859-15 |
Unicode* <--> ISO 8859-15 |
Unicode* |
KOI8-R, KO18-U, koi8-r, koi8-u |
Unicode* <--> KOI8-R, KO18-U, koi8-r, koi8-u (Cyrillic) |
UTF-7 |
UCS-2, UCS-4, UTF-8 |
UTF-7 <--> UCS-2, UCS-4, UTF-8 |
UTF-8 |
UCS-2, UCS-4, UTF-16 |
UTF-8 <--> UCS-2, UCS-4, UTF-16 |
UTF-8 |
UCS-2BE, UCS-2LE, UCS-4BE, UCS-4LE, UTF-16BE, UTF-16LE |
UTF-8 <--> UCS-2BE, UCS-2LE, UCS-4BE, UCS-4LE, UTF-16BE, UTF-16LE |
UCS-4, UCS-4BE, UCS-4LE |
UCS-2, UCS-2BE, UCS-2LE, UTF-16, UTF-16BE, UTF-16LE |
UCS-4, UCS-4BE, UCS-4LE <--> UCS-2, UCS-2BE, UCS-2LE, UTF-16, UTF-16BE, UTF-16LE |
UTF-8 |
UTF-EBCDIC |
UTF-8 <--> UTF-EBCDIC |
UTF-8 |
IBM-037, -273, -277, -278, -280 -284, -285, -297, -420 -424, -500, -850, -852 -855, -856, -857, -862 -864, -866, -869, -870 -875, -880, -921, -922 -1025, -1026, -1046, -1112, -1122 |
UTF-8 <--> various IBM code pages (PC and EBCDIC) |
UTF-8 |
CP850, CP852, CP855, CP857, CP862, CP864, CP866, CP869, CP874, CP1250, CP1251, CP1252, CP1252, CP1253, CP1254, CP1255, CP1256, CP1257, CP1258 |
UTF-8 <--> various Microsoft code pages |
UTF-8 |
eucJP |
UTF-8 <--> Japanese EUC (JIS X0201-1976, JIS X0208-1983 and JIS X0212-1990) |
UTF-8 |
PCK |
UTF-8 <--> Japanese PC Kanji (a.k.a. SJIS) |
UTF-8 |
ISO-2022-JP |
UTF-8 <--> Japanese MIME charset |
UTF-8-Java |
eucJP |
UTF-8-Java to Japanese EUC (JIS X0201-1976, JIS X0208-1983 and JIS X0212-1990) |
UTF-8-Java |
PCK |
UTF-8-Java to Japanese PC Kanji (a.k.a. SJIS) |
UTF-8-Java |
ISO-2022-JP.RFC1468 |
UTF-8-Java to Japanese MIME charset (one-way conversion) |
UTF-8 |
ko_KR-euc |
UTF-8 <--> Korean EUC (KS C 5636 and KS C 5601-1987) |
UTF-8 |
ko_KR-johap |
UTF-8 <--> Korean Johap (of KS C 5601-1987) |
UTF-8 |
ko_KR-johap92 |
UTF-8 <--> Korean Johap (of KS C 5601-1992) |
UTF-8 |
ko_KR-iso2022-7 |
UTF-8 <--> Korean MIME charset (ISO-2022-KR) |
UTF-8 |
ko_KR-cp933 |
UTF-8 <--> IBM MBCS CP933 ko_KR-euc |
UTF-8 |
gb2312 |
UTF-8 <--> Simplified Chinese EUC (GB 1988-1980 and GB 2312-1980) |
UTF-8 |
iso2022 |
UTF-8 <--> Simplified Chinese MIME charset (ISO-2022-CN) |
UTF-8 |
GBK |
UTF-8 <--> Simplified Chinese GBK |
UTF-8 |
zh_TW-euc |
UTF-8 <--> Traditional Chinese EUC (CNS 11643-1992) |
UTF-8 |
zh_TW-big5 |
UTF-8 <--> Traditional Chinese Big5 |
UTF-8 |
zh_TW-iso2022-7 |
UTF-8 <--> Traditional Chinese MIME charset (ISO-2022-TW) |
UTF-8 |
zh_TW-cp937 |
UTF-8 <--> IBM MBCS CP937 |