#include <sys/sunddi.h> kiconv_t kiconv_open(const char *tocode, const char *fromcode);
Solaris DDI specific (Solaris DDI).
Points to a target codeset name string.
Points to a source codeset name string.
The kiconv_open() function returns a code conversion descriptor that describes a conversion from the codeset specified by fromcode to the codeset specified by tocode. For state-dependent encodings, the conversion descriptor is in a codeset-dependent initial state (ready for immediate use with the kiconv() function).
Supported code conversions are between UTF-8 and the following:
Name Description
Big5 Traditional Chinese Big5
Big5-HKSCS Traditional Chinese Big5-Hong Kong
Supplementary Character Set
CP720 DOS Arabic
CP737 DOS Greek
CP850 DOS Latin-1 (Western European)
CP852 DOS Latin-2 (Eastern European)
CP857 DOS Latin-5 (Turkish)
CP862 DOS Hebrew
CP866 DOS Cyrillic Russian
CP932 Japanese Shift JIS (Windows)
CP950-HKSCS Traditional Chinese HKSCS-2001 (Windows)
CP1250 Central Europe
CP1251 Cyrillic
CP1252 Western Europe
CP1253 Greek
CP1254 Turkish
CP1255 Hebrew
CP1256 Arabic
CP1257 Baltic
EUC-CN Simplified Chinese EUC
EUC-JP Japanese EUC
EUC-JP-MS Japanese EUC MS
EUC-KR Korean EUC
EUC-TW Traditional Chinese EUC
GB18030 Simplified Chinese GB18030
GBK Simplified Chinese GBK
ISO-8859-1 Latin-1 (Western European)
ISO-8859-2 Latin-2 (Eastern European)
ISO-8859-3 Latin-3 (Southern European)
ISO-8859-4 Latin-4 (Northern European)
ISO-8859-5 Cyrillic
ISO-8859-6 Arabic
ISO-8859-7 Greek
ISO-8859-8 Hebrew
ISO-8859-9 Latin-5 (Turkish)
ISO-8859-10 Latin-6 (Nordic)
ISO-8859-13 Latin-7 (Baltic)
ISO-8859-15 Latin-9 (Western European with euro sign)
KOI8-R Cyrillic
Shift_JIS Japanese Shift JIS (JIS)
TIS_620 Thai (a.k.a. ISO 8859-11)
Unified-Hangul Korean Unified Hangul
UTF-8 and the above names can be used at tocode and fromcode to specify the desired code conversion. The following aliases are also supported as alternative names to be used:
Aliases Original Name 720 CP720 737 CP737 850 CP850 852 CP852 857 CP857 862 CP862 866 CP866 932 CP932 936, CP936 GBK 949, CP949 Unified-Hangul 950, CP950 Big5 1250 CP1250 1251 CP1251 1252 CP1252 1253 CP1253 1254 CP1254 1255 CP1255 1256 CP1256 1257 CP1257 ISO-8859-11 TIS_620 PCK, SJIS Shift_JIS
A conversion descriptor remains valid until it is closed by using kiconv_close().
Upon successful completion, kiconv_open() returns a code conversion descriptor for use on subsequent calls to kiconv(). Otherwise, if the conversion specified by fromcode and tocode is not supported or for any other reasons the code conversion descriptor cannot be allocated, kiconv_open() returns (kiconv_t)-1 to indicate the error.
kiconv_close() can be called from user context only.
The following example shows how to open a code conversion from ISO 8859-15 to UTF-8
#include <sys/sunddi.h>
kiconv_t cd;
cd = kiconv_open("UTF-8", "ISO-8859-15");
if (cd == (kiconv_t)-1) {
/* Cannot open up the code conversion. */
return (-1);
}
See attributes(5) for descriptions of the following attributes:
|
iconv(3C), iconv_close(3C) , iconv_open(3C), u8_strcmp (3C), u8_textprep_str(3C), u8_validate(3C), uconv_u16tou32(3C), uconv_u16tou8(3C), uconv_u32tou16(3C), uconv_u32tou8(3C), uconv_u8tou16(3C) , uconv_u8tou32(3C), attributes (5), kiconv(9F), kiconvstr(9F), kiconv_close(9F), u8_strcmp(9F), u8_textprep_str(9F), u8_validate(9F), uconv_u16tou32(9F), uconv_u16tou8(9F), uconv_u32tou16(9F), uconv_u32tou8(9F), uconv_u8tou16(9F), uconv_u8tou32(9F)
The Unicode Standard
http://www.unicode.org/standard/standard.html
The code conversions are available between UTF-8 and the above noted codesets. For example, to convert from EUC-JP to Shift_JIS, first convert EUC-JP to UTF-8 and then convert UTF-8 to Shift_JIS.
The code conversions supported are based on simple one-to-one mappings. There is no special treatment or processing done during code conversions such as case conversion, Unicode Normalization, or mapping between combining or conjoining sequences of UTF-8 and pre-composed characters in non-UTF-8 codesets.
All supported non-UTF-8 codesets use pre-composed characters only. However, UTF-8 allows combining or conjoining characters too. For this reason, using a form of Unicode Normalizations on UTF-8 text with u8_textprep_str() before or after doing code conversions might be necessary.