Go to main content

man pages section 3: Basic Library Functions

Exit Print View

Updated: Thursday, June 13, 2019
 
 

cconv_open (3C)

Name

cconv_open - character sequence based code conversion allocation function

Synopsis

#include <iconv.h>
cconv_t cconv_open(const char *tocode, int variant_tocode, const char *fromcode, 
int variant_fromcode, int flag);

Description

The cconv_open() function returns a character sequence based code conversion descriptor that describes a conversion from the codeset specified by the string pointed to by the fromcode argument to the codeset specified by the string pointed to by the tocode argument.

A non-zero value at the variant_tocode argument indicates a variation of the codeset specified by the tocode argument. A non-zero value at the variant_fromcode argument indicates a variation of the codeset specified by the fromcode argument. For the supported variations and the corresponding variant values, refer to alias(5) file and the section 5 man pages shown in the SEE ALSO section. The flag argument indicates the possible conversion behavior modifications constructed by a bitwise-inclusive-OR of the following values:

CCONV_CANONICAL_NAMES

By default, the cconv_open() function normalizes supplied tocode and fromcode names and searches corresponding canonical names and variant values to support aliases as described in alias(5). If this flag value is specified, the function assumes that the names supplied are already canonical names and will not go through the normalization and search process.

If this flag value is not specified, and thus during the normalization and search process, a canonical name and a variant value are found and the variant value from the corresponding variant input argument (the variant_tocode or the variant_fromcode) is not 0, the found canonical name and the variant value from the corresponding input argument are used to locate the code conversion. However, if the variant value from the corresponding input argument is 0, the found variant value is used instead.

CCONV_CONV_ILLEGAL_DISCARD

When specified, during subsequent cconv(3C) code conversion, a sequence of illegal input bytes that does not form a valid character in the codeset specified by the fromcode argument is silently discarded as if there are no such illegal bytes in the input buffer and the conversion continues.

CCONV_CONV_ILLEGAL_REPLACE_HEX

For any illegal input bytes, the cconv(3C) code conversion converts each of such bytes into a hexadecimal number with a specific leading four-letter designator sequence as if the value is a valid input byte and the conversion continues. More specifically, each of such hexadecimal numbers has a leading four-letter designator sequence of "IL--" followed by two hexadecimal digits in uppercase. For example, IL--01 for 1, IL--0A for 10, IL--0B for 11, IL--EF for 239, and so on.

CCONV_CONV_ILLEGAL_RESTORE_HEX

When specified, the cconv(3C) code conversion simply converts back the above mentioned hexadecimal numbers for illegal input bytes into corresponding byte values regardless of the codeset specified by the tocode argument. For example, IL--0A will be converted back into a byte with 10 as the value and IL--FF into 255.

If the characters following the leading four-letter designator sequence do not form a valid hexadecimal number, such a sequence will not be treated as a hexadecimal number for illegal input bytes.

CCONV_CONV_NON_IDENTICAL_DISCARD

During subsequent cconv(3C) code conversion, if the conversion encounters a character in the input buffer in the codeset specified by the fromcode argument that is legal but for which an identical character does not exist in the target codeset specified by the tocode argument, i.e., non-identical characters, the conversion discards such characters in the output buffer instead of doing an implementation-defined conversion.

The number of such conversions are, nonetheless, still counted and returned as the return value of cconv(3C).

CCONV_CONV_NON_IDENTICAL_REPLACE_HEX

For non-identical characters, the cconv(3C) code conversion converts each byte of such characters into a hexadecimal number with a specific leading four-letter designator sequence. More specifically, each of such hexadecimal numbers has a leading four-letter designator sequence of NI-- followed by two hexadecimal digits in uppercase. For example, NI--02 for 2, NI--0C for 12, NI--EF for 239, and so on.

The number of such non-identical characters is counted and returned as the return value of cconv(3C).

CCONV_CONV_NON_IDENTICAL_RESTORE_HEX

When specified, the cconv(3C) code conversion converts back the above mentioned non-identical hexadecimal numbers into corresponding byte values regardless of the codeset specified by the tocode argument. For instance, NI--0B will be converted back into a byte with 11 as the value and NI--FF into 255.

If the characters following the leading four-letter designator sequence do not form a valid hexadecimal number, such a sequence will not be treated as a non-identical hexadecimal number.

CCONV_CONV_NON_IDENTICAL_TRANSLITERATE

For non-identical characters, if applicable, the cconv(3C) code conversion transliterates each of such characters into one or more characters of the target codeset best resembling the input character.

The number of such non-identical characters is counted and returned as the return value of cconv(3C).

When conflicting values are specified together, the values for discarding and then replacing into hexadecimal numbers will supersede other values specified.

For state-dependent encodings, the conversion descriptor will be in a codeset-dependent initial shift state, ready for immediate use with the cconv(3C) function.

Settings of fromcode, tocode, variant_fromcode, variant_tocode, flag, and their permitted combinations are implementation-dependent.

The cconv_open() function supports aliases of the codeset name specified in tocode and fromcode. The relationship between canonical names and aliases and the format of possible external alias table is described in alias(5).

When an empty string ("") or "char" is supplied as the string value for fromcode, tocode, or both, it is interpreted by the function as the codeset name of the current locale. Similarly, when "wchar_t" is supplied, the function interprets it as the wide character encoding of the current locale in the natural byte order of the current processor or as defined by the locale.

A conversion descriptor remains valid in a process until that process closes it.

See the EXAMPLES section of cconv(3C) for examples on how to use the function.

Return Values

Upon successful completion, cconv_open() returns a conversion descriptor for use on subsequent calls to cconv(). Otherwise, cconv_open() returns (cconv_t)-1 and sets errno to indicate the error.

Errors

The cconv_open() function may fail if:

EMFILE

{OPEN_MAX} files descriptors are currently open in the calling process.

ENFILE

Too many files are currently open in the system.

ENOMEM

Insufficient storage space is available.

EINVAL

The conversion specified by fromcode, tocode, variant_fromcode, variant_tocode, and flag is not supported by the implementation.

Files

/usr/lib/iconv/*.bt

cconv code conversion binary table files for iconv(1), cconv(3C), and iconv(3C).

/usr/lib/iconv/alias

Alias table file of codeset names.

Attributes

See attributes(7) for descriptions of the following attributes:

ATTRIBUTE TYPE
ATTRIBUTE VALUE
Interface Stability
Committed
MT-Level
MT-Safe

See Also

geniconvtbl(1), iconv(1), exec(2), cconv(3C), cconv_close(3C), cconvctl(3C), iconv(3C), iconv_close(3C), iconvctl(3C), iconvstr(3C), malloc(3C), nl_langinfo(3C), setlocale(3C), iconv.h(3HEAD), alias(5), geniconvtbl(5), geniconvtbl-cconv(5), attributes(7), iconv_extra(7), iconv_ja(7), iconv_ko(7), iconv_unicode(7), iconv_zh(7), iconv_zh_TW(7), standards(7)

Notes

Portable applications must assume that the conversion descriptors are not valid after a call to one of the exec functions (see exec(2)).