man pages section 3: Basic Library Functions

Exit Print View

Updated: July 2014
 
 

iconv_open(3C)

Name

iconv_open - code conversion allocation function

Synopsis

#include <iconv.h>

iconv_t iconv_open(const char *
tocode, const char *
fromcode);

Description

The iconv_open() function returns a conversion descriptor that describes a conversion from the codeset specified by the string pointed to by the fromcode argument to the codeset specified by the string pointed to by the tocode argument. For state-dependent encodings, the conversion descriptor will be in a codeset-dependent initial shift state, ready for immediate use with the iconv(3C) function.

Settings of fromcode and tocode and their permitted combinations are implementation-dependent.

The iconv_open() function supports the alias of the encoding name specified in tocode and fromcode. The alias table of the encoding name is described in the file /usr/lib/iconv/alias. See alias(4).

When an “” (empty string) or char is supplied as the string value for fromcode argument, tocode argument, or both, it is interpreted by the function as the codeset name of the current locale. Similarly, when wchar_t is supplied, the function interprets it as the wide character encoding of the current locale in the natural byte order of the current processor or as defined by the locale.

When one or more of the following indicators are appended to the string values pointed to by the arguments, code conversion behavior will be modified as specified at below:

//ILLEGAL_DISCARD

When specified, during subsequent iconv() code conversion, a sequence of illegal input bytes that does not form a valid character in the codeset specified by the fromcode argument is silently discarded as if there are no such illegal bytes in the input buffer and the conversion continues.

//ILLEGAL_REPLACE_HEX

For any illegal input bytes, the iconv() code conversion converts each of such bytes into a hexadecimal number with a specific leading four-letter designator sequence as if such is a valid input byte and the conversion continues. More specifically, each of such hexadecimal numbers has a leading four-letter designator sequence of “IL--” followed by two hexadecimal digits in uppercase, for instance, “IL--01” for 1, “IL--0A” for 10, “IL--0B” for 11, “L--EF” for 239, and so on.

//ILLEGAL_RESTORE_HEX

When specified, the iconv() code conversion simply converts back the above mentioned hexadecimal numbers for illegal input bytes into corresponding byte values regardless of the codeset specified by the tocode argument. For instance, “IL--0A” will be converted back into a byte with 10 as the value and “IL--FF” into 255.

If the characters following the leading four-letter designator sequence do not form a valid hexadecimal number, such a sequence will not be treated as a hexadecimal number for illegal input bytes.

//NON_IDENTICAL_DISCARD

During subsequent iconv() code conversion, if the conversion encounters a character in the input buffer in the codeset specified by the fromcode argument that is legal but for which an identical character does not exist in the target codeset specified by the tocode argument, i.e., non-identical characters, the conversion discards such characters in the output buffer instead of doing an implementation-defined conversion.

The number of such conversions are, nonetheless, still counted and returned as the return value of iconv().

//NON_IDENTICAL_REPLACE_HEX

For non-identical characters, the iconv() code conversion converts each byte of such characters into a hexadecimal number with a specific leading four-letter designator sequence. More specifically, each of such hexadecimal numbers has a leading four-letter designator sequence of “NI--” followed by two hexadecimal digits in uppercase, for instance, “NI--02” for 2, “NI--0C” for 12, “NI--EF” for 239, and so on.

The number of such non-identical characters are counted and returned as the return value of iconv().

//NON_IDENTICAL_RESTORE_HEX

When specified, the iconv() code conversion converts back the above mentioned non-identical hexadecimal numbers into corresponding byte values regardless of the codeset specified by the tocode argument. For instance, “NI--0B” will be converted back into a byte with 11 as the value and “NI--FF” into 255.

If the characters following the leading four-letter designator sequence do not form a valid hexadecimal number, such a sequence will not be treated as a non-identical hexadecimal number.

//NON_IDENTICAL_TRANSLITERATE

For non-identical characters, if applicable, the iconv() code conversion transliterates each of such characters into one or more characters of the target codeset best resembling the input character.

The number of such non-identical characters are counted and returned as the return value of iconv().

//IGNORE

A convenience alias to //NON_IDENTICAL_DISCARD //ILLEGAL_DISCARD indicators.

//REPLACE_HEX

A convenience alias to //NON_IDENTICAL_REPLACE_HEX //ILLEGAL_REPLACE_HEX indicators.

//RESTORE_HEX

A convenience alias to //NON_IDENTICAL_RESTORE_HEX //ILLEGAL_RESTORE_HEX indicators.

//TRANSLIT

A convenience alias to //NON_IDENTICAL_TRANSLITERATE indicator.

When conflicting indicators are specified, one specified right-most within an argument and at tocode argument if specified at both arguments will override preceding indicators. As an example, in the following:

cd = iconv_open("UTF-8//IGNORE//REPLACE_HEX", "ISO8859-1//ILLEGAL_REPLACE_HEX");

Among the three indicators specified, the //REPLACE_HEX will be honored. For more details on the associated error numbers and function return values at iconv(), see iconv(3C).

A conversion descriptor remains valid in a process until that process closes it.

For examples using the iconv_open() function, see the Examples section below and iconv(3C).

Return Values

Upon successful completion iconv_open() returns a conversion descriptor for use on subsequent calls to iconv(). Otherwise, iconv_open() returns (iconv_t) −1 and sets errno to indicate the error.

Errors

The iconv_open function may fail if:

EMFILE

{OPEN_MAX} files descriptors are currently open in the calling process.

ENFILE

Too many files are currently open in the system.

ENOMEM

Insufficient storage space is available.

EINVAL

The conversion specified by fromcode and tocode is not supported by the implementation.

Examples

Example 1 Use iconv_open() to open a simple code conversion.
#include <stdio.h>
#include <errno.h>
#include <iconv.h>

    :
iconv_t cd;
    :

/* Open an iconv code conversion from ISO 8859-1 to UTF-8. */
cd = iconv_open("UTF-8", "ISO8859-1");
if (cd == (iconv_t)-1) {
    (void) fprintf(stderr, "iconv_open(UTF-8, ISO8859-1) failed.\n");
    return (1);
}

Example 2 Change conversion behavior by supplying conversion behavior modification indicators.
#include <stdio.h>
#include <errno.h>
#include <iconv.h>

    :
iconv_t cd;
    :

/*
 * Open an iconv code conversion from UTF-8 to ISO 8859-1 with
 * conversion behavior modification indicators that will remove
 * illegal byte sequences and replace non-identicals into hexadecimal
 * number strings.
 */

cd = iconv_open("ISO8859-1//ILLEGAL_DISCARD//NON_IDENTICAL_REPLACE_HEX",
    "UTF-8");
if (cd == (iconv_t)-1) {
    (void) fprintf(stderr, "iconv_open(UTF-8, ISO8859-1) failed.\n");
    return (1);
}

Files

/usr/lib/iconv/alias

alias table file of the encoding name

Attributes

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE
ATTRIBUTE VALUE
Interface Stability
Committed
MT-Level
MT-Safe with exceptions.
Standard

The iconv_open() function is MT-Safe with exception if and only if fromcode, tocode , or both arguments are pointing to a value that is “” (empty string), char, or wchar_t since the function will have to call nl_langinfo(3C) to know the codeset of the current locale in such cases. See Attributes and Notes sections of setlocale(3C) for more detail. Otherwise, it is fully MT-Safe.

See Also

exec(2) , iconv(3C), iconv_close(3C), malloc(3C), nl_langinfo(3C), setlocale(3C), alias(4), attributes(5) , standards (5)

Notes

The iconv_open() function uses malloc(3C) to allocate space for internal buffer areas. iconv_open() may fail if there is insufficient storage space to accommodate these buffers.

Portable applications must assume that conversion descriptors are not valid after a call to one of the exec functions (see exec(2) ).

Individually, depending on the actual implementation of a code conversion, it is possible that one or more (including all) conversion behavior modification indicators are not supported by the code conversion and iconv_open() may fail.