International Language Environments Guide for Oracle® Solaris 11.2

Exit Print View

Updated: July 2014
 
 

JPRS idnkit-2 Library

The idnkit-2 library is an open-source IDN implementation with idnkit-2 JPRS Public License. The dedicated idnkit-2 conversion utility idnconv(1) provides IDN conversions with various options. For more information on the options to control the conversion details, see the idnconv(1) man page.

Oracle Solaris 11 also supports IDN conversions through the iconv(3C) interface by leveraging the conversion routines in libidnkit (3). The iconv(1) utility can also be used for the conversions between ACE and UTF-8, as shown in following table.

Since the IDNA2008 explicitly defines terminologies for two operational modes, lookup and registration, we will also supply corresponding iconv code conversion name aliases, IDNA2008-LOOKUP (an alias to ACE-ALLOW-UNASSIGNED) and IDNA2008-REGIST (an alias to ACE).

Table 6-1  iconv IDN Code Conversions
From Code
To Code
ACE or IDNA2008-REGIST
UTF-8
ACE-ALLOW-UNASSIGNED or IDNA2008-LOOKUP
UTF-8
UTF-8
ACE or IDNA2008-REGIST
UTF-8
ACE-ALLOW-UNASSIGNED or IDNA2008-LOOKUP

The ACE and the ACE-ALLOW-UNASSIGNED iconv code conversion names (and their aliases) have the following meanings:

  • ACE or IDNA2008-REGIST

    ACE is a fromcode or tocode name that can be used in iconv code conversions to refer to the ASCII Compatible Encoding defined in RFC 5890. This conversion uses STD3 ASCII rules. Unassigned characters are not allowed. ACE is typically used for storing or giving host or domain names to machines.

  • ACE-ALLOW-UNASSIGNED or IDNA2008-LOOKUP

    ACE-ALLOW-UNASSIGNED performs the same operations as ACE except that ACE-ALLOW-UNASSIGNED allows unassigned characters. ACE-ALLOW-UNASSIGNED is typically used for query purpose.

The following example shows a conversion from ACE to UTF-8 with input from the hostnames.txt file. Output goes to standard output.

$ iconv -f ACE -t UTF-8 hostnames.txt

For information about idnkit -2 library and iconv code conversions, see the libidnkit(3) and iconv_en_US.UTF-8(5) man pages.