Internationalizing and Localizing Applications in Oracle Solaris

Exit Print View

Updated: July 2014
 
 

Wide-Character Classification Functions

The following functions are used for classification of wide-characters and return a non-zero value for TRUE, and 0 for FALSE. These functions check the given wide character against named character classes, such as alpha, lower, or jkana, which are defined in the LC_CTYPE category of the current locale. Therefore, these functions are locale sensitive.

iswalpha()

Test for an alphabetic wide-character

iswalnum()

Test for an alphanumeric wide character

iswascii()

Test whether a wide character represents a 7-bit US-ASCII character

iswblank()

Test for a blank wide character

iswcntrl()

Test for a control wide character

iswdigit()

Test for a decimal digit wide character

iswgraph()

Test for a visible wide character

iswlower()

Test for a lowercase letter wide character

iswprint()

Test for a printable wide character

iswpunct()

Test for a punctuation wide character

iswspace()

Test for a white-space wide character

iswupper()

Test for an uppercase letter wide character

iswxdigit()

Test for a hexadecimal digit wide character

isenglish()

Test for a wide character representing an English language character, excluding US-ASCII characters

isideogram()

Test for a wide character representing an ideographic language character, excluding US-ASCII characters

isnumber()

Test for wide character representing digit, excluding US-ASCII characters

isphonogram()

Test for a wide character representing a phonetic language character, excluding US-ASCII characters

isspecial()

Test for a wide character representing a special language character, excluding US-ASCII characters

The following character classes are defined in all the locales:

  • alnum

  • alpha

  • blank

  • cntrl

  • digit

  • graph

  • lower

  • print

  • punct

  • space

  • upper

  • xdigit

The isenglish(), isideogram(), isnumber(), isphonogram(), and isspecial() are legacy Oracle Solaris specific wide-character classification functions. The character classes for these functions are defined only in the following Asian locales: ko_KR.EUC, zh_CN.EUC, zh_CN.GBK, zh_CN.GB18030, zh_HK.BIG5HK, zh_TW.BIG5, and zh_TW.EUC and their variants. The return values will always be false when used in other locales including Unicode locales.

You can to query for a specific character class in a generic way by using the following functions:

wctype()

Define character class

iswctype()

Test character for specified class

Example 2-11  Querying Character Class of a Wide Character

In the following example, calls to the iswctype() and wctype() functions are used to check whether the given Unicode character belongs to the jhira character class . The jhira character class is from Japanese Hiragana script.

  wint_t  wc;
  int     ret;

  setlocale(LC_ALL, "ja_JP.UTF-8");

  /* "\xe3\x81\xba" is UTF-8 for HIRAGANA LETTER PE */
  ret = mbtowc(&wc, "\xe3\x81\xba", 3);
  if (ret == (size_t)-1) {
          /* Invalid character sequence. */
          :
  }

  if (iswctype(wc, wctype("jhira"))) {
          wprintf(L"'%c' is a hiragana character.\n", wc);
  }

The example will produce the following output:

ぺ is a hiragana character.