wctype_ja - Define a character class for the Japanese locale
#include <wchar.h> wctype_t wctype(const char * charclass);
wctype() builds values in the wctype_t data type according to the specification with the charclass argument to determine wide character classes. iswctype() is used for actual determination. wctype() returns arguments that wctype() needs to use.
The following character class names are defined in every locale:
alnum alpha blank cntrl digit graph lower print punct space upper xdigit
In addition to the above character classes, the Japanese locale (ja, ja_JP.eucJP, ja_JP.PCK and ja_JP.UTF-8) defines the following character classes specific to the Japanese locale.
jkanji jkata hira jdigit jparen line jisx0201r jisx0208 jisx0212 udc vdc
The following character classes are supported in ja and ja_JP.eucJP locales only:
jalpha jspecial jgreek jrussian junit jsci jgen jpunct
The following character classes are supported in ja_JP.eucJP and ja_JP.UTF-8 locale only:
ascii paren jisx0201 gaiji jhankana jspace
These can be also used as charclass arguments to wctype(). However, the use of these classes are limited to applications for the Japanese locale only:
Character class that represents any uppercase letter
Alphabet uppercase letters (C/1–D/10)
Roman character uppercase letters (3/33–3/58)
Greek character uppercase letters (6/1–24)
Russian character uppercase letters (7/1–33)
Greek alphabet uppercase letters with diacritical marks (6/65–69, 71, 73, 74, 76)
Cyrillic alphabet uppercase letters (7/34-46)
Latin alphabet uppercase letters (9/1, 2, 4, 6, 8, 9, 11, 12, 13, 15, 16)
Latin alphabet uppercase letters with diacritical marks (10/01–24, 26–87)
Character class that represents any lowercase letter
Alphabet lowercase letters (E/1–F/10)
Roman character lowercase letters (3/65–90)
Greek character lowercase letters (6/33–56)
Russian character lowercase letters (7/49–81)
Greek alphabet lowercase letters with diacritical marks (6/81–92)
Cyrillic alphabet lowercase letters (7/82–94)
Latin alphabet lowercase letters (9/33–48)
Latin alphabet lowercase letters with diacritical marks (11/1–27, 29–35, 37–87)
Class that determines the numbers 0 to 10 for decimal representation.
Numbers (B/0–9)
Class that determines a space.
Space (A/9–13)
Space characters
Space (1/1)
Class that determines symbols and special characters.
A/1–15, B/10–C/0, D/11–E/0, F/11–14
Class that determines control characters.
All characters
All characters
Class that determines field delimiters.
A/9
Space characters
Space (1/1)
Class that determines alphanumerics used for hexadecimal representation.
Numbers (B/0–9)
A–F, a–f (C/1–6, E/1–6)
Class that determines alphabets.
upper class and lower class letters
Class that determines printable characters.
Space characters
All the characters except in character undefined areas
All the characters except in character undefined areas
All the characters except in character undefined areas
All the characters except in character undefined areas in vdc class.
All the characters including character undefined areas in udc class.
Class that determines graphic characters.
All the characters in print class except those in space class.
Class that determines Kanji (symbol or ideographic characters used for Kanji representation).
Character defined areas from Ku 16 to Ku 84.
Character defined areas from Ku 16 to Ku 77.
Class that determines Katakana.
5/1–86, 1/11, 12, 19, 20
Class that determines Hiragana.
4/1–83, 1/11, 12, 21, 22, 26
Class that determines numbers except in digit.
3/16–25
Class that determines characters such as parentheses.
1/38–59
Class that determines ruled line primitives.
8/1–32
Class that determines characters included in JIS X 0201 Katakana character graphic set.
All the characters from A/1 to D/15.
Class that determines characters included in JIS X 0208.
All the characters including those in JIS X 0208 character undefined areas. From Ku 1 to Ku 84 (Ku 13 Vendor-defined character area is included).
Class that determine characters included in JIS X 0212.
All the characters including those in JIS X 0212 character undefined areas. From Ku 1 to Ku 84 (Ku 83 and 84 Vendor-defined character areas are also included). No characters in ja_JP.PCK locale are included in this class.
Class that determines user-defined characters.
All the characters including those in character undefined areas in the user-defined character area.
0xf5a1–0xfefe
0x8ff5a1–0x8ffefe
0xf040–0xf9fc
0xe000–0xf8ff
Class that determines vendor-defined characters.
All the characters including those in character undefined areas in the vendor-defined character area.
JIS X 0208 Ku 13: Special symbols
JIS X 0212 Ku 83 – 84
IBM Extended characters not included in JIS X 0212.
JIS X 0208 Ku 13: Special symbols
NEC-selective IBM Extended characters 0xed40–0xeffc
IBM Extended characters: 0xfa40–0xfcfc
Not defined
Class that determines alphabet letters.
3/33–58, 3/65–90
Class that determines special symbol characters.
1/2–94, 2/1–14, 2/26–33, 2/42–48, 2/60–74, 2/82–89, 94
2/15–25, 2/34–36, 2/75–81
IBM Extended characters
Special characters defined by NEC-selective IBM Extended characters
Class that determines Greek characters.
6/1–24, 6/33–56
Class that determines Russian characters.
7/1–7/33, 7/49–81
Class that determines unit symbols.
1/75–83, 2/82, 83
2/80
Class that detemines scientific symbols.
1/60–74, 2/26–33, 2/42–48, 2/60–74
Class that determines general symbols.
1/84–94, 2/1–14, 2/84–89, 94
2/35, 75, 2/79–81
Class that determines punctuation symbols.
1/2–37
2/34, 36
Class that determines JIS X 0201 Functional character set, Space characters, Roman character graphic set, and Kill characters.
Class that determines characters such as parentheses.
Class that determines characters included in JIS X 0212.
Class that determines implemented defined characters. udc and vdc classes are included.
Class that determines characters used for Japanese representation included in JIS X 0212.
Class that determines space characters included in JIS X 0208 and JIS X 0212.
XX/YY in JIS X 0201 Functional character set, Roman character graphic set, and Katakana character graphic set denotes Column XX and Row YY. XX/YY in JIS X 0208 and JIS X 0212 denotes Ku XX and Point YY.
In case of JIS X 0212 characters, this rule only applies to ja or ja_JP.UTF-8 locale.
Example 1 Determining if a Wide Character Is Included in a ClassThe following example shows how to determine if the wide character wc is included in udc class.
iswctype(wc, wctype("udc"))
See attributes(5) for descriptions of the following attributes:
|
iswctype(3C) , wctype(3C) , wctrans_ja(3C) , eucJP(5), PCK(5)