man pages section 3: Basic Library Functions

Exit Print View

Updated: July 2014
 
 

wctype_ja (3C)

Name

wctype_ja - Define a character class for the Japanese locale

Synopsis

#include <wchar.h> 

wctype_t wctype(const char *
charclass);

Description

wctype() builds values in the wctype_t data type according to the specification with the charclass argument to determine wide character classes. iswctype() is used for actual determination. wctype() returns arguments that wctype() needs to use.

The following character class names are defined in every locale:


alnum		alpha		blank		cntrl
digit		graph		lower		print
punct		space		upper		xdigit

In addition to the above character classes, the Japanese locale (ja, ja_JP.eucJP, ja_JP.PCK and ja_JP.UTF-8) defines the following character classes specific to the Japanese locale.


jkanji      jkata      hira      jdigit
jparen      line       jisx0201r jisx0208
jisx0212    udc        vdc

The following character classes are supported in ja and ja_JP.eucJP locales only:


jalpha      jspecial    jgreek  jrussian
junit       jsci        jgen    jpunct

The following character classes are supported in ja_JP.eucJP and ja_JP.UTF-8 locale only:


ascii      paren      jisx0201
gaiji      jhankana   jspace

These can be also used as charclass arguments to wctype(). However, the use of these classes are limited to applications for the Japanese locale only:

upper

Character class that represents any uppercase letter

JIS X 0201 Roman character graphic set

Alphabet uppercase letters (C/1–D/10)

JIS X 0208

Roman character uppercase letters (3/33–3/58)

Greek character uppercase letters (6/1–24)

Russian character uppercase letters (7/1–33)

JIS X 0212

Greek alphabet uppercase letters with diacritical marks (6/65–69, 71, 73, 74, 76)

Cyrillic alphabet uppercase letters (7/34-46)

Latin alphabet uppercase letters (9/1, 2, 4, 6, 8, 9, 11, 12, 13, 15, 16)

Latin alphabet uppercase letters with diacritical marks (10/01–24, 26–87)

lower

Character class that represents any lowercase letter

JIS X 0201 Roman character graphic set

Alphabet lowercase letters (E/1–F/10)

JIS X 0208

Roman character lowercase letters (3/65–90)

Greek character lowercase letters (6/33–56)

Russian character lowercase letters (7/49–81)

JIS X 0212

Greek alphabet lowercase letters with diacritical marks (6/81–92)

Cyrillic alphabet lowercase letters (7/82–94)

Latin alphabet lowercase letters (9/33–48)

Latin alphabet lowercase letters with diacritical marks (11/1–27, 29–35, 37–87)

digit

Class that determines the numbers 0 to 10 for decimal representation.

JIS X 0201 Roman character graphic set

Numbers (B/0–9)

space

Class that determines a space.

JIS X 0201 Control character set

Space (A/9–13)

Space characters

JIS X 0208

Space (1/1)

punct

Class that determines symbols and special characters.

JIS X 0201 Roman character graphic set

A/1–15, B/10–C/0, D/11–E/0, F/11–14

cntrl

Class that determines control characters.

JIS X 0201 Control character set

All characters

DELETE character

 

C1 control characters

All characters

blank

Class that determines field delimiters.

JIS X 0201 Control character set

A/9

Space characters

JIS X 0208

Space (1/1)

xdigit

Class that determines alphanumerics used for hexadecimal representation.

JIS X 0201 Roman character graphic set

Numbers (B/0–9)

A–F, a–f (C/1–6, E/1–6)

alpha

Class that determines alphabets.

upper class and lower class letters

print

Class that determines printable characters.

JIS X 0201 Roman character graphic set

Space characters

JIS X 0201 Katakana character graphic set

All the characters except in character undefined areas

JIS X 0208

All the characters except in character undefined areas

JIS X 0212

All the characters except in character undefined areas

Vendor-defined character areas

All the characters except in character undefined areas in vdc class.

User-defined character areas

All the characters including character undefined areas in udc class.

graph

Class that determines graphic characters.

All the characters in print class except those in space class.

jkanji

Class that determines Kanji (symbol or ideographic characters used for Kanji representation).

JIS X 0208

Character defined areas from Ku 16 to Ku 84.

JIS X 0212

Character defined areas from Ku 16 to Ku 77.

jkata

Class that determines Katakana.

JIS X 0208

5/1–86, 1/11, 12, 19, 20

jhira

Class that determines Hiragana.

JIS X 0208

4/1–83, 1/11, 12, 21, 22, 26

jdigit

Class that determines numbers except in digit.

JIS X 0208

3/16–25

jparen

Class that determines characters such as parentheses.

JIS X 0208

1/38–59

line

Class that determines ruled line primitives.

JIS X 0208

8/1–32

jisx0201r

Class that determines characters included in JIS X 0201 Katakana character graphic set.

JIS X 0201 Katakana character graphic set

All the characters from A/1 to D/15.

jisx0208

Class that determines characters included in JIS X 0208.

All the characters including those in JIS X 0208 character undefined areas. From Ku 1 to Ku 84 (Ku 13 Vendor-defined character area is included).

jisx0212

Class that determine characters included in JIS X 0212.

All the characters including those in JIS X 0212 character undefined areas. From Ku 1 to Ku 84 (Ku 83 and 84 Vendor-defined character areas are also included). No characters in ja_JP.PCK locale are included in this class.

udc

Class that determines user-defined characters.

All the characters including those in character undefined areas in the user-defined character area.

ja locale
User-defined characters (Ku 1–20)

0xf5a1–0xfefe

0x8ff5a1–0x8ffefe

ja_JP.PCK locale
User-defined characters (Ku 1–20)

0xf040–0xf9fc

ja_JP.UTF-8 locale
User-defined characters (6400 characters)

0xe000–0xf8ff

vdc

Class that determines vendor-defined characters.

All the characters including those in character undefined areas in the vendor-defined character area.

ja and ja_JP.eucJP locale

JIS X 0208 Ku 13: Special symbols

JIS X 0212 Ku 83 – 84

IBM Extended characters not included in JIS X 0212.

ja_JP.PCK locale

JIS X 0208 Ku 13: Special symbols

NEC-selective IBM Extended characters 0xed40–0xeffc

IBM Extended characters: 0xfa40–0xfcfc

ja_JP.UTF-8 locale

Not defined

jalpha

Class that determines alphabet letters.

JIS X 0208

3/33–58, 3/65–90

jspecial

Class that determines special symbol characters.

JIS X 0208

1/2–94, 2/1–14, 2/26–33, 2/42–48, 2/60–74, 2/82–89, 94

JIS X 0212

2/15–25, 2/34–36, 2/75–81

JIS X 0208 Ku 13: Special symbols

IBM Extended characters

Special characters defined by NEC-selective IBM Extended characters

jgreek

Class that determines Greek characters.

JIS X 0208

6/1–24, 6/33–56

jrussian

Class that determines Russian characters.

JIS X 0208

7/1–7/33, 7/49–81

junit

Class that determines unit symbols.

JIS X 0208

1/75–83, 2/82, 83

JIS X 0212

2/80

jsci

Class that detemines scientific symbols.

JIS X 0208

1/60–74, 2/26–33, 2/42–48, 2/60–74

jgen

Class that determines general symbols.

JIS X 0208

1/84–94, 2/1–14, 2/84–89, 94

JIS X 0212

2/35, 75, 2/79–81

jpunct

Class that determines punctuation symbols.

JIS X 0208

1/2–37

JIS X 0212

2/34, 36

ascii

Class that determines JIS X 0201 Functional character set, Space characters, Roman character graphic set, and Kill characters.

paren

Class that determines characters such as parentheses.

jisx0201

Class that determines characters included in JIS X 0212.

gaiji

Class that determines implemented defined characters. udc and vdc classes are included.

jhankana

Class that determines characters used for Japanese representation included in JIS X 0212.

jspace

Class that determines space characters included in JIS X 0208 and JIS X 0212.

XX/YY in JIS X 0201 Functional character set, Roman character graphic set, and Katakana character graphic set denotes Column XX and Row YY. XX/YY in JIS X 0208 and JIS X 0212 denotes Ku XX and Point YY.

In case of JIS X 0212 characters, this rule only applies to ja or ja_JP.UTF-8 locale.

Example 1 Determining if a Wide Character Is Included in a Class

The following example shows how to determine if the wide character wc is included in udc class.

iswctype(wc, wctype("udc"))

Attributes

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE
ATTRIBUTE VALUE
Availability
system/locale for ja_JP.UTF-8, system/locale/extra for ja_JP.eucJP and ja_JP.PCK

See Also

iswctype(3C), wctype(3C), wctrans_ja(3C), eucJP (5), PCK(5)