This chapter contains information for making programs backward-compatible with earlier versions of Asian Solaris software. Every utility described is supported. For this version of Solaris, you are encouraged to use the XPG4 internationalization APIs described in International Language Environments Guide.
These utilities test various aspects of the Simplified Chinese (GB-2312-80) national standard character set. They also assume that the character being tested is part of the national standard character set.
The arguments for the functions in these tables must be a character in WC, wchar_t. For more information, see the cctype(3x)man page.
Table 12–1 Simplified Chinese Character Classification Functions
Routine |
Description |
---|---|
ischanzi |
Returns true if it is a Hanzi ideogram in GB-2312-80. |
iscaccent |
Returns true if it is an accent notation in GB-2312-80. |
iscphonetic |
Returns true if it is a phonetic symbol in GB-2312-80. |
iscpinyin |
Returns true if it is a Pinyin symbol in GB-2312-80. |
iscalpha |
Returns true if it is a Roman alphabetic in GB-2312-80. |
iscdigit |
Returns true if it is a Roman digit in GB-2312-80. |
iscnumber |
Returns true if it is a number in GB-2312-80. |
isclower |
Returns true if it is a Roman lowercase in GB-2312-80. |
iscupper |
Returns true if it is a Roman uppercase in GB-2312-80. |
iscblank |
Returns true if it is a white space character from GB-2312-80. |
iscspace |
Returns true if it is a space character from GB-2312-80. |
iscgen |
Returns true if it is a graphic or general symbol in GB-2312-80. |
iscsci |
Returns true if it is a scientific symbol in GB-2312-80. |
iscline |
Returns true if it is a ruled line symbol in GB-2312-80. |
iscunit |
Returns true if it is a unit character in GB-2312-80. |
iscparen |
Returns true if it is a right or left parenthesis in GB-2312-80. |
iscpunct |
Returns true if it is a punctuation character in GB-2312-80. |
iscgreek |
Returns true if it is a Greek character in GB-2312-80. |
iscrussian |
Returns true if it is a Russian character in GB-2312-80. |
iscspecial |
Returns true if it is a Greek or Russian character in GB-2312-80. |
ischira |
Returns true if it is a Japanese Hiragana character in GB-2312-80. |
isckata |
Returns true if it is a Japanese Katakana character in GB-2312-80. |
Two additional routines for Simplified Chinese, iscgb and isceuc, test for characters from the GB-2312-80 character set. The iscgb routine expects a wide character, and isceuc expects a GB-2312-80 character in EUC format. For more information, see the cctype(3x) man page.
Table 12–2 General Simplified Chinese General Character Classification Functions
Routine |
Description |
---|---|
iscgb |
Returns true if it is in GB-2312-80. |
isceuc |
Returns true if it is a GB-2312-80 character in EUC format. |
This section describes functions for wide character and string input and output, character classification, and conversion functions for the Simplified Chinese character sets. Solaris 2.7 software implements a wide character library for handling Simplified Chinese character codes according to industry standards.
Routines that have Chinese language-specific dependency are in their own language-specific library, which is linked with the corresponding C compiler option. Simplified Chinese Solaris libcle is linked with -lcle
Refer to the appropriate man pages for more information.
Asian Solaris software defines WC as a constant-width, four-byte code. WC uses the ANSI C data type wchar_t, which Solaris software defines in wchar.h as follows:
typedef long wchar_h;
In Solaris software, long is four bytes.
The conversion functions described in this section are available, but you should use iconv() as a standard function.
Simplified Chinese Solaris software provides facilities for various conversions, for example:
Characters within a code set, such as converting uppercase ASCII to lowercase.
Between different conventions for national standard character sets, such as GB and EUC.
Between code formats (such as converting between EUC and WC).
Programs using the general multibyte conversion utilities should include the header files widec.h and wctype.h. Simplified Chinese Solaris specific routines (such as iscxxx) are declared in zh/xctype.h.
Programs using general multibyte conversion utilities should include three header files: wctype.h, widec.h, and zh/xctype.h.
The locale/xctype.h file declares the Chinese locale-specific routines, which have names of the form iscxxxx:
As with the classification functions described in the previous section, the use of these functions can be controlled by the setlocale function (described elsewhere in this and other chapters).
Locale-specific conversion routines (such as Chinese cgbtoeuc) are contained in the libcle library:
This library can be linked during compilation using the C compiler option -lcle.
The multibyte conversion functions are similar to the one-byte conversion functions toupper and tolower. These functions convert wide-characters to other wide characters. For more information on conversion routines, see the man pages for wconv(3) and cconv(3).
The following routines are in the regular Chinese C library.
Table 12–3 Simplified Chinese Case Conversion Functions (declared in zh/xctype.h)
Function |
Description |
---|---|
tocupper |
Converts code set1 Roman lowercase to uppercase |
toclower |
Converts code set1 Roman uppercase to lowercase |
In the Simplified Chinese character sets, the Roman characters and numbers in code set 0 are repeated in code set 1. The following functions test wide characters.
Table 12–4 Simplified Chinese Code Set Conversion Functions
Function |
Description |
---|---|
atocgb |
Converts alphabetic or numeric characters in ASCII (code set0) to the corresponding characters in GB-2312-80 (code set1). |
cgbtoa |
Converts alphabetic or numeric characters in GB-2312-80 (code set1) to the corresponding characters in ASCII (code set0). |
For further information on these functions, see the man page for cconv()(3x).
The following routines do character-based code conversion on the GB-2312-80 character set. They convert characters and strings between EUC format and GB-2312-80 format. To use these routines, the library libcle must be linked using the C compiler option -lcle. For further information, see the cconv(3) man page.
Table 12–5 Simplified Chinese Character-Based Functions
Function |
Description |
---|---|
cgbtoeuc |
Converts a character in GB-2312-80 format (7 bit) to EUC format |
scgbtoeuc |
Converts a string in GB-2312-80 format (7 bit) to EUC format |
sncgbtoeuc |
Converts part of a string in GB-2312-80 format (7 bit) to EUC format |
euctocgb |
Converts a character in EUC format to GB-2312-80 format (7 bit) |
seuctocgb |
Converts a string in EUC format in GB-2312-80 format (7 bit) |
sneuctocgb |
Converts a part of a string in EUC to GB-2312-80 format (7 bit) |
Applications compiled under Chinese OpenWindows 2.x or Solaris 1.x or SunOS 4.x systems have different binary formats than the current Chinese Solaris release. Older applications can nevertheless be run under the current Chinese release without being recompiled by using its included binary compatibility package (BCP).
SUNWowbcp must be included in your system configuration in order for you to run the following commands. See your system administrator for installation.
The following BCP command runs the compiled binary code of earlier SunOS4.x, Solaris 1.x, or Chinese OpenWindows 2.x applications without recompilation, However, OpenWindows V2 Chinese applications will display no input server status region. As shown in the following examples, the command calls the application by its old name (old_application_name) and sets the basic locale, input language, and display language using the older version's specific locale name (old-locale):
system% old_application_name -lc_basiclocale old-locale -lc_inputlang old-locale \ -lc_displaylang old-locale |
The following example shows the command for running the compiled binary code of an earlier version of the textedit application in the current Simplified Chinese Solaris environment:
system% textedit -lc_displaylang chinese -lc_basiclocale chinese \ -lc_inputlang chinese |
Due to incompatibilities between Simplified Chinese Solaris 2.x and 1.x applications, you cannot cut and paste Chinese characters between them.