A character set determines how a computer's internal character codes (numbers) are mapped to recognizable characters. In most languages, single-byte characters are sufficient for representing an entire character set. However, there are some languages that use thousands of characters. These languages require two, three, or four bytes to represent each character uniquely.
Character sets supported by the Help System are listed in Table 14-1. However, some characters sets may not exist on all platforms.
Table 14-1 Common Desktop Environment Character Sets
Language |
Character Set Name |
Description |
---|---|---|
|
|
|
Western Europe and Americas |
ISO-8859-1 HP-ROMAN8 |
ISO Latin 1 HP Roman |
|
IBM-850 |
PC Multi-lingual |
|
|
|
Central Europe |
ISO-8859-2 |
ISO Latin 2 |
|
|
|
Cyrillic |
ISO-8859-5 |
ISO Latin/Cyrillic |
|
|
|
Arabic |
ISO-8859-6 |
ISO Latin/Arabic |
|
HP-ARABIC8 |
HP Arabic8 |
|
IBM-1046 |
PC Arabic |
|
|
|
Hebrew |
ISO-8859-8 |
ISO Latin/Hebrew |
|
HP-HEBREW8 |
HP Hebrew8 |
|
IBM-856 |
PC Hebrew |
|
|
|
Greek |
ISO-8859-7 |
ISO Latin/Greek |
|
HP GREEK8 |
HP Greek8 |
|
|
|
Turkish |
ISO-8859-9 |
ISO Latin 5 |
|
HP-TURKISH8 |
HP Turkish8 |
|
|
|
Japanese |
EUC-JP |
Japanese EUC (JISX0201, JISX0208, JISX0212) |
|
|
|
|
HP-SJIS |
HP Japanese Shift JIS |
|
HP-KANA8 |
HP Japanese Katakana8 (JISX0201 1976) |
|
IBM-932 |
PC Japanese Shift JIS |
|
|
|
Korean |
EUC-KR |
Korean EUC |
|
|
|
Chinese |
EUC-CN |
Simplified Chinese EUC (China) (GB2312) |
|
EUC-TW |
Traditional Chinese EUC (Taiwan) (CNS 11643.*) |
|
HP-BIG5 |
HP Traditional Chinese Big5 |
|
HP-CCDC |
HP Traditional Chinese CCDC |
|
HP-15CN |
HP Traditional Chinese EUC |
|
|
|
Thai |
TIS-620 |
Thai |
When writing HelpTag files, you may use multibyte characters for any help text. However, the HelpTag markup itself (tag names, entity names, IDs, and so on) must be entered using eight-bit characters