Common Desktop Environment: Help System Author's and Programmer's Guide

Chapter 14 Native Language Support

This chapter identifies files used by the Help System that require modification when a help volume is provided in multiple languages.

Internationalized Online Help

If your product is intended for an international audience, then providing online help in the user's native language is important. The Help System supports the authoring and displaying of online help in virtually any language.

When you process a help volume to create run-time help files, the HelpTag software must be told what language and character set you used to author your files. The language and character set information is also used to determine the proper fonts for displaying the help volume.

Internationalization Factors

Several factors, which are explained in the following section, contribute to providing online help in the user's native language.

Character Sets and Multibyte Characters

A character set determines how a computer's internal character codes (numbers) are mapped to recognizable characters. In most languages, single-byte characters are sufficient for representing an entire character set. However, there are some languages that use thousands of characters. These languages require two, three, or four bytes to represent each character uniquely.

Character sets supported by the Help System are listed in Table 14-1. However, some characters sets may not exist on all platforms.

Table 14-1 Common Desktop Environment Character Sets

Language 

Character Set Name 

Description 

 

 

 

Western Europe and Americas 

ISO-8859-1 HP-ROMAN8 

ISO Latin 1 HP Roman  

 

IBM-850 

PC Multi-lingual 

 

 

 

Central Europe 

ISO-8859-2 

ISO Latin 2  

 

 

 

Cyrillic 

ISO-8859-5 

ISO Latin/Cyrillic 

 

 

 

Arabic 

ISO-8859-6 

ISO Latin/Arabic 

 

HP-ARABIC8 

HP Arabic8 

 

IBM-1046 

PC Arabic 

 

 

 

Hebrew 

ISO-8859-8 

ISO Latin/Hebrew 

 

HP-HEBREW8 

HP Hebrew8 

 

IBM-856 

PC Hebrew 

 

 

 

Greek 

ISO-8859-7 

ISO Latin/Greek 

 

HP GREEK8 

HP Greek8 

 

 

 

Turkish 

ISO-8859-9 

ISO Latin 5  

 

HP-TURKISH8 

HP Turkish8 

 

 

 

Japanese 

EUC-JP 

Japanese EUC (JISX0201, JISX0208, JISX0212) 

 

 

 

 

HP-SJIS 

HP Japanese Shift JIS 

 

HP-KANA8 

HP Japanese Katakana8 (JISX0201 1976) 

 

IBM-932 

PC Japanese Shift JIS 

 

 

 

Korean 

EUC-KR 

Korean EUC 

 

 

 

Chinese 

EUC-CN 

Simplified Chinese EUC (China) (GB2312) 

 

EUC-TW 

Traditional Chinese EUC (Taiwan) (CNS 11643.*) 

 

HP-BIG5 

HP Traditional Chinese Big5 

 

HP-CCDC 

HP Traditional Chinese CCDC 

 

HP-15CN 

HP Traditional Chinese EUC 

 

 

 

Thai 

TIS-620 

Thai 

When writing HelpTag files, you may use multibyte characters for any help text. However, the HelpTag markup itself (tag names, entity names, IDs, and so on) must be entered using eight-bit characters

Language and Territory Names

When choosing a language, you select both a character set and a language and territory name. The language and territory name is used to accommodate variations, such as currency and date format, for a given country or region.

The language and territory names supported by the Help System are listed in the following table. Before you choose a language, refer to your system documentation to identify the languages and character sets supported on your platform.

Table 14-2 Help System Language and Territory Names

Languages 

Language/Territory Name 

Language, Territory 

 

 

 

Standards compliance 

 

 

 

 

POSIX 

Western Europe/Americas 

 

 

 

da_DK 

Danish, Denmark 

 

de_AT 

German, Austria 

 

de_CH 

German, Switzerland 

 

de_DE 

German, Germany 

 

en_AU 

English, Australia 

 

en_CA 

English, Canada 

 

en_DK 

English, Denmark 

 

en_GB 

English, U.K. 

 

en_IE 

English, Ireland 

 

en_MY 

English, Malaysia 

 

 

 

 

en_NZ 

English, New Zealand 

 

en_US 

English, USA 

 

es_AR 

Spanish, Argentina 

 

es_BO 

Spanish, Bolivia 

 

es_CL 

Spanish, Chile 

 

es_CO 

Spanish, Columbia 

 

es_CR 

Spanish, Costa Rica 

 

es_EC 

Spanish, Ecuador 

 

es_ES 

Spanish, Spain 

 

es_GT 

Spanish, Guatemala 

 

es_MX 

Spanish, Mexico 

 

es_PE 

Spanish, Peru 

 

es_UR 

Spanish, Uruguay 

 

es_VE 

Spanish, Venezuela 

 

et_EE 

Estonian, Estonia 

 

fi_FI 

Finnish, Finland 

 

fo_FO 

Faroese, Faeroe Island 

 

fr_BE 

French, Belgium 

 

fr_CA 

French, Canada 

 

fr_CH 

French, Switzerland 

 

fr_FR 

French, France 

 

is_IS 

Icelandic, Iceland 

 

it_CH 

Italian, Switzerland 

 

it_IT 

Italian, Italy 

 

kl_GL 

Greenlandic, Greenland 

 

lt_LT 

Lithuanian, Lithuania 

 

lv_LV 

Latvian, Latvia 

 

nl_BE 

Dutch, Belgium 

 

nl_NL 

Dutch, The Netherlands 

 

no_NO 

Norwegian, Norway 

 

pt_BR 

Portuguese, Brazil 

 

pt_PT 

Portuguese, Portugal 

 

sv_FI 

Swedish, Finland 

 

sv_SE 

Swedish, Sweden 

Central Europe 

 

 

 

cs_CS 

Czech 

 

hr_HR 

Croatian, Croatia 

 

hu_HU 

Hungarian, Hungary 

 

pl_PL 

Polish, Poland 

 

ro_RO 

Rumanian, Romania 

 

sh_YU 

Serbocroatian, Yugoslavia 

 

si_CS 

Slovenian 

 

si_SI 

Slovenian 

 

sk_SK 

Slovak 

Cyrillic 

 

 

 

bg_BG 

Bulgarian, Bulgaria 

 

mk_MK 

Macedonian 

 

ru_RU 

Russian 

 

ru_SU 

Russian 

 

sp_YU 

Serbian, Yugoslavia 

Arabic [No ISO territory name exists for the Arabic-speaking regions of the world. Vendors have supplied their own, which have been adopted for use in the Common Desktop Environment.]

 

 

 

ar_SA 

Arabic 

 

 

 

 

ar_AA 

Arabic 

 

ar_DZ 

Arabic 

 

 

 

 

Hebrew 

 

 

 

iw_IL 

Hebrew, Israel 

Greek 

 

 

 

el_GR 

Greek, Greece 

Turkish 

 

 

 

tr_TR 

Turkish, Turkey 

Asia 

 

 

 

ja_JP 

Japanese, Japan 

 

ko_KR 

Korean, Korea 

 

zh_CN 

Chinese, China 

 

zh_TW 

Chinese, Taiwan 

Thai 

 

 

 

th_TH 

Thai, Thailand 

Locale and Character Set

A help volume's default language and character set can be defined as an entity in the helplang.ent file. To specify a complete locale name, combine the language and territory name with the character set name using this syntax:

language-and-territory-name.character-set-name

For a description of the helplang.ent file, see "helplang.ent File".

Examples

If the locale is not specified in the helplang.ent file, then the value is derived from the value of the LANG environment variable.

HelpTag Software

When you process a help volume to create run-time help files, the HelpTag software must be told what language and character set you used to author your files. The language and character set information is used to determine the proper fonts for displaying help topics. If you do not specify a language and character set, HelpTag assumes the default, which is English and ISO-8859-1.

The language and character set can be defined in the helplang.ent file (see "helplang.ent File"). Or, the character set can be specified as an option on the command line when running dthelptag in a terminal window.


Note -

When writing HelpTag files, you may use multibyte characters for any help text. However, the HelpTag markup itself (tag names, entity names, IDs, and so on) must be entered using eight-bit characters.


DtHelp Message Catalog

The menus, buttons, and labels that appear in help dialogs should also be displayed in the user's native language. To enable this, Help dialogs read such strings from a message catalog named DtHelp.cat.

The message catalog source file, DtHelp.msg, contains strings for menus, buttons, and messages. If the language you need is not supplied, you must translate the sample message catalog (/usr/dt/dthelp/nls/C/DtHelp.msg) and then use the gencat command to create the run-time message catalog file. See "To Create a Message Catalog"for instructions.

Refer to your system documentation to determine the correct directory where your new message catalog should be installed.

LANG Environment Variable

The user's LANG environment variable is important for two reasons:

See Also

helplang.ent File

The helplang.ent file defines text entities used by the Helptag software to determine the default locale and character set for a help volume. See "Locale and Character Set"to learn how to specify a language and character set for your help volume.

The helplang.ent file also defines text entities for default strings such as Note, Caution, and Warning. If you want to override the English strings built into the HelpTag software, copy the file and localize the strings. The file is located in the directory /usr/dt/dthelp/dthelptag.

Here is an excerpt from the helplang.ent file:

<!ENTITY LanguageElementDefaultLocale          SDATA "C.ISO-8859-1">
 <!ENTITY NoteElementDefaultHeadingString       SDATA "NOTE">
 <!ENTITY CautionElementDefaultHeadingString    SDATA "CAUTION">
 <!ENTITY WarningElementDefaultHeadingString    SDATA "WARNING">
 <!ENTITY ChapterElementDefaultHeadingString    SDATA "Chapter">
 <!ENTITY FigureElementDefaultHeadingString     SDATA "Figure">
 <!ENTITY GlossaryElementDefaultHeadingString   SDATA "Glossary">
 .
 .
 .

Formatting Tables

A multibyte language, such as Japanese or Chinese, requires a formatting table. This table specifies a list of characters that cannot start a line and those characters that cannot end a line. When help files are processed, the formatting table ensures that lines wrap correctly. "Creating a Formatting Table"explains how to create a new table or edit the sample table provided in the Help Developer's Kit.

Font Schemes

One of the primary functions of the HelpTag software is to convert your marked-up files into a run-time format that the Help System understands. Text is formatted by specifying particular attributes such as type family, size, slant, and weight. A font scheme is simply a name, like an alias, that the Help System uses to assign fonts to HelpTag elements such as heads, procedures, lists, and so forth. It provides a way to map a group of text attributes used by the Help System with specific fonts.

Applications that use the standard Common Desktop Environment fonts do not need to define additional font resources. If your application relies on a different set of fonts, you must create and add a font scheme to your application.

See Also

Understanding Font Schemes

When you write a help volume using the HelpTag markup language, you don't specify the fonts and sizes of the text. When you run the HelpTag software, the elements that you've entered are formatted into run-time help files that include text attributes.

A font scheme maps text attributes to actual font specifications. For example, if a help topic is formatted using a bold, sans serif typeface, the font scheme identifies which Common Desktop Environment standard font or X font is actually used to display the text.

One of the primary uses of font schemes is to provide a choice of font sizes. The HelpTag software formats the body of most topics as 10-point text. However, because the actual display font is determined by the font scheme being used, all 10-point text could be specified to use a 14-point font.

Font Resources

Each font scheme is actually a set of X resources. These resources are read by the application displaying the help. If you want to change the font scheme, you can set font resources in your application's application defaults file.

Each resource within a font scheme has this general form:

*pitch.size.slant.weight.style.lang.char-set: font specification

Where:

pitch

Specifies the horizontal spacing of characters. This field should be either p (proportional) or m (monospace).

size

Specifies the height of the desired font. For help files formatted with HelpTag, this value should be 6, 8, 10, 12, or 14.

slant

Specifies the slant of the desired font. Usually this field is either roman for upright letters or italic for slanted letters

weight

Specifies the weight of the desired font. Usually this field is either medium or bold.

style

Specifies the general style of the desired font. For help files formatted with HelpTag, this value should be either serif or sans_serif.

lang

Specifies that volumes compiled using this language should use these fonts. Usually the entry uses an * (asterisk) so that all volumes using the specified char_set will use these fonts.

char-set

Specifies the character set used to author the help text. This value must match the character set that was specified when HelpTag was run. The default is ISO-8859-1. Some special characters are displayed using a symbol character set.

An * (asterisk) can be used in a field to specify a font that has any value of that particular attribute. For instance, the symbol set (for special characters and special symbols) distinguishes a unique font based only on size and character set.

Its font resources appear like this within a font scheme:

*.6.*.*.*.*.DT-SYMBOL-1:  -adobe-symbol-medium-r-normal-*-*-60-*-*-
p-*-adobe-fontspecific
*.8.*.*.*.*.DT-SYMBOL-1:  -adobe-symbol-medium-r-normal-*-*-80-*-*-
p-*-adobe-fontspecific
 *.10.*.*.*.*.DT-SYMBOL-1: -adobe-symbol-medium-r-normal-*-*-100-*-*-
p-*-adobe-fontspecific
 *.12.*.*.*.*.DT-SYMBOL-1: -adobe-symbol-medium-r-normal-*-*-120-*-*-
p-*-adobe-fontspecific
 *.14.*.*.*.*.DT-SYMBOL-1: -adobe-symbol-medium-r-normal-*-*-140-*-*-
p-*-adobe-fontspecific

The char-set field is the only field that cannot use the * (asterisk).

To display multibyte languages, such as Japanese or Korean, font resources must be specified using a font set. A font set is actually a group of fonts. A resource entry for a font set is similar to a single font, except a , (comma) separates multiple font names and the specification ends with a : (colon). Here is an example of a fully specified font resource for a Japanese font set.

bridge-gothic-medium-r-normal--18-180-75-75-c-80-jisx0201.1976-0,
bridge-gothic-medium-r-normal--18-180-75-75-c-160-jisx0208.1983-0,
bridge-gothic-medium-r-normal--18-180-75-75-c-160-jisx0212.1990-0: 

You can also specify fonts for a multibyte language by providing a minimal XLFD font specification and allowing the system to supply the character set value to produce a font set.

*.12.roman.medium.*.ja_JP.EUC-JP: -*-*-*-*-*-*-*-120-*-*-*-*-*-*:

When specifying a font set, remember to end the specification with a : (colon). This instructs the Help System to load a set of fonts to display the information. Font sets are used to display multibyte languages. For volumes containing single-byte information, use the standard font specification.

Sample Font Schemes

The /usr/dt/dthelp/fontschemes directory contains four font schemes:

fontDef.fns

Default fonts used by the Help System

fontLarge.fns

Example of a larger font

fontMulti.fns

Example of a multi-byte font

fontX11.fns

Example of standard X11 fonts

To Choose a Font Scheme

    Edit the application-defaults file for the application that displays the online help. Replace the current font resources (if any) with the new scheme.

If you are making this change just for yourself, copy the application-defaults file into your home directory before editing it.

Example

To use a larger size font (in the help dialogs) of a personal application named DtStopWatch, perform these steps:

Change to your home directory:

cd

Then copy the DtStopWatch application-defaults file and make it writable:

cp /usr/dt/app-defaults/C/DtStopWatch .
chmod u+w DtStopWatch

Edit the DtStopWatch file to add the largest scheme (fontLarge.fns). Go to the end of the file, and insert the contents of this file:

/usr/dt/dthelp/fontschemes/fontLarge.fns

Save your new DtStopWatch file.

Start the DtStopWatch application, select Help, and verify that help topics are displayed using the new font scheme.

Creating a Formatting Table

A multibyte language, such as Japanese or Chinese, requires a formatting table. This table contains three message sets. The first set consists of characters that cannot start a line; the second set lists any characters that cannot end a line; and the third set indicates how to handle newline characters that occur between a single-byte and a multibyte character.

A formatting table is an ASCII file whose file name must end with a.msg extension. Figure 14-1 shows an excerpt from a formatting table for Simplified Chinese.

Figure 14-1 Sample formatting table

Graphic

Any line that begins with a $ (dollar sign) followed by a space is a comment.

Sample Formatting Table

A sample formatting table for a multibyte character set is located in the /usr/dt/dthelp/nls/zh_CN.dt-eucCN directory and is named fmt_tbl.msg.

The sample table can be modified by adding or removing characters. To edit the formatting table, use an editor capable of composing characters in the language you have chosen for the help information. If you intend to create help information using a multibyte language, you need to create a formatting table.

To Create a Message Catalog

If you translate the DtHelp.msg file, create a new formatting table, or modify the sample table (fmt_tbl.msg), you must update the message catalog used by the Help System.

    Use this command syntax to generate the catalog file:

gencat file.cat file.msg

Message catalogs for the standard desktop applications are located in the /usr/dt/lib/nls/msg/lang directory. To install a message catalog, refer to your operating system documentation for guidelines.

See Also

Displaying a Localized Help Volume

To view a help volume created for a locale different from your current system, you must set your LANG environment variable to match the help volume. The value of the LANG environment variable is platform-specific. If you are not familiar with this variable, check with your system administrator for the correct value and procedure to set your environment.

Preparing Online Help for International Audiences

The following checklist summarizes the questions you should answer when providing online help for international audiences.

See Also