Previous     Contents     Index     DocHome     Next     
iPlanet Directory Server Gateway Customization Guide



Chapter 3   Gateway Localization


This chapter describes gateway localization and identifies the tasks required to set up additional gateway locales. Topics include:



Unicode and iPlanet Support for UTF-8

Unicode is a character set containing all the characters of all the world's major languages. There are different standard methods to encode Unicode, including UCS-2, which is NT's Unicode version, and UTF-8, the version of Unicode specified by version 3 of the LDAP protocol.

iPlanet products use UTF-8 in versions 2 and 3 of LDAP. Most software included in the directory server uses UTF-8 internally, and at interfaces other than LDAP (for example in command-line parameters and LDIF files).

The NT Synchronization Server, installed with the directory server, converts UTF-8 to and from NT's Unicode representation (UCS-2).



Note Netscape Communicator 4.x supports UTF-8. Netscape Navigator 3.x does not.





How the Gateway Selects a Character Set



The gateway can output web pages in many character sets. The gateway selects a character set for each HTTP client based on a combination of input from the client and from the gateway's configuration files. Releases 3.x and 4.0 of the gateway select a charset for transmission according to this priority:

  • Character set defined in the client's HTTP Accept-charset header (in release 4.0, this can be overridden for a particular browser using the ignoreAcceptCharsetFrom parameter).

  • Character set defined in the client's HTTP Accept-language header (for instance, for Japanese, the charset would be defined as ../dsgw/ja/dsgwcharset.conf)

  • Character set defined in the gateway's .conf file by the charset parameter.


How the Gateway Selects from Multiple Requested Characters Sets

When a client includes more than one character set in a request header, and the gateway supports more than one of these, it selects a character set according to this priority:

  • UTF-8

  • Of the possible character sets, the character set with the highest Q value (for example, "de;q=1, en;q=0.5, fr;q=0.7" would give German the highest Q value)

  • The character set that appears first in the request header.

  • Latin-1 (ISO-8859-1)


HTTP Clients that Request UTF-8

Browsers designed for localization are configured to request the UTF-8 character set by default. To support localization, the gateway is pre-configured to transmit the UTF-8 character set to these clients: Netscape Communicator version 4.0 and greater and to Internet Explorer version 4.0 and greater. Release 4.0 of the gateway allows this pre-configuration to be overridden using the ignoreAcceptCharsetFrom parameter. For more information about this parameter, see ignoreAccetpCharsetFrom.

The conversion from UTF-8 to the gateway client's chosen charset is performed shortly before output.


HTTP Clients that Do Not Request UTF-8

For browsers that do not request UTF-8 by default (including Netscape Navigator 3.x and pre-4.0 releases of Internet Explorer), the gateway selects a character set from the Accept-Charset request header or from the Accept-Language request header, depending on the HTTP client.

Some HTTP clients don't request any character set information. For these clients, the gateway's charset parameter definition is the default. When the charset parameter is not defined in dsgw.conf, the gateway uses Latin-1 (which is the default in HTTP).

In addition to UTF-8 and Latin-1, the gateway can convert to and from several national character sets, depending on the client's needs and configuration, including:

  • Shift_JIS

  • Big5

  • EUC-KR



Special Characters

The following sections describe how special characters are interpreted by the gateway.


Non-Breaking Space

If the client's charset lacks a character for non-breaking space, but has ideographic space, non-breaking spaces are converted to ideographic spaces before charset conversion.

See the changeHTML directive, in the gateway configuration file dsgw.conf.


Query Strings

When the gateway needs to embed a UTF-8 string in an URL, it encodes it in a query string (the query string is the part of the URL that follows the question mark).

This works around a problem with Japanese NT, which garbles environment variables that are in UTF-8 (or any charset except Shift_JIS). The Web server passes information to the gateway CGI programs in environment variables, but the query string environment variable $QUERY_STRING is URL-encoded, so it can handle UTF-8 (from NT's point of view, it's ASCII).



Gateway Locales



The gateway's default language is US English. Release 4.0 of the directory server gateway interface is also translated into the following locales

  • Japanese

  • Spanish

  • German

  • French


Support for Multiple Locales

A single gateway instance supports clients in multiple locales concurrently.

Support for multiple locales is accomplished by translating documentation (including online help), the string resource database, and the configuration and HTML template files. A single copy of the compiled code handles all supported locales.

Locale-dependent information is stored in translated files stored in subdirectories identifying the locale name. These editable files are stored separately from the gateway code. For example, the German translation of config/search.html is stored in config/de/search.html, the French translation is stored in config/fr/search.html, and the Japanese translation is stored in config/ja/search.html.



Setting Up Locales for Translation



The default gateway can be configured to support locales in addition to English (the default locale), French, German, Spanish, and Japan. This is part of the overall localization effort, which includes localizing all the configuration and HTML files, including the on-line help and the string resource database.This is made possible by including a pointer to the mapping table in dsgw-l10n.conf, which is stored during directory server installation in /usr/iplanet/servers/dsgw/config/lang.


dsgw-l10n.conf

dsgw-l10n.conf provides translation in the Search and Advanced Search pull-down menus for the default gateway (dsgw.conf). If dsgw-110n.conf is not present in the /config/lang directory, translation of the UI does not occur and English characters appear in the pull-down menus for Standard Search and Advanced Search.

The following example shows how to create a new locale using Chinese as the language for translation:

  1. Create a "zh" directory in /usr/iplanet/servers/dsgw/context

  2. Copy dsgw.conf to the /usr/iplanet/servers/dsgw/context/zh

  3. uncomment this line from the gateway's .conf file:

    include "../config/dsgw-l10n.conf"

  4. create a "zh" directory in /usr/iplanet/servers/dsgw/config

  5. Copy or create the file dsgw-l10n.conf, stored during gateway installation in /usr/iplanet/servers/dsgw/config/lang, to /usr/iplanet/servers/dsgw/config/zh



    Note If you are using the US version of the gateway, dsgw.conf contains a sample of dsgw-l10n.conf.




Previous     Contents     Index     DocHome     Next     
Copyright © 2001 Sun Microsystems, Inc. Some preexisting portions Copyright © 2001 Netscape Communications Corp. All rights reserved.

Last Updated March 21, 2001