Previous     Contents     Index          Next     
iPlanet Web Server, Enterprise Edition Administrator's Guide



Appendix D   Internationalized iPlanet Web Server


The internationalized version of the iPlanet Web Server contains special features tailored for the non-U.S. environment. These features include a choice of user-interface language (Japanese, French, or German) and a choice of search engines that allow you to use text search on a variety of languages.

This appendix contains the following sections:



General Information

The following information covers the international considerations for general server capabilities:


Installing the Server

When you install the server, you choose what user-interface language to use, as well as what search engines to install.

For information on installing the international version of the server, see the iPlanet Web Server,Enterprise Edition 6.0 Release Notes. You can access the Release Notes online via the link provided in the README file.


Entering UTF-8 Data

If you want to enter UTF-8 data on the Server Manager or the Administration Server pages, you need to be aware of the following issues:


File or Directory Names

If a file or directory name is to appear in a URL, it cannot contain 8-bit or multi-byte characters.


LDAP Users and Groups

For email addresses, use only those characters permitted in RFC 1700 (ftp://ds.internic.net/rfc/rfc1700.txt). User ID and password information must be stored in ASCII.

To make sure you enter characters in the correct format for users and groups, use a UTF-8 form-capable client (such as Netscape Communicator) to input 8-bit or multi-byte data.

If you let users access their own user and group information, they will need to use a UTF-8 form-capable client.


Using the Accept-language Header

When clients contact a server using HTTP, they can send header information that describes the various languages they accept. You can configure your server to parse this language information as described in Setting up Server-Parsed HTML".

You can enable or disable the server to the acceptlanguage directive in the server.xml file.

Figure D-1    International Settings in server.xml



acceptlanguage  

on, off  

Enables or disables the Accept-language header parsing.  

For example, if acceptlanguage is set to on, and a client sends the Accept-language header with the value fr-CH,de, when requesting the following URL:

http://www.someplace.com/somepage.html

Your server searches for the file in the following order:

  1. The Accept-language list fr-CH,de.

    http://www.someplace.com/fr_ch/somepage.html

    http://www.someplace.com/somepage_fr_ch.html

    http://www.someplace.com/de/somepage.html

    http://www.someplace.com/somepage_de.html

  2. Language codes without the country codes (fr in the case of fr-CH):

    http://www.someplace.com/fr/somepage.html

    http://www.someplace.com/somepage_fr.html

  3. The DefaultLanguage , such as en, defined in the magnus.conf file.

    http://www.someplace.com/en/somepage.html

    http://www.someplace.com/somepage_en.html

  4. If none of these are found, the server tries:

    http://www.someplace.com/somepage.html

    Note Keep in mind when naming your localized files that country codes like CH and TW are converted to lower case and dashes (-) are converted to underscores (_).




Using Other Language Settings

The following directives in the magnus.conf file specify language defaults:


Table D-1    Language Settings in magnus.conf

Directive

Values

Description

ClientLanguage  

en, fr, de, ja  

Specifies the language in which client messages, such as "Not Found" or "Access denied" are to be expressed. This value is used to determine which ns-httpd.db database to use for the localized messages.  

DefaultLanguage  

en, fr, de, ja  

Specifies the language used if a resource cannot be found for the client language.  



Search Information



Search capabilities are supported for the following languages:

  • English

  • German

  • French

  • Italian

  • Spanish

  • Swedish

  • Dutch

  • Japanese


International Search

To view documents in different character set encodings, users must change the character set encoding for their browsers. In addition, since the text search works with one character set encoding at a time, you might receive inaccurate results when using those features. For best results, use one specific character set for all documents when creating search collections from them.


Searching in Japanese

The following information is specific to searching in Japanese.


Query Operators

This release supports the following query operators for Japanese:


Table D-2    Query operators for Japanese

Operator

Japanese Character

AND  

Yes  

CONTAINS  

No  

ENDS  

Yes  

MATCHES  

Yes  

NEAR  

Yes  

NEAR/N  

Yes  

NOT  

Yes  

OR  

Yes  

PHRASE  

Yes  

STARTS  

Yes  

STEM  

English only  

SUBSTRING  

Yes  

WILDCARD *  

Yes  

WILDCARD ?  

Yes  

WORD  

Yes  


Document Formats

This release supports the following document formats for the Japanese language:

  • HTML

  • ASCII

  • NEWS

  • MAIL

    Note The PDF document format is not supported for Japanese.




Searching in Japanese

The following sections give additional information about searching in the Japanese character set.


Document Encodings
This release supports the following document encodings for the Japanese language:

  • euc-jp

  • Shift_JIS

    Note ISO-2022-JP is not supported.




Search Words
This release supports the following search words:

  • kanji

  • hiragana

  • katakana (full-width and half-width)

  • ASCII (full-width and half-width)

The search engine translates half-width katakana to full-width katakana, and translates full-width ASCII to half-width ASCII. Users can use full-width and half-width as the same characters.

This release also supports phrase and sentence search.



Servlet Internationalization



When form data is submitted from a browser to the server using POST, the browser:

  • url-encodes the POST data

  • Sets the Content-Type to application/x-www-form-urlencoded

  • Does not send any charset information in the Content-Type header

On the server side, if a servlet tries to access POST data using getParameter or getParameterValues, the servlet container does not have any information about which character encoding to use for getParameter strings.

You can configure iPlanet Web Server 6.0 to instruct the servlet container which character encoding to use for interpreting POST data strings. To do this, you would specify the character encodingusing the parameter-encoding element in web-apps.xml:

<parameter-encoding enc="value"/>

where value can be one of the following:

  • auto (default)

  • none

  • any valid encoding

These values are described below.


auto

auto requires the servlet container to look for some hints regarding the character encoding to be used. The hint can be provided using:

  • A request attribute with the name: com.iplanet.server.http.servlet.parameterEncoding. The value is of type String. The request attribute must be set before any calls to getParameter() or getParameterValues(). Example:

    request.setAttribute("com.iplanet.server.http.servlet.
    parameterEncoding", "Shift_JIS");
    request.getParameter("test");

    This option is used if the servlet knows beforehand what the charset of the posted data is.

  • A j_encoding parameter in the form data. The form that is being submitted can have a hidden element. Example:

    <input type=hidden name="j_encoding" value="Shift_JIS" >

    This option is typically used if the servlet that is reading the data does not necessarily know what the charset of the posted data is. The hint parameter name, which by default is j_encoding can be changed using theparameter-encoding element in web-apps.xml.


none

Use this option if you wish the platform default encoding to be used for the servlet parameter data.


any valid encoding

If none of the above options are specified, the servlet container interprets this string itself as the encoding, so this can be any valid encoding string like Shift_JIS, or UTF-8. For example, you would specify this as UTF-8 if you know that the form POST data is always in UTF-8.


Note The server will always try to resolve the charset from the Content-Type header of the request first.



For more information on parameter-encoding, see the Programmer's Guide to Servlets.



Posting to JSPs



You can configure parameter-encoding to work the same way when you are posting to a JSP instead of a servlet. The following example demonstrates a JSP configuration of `auto' to read parameters which are in Japanese Shift_JIS encoding:

<%@ page contentType="text/html;charset=Shift_JIS" %>
<html>
<head>
<title>JSP Test Case</title>
</head>
<body>
<% request.setAttribute("com.iplanet.server.http.servlet.parameterEnco ding", "Shift_JIS");
%>
<h1>The Entered Name is : <%= request.getParameter("test") %> </h1>
</body>
</html>


Previous     Contents     Index          Next     
Copyright © 2001 Sun Microsystems, Inc. Some preexisting portions Copyright © 2001 Netscape Communications Corp. All rights reserved.

Last Updated May 09, 2002