The Java Desktop System
is a fully Unicode-enabled,
multilingual system that supports languages with Unicode UTF-8 encoding. The Java Desktop System
also provides codeset conversion to support
legacy language encodings.
The language selection menu in the login screen only shows the supported
Unicode UTF-8 language names instead of locale names. To support the migration
to Unicode, the Java Desktop System
provides legacy
locales using non-UTF-8 locales that system administrators can add to the
login selection menu as an option.
The list of languages shown in the language selection menu in the login screen is configured in the following file: /etc/X11/gdm/locale.alias
Each supported legacy locale is listed in this file in a commented out line preceded by the # character. For example, Japanese support is listed in the following way:
Normal Line |
Commented Line |
---|---|
Japanese ja_JP.UTF-8 |
#Japanese ja_JP.eucJP |
To show ja_JP.eucJP as an option in the language selection menu, open the locale.alias file with a text editor and remove the # character at the start of the line.
There are a number of methods of importing and exporting data that are affected by the migration to Unicode multilingual computing.
The system administrator must configure the mount options codepage and iocharset for the file system type FAT and VFAT that are typically used for floppy disks, zip drives, and removable hard-disks on Microsoft Windows. For example, if you import from Traditional Chinese Windows, the settings must be as shown in the following table to browse the traditional Chinese filenames correctly.
Mount Option |
Traditional Chinese Setting |
---|---|
codepage |
950 |
iocharset |
big5 |
Sample entries for /etc/fstab for the Traditional Chinese example are as follows:
/dev/fd0h1440 |
/media/fd0h1440 |
vfat noauto,iocharset=big5,codepage=950 |
/dev/sda1 |
/media/iee1394disk |
vfat noauto,iocharset=big5,codepage=950 |
A system administrator must configure mount options codepage and iocharset to mount a remote Microsoft Windows file system shared using CIFS, or a file system exported from another system by SMB. For example, if you import the legacy files encoded in big5 on Traditional Chinese Windows, the iocharset parameter must be set to big5 and codepage must be set to 950 to browse the Traditional Chinese file names correctly. A sample /etc/fstab entry is as follows:
server:/data /data smbfs iocharset=big5,codepage=950,username=foo,password=bar |
The Java Desktop System
can remotely access
a file system on UNIX and Linux systems by using SMB. The export server must
run Samba or equivalent to export the remote file system. The client side
can specify file system encoding if the legacy data is stored in legacy encodings.
The codeset conversion of the filename is done automatically.
Microsoft Office
files are encoded in Unicode. StarOffice
applications can read and write the Unicode encoded
files without problem.
HTML files authored using HTML editors such as Mozilla
Composer
, or HTML files saved by a web browser, usually contain
a charset encoding tag. After exporting or importing, you
can browse such HTML files with the Mozilla Navigator
web browser, or edit the files with Mozilla Composer
,
according to the encoding tag in the HTML file.
Some HTML files might be displayed in garbage characters. This problem is typically due to the following reasons:
The charset encoding tag is incorrect.
The charset encoding tag is missing.
To find the charset encoding tag in the HTML file, perform the following actions:
Open the file with Mozilla
.
Press Ctrli, or click View to open the View menu.
Click on Page Info.
The charset information is in the bottom of the General tab, for example: Content-Type text/html; charset=us-ascii
If the string charset=us-ascii does not match with the actual encoding of the file, the file might appear as broken. To edit the encodings of the HTML file, perform the following actions:
Open the file with Mozilla Composer
.
Open the File menu.
Select Save As Charset.
Choose the correct encoding. Mozilla Compose
automatically converts the encoding and the charset tag as appropriate.
Modern emails are tagged with the MIME charset tag.
The mail application of the Java Desktop System
, Evolution
, accepts MIME charset tags. You
do not need to perform any encoding conversion.
Plain text files do not have a charset tag. If the files are not in UTF-8 encoding, encoding conversion is needed. For example, to convert a plain text file encoded in Traditional Chinese big5 to UTF-8, execute the following command: iconv -f big5 -t UTF-8 inputfilename > outputfilename