The Java Desktop System is a fully Unicode-enabled, multilingual system that supports languages with Unicode UTF-8 encoding. The Java Desktop System also provides codeset conversion to support legacy (non-UTF-8) encodings. This section describes issues you might encounter when you migrate to Unicode multilingual computing.
There are a number of methods of importing and exporting data that are affected by the migration to Unicode multilingual computing.
Microsoft Office
files are encoded in Unicode. StarOffice
applications can read and write the Unicode encoded
files.
HTML files authored using HTML editors such as Mozilla
Composer
, or HTML files saved by a web browser, usually contain
a charset encoding tag. After exporting or importing, you
can browse such HTML files with the Mozilla Navigator
web browser, or edit the files with Mozilla Composer
,
according to the encoding tag in the HTML file.
Some HTML files might be displayed in garbage characters. This problem is typically due to the following reasons:
The charset encoding tag is incorrect.
The charset encoding tag is missing.
To find the charset encoding tag in the HTML file, perform the following actions:
Open the file with Mozilla
.
Press Ctrl+i, or click View to open the View menu.
Click on Page Info.
The charset information is in the bottom of the General tab, for example: Content-Type text/html; charset=us-ascii
If the string charset=us-ascii does not match with the actual encoding of the file, the file might appear as broken. To edit the encodings of the HTML file, perform the following actions:
Open the file with Mozilla Composer
.
Open the File menu.
Select Save As Charset.
Choose the correct encoding. Mozilla Compose
automatically converts the encoding and the charset tag as appropriate.
Modern emails are tagged with the MIME charset tag. The Email and Calendar application accepts MIME charset tags. You do not need to perform any encoding conversion.
Plain text files do not have a charset tag. If the files are not in UTF-8 encoding, encoding conversion is needed. For example, to convert a plain text file encoded in Traditional Chinese big5 to UTF-8, execute the following command:
iconv -f big5 -t UTF-8 inputfilename > outputfilename
You can also use File System Examiner
for
the encoding conversion.
You can use Text Editor
to read and write
character encoding text automatically or by specifying an encoding explicitly
when opening or saving a file.
To start Text Editor
, click Launch, then choose Applications -> Accessories -> Text Editor.
If file names and directory names using multibyte characters are not
in UTF-8 encoding, encoding conversion is needed. You can use File System Examiner
to convert file and directory names and
the contents of plain text files from legacy character encodings to UTF-8
encoding. Refer to the online Help for File System Examiner
for more information.
To start File System Examiner
, click Launch, then choose Applications -> Utilities -> File System
Examiner.
When you access non-UTF-8 file or directory names on Microsoft Windows
via SMB using File Manager
, you can access the
non-UTF-8 file or directory names without encoding conversion.
For applications that are not ready to migrate to Unicode UTF-8, you can create a launcher on a front panel to start the application in legacy locales. You can also launch the applications directly from the command line. Perform the following steps to create a launcher for an application:
Right-click on the panel where you want to place the launcher.
Choose Add to Panel -> Launcher.
Use the following format to type the entry in the Command field in the Create Launcher dialog:
env LANG=locale LC_ALL=locale application name
For example, if you want to launch an application called motif-app from /usr/dt/bin in the Chinese Big5 locale, enter the following text in the Command field of the Create Launcher:
env LANG=zh_TW.BIG5 LC_ALL=zh_TW.BIG5 /usr/dt/bin/motif-app
You may also need to specify appropriate LD_LIBRARY_PATH for the application
Click OK to create the launcher on the panel.
When you need to run CLI (command line interface) applications which
are specific to a legacy locale, open a Terminal
window in the legacy locale first and then run the CLI applications in the
same Terminal
terminal window. To open a Terminal
window in a legacy locale, enter the following command:
env LANG=locale LC_ALL=locale gnome-terminal --disable-factory.
Instead of opening a new Terminal
window
in a legacy locale, you can switch the locale setting from UTF-8 to a legacy
locale in the current Terminal
window by changing
the encoding the Set Character Encoding menu in the Terminal
window. Then you must also set the LANG
and LANG environment variables to the current shell.