Solaris 10 10/09 Release Notes

Migration Note to UTF-8 locales

When migrating to UTF-8 locales, the files affect the method that you use to import or export data.

Microsoft Office Files

Microsoft Office files are encoded in Unicode. StarOffice applications can read and write the Unicode encoded files.

HTML Files

HTML files authored using HTML editors such as Mozilla Composer, or HTML files saved by a web browser, usually contain a charset encoding tag. After exporting or importing, you can browse such HTML files with the Mozilla Navigator web browser, or edit the files with Mozilla Composer, according to the encoding tag in the HTML file.

Fixing Broken HTML File

Some HTML files might be displayed in garbage characters. This problem is typically due to the following reasons:

To find the charset encoding tag in the HTML file, perform the following actions:

  1. Open the file with Mozilla.

  2. Press Ctrl-i, or click View to open the View menu.

  3. Click Page Info.

The charset information is in the bottom of the General tab, for example:


Content-Type text/html; charset=us-ascii

If the string charset=us-ascii does not match the actual encoding of the file, the file might appear broken. To edit the encodings of the HTML file, perform the following actions:

  1. Open the file with Mozilla Composer.

  2. Open the File menu.

  3. Select Save as Charset.

  4. Choose the correct encoding. Mozilla Composer automatically converts the encoding and the charset tag as appropriate.

Emails Saved As Portable Format

Modern mails are tagged with the MIME charset tag. The Email and Calendar application accepts MIME charset tags. You do not need to perform any encoding conversion.

Plain Text Files

Plain text files do not have a charset tag. If the files are not in UTF-8 encoding, encoding conversion is needed. For example, to convert a plain text file encoded in Traditional Chinese big5 to UTF-8, execute the following command:


iconv -f big5 -t UTF-8 inputfilename



 > outputfilename

You can also use the File System Examiner for the encoding conversion.

You can use the Text Editor to read and write character encoding text automatically or by specifying an encoding explicitly when opening or saving a file.

To start Text Editor, click Launch, then choose Applications->Accessories->Text Editor.

File Names and Directory Names

If file names and directory names using multibyte characters are not in UTF-8 encoding, encoding conversion is needed. You can use File System Examiner to convert file and directory names and the contents of plain text files from legacy character encodings to UTF-8 encoding. Refer to the online Help for File System Examiner for more information.

To start File Systems Examiner, click Launch, then choose Applications->Utilities->File System Examiner.

When you access non-UTF-8 file or directory names on Microsoft Windows via SMB using File Manager, you can access the non-UTF-8 file or directory names without encoding conversion.

Launching Legacy Locale Applications

For applications that are not ready to migrate to Unicode UTF-8, you can create a launcher on a front panel to start the application in legacy locales. You can also launch the applications directly from the command line. Perform the following steps to create a launcher for an application.

  1. Right-click on the panel where you want to place the launcher.

  2. Choose Add to Panel->Launcher.

  3. Use the following format to type the entry in the Command field in the Create Launcher dialog:


    env LANG=locale LC_ALL=
    locale application name
    

    For example, if you want to launch an application called motif-app from /usr/dt/bin in the Chinese Big5 locale, enter the following text in the Command field of the Create Launcher:


    env LANG=zh_TW.BIG5 LC_ALL=zh_TW.BIG5 /usr/dt/bin/motif-app
    
  4. Click OK to create the launcher on the panel.

When you need to run CLI (command line interface) applications which are specific to a legacy locale, open a Terminal window in the legacy locale first and then run the CLI applications in the same Terminal window. To open a Terminal window in a legacy locale, enter the following command:


eng LANG=locale LC_ALL=locale GNOME-TERMINAL –disbable-factory.

Instead of opening a new Terminal window in a legacy locale, you can switch the locale setting from UTF-8 to a legacy locale in the current Terminal window by changing the encoding the Set Character Encoding menu in the Terminal window. Then you must also set the LANG and LANG environment variables to the current shell.