Go to main content

International Language Environments Guide for Oracle® Solaris 11.3

Exit Print View

Updated: December 2018
 
 

Migrating From Non-UTF-8 Locales to UTF-8 Locales

When migrating to UTF-8 locales, the method used for importing or exporting data depends on the file type.

Plain Text Files

Plain text files do not have an explicit identification of the files' character encoding. If the files are not in the UTF-8 encoding, conversion is needed. For example, you would run the following command to convert a plain text file encoded in Traditional Chinese big5 to UTF-8:

$ iconv -f big5 -t UTF-8 inputfilename > outputfilename

The Text Editor application can read and write character encoding text automatically, or you can specify an encoding explicitly when opening or saving a file. To start the Text Editor, choose Launch→Applications→Accessories→Text Editor.

File Names and Directory Names

File systems like UFS or ZFS store file and directory names in the character set that you use. If you use non-UTF-8 locales or mount a non-UTF-8 file system and move to a UTF-8 locale, you might see garbage characters in the file names. To fix this problem, convmv(1) can be used to convert a single file name, a directory tree and the contained files or a whole file system into a different encoding. It only converts the file names, not the contents of the files.

See the convmv(1) man page for more information. The tool works on any file system.

Alternatively, fsexam(1) can be used for this purpose. For more information, see File Examiner (fsexam).

ZFS File System

ZFS is the main file system used in Oracle Solaris 11. ZFS uses the locale's character set to store file and directory names like other file systems. For the UTF-8 character set, the normalization property sets the type of normalization algorithm used by the file system for comparing names to avoid having more than one entity with the same file name in a single directory.

If the utf8only property is enabled, the file system will reject file names that include characters not present in the UTF-8 character set.

See the zfs(1M) man page for more information.

NFS File System

For more information, see Interoperability with Other Platforms.