JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
International Language Environment Guide     Oracle Solaris 11 Information Library
PDF
search filter icon
search icon

Document Information

Preface

1.  Introduction

2.  Unicode and UTF-8 Locale Support

3.  Working with Languages and Locales

4.  Desktop Keyboard Preferences and Input Methods

5.  Configuring Fonts

6.  Advanced Topics

Code Set Conversion

iconv Utility

International Components for Unicode

uconv Utility

File Examiner (fsexam)

Auto Encoding Finder (auto_ef)

Internationalized Domain Names Support

GNU IDN Library

JPNIC idnkit Library

Printing Enhancement

mp utility

Interoperability with Other Platforms

NFS Server Considerations

File System Considerations

Archives Containing Non-ASCII Filenames

Creating a Custom Locale

Creating a New Locale Based on a System Locale

How to Create a Custom Locale

Creating a Locale From Scratch

A.  Available Locales

Index

Interoperability with Other Platforms

The following sections describe certain considerations for multi-platform environments.

NFS Server Considerations

The NFS version 4 protocol (the default in Oracle Solaris) uses UTF-8 to handle file names and other strings. Therefore, so in most use cases no charset-related adjustments should be necessary. However, note that the charset option can be used if some or all clients are using a specified character set.

For example, to share the /export directory using the ISO8859-1 character set, the following command would be used:

# share -o iso8859-1 /export

To share a directory using a specific character set for some machines only, the charset=access_list option can be used:

# share -o iso-8859-1=isomachine.example.com,koi8-r=koimachine.example.com /export

All file and path names created by the clients will be converted to UTF-8 at the server.

For more information, see the share_nfs(1M) man page.

File System Considerations

mount_pcfs(1M) does not support the MS-DOS codepages, so non-ASCII characters on FAT filesystems created by MSDOS, ancient version of MS Windows or the Linux "msdos" driver may be garbled. The later FAT implementations use Unicode for character representation and it's fully supported on Oracle Solaris by default, both for reading and writing.

Archives Containing Non-ASCII Filenames

Archiving files with non-ASCII characters in filenames may cause issues, because support of non-ASCII filenames in the numerous implementations of the particular archive formats differs significantly, although the situation is improving.

Recent tar implementations on UNIX and Unix-like systems support the POSIX format specified by POSIX.1-2001, so the non-ASCII filenames are handled safely. On the MS Windows platform a number of archival utilities stores the filenames using the current codepage so names of files extracted from such archives can become garbled.

In that case the convmv(1) tool can be used to repair them, when the codepage is known:

$ convmv -f cp437 -t utf8 my_extracted_filename 

In Zip files, the original specification sets the encoding of file names and file comments to IBM437. In 2007 PKWare extended the specification to also allow UTF-8. In the meantime various zip implementations adopted the strategy of using the current codepage as the filename encoding (usually on the MS Windows platform).

Info-ZIP's Zip 3.0, used in Oracle Solaris 10 and Oracle Solaris 11, stores filenames in UTF-8, so if both the compression and decompression utility are of this version, the archive contents would not become corrupted.

When a zip archive using a non-UTF-8 encoding to store the file names is extracted on Oracle Solaris, the file names might get garbled. You can use the convmv(1) tool to repair them, if the codepage is known:

    $ convmv -f cp437 -t utf8 my-unzipped-filename