Unicode Support in the Solaris Operating Environment

3.5 Unicode Font Resources

The Unicode Standard, Version 3.0 contains 49,194 characters from the world's scripts, with over 25,000 ideographic characters for Chinese, Japanese, and Korean. The font resources representing these characters, however, are not always one to one--some Unicode code points associate different, multiple glyphs, enabling specific code points to be rendered correctly based upon their context. For example, in Asian languages, the Unified han glyphs are written and displayed differently in Simplified Chinese, Traditional Chinese, Japanese kanji, and Korean hanja ideographs.

To manage these difficulties, the Solaris operating environment contains an output method combining existing fonts to form a Unicode font set, instead of providing a single Unicode font. The Solaris 8 operating environment supports the following range of scripts:

English/European
Greek, Turkish, Cyrillic
Arabic, Hebrew, Thai
Simplified Chinese, Traditional Chinese, Japanese, Korean

For European scripts, there is a one-to-one mapping between Unicode characters and corresponding glyphs. For Complex Text Layout language text (Arabic, Hebrew, Thai), the Solaris Universal Multiscript Layout Engine pre-processes the text (right-to-left swapping, contextual analysis, and so on) before rendering the associated glyphs.

For Asian characters, the Solaris operating environment output methods provide dynamic remapping of the font and glyph index according to the locale definition. Each locale contains a font table with mapping mechanisms specifying which font and glyph to use for each character code. The mechanism remaps the Unicode code point values to existing Chinese, Japanese, and Korean fonts and glyph index pairs. A locale administrator can define the sort priority among fonts. For example, the mechanism may search the Simplified Chinese fonts for the appropriate glyph and then search the Traditional Chinese fonts, and so on.