International Language Environments Guide

Multibyte Support Environment

A multibyte character is a character that cannot be stored in a single byte, such as Chinese, Japanese, or Korean characters. These characters require 2, 3, or 4 bytes of storage. A more precise definition can be found in ISO/IEC 9899:1990 subclause 3.13.

The Amendment 1 to ANSI C, which is also known as ISO/IEC 9899:1990, added new internationalization features, collectively known as the Multibyte Support Environment (MSE). Amendment 1 defines additional internationalization APIs for multibyte code sets with state and also for better wide-character handling support.

The programming model enables these multibyte characters to be read in as logical units and stored internally as wide characters. These wide characters can be processed by the program as logical entities. Finally, these wide characters can be written out, undergoing appropriate translation, as logical units.

This procedure is analogous to the way single-byte characters are read in, manipulated, and written out again. The MSE enables programs to handle multibyte characters using the same programming model that is used for single-byte characters.