Bookshelf v7.8: Non-Unicode (Traditional) Character Sets

Global Deployment Guide > Overview of Global Deployments > About Supported Character Sets >

Non-Unicode (Traditional) Character Sets

Before the emergence of Unicode, non-Unicode (traditional) character sets were available to address storage and processing requirements for a specific language or group of languages.

Examples of non-Unicode character sets are Code Page 1252 for languages spoken in Western European countries as well as in the Americas and elsewhere, and Code Page 932 for the Japanese language.

Because of the regional aspect of non-Unicode character sets, character data for languages not part of the character set cannot be processed in the same environment. Therefore, when a need to process data belonging to multiple character sets arise, customers are forced to provide multiple environments.

Also, because character sets are expressed in code pages, the numeric representation of a character in one code page may be different from the representation in another code page, and often the character does not even exist.

For example, the letter a-umlaut (ä) in the Western European character set does not exist in the Arabic character set. In a Western European code page, such as 1252 or ISO 8859-1, the a-umlaut occupies code point E4 (Hex value). In an Arabic code page, such as 1256 or ISO 8859-6, the E4 code point is an Arabic character and not the a-umlaut. Thus, you cannot represent the a-umlaut character on an Arabic system, or represent the Arabic character in a Western European system.

There is a set of characters that are common in most generally used non-Unicode character sets and code pages. These characters are known as the ASCII characters. They include the common characters used in the English language and they occupy the first 128 code points (Hex 00-9F) in the non-Unicode code pages.

NOTE: It is the customer's responsibility to choose a character set that includes the characters required by the customer's business. Since the character set is a property of database configuration performed by the customer, Siebel Systems has no control over this setting. Choosing an inappropriate character set may require database reconfiguration later, and a corresponding need to convert large amounts of transaction data that has built up in the wrong character set. This is generally a time-consuming and costly experience. Character set conversion to Unicode must be done with the assistance of Siebel Expert Services.

For more information, see System Requirements and Supported Platforms on Siebel SupportWeb.

Global Deployment Guide