Overview of Global Deployments

This chapter provides overview information about global deployments of Siebel CRM. It includes the following topics:

Global Deployment Terminology

This guide uses many specialized terms relevant to global deployments of Siebel CRM. This topic contains the following information, which describes some of these terms and related concepts:

General Terminology

This topic is part of Global Deployment Terminology.

The following table identifies and provides brief descriptions for some of the terms relevant to global deployments of Siebel CRM. Additional information is provided in subsequent topics.

Table Global Deployment Terminology

Term	Meaning
Character set	A group of characters (alphanumeric, text, or special) usually associated with one or more languages or scripts (writing systems). There are many character sets used in the computing industry. Character sets are identified by a character set name, such as Western European or Latin 1. These names are not well-standardized and many character sets have multiple names. However, you can use formal identifiers to clearly specify a character set when necessary. See also About Supported Character Sets.
Character set encoding	Also known as character encoding. A specific computer representation of a character set. Some character sets can have multiple encodings. For example, Western European or Latin 1 is encoded differently in ISO 8859-1, 8859-15, ANSI 1252 (Microsoft standard), and EBCDIC. Unicode also comes in different encodings, such as UTF-8 or UTF-16. In general, the differences in encodings are between ISO, ANSI, and EBCDIC. Aside from Unicode, a prominent example of a character set with multiple encodings is JIS (Japan Industrial Standard). Shift-JIS, EUC, and ISO 2022-JP are all encodings based on JIS. As with character set names, industry standardization is minimal and there are multiple names for the same encoding. The character set encoding is also known as a code page or codepage (one word), which often refers to a vendor implementation of a character set encoding. For Microsoft Windows, the term code page is used for ANSI code page (Windows) and OEM code page (DOS), but not for ISO character sets. IBM uses a numbering system which is similar, and often identical, to Microsoft. However, IBM renumbers extensions while Microsoft does not, which can lead to statements like “Use IBM 943 with Siebel applications and MS 932; IBM 932 is an older version.” See also About Character Set Encodings and Siebel Applications.
Code point	A single data value representing a single character in a code page.
Global deployment	The process of installing, configuring, testing, and deploying Siebel CRM in more than one locale and language.
Internationalization	The process of making a software product that can correctly process data for any customer, including data entry and display. Internationalization makes localization possible. For more information, see Internationalization.
Internet Corporation for Assigned Names and Numbers (ICANN)	An Internet body that manages Internet addresses, domain names and protocol parameters. ICANN conventions are used for the World Wide Web, email, and XML.
Language	The language or languages of the Siebel CRM software installed on the system. Language pack is another term for a language that you install with Siebel CRM or the Siebel database, or that you add to an existing deployment. For more information, see Language.
Locale	A set of user preference information related to the user’s language, location, and cultural conventions, including the formatting or presentation style of data such as dates, time, numbers, and currency. For more information, see Locale and About Date, Time, Number, and Currency Formatting.
Localization	The process of readying a product for use in a particular target country through the use of a locale. For more information, see Localization.
Multicurrency support	A feature that allows automatic currency conversion and currency formatting. For more information, see About Date, Time, Number, and Currency Formatting.
Non-Unicode (traditional) character set	Non-Unicode (traditional) character sets refer to character sets other than Unicode and imply that the character set supports a restricted set of characters. Typically, a traditional character set supports the alphabet of a single language or of a collection of languages that use the same or similar alphabets. See also About Supported Character Sets.
Platform	A platform includes the operating system of the various entities of a Siebel CRM deployment; the database, the Siebel Servers, and the clients and the character set used by these entities. For more information, see Platform.
Script	A system of writing that requires graphical symbols to be placed in a certain order to communicate information. The symbols can be based on an alphabet, on pictures of objects in the world around us, or on some other system. The Roman script (sometimes called the Roman alphabet) is a script that uses 26 symbols to represent sounds made by the human mouth, organized into an alphabet. Originally the script used to write Latin (the language of the Romans), it has been extended with diacritics on many of the characters to express other sounds present in Western European languages. (In a Siebel CRM development context, a very different meaning of the term script is a program written using a language such as Siebel eScript.)
Unicode character set	A character set defined by the Unicode Consortium that is the union of most of the major character sets used in the computing industry. See also About Supported Character Sets.
Universal Time Coordinated (UTC)	Also known as coordinated universal time. A time scale that is defined and recommended by the International Radio Consultative Committee (CCIR) and maintained by the Bureau International des Poids et Mesures (BIPM). The global time zone feature uses UTC in order to provide a standard internal time for the Siebel applications, which can then be adapted to each user’s time zone. UTC allows activities and other events to be scheduled across time zones. For more information, see Deploying Siebel CRM with Global Time Zone.

Platform

This topic is part of Global Deployment Terminology.

The platform determines what data can be processed in a Siebel CRM deployment. The character set encoding and operating system language of a platform will determine what data can be handled correctly and what data will not be handled correctly by the platform.

This guide uses the term platform in several places to discuss deployment options as well as specific functionality available in Siebel CRM.

Siebel CRM generally does not support mixed character encoding environments. The reason is that it is not technically possible to manage an environment that uses multiple character set encodings on databases and servers without a genuine risk of losing data in the process.

For example, suppose a database is set up with a Western European character set encoding and a user tries to insert Japanese data through a Siebel Server set up for Japanese. Depending on the actual database, the effect could be that the user’s data would be rejected and not stored in the database, or the data could get converted by the database and stored incorrectly as unreadable characters (substitution characters), resulting in loss of the original Japanese data.

Field Size Issues

In some cases, a user might try to enter data into the application and receive an error message saying that the language of the text entered “is not compatible with the database language” or that the length of the text entered “is bigger than the corresponding length allocated in the database.”

Such an error might occur when the character set of the data does not match the character set of the database and cannot be converted without data loss. Or, the error might occur when the data string is too long to fit into the database column.

If you have an existing Siebel deployment and upgrade the database default character set to Unicode (any encoding), then all links to other information technology systems must be examined for compatibility. In many cases, data previously encoded in a traditional codepage will now be larger in terms of bytes, and could overflow the fields that are used for transfers. In such cases, Siebel Unicode data values will be truncated, especially if the data value is large enough to approach the maximum defined size of the field.

You must examine the field sizes in tables used by Siebel EAI or Siebel EIM, in any linking or replication software, and in any extension columns that have been added.

Depending on your existing code page, the languages and specific characters representing your data, and the Unicode encoding that you are migrating to, field sizes might need to be enlarged. In some scenarios, field sizes must be doubled.

Customers can set byte limits on columns in the database. At the user interface level, the input size can be controlled to make sure that this column width is not exceeded. In most cases, there is a 1:1 mapping between bytes and characters.

Field Sizes by Database Platforms

When specifying new field sizes, be careful as to whether they must be given in bytes or in characters. Field size units will vary by the RDBMS vendor:

Oracle Database. Specify field sizes in characters (using character semantics).
IBM DB2. Specify field sizes in bytes.
IBM DB2 for z/OS. Specify field sizes in bytes.
Microsoft SQL Server. Specify field sizes in bytes.

Note: For more information about using IBM DB2 for z/OS, see Implementing Siebel Business Applications on DB2 for z/OS and Siebel Database Upgrade Guide for DB2 for z/OS.

Considerations for Deployments Using Code Page 932 (Shift-JIS)

The following are special considerations for Japanese-language deployments using Shift-JIS, which is also known as Code Page 932 (or 936 on IBM DB2, or JA16SJIS on Oracle Database). It is assumed that the client computer uses the same code page as the database.

Some Japanese characters require two bytes per character. However, note that user input can include both single-byte and double-byte chars, in combination. Because there is no direct correlation between length limit and byte limit in the case of double-byte languages, it is impossible to provide direct validation of byte limits. For validation purposes, the assumption should be that all characters typed are double-byte. This may leave unused fields when only single-byte characters are used. Custom validation based on analysis of the input string might be able to solve this problem, if a customer chooses to implement this.
If your deployment is migrating from Code Page 932 (Shift-JIS) to Unicode, then you must be careful with characters entered on a system where the data is stored in Code Page 932.
Over 300 characters present in Code Page 932 have only one representation in Unicode, so when these characters are moved to a Unicode system, they are converted permanently to the new value used in Unicode. Because of this conversion, users will see that the character they originally entered has been slightly changed, but the meaning should be the same as before.
An example is the replacement of the WAVE DASH character by the FULL WIDTH TILDE character, often used in expressing appointment times. There is no correction for this situation, because it is a result of the design of both Code Page 932 and Unicode character sets, which are industry standards.

Limitation for Arabic and Numeric Fields

Although the Arabic language is supported, Arabic digits cannot be used in numeric fields in Siebel CRM.

Language

This topic is part of Global Deployment Terminology.

The language for a Siebel application can mean multiple things, and might involve different system or application elements. These elements are independent from the language of the data that the user enters in the Siebel database. You must install seed data according to how you want to use languages in the Siebel applications.

Primary Language, Active Language, and Resource Language

The term language has the following major meanings with respect to Siebel CRM:

The primary language (sometimes called the base language) is the first language installed for this Siebel product installation, particularly the first language installed in the Siebel database.

Note: In general, the term primary language refers to the first language installed in the Siebel database. Sometimes this phrase also refers to the language for Siebel Enterprise Server messages and logging, which is specified during installation and initial configuration. For most deployments, these would be the same language. See also the Siebel Installation Guide for the operating system you are using.
For the Siebel Mobile Web Client or Developer Web Client, the user can explicitly specify the language to use for an application session by double-clicking the corresponding shortcut, where the language is installed and available.
- The active language is the language in effect for an individual user’s session and the language of user interface elements, including multilingual lists of values (MLOVs) that have been enabled. The same language is used for system messages (if the resource language is not separately defined).
- For a Siebel Web Client session, the language is determined by the Application Object Manager component that is invoked through the specified URL. This language cannot be changed by the user, except by logging in to a different language-specific Application Object Manager where the language is installed and available. The URL includes the language code.
The resource language, if it is defined, is used as the default language for system messages.

Languages and Siebel Installations and Upgrades

Installing Siebel Language Packs on the Siebel Server (or on other Siebel components) installs the language-specific run-time environment: resource libraries such as DLL files, configuration (CFG) files, error messages, help files, and so on. In general, you install the same languages on all of the components in your Siebel Enterprise. For more information, see Scenarios for Installing and Deploying Siebel Languages.

When you install the Siebel database for a new installation, language-specific seed data is added for the primary language only. For multilingual deployments, you must add seed data for additional languages separately after your initial Siebel database installation. The Siebel database also includes the Siebel runtime repository.

When you upgrade the Siebel database from a prior Siebel version, all of the existing languages are upgraded at the same time. For more information, see Siebel Database Upgrade Guide.

The languages allowed in data are constrained only by the character encoding of the database platform. For example, although a user might be using a U.S. English version of a Siebel application with a Western European code page database, the user can enter or view contact data in French, because all French characters are representable in the Western European code page.

With a Unicode code page, and appropriate fonts locally installed, languages using dissimilar scripts, such as French and Japanese, can be used together.

Language Codes

Each language code used by Siebel CRM uses a three-letter code, such as ENU for U.S. English, FRA for French, THA for Thai, and so on. Using language codes with only two characters does not work and is not supported.

Note: For a list of the languages supported by Siebel CRM, and their language codes, see 1513102.1 (Article ID) on My Oracle Support. They are also listed in the second table in Configuring Cascading Style Sheets to Specify Different Fonts. Special requirements apply when you localize an unshipped language, as noted in Localizing an Unshipped Language.

Languages and Application Development and Deployment

For application development using Siebel Tools or Siebel Web Tools, you set the language mode to work with object definitions for a particular language. For information about the language mode in Siebel Tools or Siebel Web Tools and how to set it, see Using Siebel Tools. The Siebel runtime repository includes language-specific content.

Although one Application Object Manager component can support only one language, multiple Application Object Managers can run at the same time on the same Siebel Server, each configured for a different language.

Additional Information

The following topics contain more information about language deployment for Siebel applications:

Scenarios for Installing and Deploying Siebel Languages
About Parameters for Language and Locale
Creating Language and Locale Records
Siebel Installation Guide for the operating system you are using
For Siebel language support, Unicode support, and legacy code page support, see 1513102.1 (Article ID) on My Oracle Support

Locale

This topic is part of Global Deployment Terminology.

A locale is based on the language, country (territory), and character set applicable to a particular place or region. Siebel CRM cannot control the character set supported by the database and do not directly support the concept of a country, so the locale is primarily based on the language. Organization is sometimes used as a proxy for country.

Locales are defined in the Siebel seed data and associated with Application Object Manager components, as are languages. A Siebel locale includes a collection of user profile inputs, including keyboard layout and the formats used for numbers, dates, currencies, and times. Bidirectionality is also a function of locale, as noted in Verifying Bidirectional Capability.

The Siebel Web Client adopts the locale settings in effect for the Application Object Manager component on the Siebel Server.

The Siebel Mobile Web Client and Developer Web Client adopt the locale settings defined in the client operating system’s regional settings.

For more information about locales defined in Siebel CRM, see Siebel Applications Administration Guide.

Types of Locales

Different types of locales are described as follows:

User locale. The current language and country settings active for this session.
You can set a locale to provide data to users in their native format, including the formatting of numeric information such as numbers, times, dates, and currencies. Typically, user locales contain the symbols for the thousand separator, decimal point, negative number representation, time separator, short data format, long data format, and currency symbols. A country specification is often used to select default values for user locale settings.
Both the Siebel database and the Siebel applications have locale settings, which are independent of the operating system (except for the Siebel Mobile Web Client and Developer Web Client).
Input locale. The current language used for entering data from the keyboard.
The input locale affects the layout of keys on the keyboard, and for some languages, the way in which those key entries are then processed before the application enters the data into the current form on the screen. The input locale describes the language being entered and the input method, which could be a particular keyboard layout or a speech-to-text converter.
Keyboard layout is a defined input locale that correlates the keys on the keyboard to their subsequent character definition mapping within the code page of the operating system.
An input method editor (IME) allows you to enter complex characters, such as those in Asian languages, directly from the keyboard. For information about setting the IME mode on applet controls and list columns, see Configuring Siebel Business Applications.
System locale. If you are using a Microsoft Windows operating system, then the system locale is a systemwide setting that designates which code page is used as the default for all of the users on the system. If you are using a UNIX operating system, then the settings for formatting and code page locales are not systemwide. These code pages and fonts allow non-Unicode applications to run as they would on a system localized to the language of the system locale.
For more information about specifying the system locale on UNIX, see Siebel Installation Guide for UNIX.

Note: If you are using a Windows operating system, then you must restart the system after changing a system locale.

Locale Usage

You can use locale rules to vary the appearance of data for different regions of your implementation. Typically, this data would include dates and times, numbers, and currencies.

For example, the date and time thirty minutes past four in the afternoon on May nine, year two thousand-and-eighteen can appear differently depending on the locale. It might appear as:

05/09/2018, 04:30 PM, if the locale used is English American (ENU).
09.05.2018, 16:30, if the locale used is German (DEU).

Locales specify thousand separators and decimal symbols for numbers. They determine the position of the currency symbol in relation to the currency amount.

Locales also guide what characters are available through the computer keyboard. Users can remap their keyboards through the locale setting to get access to additional characters when typing.

Internationalization

This topic is part of Global Deployment Terminology.

Internationalization includes designing software to handle and display data, such as text, diagrams, and numbers, according to the orthography or rules of the language as used in a particular locale. Internationalization is often abbreviated as I18N (there are 18 characters between the initial I and the terminal N).

The software might have to input, display, and print characters, sort text, and recognize numbers and dates in different formats, and display and print text right-to-left as well as left-to right. Therefore, certain engineering features must be incorporated into the code to handle these requirements.

Developing an internationalized program means that the feature and code designs do not make assumptions based on a single language or locale and that the source code base simplifies the creation of different language editions of a program.

Internationalization Features

Some aspects of internationalization include:

A base version enabled for international environments
Localizable items separated from the core functionality on which they are running
Software that takes advantage of supporting platforms, such as the Windows operating systems and the database platform the software is running on

Your Siebel applications have been internationalized. Specific features include:

A base version, enabled for international environments
Support for localization built into the data model
Support for separate language-specific modules (where necessary)
For example, some resource library files (such as DLL files) are language-independent, while other such files are language-dependent. In general, language-dependent files are located in language-specific installation directories.
Euro (€) currency support
String, number, and date handling
Support for multilingual user data, such as:
- Multilingual picklists (MLOV seed data)
- Multilingual data for product- and catalog-related entities
Support for major Unicode and non-Unicode (traditional) character sets
For a list of the languages supported by Siebel CRM, and the supported code pages for each database, see 1513102.1 (Article ID) on My Oracle Support.
The ability to support both left-to-right and right-to-left displays, referred to as bidirectionality

Localization

This topic is part of Global Deployment Terminology.

Localization is the process of readying a product for use in a particular target country. Localization is often abbreviated as L10N, because there are 10 characters between the initial L and the terminal N. (The product must have been internationalized, or else most localization cannot be performed.)

Localization tasks are described in Localizing Global Deployments.

Siebel applications are localized as required by the customer base. Local language releases are translated and elements of the user interface, including buttons, error messages, and log files, are configured to meet local requirements.

The features that make the product internationalized are part of the software architecture; they do not require a special version of the product.

Customers must perform certain tasks to complete localization. The necessary tasks might vary according to the language requirements. For example, implementing any language that displays using a right-to-left directionality, such as Arabic or Hebrew, requires a particular set of tasks.

General Activities for Localization

Localization consists of these general activities:

Translation. Taking all of the applicable strings that appear on-screen in the application user interface and translating them into the language used in the target country.
Adaptation. The process of making sure the product is suitable for use in the target country. Example activities are:
- Modifying the user interface to display language-specific elements, for example, hiding or displaying fields or modifying the position, height, and width of controls to accommodate the target language. For example, if a target country does not have a governmental equivalent to a state, then the State field might be hidden for the target country.
- Modifying images used in the application to those appropriate for the target country.
- Ensuring that the default configuration for the target country includes the right date format, currency, address format, salutations, names of provinces or states, and so on. User interface labels and master data might need to be modified.
  For example, a U.S.-specific term like SSN (Social Security number) is not translatable, but might be replaced with an equivalent term for the target country, such as national ID number.
  For another example, the State field is prepopulated with the names of the U.S. states. These values are incorrect in other countries that have states (or equivalent), such as Mexico and Brazil. Where applicable, replace the LOV containing state names with the list of states (or equivalent) for the target country.
  Addresses use a single format for each language, and there are more than 400 address applets across the applications. For each supported language, Siebel CRM predefines the address formats for the target country.
  For example, the address format for France is used with the French language pack. French-speaking users in Canada will find that this is the wrong address format, so you will likely want to change it. Similarly, the U.S. address format, used for the ENU language pack, is incorrect for English-speaking users outside of the U.S.
- Changing from a left-to-right display to a right-to-left display. (The ability to support both left-to-right and right-to-left displays, referred to as bidirectionality, is an internationalization feature.)
- Defining and implementing access control mechanisms that are appropriate for the users in the target country and the data they work with. Data might need to be visible in multiple countries or visible only in particular countries.

Localization Example

A Siebel CRM localization example for Japanese (JPN) is shown in the following figure. In this example, the localization that was done is part of the standard product.

Example of Localized User Interface - Japanese (JPN)

About Date, Time, Number, and Currency Formatting

Siebel CRM supports formatting of data such as dates and time, numbers, phone numbers, and currency, based on locale settings. More information about formatting for these types of data can be found in Siebel Applications Administration Guide.

The following are examples of different formats based on locales:

Date and time
- 11/4/2018 or 3/21/2018 (U.S. English format, for November 4, 2018 or March 21, 2018)
- 04.11.2018 (German format, for November 4, 2018)
Number
- 1,234.34 (U.S. format, with a comma as the digit grouping symbol and a period for the decimal symbol)
- 1 234,34 (French format, with a space as the digit grouping symbol and a comma for the decimal symbol)
- 1.234,34 (German format, with a period as the digit grouping symbol and a comma for the decimal symbol)
Phone number
- +33 1-23 42 34 56 (French phone number, as shown in U.S. regional settings)
- (415) 295-5000 (U.S. phone number, as shown in U.S. regional settings)
Currency
- $32.45 (U.S. format, with U.S. dollar currency symbol in front of the amount)
- 99.40 kr (Swedish format, with Krona currency symbol behind the amount)

Application handling of multicurrency transactions for multinational businesses includes automatic currency conversion, with full euro support. Siebel CRM allows you to conduct currency transactions using multiple currencies, and to define additional currencies as needed. Currencies are converted as needed within the application, such as when rolling up forecasts.

For information about administering currency conversion, see Siebel Applications Administration Guide. For information about configuring dual-currency display, see Configuring Siebel Business Applications.

About Supported Character Sets

This topic provides information about the non-Unicode (traditional) and Unicode character sets supported for Siebel CRM. It contains the following information:

In this guide, the terms character set and code page are used to cover closely related concepts used by the various platform vendors.

Note: Siebel CRM does not support any character that has been added to a font by mapping it to an open code point that is not within an official character set extension area, such as the Private Use Area (PUA) of Unicode.

Non-Unicode (Traditional) Character Sets

This topic is part of About Supported Character Sets.

Before the emergence of Unicode, non-Unicode (traditional) character sets were available to address storage and processing requirements for a specific language or group of languages.

Examples of non-Unicode character sets are Code Page 1252 for languages spoken in Western European countries as well as in the Americas and elsewhere, and Code Page 932 for the Japanese language.

Because of the regional aspect of non-Unicode character sets, character data for languages not part of the character set cannot be processed in the same environment. Therefore, when a need to process data belonging to multiple character sets arise, customers are forced to provide multiple environments.

Also, because character sets are expressed in code pages, the numeric representation of a character in one code page might be different from the representation in another code page, and often the character does not even exist.

For example, the letter a-umlaut (ä) in the Western European character set does not exist in the Arabic character set. In a Western European code page, such as 1252 or ISO 8859-1, the a-umlaut occupies code point E4 (Hex value). In an Arabic code page, such as 1256 or ISO 8859-6, the E4 code point is an Arabic character and not the a-umlaut. Thus, you cannot represent the a-umlaut character on an Arabic system, or represent the Arabic character in a Western European system.

There is a set of characters that are common in most generally used non-Unicode character sets and code pages. These characters are known as the ASCII characters. They include the common characters used in the English language and they occupy the first 128 code points (Hex 00-9F) in the non-Unicode code pages. For a list of the languages supported by Siebel CRM, and the supported code pages for each database, see 1513102.1 (Article ID) on My Oracle Support.

Note: It is the customer’s responsibility to choose a supported character set that includes the characters required by the customer’s business. Because the character set is a property of database configuration performed by the customer, Siebel CRM has no control over this setting. Choosing an inappropriate character set might require database reconfiguration later, and a corresponding need to convert large amounts of transaction data that has built up in the wrong character set. Converting transaction data is generally a time-consuming and costly experience. For help with character set conversion to Unicode, you must engage Oracle’s Application Expert Services. Contact your Oracle sales representative for Oracle Advanced Customer Services to request assistance from Oracle’s Application Expert Services.

Unicode Character Set

This topic is part of About Supported Character Sets.

To meet the needs of global operations, a number of software and hardware providers started the Unicode Consortium and created a Unicode standard during the 1990s. The repertoire of this international character code for information processing includes characters for the major scripts of the world, as well as technical symbols in common use. Unicode can represent 64 thousand planes of 64 thousand characters each. Unicode character encoding treats alphabetic characters, ideographic characters, such as Kanji, and symbols identically, which means that they can be used in any mixture with equal facility.

The original Unicode standard (1.0) defined a 16-bit entity as the basic unit to represent a character. This standard became the basis of the UCS-2 encoding of Unicode, which specifies 16 bits for each character, regardless of which language it might represent.

However, the UCS-2 standard considered 8 consecutive bits of zero value to be valid data, which has a different meaning to programs written in C, where it means the end of string. Because most Web and communications software was written in C at the time the Unicode standard was introduced, an alternative encoding of Unicode called UTF-8 became popular. It encodes exactly the same set of characters, but avoids the null byte problem. To do this, it represents data in variable amounts: 1, 2, or 3 bytes in length, depending on the character.

Today the Unicode standard has advanced further, and has defined an extension mechanism to encode more than 16 bits worth of information. This revised standard is now referred to as UTF-16. The UTF-8 standard has remained popular among Web users, and has added a fourth byte in size to address the Unicode extension mechanism. Today there are two forms of Unicode in active use, UTF-16 and UTF-8, and Siebel CRM uses both of them.

For more information about Unicode, see the Web site of the Unicode Consortium:

http://www.unicode.org

For more information about databases supported by Siebel CRM, see the Certifications tab on My Oracle Support. For a list of the languages supported by Siebel CRM, and the supported code pages for each database, see 1513102.1 (Article ID) on My Oracle Support.

UCS-2

UCS-2 stands for Universal Character Set - 2 Bytes. In this standard, all characters are represented by two bytes (16 bits), no matter the origin.

UTF-8

UTF-8 stands for Unicode Transformation Format, 8-bit Encoding. UTF-8 is an encoding of Unicode which is more efficient for storage of English (ASCII), whereas other language data is expanded and can be represented by up to four bytes.

For example, English (ASCII) characters use one byte for each character, accented European characters use two bytes, and Asian languages use three bytes for each character.

UTF-16

UTF-16 replaces the original UCS-2. UTF-16 can access 63,000 characters as single Unicode 16-bit units and an additional one million characters through a mechanism known as surrogate pairs.

For surrogate pairs, two ranges of Unicode code values are reserved for the high (first) and low (second) values of these pairs. High values are from 0xD800 to 0xDBFF, and low values are from 0xDC00 to 0xDFFF. The number of characters requiring surrogate pairs is fairly limited, because the most common characters have already been included in the first 64,000 values.

About Character Set Encodings and Siebel Applications

Character set encodings are used in multiple places in Siebel CRM. See also About Supported Character Sets.

Enterprise DB Server Code Page system preference. This system preference is set during Siebel Enterprise Server installation and configuration to reflect the character set that the administrator believes has been set up in the database. This value must not be modified, because it is used at configuration time to select the correct database schema to be used. (Siebel CRM provides customized schemas to match each database and character set.)
For more information, see the Siebel Installation Guide for the operating system you are using.
SIEBEL_CODEPAGE (UNIX environment variable). This environment variable is created and set for Siebel CRM to indicate the code page that the applications will assume if Siebel configuration files (CFG files, CSS files, and so on) have not been saved as Unicode UTF-8, as they would normally be saved. This variable generally does not need to be set explicitly. If you must set it, then the value can be a subset of character set encodings, except UTF-8 and UTF-16.
For more information, see Siebel Installation Guide for UNIX.
Character conversion argument. This argument is available in the following business services:
- Transcode Service business service. Accepts all of the supported character set encoding names. This business service is normally used for data validation and to prevent data that cannot be converted to the appropriate code page from entering or leaving the Siebel application. For more information, see About the Transcode Service Business Service.
  
  Note: Whenever possible, use EAI business services such as the XML Converter business service to convert data.
- EAI business services. These business services accept a variety of character set encodings.

When business services are invoked from a workflow, the valid set of encodings is controlled by a picklist. If the business services are invoked through scripting or a similar mechanism, then the character set name is supplied textually.

Updating Currency Symbols

In some situations, you might need to update your currency symbols. For example, if you are operating in a Unicode environment, but your currency seed data was originally installed in a non-Unicode environment, then you must update your currencies to include any currency symbols that you require that were not part of your prior non-Unicode environment.

For information about activating and defining currencies, see Siebel Applications Administration Guide.

About the Database Collation Sequence

The collation sequence for your database, also called sort order, is the ordering relationship, or sequence, between data records. Each database has a collation sequence so that records returned by queries can be returned in a certain order, such as an alphabetic order for text strings. The collation sequence determines the order in which records are displayed in the Siebel client, most noticeably in list views.

A collation sequence is defined when you set up the Siebel database. All of the sorting is done in the Siebel database by the database server. Sorting is not set or performed within the Siebel application and does not depend on the operating system.

Note: For more information about creating and configuring the Siebel database, see the Siebel Installation Guide for the operating system you are using. For more information about collation sequences for upgrade environments, see Siebel Database Upgrade Guide.

For the collation sequences supported for each supported RDBMS platform for the Siebel database, see the Certifications tab on My Oracle Support. For a list of the languages supported by Siebel CRM, and the supported code pages for each database, see 1513102.1 (Article ID) on My Oracle Support. Also consult your RDBMS vendor documentation.

The collation sequence in effect for a database is determined by one of the following implementation methods:

Indexes created in the Siebel database provide a default collation sequence. In Oracle Database, indexes always use binary collation sequence.
Post-query sorting might also be supported for an RDBMS platform. However, this method of sorting yields slower performance and requires all of the records to have been retrieved first. For this reason, it is impractical for Siebel applications, which always perform open-ended queries.

For the development environment, only binary collation sequence is supported. For a production environment, you can specify the collation sequence most suitable for your deployment.

Note: Changing the collation sequence after the Siebel database has been installed requires rebuilding your indexes. On a fully loaded production database, this task is time-consuming and database resource-intensive. For help with planning a project of this complexity, it is recommended to engage Oracle’s Application Expert Services. Contact your Oracle sales representative for Oracle Advanced Customer Services to request assistance from Oracle’s Application Expert Services.

Which collation sequence is best for your deployment depends on factors such as RDBMS support, performance requirements, database availability requirements, the code page in use, the needs of your users, and the nature of the data that is to be retrieved by different groups of users.

Binary collation sequence offers the best performance and does not require you to rebuild your indexes for the production environment. This collation sequence works well for users working with English-language data, because the ASCII character set is based on the English alphabet and corresponds to the binary collation sequence. However, sorting might be unsuitable for users and data in languages other than English.

For multilingual deployments using Unicode, a linguistic collation sequence based on the Unicode Collation Algorithm (UCA), which goes by different names for different RDBMS vendors, might be a suitable collation sequence. UCA, also known as ISO 14651, provides reasonably good results with mixed-language data.

Other linguistic, or dictionary, collation sequences might offer optimal sorting results for particular languages or groups of languages. Such collation sequences might be suitable for certain deployments, such as those requiring compatibility with the CP932 (Japanese Shift-JIS) sort order.

Linguistic collation sequences that are not based on UCA might associate multiple characters (such as accented and unaccented versions of a particular letter) so they will be treated the same for sorting purposes, but will also be treated the same in unique indexes. If you are changing to a case- or accent-insensitive collation sequence, then you will need to first clean out any data that is unique only due to a case or accent difference.

Database Collation for the Local Database

As of Siebel Innovation Pack 2016, database collation for the Siebel Mobile Web Client, for which the local database uses Oracle Database XE, is the same as for Oracle Database Enterprise Edition.

A local database used for development with Siebel Tools must use the binary collation sequence. Using Siebel Tools against a non-binary collation sequence is not supported.