Oracle ATG Web Commerce - Character Encodings

Character Encodings

A character encoding is a technique for translating a sequence of bytes into a sequence of characters (text). For example, content from a web page is stored on the server as a sequence of bytes and, when it is sent to a web browser, it is converted to human-readable text using an appropriate character encoding. Different character encodings are available for handling the requirements of different languages; for example, languages such as English have a relatively small number of characters and can use a single-byte character set such as ISO-8859-1, which allows up to 256 symbols, including punctuation and accented characters. Other languages such as Chinese, however, use thousands of characters and require a double-byte character set such as Unicode, which allows up to 65536 symbols.

You can create internationalized web sites with Oracle ATG Web Commerce in any character encodings supported by the Java Development Kit (JDK). Java bases all character data on Unicode. All Strings in Java are considered to be Unicode characters. Likewise, I/O classes support the conversion of character data to and from native encodings and Unicode. Find a list of the character encodings that Oracle ATG Web Commerce supports in the Oracle ATG Commerce Supported Environments Matrix document in the My Oracle Support knowledge base (https://support.oracle.com/).

Developers and web designers generally use a native encoding method for their content. Oracle ATG Web Commerce handles native encoded content the same way Java does. When Oracle ATG Web Commerce reads in character data, it is converted to Unicode by the GenericConverter that is included with Oracle ATG Web Commerce. The GenericConverter handles any character encodings supported by Java and by your version of the JDK. Whenever data is written out and sent to a web browser, the GenericConverter converts the data back to a native encoding. Typically, the encoding written out to a browser is the same as the encoding of the document that is read in by Oracle ATG Web Commerce. The Java InputStreamReader and OutputStreamWriter classes are used to convert text from locale-specific encoding to Unicode and then convert the text back to the locale-specific encoding for display to the user. For more information, see Using the EncodingTyper to Set the Character Encoding in this chapter.

Character Encodings

ATG Platform Programming Guide