Oracle Commerce Platform 11.1 - Character Encodings

Character Encodings

A character encoding is a technique for translating a sequence of bytes into a sequence of characters (text). For example, content from a web page is stored on the server as a sequence of bytes and, when it is sent to a web browser, it is converted to human-readable text using an appropriate character encoding. Different character encodings are available for handling the requirements of different languages; for example, languages such as English have a relatively small number of characters and can use a single-byte character set such as ISO-8859-1, which allows up to 256 symbols, including punctuation and accented characters. Other languages such as Chinese, however, use thousands of characters and require a double-byte character set such as Unicode, which allows up to 65536 symbols.

You can create internationalized web sites with the Oracle Commerce Platform in any character encodings supported by the Java Development Kit (JDK). Java bases all character data on Unicode. All Strings in Java are considered to be Unicode characters. Likewise, I/O classes support the conversion of character data to and from native encodings and Unicode. Find a list of the character encodings that the Oracle Commerce Platform supports in the Oracle Commerce Supported Environments document in the My Oracle Support knowledge base.

Developers and web designers generally use a native encoding method for their content. The Oracle Commerce Platform handles native encoded content the same way Java does. When the Oracle Commerce Platform reads in character data, it is converted to Unicode by the GenericConverter that is included with the system. The GenericConverter handles any character encodings supported by Java and by your version of the JDK. Whenever data is written out and sent to a web browser, the GenericConverter converts the data back to a native encoding. Typically, the encoding written out to a browser is the same as the encoding of the document that is read in by the Oracle Commerce Platform. The Java InputStreamReader and OutputStreamWriter classes are used to convert text from locale-specific encoding to Unicode and then convert the text back to the locale-specific encoding for display to the user. For more information, see Using the EncodingTyper to Set the Character Encoding in this chapter.

Copyright © 1997, 2015 Oracle and/or its affiliates. All rights reserved. Legal Notices