Skip navigation.

Use with Multibyte Environments

  Previous Next vertical dots separating previous/next from contents/index/pdf Contents

Internationalization in WebLogic Server

Overview of Internationalization

Main features of Internationalization (I18n) in WebLogic Server:

When you are to configure a distributed system that handles multibyte character data using WebLogic Server, you need to fully understand how to specify encoding methods particularly for Java and J2EE. Furthermore, you also need to sufficiently study how the encoding is done in operating system, internet and backend systems which are linked to WebLogic Server, to correctly control the encoding conversions.

The following is a concise description of the encoding handling in WebLogic Server.

Use of Unicode

WebLogic Server is a 100% pure Java application server program. All encodings inside the Server use Unicode.

This allows WebLogic Server to handle characters of all languages at the same time, provided that their characters can be handled by Unicode.

Encoding Conversion

The encoding conversion is needed when WebLogic Server exchanges character data with the outside.

In the common operating systems, the environment to use Unicode which is the internal code inside Java is very rare, instead the encoding called native encoding, which is defined individually for each platform, is generally performed. Some examples of native encoding would be, for Windows, the code page that corresponds to each language, for UNIX, the encoding that corresponds to the locale specified by LANG environment variables, and for databases, the character set specified when the database is generated or the character set for the clients.

For this reason, each time character input or output occurs at WebLogic Server, a conversion between native encoding characters and characters in Unicode must be performed in either way (character set conversion). This encoding conversion occurs every time when character data input/output takes place with the operating system or external resources.

Note: The characters included in the stream to which Java class is serialized do not require the code conversion since Unicode is preserved as internal information of the class after being encoded with UTF-8. Hence generally with EJB or RMI, no considerations on encoding are necessary.

Also, it must be noted that the encoding conversion requires relatively large CPU resources since the conversion must be done for each individual character. This leads to a suggestion that at the time of application designing phase you need to reduce the code conversion as much as possible to obtain better performance.

Separation between the Encoding Conversion for the Server Itself and the Encoding Conversion for Application Components and Resources on WebLogic Server

In WebLogic Server, the encoding conversion for the server itself and the encoding conversion for application components and resources on WebLogic Server are separate.

In WebLogic Server, the encoding of the server log or the Administration Console is determined by the default encoding of a server's Java VM or a browser's language setting independently of the encoding of the application component or the language of the content that the WebLogic Server is serving.

Moreover, by deploying an application component to WebLogic Server, you can configure it to behave identically, regardless of any locale (language) environment the WebLogic Server is running in.

Also, you can set the encoding conversion individually for each resource configured on the WebLogic Server's container (ex. JDBC Connection Pool).

The encoding conversions of WebLogic Server itself include:

The encoding conversions of individual applications include:

Resources on WebLogic Server include:

When you specify an encoding on WebLogic Server, you must clarify to which one of above three categories the encoding is to be applied. Furthermore, you must always be aware whether the right character object can be created in WebLogic Server, or the character object inside WebLogic Server is being correctly encoded and output as it is supposed to be.

As above, when multibyte characters are to be handled on WebLogic Server, the entire process of encoding conversion must be understood and any setting as necessary must be made. In some cases, the application software may not be able to handle the multibyte characters correctly without setting encoding conversion.

In any case, when encoding is not specified, some default encoding will be applied. The default encoding applied may vary with each specification and/or environment.

Example of the Default Encoding

The default encodings relating to the behavior of the WebLogic Server include:

Example:

Since, as shown above, a default encoding varies with the technical specification employed, specifying no encoding at all will lead to incorrect multibyte handling in WebLogic Server. Therefore the full understanding of each way to specify encodings described in the following chapters is strongly recommended to control encoding conversions.

The encoding means the "character set" in Java language terminology. There are a number of words that describe a character set, but the definition of each word is slightly different.

The encoding or the character set means the definition which assigns computer-readable codes to the set of characters of a specific language so that the computer can deal with these characters. This definition is called "encoding" in the Java terminology, "character set" in the Internet terminology.

Java absorbs these differences at the input/output stage, allowing it to use only Unicode always internally. This represents the excellence of Java to be able to handle any character set wherever encoding definition is available. In other words, Java is said to have the possibility to absorb all the differences of encoding that exist among various systems. However, at the moment, there is no encoding conversion table that can handle all minute differences. Also the existing encoding tables have some limitations due to the consistency with Unicode.

What is particularly important with Java Web application servers is the difference between encoding names of Java and MIME character set which is defined by IANA used in Internet and XML. To absorb this difference, WebLogic Server has a mapping table between Java encoding names and IANA character set names (see Predefined MIME-Java Encoding Mapping Table in WebLogic Server). Using this, for example, the file defined as Shift_JIS in JSP can be treated as SJIS in Java. Also with Web components, you can change this mapping table of WebLogic Server system and treat, for example, IANA character set name ‘Shift_JIS’ as a Java encoding ‘cp943’ (see Mapping Change for Java Encoding and IANA Character Set Involving HTTP Responses (Not J2EE-Compliant)).

The xerces, an embedded XML parser in WebLogic Server has its own mapping table between IANA and Java. This cannot be customized by users. For example, a character name in IANA ‘Shift_JIS’ is mapped to ‘SJIS’ of Java’s encoding name.

In WebLogic Server, the encoding is basically specified by using encoding names of Java. Also, for J2EE, Internet and XML, IANA character set names are used. The user is requested to change this mapping as necessary.


How to specify Default Encoding for WebLogic Server's Container

WebLogic Server can specify encodings in various different effective areas. For example in JSP, the page tag compliant to JSP2.0 specification is provided to specify the encoding of the individual pages. The encoding for each effective area such as this is nothing to do with the default encoding for JavaVM with which WebLogic Server operates, in other words the encoding which an internal implementation of JavaVM determines from the locale environment of the operating platform. If the locale for JavaVM is English, there is no problem supplying services using JSP file containing multibyte characters. However, with regards to the following items, the character strings will be handled relying on the default encoding of JavaVM.

These will operate with default encoding of JavaVM. When the language and encoding of log messages of WebLogic Server need to be switched by replacing the platform locale, following must be specified. You cannot switch the Java VM default encoding dynamically once the VM has been started. Make sure of the following settings before you restart WebLogic Server.

For Windows

From Control Panel - Region (or Regional Options), select a language, such as English (United States), Japanese, Korean, Chinese (PRC) and Chinese (Taiwan). By this selection, the server will operate using CP1252, MS932, MS949, GBK or MS950 as the default encoding.

For UNIX

Specify the locale supported by your platform in the LANG environment variable.

Some examples of encoding for server vs. LANG environment variables are shown below. For other combinations, consult with your platform manuals.

Platform Encoding LANG environment variable
Solaris EUC-JP, SJIS ja, ja_JP.eucJP, or ja_JP.PCK
Solaris EUC-KR ko or ko_KR
Solaris GB2312, GBK zh_CN or zh_CN.GBK
Solaris GB18030 zh_CN.GB18030
Solaris Big5 zh_TW.BIG5
HP EUCJIS, SJIS ja_JP.eucJP, ja_JP.SJIS
HP EUC-KR ko.eucKR or ko_KR
HP GB2312 zh_CN.hp15CN
HP GB18030 zh_CN.gb18030
HP Big5 zh_TW.big5

For example, if you specify EUC-JP on Solaris, the LANG setting looks like this:


    LANG=ja

Notes on Configuring Administration and Managed Servers

Use the same encoding for all the WebLogic Servers through out a domain.

In WebLogic Server, it is necessary to have the same encoding settings for all the servers in the domain.

For example, when a Windows platform exists within a domain, standardize with MS932 encoding. In the case of a server with different encoding, that servers' log will not show correctly.

Notes on Configuring Clusters

Use the same encoding for all the WebLogic Servers in a cluster.

In WebLogic Server, it is necessary to have the same encoding settings for all the servers in the cluster.

For example, when a Windows platform exists within a cluster, standardize with MS932 encoding. In the case of a server with different encoding, that servers' log will not show correctly.

Encoding for config.xml

The config.xml file is input/output in UTF-8. When editing the file directly with a text editor, read and save in UTF-8.

JDBC connection

When creating a JDBC connection pool, you must specify an appropriate encoding for a connection to a DB which uses multibyte characters. Also depending upon the requirements from the system to be built, encoding conversion mappings for Web layer and DB layer may have to be matched.

Deployment

In WebLogic Server, multibyte characters in DD files of J2EE components are handled according to XML declaration. If the DD file has no encoding attribute in the XML declaration or has no XML declaration, the file is handled as UTF-8.

Notes on Using Administration Console

Displayed Language on Administration Console

The language displayed when Administration Console is started is the language you specify in the language property for your Web browser. For example, if you have not changed the setting in your IE under Japanese Windows, Japanese language will be displayed when the Administration Console is started. If you wish to change it into English, set the language setting of the browser to "English" and delete all other languages in the list. Note that all output encoding of Administration Console is standardized to UTF-8, regardless of languages.

Encoding for sending an e-mail

For sending an e-mail in WebLogic Server, JavaMail is implemented. Therefore, adding mail.mime.charset, which is the system property for JavaMail, to WebLogic Server startup option will enable you to change the encoding of an e-mail to be sent. (When this property is omitted, the default encoding for JavaVM will be used.)

Example:

    -Dmail.mime.charset=ISO-2022-JP

A typical example of sending an e-mail from WebLogic Server would be to use SMTP for notification of diagnosis service at system management.


Programming

As already described in Overview of Internationalization, all characters inside WebLogic Server are handled by Unicode, but any input/output of character data with external resources will lead to encoding conversion. This section includes topics on some useful notes when processing multibyte characters in view of application programming.

Security

UTF-8 Encoding Support with Public Key Certificates (CR090467)

Conforming to RFC3280, WebLogic Server supports UTF-8 encoding with public key certificates. For details of RFC3280, see Internet X.509 Public Key Infrastructure: Certificate and CRL Profile.

Browser Locale when Setting Security Policy (CR285384)

Security Policy setup will fail if one of the following locales is specified.

Workaround:

Please change your browser locale to en-us. For how to change the locale setting of your browser, please refer to the browser's help.

Fixed in:

This problem was fixed in WLS9.2MP1.

Web Components

From the view point of WebLogic Server, the external resources which necessitate the encoding conversion are those that use the HTTP protocol. The HTTP protocol is so designed as to transport the messages in various encodings. Therefore it is of a great importance how the encoding conversion between Unicode character strings handled inside the server and the messages encoded by the specific encoding on HTTP protocol is treated as Web components. As the solution to this problem, some encoding conversion settings are prepared as several APIs and parameters, corresponding to J2EE specification and WebLogic Server’s proprietary specification. The user is requested to understand the following explanation and to find the optimum combination of settings to meet the requirements of the system to be built.

Targets of encoding settings for Web components

The targets for encoding setting with regards to J2EE Web components is as follows:

In J2EE specification, the default encoding when these items are omitted is specified. The default encoding for each component is as shown below.

Component Name Default Encoding
Servlet ISO-8859-1
JSP ISO-8859-1
XML format JSP Document UTF-8
Tag File ISO-8859-1
XML format Tag File UTF-8

Since ISO-8859-1 encoding is extensively used as default encoding except for XML components, encoding setting is essential for the use of multibyte characters. The details of settings for each Web component are shown below. The meaning of each column in the table is as follows:

Specifying the Encoding for a Response

For Servlets

There are following three ways to specify encoding for response to servlet.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
ServletResponse#setContentType method Each HTTP response MIME type with Charset attribute (IANA name) YES 1 setContentType("text/html;charset=Shift_JIS");
ServletResponse#setCharacterEncoding method Each HTTP response IANA name YES 1 setCharacterEncoding("EUC-JP");
ServletResponse#setLocale method Each HTTP response Locale name (Note 1) YES 2 setLocale(ja);

Note 1: Encoding is determined by IANA name that is identified by locale name. See Locale-to-IANA Mapping for locale vs. IANA name.

Note that you need to call these methods before obtaining Writer, as shown below.


    res.setContentType("text/html;charset=Shift_JIS");
    PrintWriter out = res.getWriter();

For JSPs

There are following five ways to specify encoding for response to JSP.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
contentType attribute of Page directive Each file MIME type with Charset attribute (IANA name) YES 1 <%@ page contentType="text/html; charset=EUC-JP" %>
page-encoding element in web.xml Within specified URL pattern IANA name and URL pattern YES 2 <jsp-config>
  <jsp-property-group>
    <url-pattern>/euc/*</url-pattern>
    <page-encoding>EUC-JP</page-encoding>
  </jsp-property-group>
  <jsp-property-group>
    <url-pattern>/utf8/*</url-pattern>
    <page-encoding>UTF-8</page-encoding>
  </jsp-property-group>
</jsp-config>
pageEncoding attribute of Page directive Each file IANA name YES 2 (Note 1) <%@ page pageEncoding="Windows-31J" %>
encoding element in weblogic.xml (not recommended) Entire Web application Java encoding name NO 3 <jsp-descriptor>
  <encoding>Windows-31J</encoding>
</jsp-descriptor>
webapp.encoding.default parameter of application-param element in weblogic-application.xml (Note 2) Entire Enterprise application IANA name NO 4 <application-param>
  <param-name>webapp.encoding.default</param-name>
  <param-value>EUC-JP</param-value>
</application-param>

Note 1: Due to JSP2.0 specification, when web.xml page-encoding element and pageEncoding attribute of page directive do not match, an error occurs when compiling JSP. As the result, the priority of both are the same.

Note 2: The value set here will be reflected on the parameters of ServletResponse#setContentType method inside the Servlet code into which JSP is compiled. Therefore when webapp.encoding.default is changed, the JSP files of the whole of enterprise application need to be rebuilt to keep the change effective.

For JSP Documents

There are following ways to specify encoding for response to JSP Document.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
contentType attribute of Page directive Each file MIME type with Charset attribute (IANA name) YES 1 <jsp:directive.page contentType="text/html; CHARSET=euc-jp"/>

Specifying the Encoding for a Request

Among the methods of specifying encoding for a HTTP request, the most compliant one to HTTP specification would be to specify a character set for a charset attribute of ContentType header of HTTP request. By doing this, WegLogic Server on receiving side can correctly recognize HTTP request encoding in protocol base. However, major Web browsers such as Microsoft IE and Netscape browser cannot specify this value. Therefore HTTP request encoding also needs to be specified at WebLogic Server side.

The setting of encoding for a request is common to JSP and Sevlet, and there are following three ways.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
ServletRequest#setCharacterEncoding method Each HTTP request IANA name YES 1 setCharacterEncoding("EUC-JP");
input-charset element in weblogic.xml Within specified URL pattern Java encoding name and URL pattern NO 2 <charset-params>
  <input-charset>
    <resource-path>/*</resource-path>
    <java-charset-name>EUC_JP</java-charset-name>
  </input-charset>
  <input-charset>
    <resource-path>/rus/joe/*</resource-path>
    <java-charset-name>Shift_JIS</java-charset-name>
  </input-charset>
</charset-params>
webapp.encoding.default parameter of application-param element in weblogic-application.xml Entire Enterprise application IANA name NO 3 <application-param>
  <param-name>webapp.encoding.default</param-name>
  <param-value>EUC-JP</param-value>
</application-param>

Specifying the Encoding for a File

The Web components other than Servlet need to be read by some appropriate encoding at the time of Web container being run. For example, JSP compiler will read JSP file using some appropriate encoding when it translates JSP file into Servlet Java code. Likewise, Web components other than Servlet need to have the encoding for files correctly set.

For JSPs

There are following four ways to specify encoding for JSP files.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
page-encoding element in web.xml Within specified URL pattern IANA name and URL pattern YES 1 <jsp-config>
  <jsp-property-group>
    <url-pattern>/euc/*</url-pattern>
    <page-encoding>EUC-JP</page-encoding>
  </jsp-property-group>
  <jsp-property-group>
    <url-pattern>/utf8/*</url-pattern>
    <page-encoding>UTF-8</page-encoding>
  </jsp-property-group>
</jsp-config>
pageEncoding attribute of Page directive Each file IANA name YES 1 (Note 1) <%@ page pageEncoding="Windows-31J" %>
contentType attribute of Page directive Each file MIME type with Charset attribute (IANA name) YES 2 <%@ page contentType="text/html; charset=EUC-JP" %>
encoding element in weblogic.xml (not recommended) Entire Web application Java encoding name NO 3 <jsp-descriptor>
  <encoding>Windows-31J</encoding>
</jsp-descriptor>

Note 1: Due to JSP2.0 specification, if web.xml page-encoding element and pageEncoding attribute of page directive do not match, an error occurs at the time of translation. As the result, the priority of both are the same.

For JSP Documents

Since JSP Document is described in XML, how the encoding is specified for JSP Document file will be compliant to XML specification.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
encoding attribute in the XML declaration Each file IANA name YES 1 <?xml version='1.0' encoding='utf-8' ?>

Due to JSP2.0 specification, when in JSP Document any page-encoding elements of web.xml or any file encoding by pageEncoding attributes for page directive is set and if any of these is not compliant to encoding attributes of XML declaration of JSP Document, an error occurs at the time of translation.

For Tag Files

There are following ways to specify encoding for Tag Files.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
pageEncoding attribute of tag directive Each file IANA name YES 1 <%@ tag pageEncoding="Windows-31J" %>

For XML format Tag Files

There are following ways to specify encoding of XML format Tag files.

Setting Location Effective Area Setting Value J2EE Compliant Priority Setting Example
encoding attribute in the XML declaration Each file IANA name YES 1 <?xml version='1.0' encoding='utf-8' ?>

Due to JSP2.0 specification, when a file encoding setting by pageEncoding attributes of tag directive is applied to XML format Tag files, an error will occur at compiling.

Parse Method of JSP

Due to JSP2.0 specification, if the same element of page directive appears twice or more, and if these are different, then an error will occur at the time of translation. This happens, for example, when there are two or more contentType with different encodings specified in a single file.

Differences between Static and Dynamic for include tag, and Specifying Encoding in page tag

Static Include

The static include for JSP is described as follows:

    <%@ include file="relativeURL" %>

In this case, all files to be included are read and formed into one file first, and then compiling of JSP is performed. Therefore when encoding settings in page directives are done both for JSP to include and JSP to be included, and if these are different, an error will occur at the time of translation as was earlier described in Parse Method of JSP.

Dynamic Include

The dynamic include for JSP is described as follows:

    <jsp:include page="{ relativeURL | <%= expression %>}" flush="true" />

For jsp:include, the include operation will not happen when this page is loaded, and the tag will remain. The page will be included when the JSP is executed. Therefore, the encoding set in the JSP that does the including will not apply to the included file(s). Hence, you must also specify the encoding in the included file.

Mapping Change for Java Encoding and IANA Character Set Involving HTTP Responses (Not J2EE-Compliant)

When you specify the encoding using the setContentType() method or the contentType directive in the page tag, use an IANA character set name. However, when you handle the encodings in WebLogic Server, which is a Java application, these values must be Java encoding names. WebLogic Server has the default mappings internally and normally uses them. The default mappings also include mappings which are not defined in IANA, but are conventionally used in the Content-Type for HTML (see Predefined MIME-Java Encoding Mapping Table in WebLogic Server).

Example: x-sjis ----> Shift_JIS

You can change this mapping. This can be set in weblogic.xml as follows:

For example, 'Shift_JIS' setting in the contentType is handled as SJIS in WebLogic Server, because the IANA character set 'Shift_JIS' is mapped to the Java encoding 'Shift_JIS' (Shift_JIS is used as an alias for SJIS in JDK1.4).

Note: In Java1.3, Shift_JIS of IANA character set used to be handled as MS932 (in JDK1.1.8 and thereafter, up to JDK1.4.0; Shift_JIS was changed back to SJIS in JDK1.4.1 and thereafter.)

Consequently, MS932-specific characters cannot be used when Shift_JIS is used with the default setting.

To allocate other encoding than default mapping, you need to overwrite the default mapping as follows: specify following in <charset-mapping> in weblogic.xml.

In the example below, Shift_JIS is mapped to MS932.

    <charset-params>
      <charset-mapping>
        <iana-charset-name>Shift_JIS</iana-charset-name>
        <java-charset-name>MS932</java-charset-name>
      </charset-mapping>
    </charset-params>

Note: This setting is only valid for HTTP response.For example, this is not effective for file encoding (page encoding) such as JSP.

CGIServlet

When migrating a CGI service which uses multibyte characters to a CGI servlet on the WebLogic Server, you must specify the appropriate ContentType charset parameter in the HTTP header generated by the CGI program. If the ContentType is not set, ISO-8859-1 is used, this being the default encoding for the J2EE Servlet container. You must also use the input-charset parameter in weblogic.xml in order to receive input strings from a client correctly. You need to write it in the DD file of the target Web application. If it does not exist, ISO-8859-1 is used.

Specifying Input Encoding for Form-Based Authentication

To specify input encoding for Form-Based authentication inside the form, specify the encoding name to be used to j_character_encoding. Note that this function is proprietary to WebLogic Server.

    < form method="POST" action="j_security_check" >
      Username: <input type="text" name="j_username">
      Password: <input type="password" name="j_password">
      <input type="hidden" name="j_character_encoding" value="Shift_JIS">
      <input type="submit" value="Login">
      <input type="reset" value="Reset">
    </form>

Use of Multibyte Character for Request URL

Default Behavior when Decoding URL

If the following type of HTTP request is received,

    http://myHostName:port/myContextPath/myRequest/?myRequestParameter

and nothing is set, WebLogic Server handles myContexPath portion and myRequest portion as follows:

  1. Performs URL decoding on myContextPath and myRequest portions
  2. Decodes the byte stream obtained in 1 into a UTF-8 character string

For example, if the User Agent (web browser) is MS IE (Microsoft Internet Explorer), by default, the multibyte characters entered in the address bar are first encoded to UTF-8, and it is then URL encoded. WebLogic Server, with default settings, correctly creates the URL string from this UTF-8 encoded URL.

Note: In IE's Internet Options - Advanced, there is an option called Always send URLs as UTF-8 (requires restart), and this option must be ON (checked).

Remember that myRequestParameter portion is decoded in line with Specifying the Encoding for a Request. For myHostName portion, IESG is standardizing it as an international domain name.

In the case where proprietary User Agent is used and multibyte is necessary in the request URL, first make the character string a UTF-8 byte string, then URL encode and send it to the WebLogic Server. It is recommended by the W3C that the URL be encoded with UTF-8 base when creating the URI. (http://www.w3.org/TR/charmod/#sec-URIs)

Method for Specifying Character Encoding when Decoding URLs

Some User Agents do not perform URL encoding of request URL into UTF-8. With Netscape browser, the characters in address bar are encoded first by the character set of the environment where Netscape browser operates, and then the character string is URL encoded to be sent to WebLogic Server. For example, the Netscape browser on Japanese Windows will URL encode the request URL into Windows-31J. To handle this situation, the setting must be made so that the byte stream that is URL decoded in WebLogic Server is converted to String in Windows-31J. Through the following WebLogic Server startup option, encoding which is used for URL decoding can be changed.

    -Dweblogic.http.URIDecodeEncoding=Windows-31J

Note that only one of such setting is allowed for one server instance.

 

Web services

Using Multibyte Characters in JMS Transfer

To send a message containing multibyte characters in JMS transfer, the message must be sent as BytesMessage.
To do this, obtain a port and set the message type to BytesMessage using any of the following methods.

  1. ((Stub)port)._setProperty(WLStub.JMS_TRANSPORT_MESSAGE_TYPE, WLStub.JMS_BYTESMESSAGE)
  2. weblogic.wsee.util.JmsUtil.setJmsTransportBytesMessage((Stub)port)

No special settings are required on the receiver because it automatically selects the same message type as that of the sent message.

Receiving SOAP messages

Web services of WebLogic Server 9.0 implement Enterprise Web Services 1.1 specification (JSR-921). In JSR-921, SOAP1.1 is adopted. HTTP/SOAP messages based on the SOAP1.1 specification have text/xml media type, and the encoding for these messages is handled according to RFC2376. Hence the encoding operations of receiving SOAP messages in Web services of WebLogic Server 9.0 are as follows:

SOAP1.1:

Make sure that the ContentType charset is specified correctly for the client which calls the developed Web service(s) using HTTP/SOAP.

Sending SOAP messages

For WebLogic Server, HTTP/SOAP messages are generated with UTF-8. In this process, UTF-8 is added as charset attribute of ContentType header for the SOAP message.

UDDI Explorer

UDDI explorer only supports us-ascii characters. Multibyte characters cannot be processed correctly.

 

XMLs

Multibyte Handling in Streaming API for XML (StAX)

Use the ElementFactory class' createStartDocument() as shown below in order to add encoding information to the XML header generated using the Streaming API for XML (StAX).

    XMLOutputStreamFactory factory = XMLOutputStreamFactory.newInstance();
    XMLOutputStream output = factory.newOutputStream(new
                        OutputStreamWriter(new FileOutputStream(fname),"Shift_JIS"));
    output.add(ElementFactory.createStartDocument("Shift_JIS","1.0"));
    output.flush();

The followings are notes on parsing an XML document containing multibyte characters using StAX. The main points are the same as in the notes on using the xerces parser.

 

JDBC

BEA WebLogic Type4 Oracle Driver (To be noted only when Japanese language is used)

In the Case of Using codePageOverride Property

Oracle database has a map between Unicode and code point on the database, for each character set. This map is used when characters are stored in a database or retrieved from the database. For example, when using Oracle Thin driver, the Oracle database server side will use the map to perform the conversion between Unicode and code point on the database.

In the WebLogic Type4 Driver for Oracle, a property called codePageOverride is provided to perform this conversion using JDK converter map. Possible values for codePageOverride property and the behaviors are in the following table:

Value Destination database to be assured Operation
SJIS Character set is JA16SJIS, JA16SJISTILDE or JA16SJISYEN Assures the conversion by the map that matches the converter for SJIS of JDK, among all the maps that can be used for the character set of the destination database. It does not assure the conversion when the map does not match.
MS932 Character set is JA16SJIS, JA16SJISTILDE or JA16SJISYEN Assures the conversion by the map that matches the converter for MS932 of JDK, among all the maps that can be used for the character set of the destination database. It does not assure the conversion when the map does not match.
UTF8 All databases The driver will use UTF-8 for character encoding when communicating with database. Consequently, the handling of characters to be stored in database will be the same as Oracle Thin driver.

The difference between the case of specifying codePageOverride=SJIS and codePageOverride=MS932 will appear directly as the difference between MS932 converter and SJIS converter. For example, it affects the handling of such symbols as ~ (Wave Dash) and ¢ (Cent Sign), that are mapped differently in Unicode. Appropriate settings to meet the requirements of each system to build is recommended. See Countermeasure for Garbled Characters Caused by Unicode Definition and Java Converter (To be noted only when Japanese language is used), etc.

Note: codePageOverride=UTF8 can be used from WebLogic Server 9.1 or later.

In the Case of Omitting codePageOverride Property

In WebLogic Server 9.0 or thereafter, when codePageOverride property is omitted, the handling of the characters to be stored in database is the same as Oracle Thin Driver provided that the character set of destination database is any one of JA16SJIS, JA16SJISTILDE or JA16SJISYEN. See About codePageOverride Property of BEA WebLogic Type4 JDBC Driver for Oracle for the changed contents and some notes on version upgrade from the earlier versions.

Notes on Migration from jDriver for Oracle

If you are using jDriver for Oracle, for a database with JA16SJIS character set, and if you encounter garbled ~ (Wave Dash) after migrating to WebLogic Type4 Oracle Driver, you will be able to solve this problem by changing the database to JA16SJISTILDE or by specifying codePageOverride=MS932.

 

Miscellaneous

Countermeasure for Garbled Characters Caused by Unicode Definition and Java Converter (To be noted only when Japanese language is used)

If you do not use the same converter for the conversions from the platform native encoding to the Unicode, and from the Unicode to the platform native encoding, the characters may be handled incorrectly. Here some explanation with examples is given for the case where the use of MS932 converter and SJIS converter against the same characters give the different mapping on Unicode.

Assume for example the application below where the data stored in a database is displayed by JSP deployed in WebLogic Server.

    Database -------------> WebLogic Server -------------> Web browser
    (Native)     MS932            (Unicode)            SJIS           (Native)

It is a rather simple application, but as shown in Encoding Conversion, via WebLogic Server, the conversion between platform native encoding and Unicode encoding is performed at least twice. In this example, MS932 converter is used between database and WebLogic Server, and SJIS converter is used between WebLogic Server and Web browser. In this case, following codes cannot be handled correctly, giving some problems such as garbled characters.

SJIS Code
"~" (0x8160)
"∥" (0x8161)
"¢" (0x817C)
"-" (0x8191)
"£" (0x8192)
"¬" (0x81CA)

In order to avoid garbled characters, you need to change encoding conversion between WebLogic Server and Web browser or between database and WebLogic Server, to harmonize the two conversions.

Changing Encoding Conversion Between WebLogic Server and Web Browser

When for example JSP page tag specifies:

    <%@ page contentType="text/html; charset=Shift_JIS" %>

(Shift_JIS here is IANA name) there are following two ways for using MS932 converter between WebLogic Server and Web browser.

a) Rewrite page tag specification from Shift_JIS (IANA name) to Windows-31J (IANA name).

b) Specify the following definition in weblogic.xml and change the default encoding mapping table that WebLogic Server has internally: from Shift_JIS (IANA name) -> SJIS (Java converter name) to Shift_JIS (IANA name) -> MS932 (Java converter name).

    <charset-params>
      <charset-mapping>
        <iana-charset-name>Shift_JIS</iana-charset-name>
        <java-charset-name>MS932</java-charset-name>
      </charset-mapping>
    </charset-params>

For the method b), see Mapping Change for Java Encoding and IANA Character Set Involving HTTP Responses (Not J2EE-Compliant). This is useful when method a) is not applicable, in such case as too much modification volume.

Changing Encoding Conversion Between Database and WebLogic Server

When using BEA WebLogic Type4 Oracle Driver, you can change the encoding conversion between database and WebLogic Server by using codePageOverride property.

In the case of Using iMode characters (To be noted only when Japanese language is used)

Java's MS932 encoding table supports conversion of external characters (gaiji). By using MS932, you can provide content using iMode external characters.

Specifying Encoding for WTC TUXEDO Domain

Domain encoding for wtc can be specified for TUXEDO domains. Specify the following parameters at time of startup. The start scripts of WebLogic Server (such as StartWebLogic.cmd file) need to be changed.

    -Dweblogic.wtc.encoding=Java encoding name

The encoding specified by this is effective for the entire TUXEDO domain.


Predefined MIME-Java Encoding Mapping Table in WebLogic Server

IANA-to-Java Mapping

US-ASCII ANSI_X3.4-1968

US-ASCII

US-ASCII

ISO-IR-6

US-ASCII

ANSI_X3.4-1986

US-ASCII

ANSI_X3.4-1968

US-ASCII

ISO_646.IRV:1991

US-ASCII

ASCII

US-ASCII

ISO646-US

US-ASCII

US

US-ASCII

IBM367

US-ASCII

CP367

US-ASCII

CSASCII

US-ASCII

IBM-367

US-ASCII

Latin1

ISO-8859-1

ISO-8859-1

ISO-IR-100

ISO-8859-1

ISO_8859-1

ISO-8859-1

LATIN1

ISO-8859-1

L1

ISO-8859-1

IBM819

ISO-8859-1

CP819

ISO-8859-1

CSISOLATIN1

ISO-8859-1

IBM-819

ISO-8859-1

Latin3

ISO-8859-3

ISO-8859-3

ISO-IR-109

ISO-8859-3

ISO_8859-3

ISO-8859-3

LATIN3

ISO-8859-3

L3

ISO-8859-3

CSISOLATIN3

ISO-8859-3

Latin4

ISO-8859-4

ISO-8859-4

ISO-IR-110

ISO-8859-4

ISO_8859-4

ISO-8859-4

LATIN4

ISO-8859-4

L4

ISO-8859-4

CSISOLATIN4

ISO-8859-4

Cyrillic

ISO-8859-5

ISO-8859-5

ISO-IR-144

ISO-8859-5

ISO_8859-5

ISO-8859-5

CYRILLIC

ISO-8859-5

CSISOLATINCYRILLIC

ISO-8859-5

Arabic

ISO-8859-6

ISO-8859-6

ISO-IR-127

ISO-8859-6

ISO_8859-6

ISO-8859-6

ECMA-114

ISO-8859-6

ASMO-708

ISO-8859-6

ARABIC

ISO-8859-6

CSISOLATINARABIC

ISO-8859-6

Greek

ISO-8859-7

ISO-8859-7

ISO-IR-126

ISO-8859-7

ISO_8859-7

ISO-8859-7

ELOT_928

ISO-8859-7

ECMA-118

ISO-8859-7

GREEK

ISO-8859-7

GREEK8

ISO-8859-7

CSISOLATINGREEK

ISO-8859-7

Hebrew

ISO-8859-8

ISO-8859-8

ISO-IR-138

ISO-8859-8

ISO_8859-8

ISO-8859-8

HEBREW

ISO-8859-8

CSISOLATINHEBREW

ISO-8859-8

ISO-8859-8-I

ISO-8859-8

ISO_8859-8-I

ISO-8859-8

CSISO88598I

ISO-8859-8

Latin5

ISO-8859-9

ISO-8859-9

ISO-IR-148

ISO-8859-9

ISO_8859-9

ISO-8859-9

LATIN5

ISO-8859-9

L5

ISO-8859-9

CSISOLATIN5

ISO-8859-9

MIBenum: 109

ISO-8859-13

ISO-8859-13

Latin9

SO-8859-15

ISO-8859-15

ISO_8859-15

ISO-8859-15

LATIN-9

ISO-8859-15

Simplified Chinese

GB2312

GB2312

CSGB2312

GB2312

GB18030

GB18030

ISO-2022-CN

ISO2022CN

Chinese for Taiwan

BIG5

Big5

CSBIG5

Big5

MIBenum 2101

BIG5-HKSCS

Big5-HKSCS

Korean

EUC-KR

EUC-KR

CSEUCKR

EUC-KR

ISO-2022-KR

ISO-2022-KR

CSISO2022KR

ISO-2022-KR

Japanese

SHIFT_JIS

Shift_JIS

SHIFT-JIS

Shift_JIS

CSSHIFTJIS

Shift_JIS

MS_KANJI

Shift_JIS

X-SJIS

Shift_JIS

SJIS

Shift_JIS

WINDOWS-31J

Windows-31J

CSWINDOWS31J

Windows-31J

EUC-JP

EUC-JP

CSEUCPKDFMTJAPANESE

EUC-JP

EXTENDED_UNIX_CODE_PACKED_FORMAT_FOR_JAPANESE

EUC-JP

ISO-2022-JP

ISO-2022-JP

CSISO2022JP

ISO-2022-JP

X0201

JIS0201

JIS_X0201

JIS0201

CSHALFWIDTHKATAKANA

JIS0201

X0208

JIS0208

JIS_C6226-1983

JIS0208

ISO-IR-87

JIS0208

JIS_X0208-1983

JIS0208

CSISO87JISX0208

JIS0208

X0212

JIS0212

JIS_X0212-1990

JIS0212

ISO-IR-159

JIS0212

CSISO159JISX02121990

JIS0212

Russian

KOI8-R

KOI8-R

CSKOI8R

KOI8-R

Thai

TIS-620

TIS-620

Traditional Chinese

CNS11643

EUC-TW

EUC-TW

EUC-TW

EUCTW

EUC-TW

MIBenum: 106

UTF-8

UTF-8

UTF8

UTF-8

Unicode

UTF-16

UTF-16

UTF-16BE

UTF-16BE

UTF-16LE

UTF-16LE

MIBenum: 2250 - 2258

WINDOWS-1250

Cp1250

WINDOWS-1251

Cp1251

WINDOWS-1252

Cp1252

WINDOWS-1253

Cp1253

WINDOWS-1254

Cp1254

WINDOWS-1255

Cp1255

WINDOWS-1256

Cp1256

WINDOWS-1257

Cp1257

WINDOWS-1258

Cp1258

EBCDIC

IBM037

Cp037

CP037

Cp037

CSIBM037

Cp037

EBCDIC-CP-US

Cp037

EBCDIC-CP-CA

Cp037

EBCDIC-CP-NL

Cp037

EBCDIC-CP-WT

Cp037

IBM273

Cp273

CP273

Cp273

CSIBM273

Cp273

IBM277

Cp277

CP277

Cp277

CSIBM277

Cp277

EBCDIC-CP-DK

Cp277

EBCDIC-CP-NO

Cp277

IBM278

Cp278

CP278

Cp278

CSIBM278

Cp278

EBCDIC-CP-FI

Cp278

EBCDIC-CP-SE

Cp278

IBM280

Cp280

CP280

Cp280

CSIBM280

Cp280

EBCDIC-CP-IT

Cp280

IBM284

Cp284

CP284

Cp284

CSIBM284

Cp284

EBCDIC-CP-ES

Cp284

EBCDIC-CP-GB

Cp285

IBM285

Cp285

CP285

Cp285

CSIBM285

Cp285

EBCDIC-JP-KANA

Cp290

IBM290

Cp290

CP290

Cp290

CSIBM290

Cp290

EBCDIC-CP-FR

Cp297

IBM297

Cp297

CP297

Cp297

CSIBM297

Cp297

EBCDIC-CP-AR1

Cp420

IBM420

Cp420

CP420

Cp420

CSIBM420

Cp420

EBCDIC-CP-HE

Cp424

IBM424

Cp424

CP424

Cp424

CSIBM424

Cp424

IBM437

Cp437

437

Cp437

CP437

Cp437

CSPC8CODEPAGE437

Cp437

EBCDIC-CP-CH

Cp500

IBM500

Cp500

CP500

Cp500

CSIBM500

Cp500

EBCDIC-CP-CH

Cp500

EBCDIC-CP-BE

Cp500

IBM775

Cp775

CP775

Cp775

CSPC775BALTIC

Cp775

IBM-THAI

Cp838

CSIBMTHAI

Cp838

IBM850

Cp850

850

Cp850

CP850

Cp850

CSPC850MULTILINGUAL

Cp850

IBM852

Cp852

852

Cp852

CP852

Cp852

CSPCP852

Cp852

IBM855

Cp855

855

Cp855

CP855

Cp855

CSIBM855

Cp855

IBM857

Cp857

857

Cp857

CP857

Cp857

CSIBM857

Cp857

IBM00858

Cp858

CP00858

Cp858

CCSID00858

Cp858

IBM860

Cp860

860

Cp860

CP860

Cp860

CSIBM860

Cp860

IBM861

Cp861

861

Cp861

CP861

Cp861

CP-IS

Cp861

CSIBM861

Cp861

IBM862

Cp862

862

Cp862

CP862

Cp862

CSPC862LATINHEBREW

Cp862

IBM863

Cp863

863

Cp863

CP863

Cp863

CSIBM863

Cp863

IBM864

Cp864

CP864

Cp864

CSIBM864

Cp864

IBM865

Cp865

865

Cp865

CP865

Cp865

CSIBM865

Cp865

IBM866

Cp866

866

Cp866

CP866

Cp866

CSIBM866

Cp866

IBM868

Cp868

CP868

Cp868

CSIBM868

Cp868

CP-AR

Cp868

IBM869

Cp869

CP869

Cp869

CSIBM869

Cp869

CP-GR

Cp869

IBM870

Cp870

CP870

Cp870

CSIBM870

Cp870

EBCDIC-CP-ROECE

Cp870

EBCDIC-CP-YU

Cp870

IBM871

Cp871

CP871

Cp871

CSIBM871

Cp871

EBCDIC-CP-IS

Cp871

IBM918

Cp918

CP918

Cp918

CSIBM918

Cp918

EBCDIC-CP-AR2

Cp918

IBM00924

Cp924

CP00924

Cp924

CCSID00924

Cp924

EBCDIC-LATIN9--EURO

Cp924

IBM1026

Cp1026

CP1026

Cp1026

CSIBM1026

Cp1026

IBM01140

Cp1140

CP01140

Cp1140

CCSID01140

Cp1140

IBM01141

Cp1141

CP01141

Cp1141

CCSID01141

Cp1141

IBM01142

Cp1142

CP01142

Cp1142

CCSID01142

Cp1142

IBM01143

Cp1143

CP01143

Cp1143

CCSID01143

Cp1143

IBM01144

Cp1144

CP01144

Cp1144

CCSID01144

Cp1144

IBM01145

Cp1145

CP01145

Cp1145

CCSID01145

Cp1145

IBM01146

Cp1146

CP01146

Cp1146

CCSID01146

Cp1146

IBM01147

Cp1147

CP01147

Cp1147

CCSID01147

Cp1147

IBM01148

Cp1148

CP01148

Cp1148

CCSID01148

Cp1148

IBM01149

Cp1149

CP01149

Cp1149

CCSID01149

Cp1149

MIBenum: 2028 - 2063

2091 - 2100

IBM-1047

Cp1047

IBM1047

Cp1047

CP1047

Cp1047

IBM-37

Cp037

IBM-273

Cp273

IBM-277

Cp277

IBM-278

Cp278

IBM-280

Cp280

IBM-284

Cp284

IBM-285

Cp285

IBM-297

Cp297

IBM-420

Cp420

IBM-424

Cp424

IBM-437

Cp437

IBM-500

Cp500

IBM-775

Cp775

IBM-850

Cp850

IBM-852

Cp852

IBM-855

Cp855

IBM-857

Cp857

IBM-858

Cp858

IBM-860

Cp860

IBM-861

Cp861

IBM-862

Cp862

IBM-863

Cp863

IBM-864

Cp864

IBM-865

Cp865

IBM-866

Cp866

IBM-868

Cp868

IBM-869

Cp869

IBM-870

Cp870

IBM-871

Cp871

IBM-918

Cp918

IBM-924

Cp924

IBM-1026

Cp1026

IBM-1140

Cp1140

IBM-1141

Cp1141

IBM-1142

Cp1142

IBM-1143

Cp1143

IBM-1144

Cp1144

IBM-1145

Cp1145

IBM-1146

Cp1146

IBM-1147

Cp1147

IBM-1148

Cp1148

IBM-1149

Cp1149

Java-to-IANA Mapping

ASCII and its aliases

ASCII

US-ASCII

US-ASCII

US-ASCII

646

US-ASCII

ISO_646.IRV:1983

US-ASCII

ANSI_X3.4-1968

US-ASCII

ISO646-US

US-ASCII

DEFAULT

US-ASCII

ASCII7

US-ASCII

ISO8859_1 and its aliases

ISO8859_1

ISO-8859-1

8859_1

ISO-8859-1

ISO_8859-1:1987

ISO-8859-1

ISO-IR-100

ISO-8859-1

ISO_8859-1

ISO-8859-1

ISO-8859-1

ISO-8859-1

ISO8859-1

ISO-8859-1

LATIN1

ISO-8859-1

L1

ISO-8859-1

IBM819

ISO-8859-1

IBM-819

ISO-8859-1

CP819

ISO-8859-1

819

ISO-8859-1

CSISOLATIN1

ISO-8859-1

ISO8859_2 and its aliases

ISO8859_2

ISO-8859-2

8859_2

ISO-8859-2

ISO_8859-2:1987

ISO-8859-2

ISO-IR-101

ISO-8859-2

ISO_8859-2

ISO-8859-2

ISO-8859-2

ISO-8859-2

ISO8859-2

ISO-8859-2

LATIN2

ISO-8859-2

L2

ISO-8859-2

IBM912

ISO-8859-2

IBM-912

ISO-8859-2

CP912

ISO-8859-2

912

ISO-8859-2

CSISOLATIN2

ISO-8859-2

ISO8859_3 and its aliases

ISO8859_3

ISO-8859-3

8859_3

ISO-8859-3

ISO_8859-3:1988

ISO-8859-3

ISO-IR-109

ISO-8859-3

ISO_8859-3

ISO-8859-3

ISO-8859-3

ISO-8859-3

ISO8859-3

ISO-8859-3

LATIN3

ISO-8859-3

L3

ISO-8859-3

IBM913

ISO-8859-3

IBM-913

ISO-8859-3

CP913

ISO-8859-3

913

ISO-8859-3

CSISOLATIN3

ISO-8859-3

ISO8859_4 and its aliases

ISO8859_4

ISO-8859-4

8859_4

ISO-8859-4

ISO_8859-4:1988

ISO-8859-4

ISO-IR-110

ISO-8859-4

ISO_8859-4

ISO-8859-4

ISO-8859-4

ISO-8859-4

ISO8859-4

ISO-8859-4

LATIN4

ISO-8859-4

L4

ISO-8859-4

IBM914

ISO-8859-4

IBM-914

ISO-8859-4

CP914

ISO-8859-4

914

ISO-8859-4

CSISOLATIN4

ISO-8859-4

ISO8859_5 and its aliases

ISO8859_5

ISO-8859-5

8859_5

ISO-8859-5

ISO_8859-5:1988

ISO-8859-5

ISO-IR-144

ISO-8859-5

ISO_8859-5

ISO-8859-5

ISO-8859-5

ISO-8859-5

ISO8859-5

ISO-8859-5

CYRILLIC

ISO-8859-5

CSISOLATINCYRILLIC

ISO-8859-5

IBM915

ISO-8859-5

IBM-915

ISO-8859-5

CP915

ISO-8859-5

915

ISO-8859-5

ISO8859_6 and its aliases

ISO8859_6

ISO-8859-6

8859_6

ISO-8859-6

ISO_8859-6:1987

ISO-8859-6

ISO-IR-127

ISO-8859-6

ISO_8859-6

ISO-8859-6

ISO-8859-6

ISO-8859-6

ISO8859-6

ISO-8859-6

ECMA-114

ISO-8859-6

ASMO-708

ISO-8859-6

ARABIC

ISO-8859-6

CSISOLATINARABIC

ISO-8859-6

IBM1089

ISO-8859-6

IBM-1089

ISO-8859-6

CP1089

ISO-8859-6

1089

ISO-8859-6

ISO8859_7 and its aliases

ISO8859_7

ISO-8859-7

8859_7

ISO-8859-7

ISO_8859-7:1987

ISO-8859-7

ISO-IR-126

ISO-8859-7

ISO_8859-7

ISO-8859-7

ISO-8859-7

ISO-8859-7

ISO8859-7

ISO-8859-7

ELOT_928

ISO-8859-7

ECMA-118

ISO-8859-7

GREEK

ISO-8859-7

GREEK8

ISO-8859-7

CSISOLATINGREEK

ISO-8859-7

IBM813

ISO-8859-7

IBM-813

ISO-8859-7

CP813

ISO-8859-7

813

ISO-8859-7

ISO8859_8 and its aliases

ISO8859_8

ISO-8859-8

8859_8

ISO-8859-8

ISO_8859-8:1988

ISO-8859-8

ISO-IR-138

ISO-8859-8

ISO_8859-8

ISO-8859-8

ISO-8859-8

ISO-8859-8

ISO8859-8

ISO-8859-8

HEBREW

ISO-8859-8

CSISOLATINHEBREW

ISO-8859-8

IBM916

ISO-8859-8

IBM-916

ISO-8859-8

CP916

ISO-8859-8

916

ISO-8859-8

ISO8859_9 and its aliases

ISO8859_9

ISO-8859-9

8859_9

ISO-8859-9

ISO-IR-148

ISO-8859-9

ISO_8859-9

ISO-8859-9

ISO-8859-9

ISO-8859-9

ISO8859-9

ISO-8859-9

LATIN5

ISO-8859-9

L5

ISO-8859-9

IBM920

ISO-8859-9

IBM-920

ISO-8859-9

CP920

ISO-8859-9

920

ISO-8859-9

CSISOLATIN5

ISO-8859-9

ISO8859_13 and its aliases

ISO8859_13

ISO-8859-13

8859_13

ISO-8859-13

ISO_8859-13

ISO-8859-13

ISO-8859-13

ISO-8859-13

ISO8859-13

ISO-8859-13

ISO8859_15 and its aliases

ISO8859_15

ISO-8859-15

8859_15

ISO-8859-15

ISO-8859-15

ISO-8859-15

ISO_8859-15

ISO-8859-15

ISO8859-15

ISO-8859-15

IBM923

ISO-8859-15

IBM-923

ISO-8859-15

CP923

ISO-8859-15

923

ISO-8859-15

LATIN0

ISO-8859-15

LATIN9

ISO-8859-15

CSISOLATIN0

ISO-8859-15

CSISOLATIN9

ISO-8859-15

ISO8859_15_FDIS

ISO-8859-15

Simplified Chinese

EUC_CN

GB2312

GB2312

GB2312

GB2312-80

GB2312

GB2312-1980

GB2312

EUC-CN

GB2312

EUCCN

GB2312

ISO2022CN

ISO-2022-CN

GB18030

GB18030

Chinese for Taiwan

BIG5

Big5

Big5_HKSCS

BIG5_HKSCS

Big5-HKSCS

BIG5-HKSCS

Big5-HKSCS

BIG5HK

Big5-HKSCS

BIG5-HKSCS:UNICODE3.0

Big5-HKSCS

Korean

KSC5601

EUC-KR

EUC_KR

EUC-KR

EUC-KR

EUC-KR

EUCKR

EUC-KR

KS_C_5601-1987

EUC-KR

KSC5601-1987

EUC-KR

KSC5601_1987

EUC-KR

KSC_5601

EUC-KR

5601

EUC-KR

ISO2022KR

ISO-2022-KR

ISO-2022-KR

ISO-2022-KR

CSISO2022KR

ISO-2022-KR

Japanese

SJIS

Shift_JIS

SHIFT_JIS

Shift_JIS

SHIFT-JIS

Shift_JIS

CSSHIFTJIS

Shift_JIS

X-SJIS

Shift_JIS

MS_KANJI

Shift_JIS

PCK

Shift_JIS

MS932

Windows-31J

WINDOWS-31J

Windows-31J

CSWINDOWS31J

Windows-31J

EUC_JP

EUC-JP

EUC-JP

EUC-JP

EUCJIS

EUC-JP

EUCJP

EUC-JP

CSEUCPKDFMTJAPANESE

EUC-JP

EXTENDED_UNIX_CODE_PACKED_FORMAT_FOR_JAPANESE

EUC-JP

X-EUC-JP

EUC-JP

X-EUCJP

EUC-JP

ISO2022JP

ISO-2022-JP

JIS

ISO-2022-JP

ISO-2022-JP

ISO-2022-JP

CSISO2022JP

ISO-2022-JP

JIS_ENCODING

ISO-2022-JP

CSJISENCODING

ISO-2022-JP

JIS0201

X0201

JIS0208

X0208

JIS0212

ISO-IR-159

Russian

KOI8_R

KOI8-R

KOI8-R

KOI8-R

KOI8

KOI8-R

CSKOI8R

KOI8-R

Thai

TIS620

TIS-620

TIS620.2533

TIS-620

TIS-620

TIS-620

Traditional Chinese

EUC_TW

CNS11643

CNS11643

CNS11643

EUC-TW

CNS11643

EUCTW

CNS11643

UTF8

UTF8

UTF-8

UTF-8

UTF-8

UNICODE-1-1-UTF-8

UTF-8

Unicode

UTF16

UTF-16

UTF-16

UTF-16

UNICODE

UTF-16

UTF-16BE

UTF-16BE

UNICODEBIG

UTF-16BE

UTF-16LE

UTF-16LE

UNICODELITTLE

UTF-16LE

MIBenum: 2250 - 2258

CP1250

windows-1250

CP1251

windows-1251

CP1252

windows-1252

CP1253

windows-1253

CP1254

windows-1254

CP1255

windows-1255

CP1256

windows-1256

CP1257

windows-1257

CP1258

windows-1258

EBCDIC

CP037

EBCDIC-CP-US

IBM037

EBCDIC-CP-US

IBM-037

EBCDIC-CP-US

037

EBCDIC-CP-US

CP273

IBM273

IBM273

IBM273

IBM-273

IBM273

273

IBM273

CP277

EBCDIC-CP-DK

IBM277

EBCDIC-CP-DK

IBM-277

EBCDIC-CP-DK

277

EBCDIC-CP-DK

CP278

EBCDIC-CP-FI

IBM278

EBCDIC-CP-FI

IBM-278

EBCDIC-CP-FI

278

EBCDIC-CP-FI

CP280

EBCDIC-CP-IT

IBM280

EBCDIC-CP-IT

IBM-280

EBCDIC-CP-IT

280

EBCDIC-CP-IT

CP284

EBCDIC-CP-ES

IBM284

EBCDIC-CP-ES

IBM-284

EBCDIC-CP-ES

CP284

EBCDIC-CP-ES

284

EBCDIC-CP-ES

CP285

EBCDIC-CP-GB

IBM285

EBCDIC-CP-GB

IBM-285

EBCDIC-CP-GB

285

EBCDIC-CP-GB

CP290

EBCDIC-JP-KANA

CP297

EBCDIC-CP-FR

IBM297

EBCDIC-CP-FR

IBM-297

EBCDIC-CP-FR

297

EBCDIC-CP-FR

CP420

EBCDIC-CP-AR1

IBM420

EBCDIC-CP-AR1

IBM-420

EBCDIC-CP-AR1

420

EBCDIC-CP-AR1

CP424

EBCDIC-CP-HE

IBM424

EBCDIC-CP-HE

IBM-424

EBCDIC-CP-HE

424

EBCDIC-CP-HE

CP437

IBM437

IBM437

IBM437

IBM-437

IBM437

437

IBM437

CSPC8CODEPAGE437

IBM437

CP500

EBCDIC-CP-CH

IBM500

EBCDIC-CP-CH

IBM-500

EBCDIC-CP-CH

500

EBCDIC-CP-CH

CP775

IBM775

IBM775

IBM775

IBM-775

IBM775

775

IBM775

CP838

IBM-Thai

IBM838

IBM-Thai

IBM-838

IBM-Thai

838

IBM-Thai

CP850

IBM850

IBM850

IBM850

IBM-850

IBM850

850

IBM850

CSPC850MULTILINGUAL

IBM850

CP852

IBM852

IBM852

IBM852

IBM-852

IBM852

852

IBM852

CSPCP852

IBM852

CP855

IBM855

IBM855

IBM855

IBM-855

IBM855

855

IBM855

CSPCP855

IBM855

CP857

IBM857

IBM857

IBM857

IBM-857

IBM857

857

IBM857

CSIBM857

IBM857

CP858

IBM00858

CP860

IBM860

IBM860

IBM860

IBM-860

IBM860

860

IBM860

CSIBM860

IBM860

CP861

IBM861

IBM861

IBM861

IBM-861

IBM861

CP-IS

IBM861

861

IBM861

CSIBM861

IBM861

CP862

IBM862

IBM862

IBM862

IBM-862

IBM862

862

IBM862

CSPC862LATINHEBREW

IBM862

CP863

IBM863

IBM863

IBM863

IBM-863

IBM863

863

IBM863

CSIBM863

IBM863

CP864

IBM864

IBM864

IBM864

IBM-864

IBM864

CSIBM864

IBM864

CP865

IBM865

IBM865

IBM865

IBM-865

IBM865

865

IBM865

CSIBM865

IBM865

CP866

IBM866

IBM866

IBM866

IBM-866

IBM866

866

IBM866

CSIBM866

IBM866

CP868

IBM868

IBM868

IBM868

IBM-868

IBM868

868

IBM868

CP869

IBM869

IBM869

IBM869

IBM-869

IBM869

869

IBM869

CP-GR

IBM869

CSIBM869

IBM869

CP870

EBCDIC-CP-ROECE

IBM870

EBCDIC-CP-ROECE

IBM-870

EBCDIC-CP-ROECE

870

EBCDIC-CP-ROECE

CP871

EBCDIC-CP-IS

IBM871

EBCDIC-CP-IS

IBM-871

EBCDIC-CP-IS

871

EBCDIC-CP-IS

CP918

EBCDIC-CP-AR2

IBM918

EBCDIC-CP-AR2

IBM-918

EBCDIC-CP-AR2

918

EBCDIC-CP-AR2

CP924

IBM00924

CP1026

IBM1026

IBM1026

IBM1026

IBM-1026

IBM1026

1026

IBM1026

CP1140

IBM01140

CP1141

IBM01141

CP1142

IBM01142

CP1143

IBM01143

CP1144

IBM01144

CP1145

IBM01145

CP1146

IBM01146

CP1147

IBM01147

CP1148

IBM01148

CP1149

IBM01149

CP1047

IBM1047

Locale-to-IANA Mapping

ar

ISO-8859-6

be

ISO-8859-5

bg

ISO-8859-5

ca

ISO-8859-1

cs

ISO-8859-2

da

ISO-8859-1

de

ISO-8859-1

el

ISO-8859-7

en

ISO-8859-1

es

ISO-8859-1

et

ISO-8859-1

fi

ISO-8859-1

fr

ISO-8859-1

hr

ISO-8859-2

hu

ISO-8859-2

is

ISO-8859-1

it

ISO-8859-1

iw

ISO-8859-8

ja

Shift_JIS

ko

EUC-KR

lt

ISO-8859-2

lv

ISO-8859-2

mk

ISO-8859-5

nl

ISO-8859-1

no

ISO-8859-1

pl

ISO-8859-2

pt

ISO-8859-1

ro

ISO-8859-2

ru

ISO-8859-5

sh

ISO-8859-5

sk

ISO-8859-2

sl

ISO-8859-2

sq

ISO-8859-2

sr

ISO-8859-5

sv

ISO-8859-1

th

TIS-620

tr

ISO-8859-9

uk

ISO-8859-5

zh

GB2312

zh_TW

Big5

 

Back to Top Previous Next