Use with Multibyte Environments
![]() |
![]() |
![]() |
![]() |
Main features of Internationalization (I18n) in WebLogic Server:
When you are to configure a distributed system that handles multibyte character data using WebLogic Server, you need to fully understand how to specify encoding methods particularly for Java and J2EE. Furthermore, you also need to sufficiently study how the encoding is done in operating system, internet and backend systems which are linked to WebLogic Server, to correctly control the encoding conversions.
The following is a concise description of the encoding handling in WebLogic Server.
WebLogic Server is a 100% pure Java application server program. All encodings inside the Server use Unicode.
This allows WebLogic Server to handle characters of all languages at the same time, provided that their characters can be handled by Unicode.
The encoding conversion is needed when WebLogic Server exchanges character data with the outside.
In the common operating systems, the environment to use Unicode which is the internal code inside Java is very rare, instead the encoding called native encoding, which is defined individually for each platform, is generally performed. Some examples of native encoding would be, for Windows, the code page that corresponds to each language, for UNIX, the encoding that corresponds to the locale specified by LANG environment variables, and for databases, the character set specified when the database is generated or the character set for the clients.
For this reason, each time character input or output occurs at WebLogic Server, a conversion between native encoding characters and characters in Unicode must be performed in either way (character set conversion). This encoding conversion occurs every time when character data input/output takes place with the operating system or external resources.
Note: The characters included in the stream to which Java class is serialized do not require the code conversion since Unicode is preserved as internal information of the class after being encoded with UTF-8. Hence generally with EJB or RMI, no considerations on encoding are necessary.
Also, it must be noted that the encoding conversion requires relatively large CPU resources since the conversion must be done for each individual character. This leads to a suggestion that at the time of application designing phase you need to reduce the code conversion as much as possible to obtain better performance.
In WebLogic Server, the encoding conversion for the server itself and the encoding conversion for application components and resources on WebLogic Server are separate.
In WebLogic Server, the encoding of the server log or the Administration Console is determined by the default encoding of a server's Java VM or a browser's language setting independently of the encoding of the application component or the language of the content that the WebLogic Server is serving.
Moreover, by deploying an application component to WebLogic Server, you can configure it to behave identically, regardless of any locale (language) environment the WebLogic Server is running in.
Also, you can set the encoding conversion individually for each resource configured on the WebLogic Server's container (ex. JDBC Connection Pool).
The encoding conversions of WebLogic Server itself include:
The encoding conversions of individual applications include:
Resources on WebLogic Server include:
When you specify an encoding on WebLogic Server, you must clarify to which one of above three categories the encoding is to be applied. Furthermore, you must always be aware whether the right character object can be created in WebLogic Server, or the character object inside WebLogic Server is being correctly encoded and output as it is supposed to be.
As above, when multibyte characters are to be handled on WebLogic Server, the entire process of encoding conversion must be understood and any setting as necessary must be made. In some cases, the application software may not be able to handle the multibyte characters correctly without setting encoding conversion.
In any case, when encoding is not specified, some default encoding will be applied. The default encoding applied may vary with each specification and/or environment.
The default encodings relating to the behavior of the WebLogic Server include:
Example:
Since, as shown above, a default encoding varies with the technical specification employed, specifying no encoding at all will lead to incorrect multibyte handling in WebLogic Server. Therefore the full understanding of each way to specify encodings described in the following chapters is strongly recommended to control encoding conversions.
The encoding means the "character set" in Java language terminology. There are a number of words that describe a character set, but the definition of each word is slightly different.
The encoding or the character set means the definition which assigns computer-readable codes to the set of characters of a specific language so that the computer can deal with these characters. This definition is called "encoding" in the Java terminology, "character set" in the Internet terminology.
Java absorbs these differences at the input/output stage, allowing it to use only Unicode always internally. This represents the excellence of Java to be able to handle any character set wherever encoding definition is available. In other words, Java is said to have the possibility to absorb all the differences of encoding that exist among various systems. However, at the moment, there is no encoding conversion table that can handle all minute differences. Also the existing encoding tables have some limitations due to the consistency with Unicode.
What is particularly important with Java Web application servers is the difference between encoding names of Java and MIME character set which is defined by IANA used in Internet and XML. To absorb this difference, WebLogic Server has a mapping table between Java encoding names and IANA character set names (see Predefined MIME-Java Encoding Mapping Table in WebLogic Server). Using this, for example, the file defined as Shift_JIS in JSP can be treated as SJIS in Java. Also with Web components, you can change this mapping table of WebLogic Server system and treat, for example, IANA character set name ‘Shift_JIS’ as a Java encoding ‘cp943’ (see Mapping Change for Java Encoding and IANA Character Set Involving HTTP Responses (Not J2EE-Compliant)).
The xerces, an embedded XML parser in WebLogic Server has its own mapping table between IANA and Java. This cannot be customized by users. For example, a character name in IANA ‘Shift_JIS’ is mapped to ‘SJIS’ of Java’s encoding name.
In WebLogic Server, the encoding is basically specified by using encoding names of Java. Also, for J2EE, Internet and XML, IANA character set names are used. The user is requested to change this mapping as necessary.
WebLogic Server can specify encodings in various different effective areas. For example in JSP, the page tag compliant to JSP2.0 specification is provided to specify the encoding of the individual pages. The encoding for each effective area such as this is nothing to do with the default encoding for JavaVM with which WebLogic Server operates, in other words the encoding which an internal implementation of JavaVM determines from the locale environment of the operating platform. If the locale for JavaVM is English, there is no problem supplying services using JSP file containing multibyte characters. However, with regards to the following items, the character strings will be handled relying on the default encoding of JavaVM.
These will operate with default encoding of JavaVM. When the language and encoding of log messages of WebLogic Server need to be switched by replacing the platform locale, following must be specified. You cannot switch the Java VM default encoding dynamically once the VM has been started. Make sure of the following settings before you restart WebLogic Server.
From Control Panel - Region (or Regional Options), select a language, such as English (United States), Japanese, Korean, Chinese (PRC) and Chinese (Taiwan). By this selection, the server will operate using CP1252, MS932, MS949, GBK or MS950 as the default encoding.
Specify the locale supported by your platform in the LANG environment variable.
Some examples of encoding for server vs. LANG environment variables are shown below. For other combinations, consult with your platform manuals.
Platform | Encoding | LANG environment variable |
---|---|---|
Solaris | EUC-JP, SJIS | ja, ja_JP.eucJP, or ja_JP.PCK |
Solaris | EUC-KR | ko or ko_KR |
Solaris | GB2312, GBK | zh_CN or zh_CN.GBK |
Solaris | GB18030 | zh_CN.GB18030 |
Solaris | Big5 | zh_TW.BIG5 |
HP | EUCJIS, SJIS | ja_JP.eucJP, ja_JP.SJIS |
HP | EUC-KR | ko.eucKR or ko_KR |
HP | GB2312 | zh_CN.hp15CN |
HP | GB18030 | zh_CN.gb18030 |
HP | Big5 | zh_TW.big5 |
For example, if you specify EUC-JP on Solaris, the LANG setting looks like this:
LANG=ja
Use the same encoding for all the WebLogic Servers through out a domain.
In WebLogic Server, it is necessary to have the same encoding settings for all the servers in the domain.
For example, when a Windows platform exists within a domain, standardize with MS932 encoding. In the case of a server with different encoding, that servers' log will not show correctly.
Use the same encoding for all the WebLogic Servers in a cluster.
In WebLogic Server, it is necessary to have the same encoding settings for all the servers in the cluster.
For example, when a Windows platform exists within a cluster, standardize with MS932 encoding. In the case of a server with different encoding, that servers' log will not show correctly.
The config.xml file is input/output in UTF-8. When editing the file directly with a text editor, read and save in UTF-8.
When creating a JDBC connection pool, you must specify an appropriate encoding for a connection to a DB which uses multibyte characters. Also depending upon the requirements from the system to be built, encoding conversion mappings for Web layer and DB layer may have to be matched.
In WebLogic Server, multibyte characters in DD files of J2EE components are handled according to XML declaration. If the DD file has no encoding attribute in the XML declaration or has no XML declaration, the file is handled as UTF-8.
The language displayed when Administration Console is started is the language you specify in the language property for your Web browser. For example, if you have not changed the setting in your IE under Japanese Windows, Japanese language will be displayed when the Administration Console is started. If you wish to change it into English, set the language setting of the browser to "English" and delete all other languages in the list. Note that all output encoding of Administration Console is standardized to UTF-8, regardless of languages.
For sending an e-mail in WebLogic Server, JavaMail is implemented. Therefore, adding mail.mime.charset, which is the system property for JavaMail, to WebLogic Server startup option will enable you to change the encoding of an e-mail to be sent. (When this property is omitted, the default encoding for JavaVM will be used.)
Example:
-Dmail.mime.charset=ISO-2022-JP
A typical example of sending an e-mail from WebLogic Server would be to use SMTP for notification of diagnosis service at system management.
As already described in Overview of Internationalization, all characters inside WebLogic Server are handled by Unicode, but any input/output of character data with external resources will lead to encoding conversion. This section includes topics on some useful notes when processing multibyte characters in view of application programming.
Conforming to RFC3280, WebLogic Server supports UTF-8 encoding with public key certificates. For details of RFC3280, see Internet X.509 Public Key Infrastructure: Certificate and CRL Profile.
Security Policy setup will fail if one of the following locales is specified.
Workaround:
Please change your browser locale to en-us. For how to change the locale setting of your browser, please refer to the browser's help.
Fixed in:
This problem was fixed in WLS9.2MP1.
From the view point of WebLogic Server, the external resources which necessitate the encoding conversion are those that use the HTTP protocol. The HTTP protocol is so designed as to transport the messages in various encodings. Therefore it is of a great importance how the encoding conversion between Unicode character strings handled inside the server and the messages encoded by the specific encoding on HTTP protocol is treated as Web components. As the solution to this problem, some encoding conversion settings are prepared as several APIs and parameters, corresponding to J2EE specification and WebLogic Server’s proprietary specification. The user is requested to understand the following explanation and to find the optimum combination of settings to meet the requirements of the system to be built.
The targets for encoding setting with regards to J2EE Web components is as follows:
In J2EE specification, the default encoding when these items are omitted is specified. The default encoding for each component is as shown below.
Component Name | Default Encoding |
---|---|
Servlet | ISO-8859-1 |
JSP | ISO-8859-1 |
XML format JSP Document | UTF-8 |
Tag File | ISO-8859-1 |
XML format Tag File | UTF-8 |
Since ISO-8859-1 encoding is extensively used as default encoding except for XML components, encoding setting is essential for the use of multibyte characters. The details of settings for each Web component are shown below. The meaning of each column in the table is as follows:
There are following three ways to specify encoding for response to servlet.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
ServletResponse#setContentType method | Each HTTP response | MIME type with Charset attribute (IANA name) | YES | 1 |
setContentType("text/html;charset=Shift_JIS");
|
ServletResponse#setCharacterEncoding method | Each HTTP response | IANA name | YES | 1 |
setCharacterEncoding("EUC-JP");
|
ServletResponse#setLocale method | Each HTTP response | Locale name (Note 1) | YES | 2 |
setLocale(ja);
|
Note 1: Encoding is determined by IANA name that is identified by locale name. See Locale-to-IANA Mapping for locale vs. IANA name.
Note that you need to call these methods before obtaining Writer, as shown below.
res.setContentType("text/html;charset=Shift_JIS");
PrintWriter out = res.getWriter();
There are following five ways to specify encoding for response to JSP.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
contentType attribute of Page directive | Each file | MIME type with Charset attribute (IANA name) | YES | 1 |
<%@ page contentType="text/html; charset=EUC-JP" %>
|
page-encoding element in web.xml | Within specified URL pattern | IANA name and URL pattern | YES | 2 |
<jsp-config>
|
pageEncoding attribute of Page directive | Each file | IANA name | YES | 2 (Note 1) |
<%@ page pageEncoding="Windows-31J" %>
|
encoding element in weblogic.xml (not recommended) | Entire Web application | Java encoding name | NO | 3 |
<jsp-descriptor>
|
webapp.encoding.default parameter of application-param element in weblogic-application.xml (Note 2) | Entire Enterprise application | IANA name | NO | 4 |
<application-param>
|
Note 1: Due to JSP2.0 specification, when web.xml page-encoding element and pageEncoding attribute of page directive do not match, an error occurs when compiling JSP. As the result, the priority of both are the same.
Note 2: The value set here will be reflected on the parameters of ServletResponse#setContentType method inside the Servlet code into which JSP is compiled. Therefore when webapp.encoding.default is changed, the JSP files of the whole of enterprise application need to be rebuilt to keep the change effective.
There are following ways to specify encoding for response to JSP Document.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
contentType attribute of Page directive | Each file | MIME type with Charset attribute (IANA name) | YES | 1 |
<jsp:directive.page contentType="text/html; CHARSET=euc-jp"/>
|
Among the methods of specifying encoding for a HTTP request, the most compliant one to HTTP specification would be to specify a character set for a charset attribute of ContentType header of HTTP request. By doing this, WegLogic Server on receiving side can correctly recognize HTTP request encoding in protocol base. However, major Web browsers such as Microsoft IE and Netscape browser cannot specify this value. Therefore HTTP request encoding also needs to be specified at WebLogic Server side.
The setting of encoding for a request is common to JSP and Sevlet, and there are following three ways.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
ServletRequest#setCharacterEncoding method | Each HTTP request | IANA name | YES | 1 |
setCharacterEncoding("EUC-JP");
|
input-charset element in weblogic.xml | Within specified URL pattern | Java encoding name and URL pattern | NO | 2 |
<charset-params>
|
webapp.encoding.default parameter of application-param element in weblogic-application.xml | Entire Enterprise application | IANA name | NO | 3 |
<application-param>
|
The Web components other than Servlet need to be read by some appropriate encoding at the time of Web container being run. For example, JSP compiler will read JSP file using some appropriate encoding when it translates JSP file into Servlet Java code. Likewise, Web components other than Servlet need to have the encoding for files correctly set.
There are following four ways to specify encoding for JSP files.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
page-encoding element in web.xml | Within specified URL pattern | IANA name and URL pattern | YES | 1 |
<jsp-config>
|
pageEncoding attribute of Page directive | Each file | IANA name | YES | 1 (Note 1) |
<%@ page pageEncoding="Windows-31J" %>
|
contentType attribute of Page directive | Each file | MIME type with Charset attribute (IANA name) | YES | 2 |
<%@ page contentType="text/html; charset=EUC-JP" %>
|
encoding element in weblogic.xml (not recommended) | Entire Web application | Java encoding name | NO | 3 |
<jsp-descriptor>
|
Note 1: Due to JSP2.0 specification, if web.xml page-encoding element and pageEncoding attribute of page directive do not match, an error occurs at the time of translation. As the result, the priority of both are the same.
Since JSP Document is described in XML, how the encoding is specified for JSP Document file will be compliant to XML specification.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
encoding attribute in the XML declaration | Each file | IANA name | YES | 1 |
<?xml version='1.0' encoding='utf-8' ?>
|
Due to JSP2.0 specification, when in JSP Document any page-encoding elements of web.xml or any file encoding by pageEncoding attributes for page directive is set and if any of these is not compliant to encoding attributes of XML declaration of JSP Document, an error occurs at the time of translation.
There are following ways to specify encoding for Tag Files.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
pageEncoding attribute of tag directive | Each file | IANA name | YES | 1 |
<%@ tag pageEncoding="Windows-31J" %>
|
There are following ways to specify encoding of XML format Tag files.
Setting Location | Effective Area | Setting Value | J2EE Compliant | Priority | Setting Example |
---|---|---|---|---|---|
encoding attribute in the XML declaration | Each file | IANA name | YES | 1 |
<?xml version='1.0' encoding='utf-8' ?>
|
Due to JSP2.0 specification, when a file encoding setting by pageEncoding attributes of tag directive is applied to XML format Tag files, an error will occur at compiling.
Due to JSP2.0 specification, if the same element of page directive appears twice or more, and if these are different, then an error will occur at the time of translation. This happens, for example, when there are two or more contentType with different encodings specified in a single file.
The static include for JSP is described as follows:
<%@ include file="relativeURL" %>
In this case, all files to be included are read and formed into one file first, and then compiling of JSP is performed. Therefore when encoding settings in page directives are done both for JSP to include and JSP to be included, and if these are different, an error will occur at the time of translation as was earlier described in Parse Method of JSP.
The dynamic include for JSP is described as follows:
<jsp:include page="{ relativeURL | <%= expression %>}" flush="true" />
For jsp:include, the include operation will not happen when this page is loaded, and the tag will remain. The page will be included when the JSP is executed. Therefore, the encoding set in the JSP that does the including will not apply to the included file(s). Hence, you must also specify the encoding in the included file.
When you specify the encoding using the setContentType() method or the contentType directive in the page tag, use an IANA character set name. However, when you handle the encodings in WebLogic Server, which is a Java application, these values must be Java encoding names. WebLogic Server has the default mappings internally and normally uses them. The default mappings also include mappings which are not defined in IANA, but are conventionally used in the Content-Type for HTML (see Predefined MIME-Java Encoding Mapping Table in WebLogic Server).
Example: x-sjis ----> Shift_JIS
You can change this mapping. This can be set in weblogic.xml as follows:
For example, 'Shift_JIS' setting in the contentType is handled as SJIS in WebLogic Server, because the IANA character set 'Shift_JIS' is mapped to the Java encoding 'Shift_JIS' (Shift_JIS is used as an alias for SJIS in JDK1.4).
Note: In Java1.3, Shift_JIS of IANA character set used to be handled as MS932 (in JDK1.1.8 and thereafter, up to JDK1.4.0; Shift_JIS was changed back to SJIS in JDK1.4.1 and thereafter.)
Consequently, MS932-specific characters cannot be used when Shift_JIS is used with the default setting.
To allocate other encoding than default mapping, you need to overwrite the default mapping as follows: specify following in <charset-mapping>
in weblogic.xml.
In the example below, Shift_JIS is mapped to MS932.
<charset-params>
<charset-mapping>
<iana-charset-name>Shift_JIS</iana-charset-name>
<java-charset-name>MS932</java-charset-name>
</charset-mapping>
</charset-params>
Note: This setting is only valid for HTTP response.For example, this is not effective for file encoding (page encoding) such as JSP.
When migrating a CGI service which uses multibyte characters to a CGI servlet on the WebLogic Server, you must specify the appropriate ContentType charset parameter in the HTTP header generated by the CGI program. If the ContentType is not set, ISO-8859-1 is used, this being the default encoding for the J2EE Servlet container. You must also use the input-charset parameter in weblogic.xml in order to receive input strings from a client correctly. You need to write it in the DD file of the target Web application. If it does not exist, ISO-8859-1 is used.
To specify input encoding for Form-Based authentication inside the form, specify the encoding name to be used to j_character_encoding. Note that this function is proprietary to WebLogic Server.
< form method="POST" action="j_security_check" >
Username: <input type="text" name="j_username">
Password: <input type="password" name="j_password">
<input type="hidden" name="j_character_encoding" value="Shift_JIS">
<input type="submit" value="Login">
<input type="reset" value="Reset">
</form>
If the following type of HTTP request is received,
http://myHostName:port/myContextPath/myRequest/?myRequestParameter
and nothing is set, WebLogic Server handles myContexPath portion and myRequest portion as follows:
For example, if the User Agent (web browser) is MS IE (Microsoft Internet Explorer), by default, the multibyte characters entered in the address bar are first encoded to UTF-8, and it is then URL encoded. WebLogic Server, with default settings, correctly creates the URL string from this UTF-8 encoded URL.
Note: In IE's Internet Options - Advanced, there is an option called Always send URLs as UTF-8 (requires restart), and this option must be ON (checked).
Remember that myRequestParameter portion is decoded in line with Specifying the Encoding for a Request. For myHostName portion, IESG is standardizing it as an international domain name.
In the case where proprietary User Agent is used and multibyte is necessary in the request URL, first make the character string a UTF-8 byte string, then URL encode and send it to the WebLogic Server. It is recommended by the W3C that the URL be encoded with UTF-8 base when creating the URI. (http://www.w3.org/TR/charmod/#sec-URIs)
Some User Agents do not perform URL encoding of request URL into UTF-8. With Netscape browser, the characters in address bar are encoded first by the character set of the environment where Netscape browser operates, and then the character string is URL encoded to be sent to WebLogic Server. For example, the Netscape browser on Japanese Windows will URL encode the request URL into Windows-31J. To handle this situation, the setting must be made so that the byte stream that is URL decoded in WebLogic Server is converted to String in Windows-31J. Through the following WebLogic Server startup option, encoding which is used for URL decoding can be changed.
-Dweblogic.http.URIDecodeEncoding=Windows-31J
Note that only one of such setting is allowed for one server instance.
To send a message containing multibyte characters in JMS transfer, the message must be sent as BytesMessage.
To do this, obtain a port and set the message type to BytesMessage using any of the following methods.
No special settings are required on the receiver because it automatically selects the same message type as that of the sent message.
Web services of WebLogic Server 9.0 implement Enterprise Web Services 1.1 specification (JSR-921). In JSR-921, SOAP1.1 is adopted. HTTP/SOAP messages based on the SOAP1.1 specification have text/xml media type, and the encoding for these messages is handled according to RFC2376. Hence the encoding operations of receiving SOAP messages in Web services of WebLogic Server 9.0 are as follows:
SOAP1.1:
Make sure that the ContentType charset is specified correctly for the client which calls the developed Web service(s) using HTTP/SOAP.
For WebLogic Server, HTTP/SOAP messages are generated with UTF-8. In this process, UTF-8 is added as charset attribute of ContentType header for the SOAP message.
UDDI explorer only supports us-ascii characters. Multibyte characters cannot be processed correctly.
Use the ElementFactory class' createStartDocument() as shown below in order to add encoding information to the XML header generated using the Streaming API for XML (StAX).
XMLOutputStreamFactory factory = XMLOutputStreamFactory.newInstance();
XMLOutputStream output = factory.newOutputStream(new
OutputStreamWriter(new FileOutputStream(fname),"Shift_JIS"));
output.add(ElementFactory.createStartDocument("Shift_JIS","1.0"));
output.flush();
The followings are notes on parsing an XML document containing multibyte characters using StAX. The main points are the same as in the notes on using the xerces parser.
Oracle database has a map between Unicode and code point on the database, for each character set. This map is used when characters are stored in a database or retrieved from the database. For example, when using Oracle Thin driver, the Oracle database server side will use the map to perform the conversion between Unicode and code point on the database.
In the WebLogic Type4 Driver for Oracle, a property called codePageOverride is provided to perform this conversion using JDK converter map. Possible values for codePageOverride property and the behaviors are in the following table:
Value | Destination database to be assured | Operation |
---|---|---|
SJIS | Character set is JA16SJIS, JA16SJISTILDE or JA16SJISYEN | Assures the conversion by the map that matches the converter for SJIS of JDK, among all the maps that can be used for the character set of the destination database. It does not assure the conversion when the map does not match. |
MS932 | Character set is JA16SJIS, JA16SJISTILDE or JA16SJISYEN | Assures the conversion by the map that matches the converter for MS932 of JDK, among all the maps that can be used for the character set of the destination database. It does not assure the conversion when the map does not match. |
UTF8 | All databases | The driver will use UTF-8 for character encoding when communicating with database. Consequently, the handling of characters to be stored in database will be the same as Oracle Thin driver. |
The difference between the case of specifying codePageOverride=SJIS and codePageOverride=MS932 will appear directly as the difference between MS932 converter and SJIS converter. For example, it affects the handling of such symbols as ~ (Wave Dash) and ¢ (Cent Sign), that are mapped differently in Unicode. Appropriate settings to meet the requirements of each system to build is recommended. See Countermeasure for Garbled Characters Caused by Unicode Definition and Java Converter (To be noted only when Japanese language is used), etc.
Note: codePageOverride=UTF8 can be used from WebLogic Server 9.1 or later.
In WebLogic Server 9.0 or thereafter, when codePageOverride property is omitted, the handling of the characters to be stored in database is the same as Oracle Thin Driver provided that the character set of destination database is any one of JA16SJIS, JA16SJISTILDE or JA16SJISYEN. See About codePageOverride Property of BEA WebLogic Type4 JDBC Driver for Oracle for the changed contents and some notes on version upgrade from the earlier versions.
If you are using jDriver for Oracle, for a database with JA16SJIS character set, and if you encounter garbled ~ (Wave Dash) after migrating to WebLogic Type4 Oracle Driver, you will be able to solve this problem by changing the database to JA16SJISTILDE or by specifying codePageOverride=MS932.
If you do not use the same converter for the conversions from the platform native encoding to the Unicode, and from the Unicode to the platform native encoding, the characters may be handled incorrectly. Here some explanation with examples is given for the case where the use of MS932 converter and SJIS converter against the same characters give the different mapping on Unicode.
Assume for example the application below where the data stored in a database is displayed by JSP deployed in WebLogic Server.
Database -------------> WebLogic Server -------------> Web browser
(Native) MS932 (Unicode) SJIS (Native)
It is a rather simple application, but as shown in Encoding Conversion, via WebLogic Server, the conversion between platform native encoding and Unicode encoding is performed at least twice. In this example, MS932 converter is used between database and WebLogic Server, and SJIS converter is used between WebLogic Server and Web browser. In this case, following codes cannot be handled correctly, giving some problems such as garbled characters.
SJIS Code |
---|
"~" (0x8160) |
"∥" (0x8161) |
"¢" (0x817C) |
"-" (0x8191) |
"£" (0x8192) |
"¬" (0x81CA) |
In order to avoid garbled characters, you need to change encoding conversion between WebLogic Server and Web browser or between database and WebLogic Server, to harmonize the two conversions.
When for example JSP page tag specifies:
<%@ page contentType="text/html; charset=Shift_JIS" %>
(Shift_JIS here is IANA name) there are following two ways for using MS932 converter between WebLogic Server and Web browser.
a) Rewrite page tag specification from Shift_JIS (IANA name) to Windows-31J (IANA name).
b) Specify the following definition in weblogic.xml and change the default encoding mapping table that WebLogic Server has internally: from Shift_JIS (IANA name) -> SJIS (Java converter name) to Shift_JIS (IANA name) -> MS932 (Java converter name).
<charset-params>
<charset-mapping>
<iana-charset-name>Shift_JIS</iana-charset-name>
<java-charset-name>MS932</java-charset-name>
</charset-mapping>
</charset-params>
For the method b), see Mapping Change for Java Encoding and IANA Character Set Involving HTTP Responses (Not J2EE-Compliant). This is useful when method a) is not applicable, in such case as too much modification volume.
When using BEA WebLogic Type4 Oracle Driver, you can change the encoding conversion between database and WebLogic Server by using codePageOverride property.
Java's MS932 encoding table supports conversion of external characters (gaiji). By using MS932, you can provide content using iMode external characters.
Domain encoding for wtc can be specified for TUXEDO domains. Specify the following parameters at time of startup. The start scripts of WebLogic Server (such as StartWebLogic.cmd file) need to be changed.
-Dweblogic.wtc.encoding=Java encoding name
The encoding specified by this is effective for the entire TUXEDO domain.
![]() |
![]() |
![]() |