Oracle8i JDBC Developer's Guide and Reference
Release 3 (8.1.7)

Part Number A83724-01

Library

Product

Contents

Index

Go to previous page Go to beginning of chapter Go to next page

JDBC and NLS

After a brief overview, this section covers the following topics:

Oracle's JDBC drivers support NLS (National Language Support). NLS lets you retrieve data or insert data into a database in any character set that Oracle supports. If the clients and the server use different character sets, then the driver provides the support to perform the conversions between the database character set and the client character set.

For more information on NLS, NLS environment variables, and the character sets that Oracle supports, see the Oracle8i National Language Support Guide. See the Oracle8i Reference for more information on the database character set and how it is created.

Here are a few examples of commonly used Java methods for JDBC that rely heavily on NLS character set conversion:

How JDBC Drivers Perform NLS Conversions

The techniques that the Oracle JDBC drivers use to perform character set conversion for Java applications depend on the character set the database uses. The simplest case is where the database uses the US7ASCII or WE8ISO8859P1 character set. In this case, the driver converts the data directly from the database character set to UCS-2, which is used in Java applications, and vice versa.

If you are working with databases that employ a non-US7ASCII or non-WE8ISO8859P1 character set (for example, Japanese or Korean), then the driver converts the data first to UTF-8 (this step does not apply to the server-side internal driver), then to UCS-2. For example, the driver always converts CHAR and VARCHAR2 data in a non-US7ASCII, non-WE8ISO8859P1 character set. It does not convert RAW data.


Note:

The JDBC drivers perform all character set conversions transparently. No user intervention is necessary for the conversions to occur.  


JDBC OCI Driver and NLS

If you are using the JDBC OCI driver, then NLS is handled as in any other Oracle client situation. The client character set, language, and territory settings are in the NLS_LANG environment variable, which is set at client-installation time.

Note that there are also server-side settings for these parameters, determined during database creation. So, when performing character set conversion, the JDBC OCI driver has to take three factors into consideration:

The JDBC OCI driver transfers the data from the server to the client in the character set of the database. Depending on the value of the NLS_LANG environment variable, the driver handles character set conversions in one of two ways:

or:

JDBC Thin Driver and NLS

If you are using the JDBC Thin driver, then there will presumably be no Oracle client installation. NLS conversions must be handled differently.

Language and Territory

The Thin driver obtains language and territory settings (NLS_LANGUAGE and NLS_TERRITORY) from the Java locale in the JVM user.language property. The date format (NLS_DATE_FORMAT) is set according to the territory setting.

Character Set

If the database character set is US7ASCII or WE8ISO8859P1, then the data is transferred to the client without any conversion. The driver then converts the character set to UCS-2 in Java.

If the database character set is something other than US7ASCII or WE8ISO8859P1, then the server first translates the data to UTF-8 before transferring it to the client. On the client, the JDBC Thin driver converts the data to UCS-2 in Java.

Server-Side Internal Driver and NLS

If your JDBC code running in the server accesses the database, then the JDBC server-side internal driver performs a character set conversion based on the database character set. The target character set of all Java programs is UCS-2.

NLS Support and Object Types

The Oracle JDBC class files, classes12.zip and classes111.zip, provide NLS support for the Thin and OCI drivers. The files contain all the necessary classes to provide complete NLS support for all Oracle character sets for CHAR, VARCHAR, LONGVARCHAR, and CLOB type data not retrieved or inserted as part of an Oracle object or collection type.

However, in the case of the CHAR and VARCHAR data portion of Oracle objects and collections, the JDBC class files provide support for only these commonly used character sets:

To provide support for all NLS character sets, the Oracle 8i JDBC driver installation includes two additional files: nls_charset12.zip for JDK 1.2.x and nls_charset11.zip for JDK 1.1.x. The OCI and Thin drivers require these files to support all Oracle characters sets for CHAR and VARCHAR data in Oracle object types and collections. To obtain this support, you must add the appropriate nls_charset*.zip file to your CLASSPATH.

It is important to note that the nls_charset*.zip files are very large, because they must support a large number of character sets. To save space, you might want to keep only the classes you need from the nls_charset*.zip file. If you want to do this, follow these steps:

  1. Unzip the appropriate nls_charset*.zip file.

  2. Add only the needed character set classes to the CLASSPATH.

  3. Remove the unneeded character set files from your system.

The character set extension class files are named in the following format:

CharacterConverter<OracleCharacterSetId>.class

where <OracleCharacterSetId> is the hexadecimal representation of the Oracle character set ID that corresponds to a character set name.


Note:

The preceding discussion is not relevant in using the server-side internal driver, which provides complete NLS support and does not require the NLS character set classes.  


CHAR and VARCHAR2 Data Size Restrictions with the Thin Driver

If the database character set is neither ASCII (US7ASCII) nor ISO-LATIN-1 (WE8ISO8859P1), then the Thin driver must impose size restrictions for CHAR and VARCHAR2 bind parameters that are more restrictive than normal database size limitations. This is necessary to allow for data expansion during conversion.

The Thin driver checks CHAR or VARCHAR2 bind sizes when the setXXX() method is called. If the data size exceeds the size restriction, then the driver throws a SQL exception (ORA-17070 "Data size bigger than max size for this type") from the setXXX() call. This limitation is necessary to avoid the chance of data corruption whenever an NLS conversion occurs and increases the length of the data. This limitation is enforced when you are doing all the following:

Role of NLS Ratio

As previously discussed, when the database character set is neither US7ASCII nor WE8ISO8859P1, the Thin driver converts Java UCS-2 characters to UTF-8 encoding bytes for CHAR or VARCHAR2 binds. The UTF-8 encoding bytes are then transferred to the database, and the database converts the UTF-8 encoding bytes to the database character set encoding.

This conversion to the character set encoding might result in a size increase. The NLS ratio for a database character set indicates the maximum possible expansion in converting from UTF-8 to the character set:

NLS ratio = (maximum possible value of) [(size in database character set) / (size in UTF-8)]
 

Size Restriction Formulas

Table 18-1 shows the database size limitations for CHAR and VARCHAR2 data, and the Thin driver size restriction formulas for CHAR and VARCHAR2 binds. Database limits are in bytes. Formulas determine the maximum size of the UTF-8 encoding, in bytes.

Table 18-1 Maximum CHAR and VARCHAR2 Bind Sizes, Thin Driver
Oracle Version  Datatype  Max Size Allowed by Database (bytes)  Formula for Thin Driver Max Bind Size (UTF-8 bytes) 

Oracle8 and Oracle8i  

CHAR  

2000  

min(2000, 4000/NLS_ratio)  

Oracle8 and Oracle8i  

VARCHAR2  

4000  

4000/NLS_ratio  

Oracle7  

CHAR  

255  

255  

Oracle7  

VARCHAR2  

2000  

2000/NLS_ratio  

The formulas guarantee that after the data is converted from UTF-8 to the database character set, the size will not exceed the database maximum size.

The number of UCS-2 characters that can be supported is determined by the number of bytes per character in the data. All ASCII characters are one byte long in UTF-8 encoding. Other character types can be two or three bytes long.

NLS Ratios and Calculated Size Restrictions for Common Character Sets

Table 18-2 lists the NLS ratios of some common server character sets, then shows the Thin driver maximum bind sizes for CHAR and VARCHAR2 data for each character set, as determined by using the NLS ratio in the appropriate formula.

Again, maximum bind sizes are for UTF-8 encoding, in bytes.

Table 18-2 NLS Ratio and Size Limits, Oracle8, Common Character Sets
Server Character Set  NLS Ratio  Thin Driver Max VARCHAR2 Bind Size (UTF-8 bytes)  Thin Driver Max CHAR Bind Size (UTF-8 bytes) 

WE8DEC  

1  

4000  

2000  

JA16SJIS  

2  

2000  

2000  

ISO 8859-1 through 10  

3  

1333  

1333  



Go to previous page
Go to beginning of chapter
Go to next page
Oracle
Copyright © 1996-2000, Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index