This chapter contains information useful for SQL programming in a globalization support environment. This chapter includes the following topics:
NLS parameters can be specified for all SQL functions whose behavior depends on globalization support conventions. These functions are:
Explicitly specifying the optional NLS parameters for these functions enables the functions to be evaluated independently of the session's NLS parameters. This feature can be important for SQL statements that contain numbers and dates as string literals.
For example, the following query is evaluated correctly if the language specified for dates is
SELECT last_name FROM employees WHERE hire_date > '01-JAN-2005';
Such a query can be made independent of the current date language by using a statement similar to the following:
SELECT last_name FROM employees WHERE hire_date > TO_DATE('01-JAN-2005','DD-MON-YYYY', 'NLS_DATE_LANGUAGE = AMERICAN');
In this way, SQL statements that are independent of the session language can be defined where necessary. Such statements are necessary when string literals appear in SQL statements in views,
CHECK constraints, or triggers.
Only SQL statements that must be independent of the session NLS parameter values should explicitly specify optional NLS parameters in locale-dependent SQL functions. Using session default values for NLS parameters in SQL functions usually results in better performance.
All character functions support both single-byte and multibyte characters. Except where explicitly stated, character functions operate character by character, rather than byte by byte.
The rest of this section includes the following topics:
When SQL functions evaluate views and triggers, default values from the current session are used for the NLS function parameters. When SQL functions evaluate
CHECK constraints, they use the default values that were specified for the NLS parameters when the database was created.
NLS parameters are specified in SQL functions as follows:
'parameter = value'
'NLS_DATE_LANGUAGE = AMERICAN'
The following NLS parameters can be specified in SQL functions:
Table 9-1 shows which NLS parameters are valid for specific SQL functions.
Table 9-1 SQL Functions and Their Valid NLS Parameters
|SQL Function||Valid NLS Parameters|
The following examples show how to use NLS parameters in SQL functions:
TO_DATE ('1-JAN-99', 'DD-MON-YY', 'nls_date_language = American') TO_CHAR (hire_date, 'DD/MON/YYYY', 'nls_date_language = French') TO_CHAR (SYSDATE, 'DD/MON/YYYY', 'nls_date_language=''Traditional Chinese'' ') TO_NUMBER ('13.000,00', '99G999D99', 'nls_numeric_characters = '',.''') TO_CHAR (salary, '9G999D99L', 'nls_numeric_characters = '',.'' nls_currency = ''EUR''') TO_CHAR (salary, '9G999D99C', 'nls_numeric_characters = ''.,'' nls_iso_currency = Japan') NLS_UPPER (last_name, 'nls_sort = Swiss') NLSSORT (last_name, 'nls_sort = German')
In some languages, some lowercase characters correspond to more than one uppercase character or vice versa. As a result, the length of the output from the
NLS_INITCAP functions can differ from the length of the input.
The following NLS parameters are not accepted in SQL functions except for
NLS_TERRITORY_FORMAT are not accepted as parameters because they can interfere with required format masks. A date format must always be specified if an NLS parameter is in a
TO_DATE function. As a result,
NLS_TERRITORY_FORMAT are not valid NLS parameters for the
TO_DATE functions. If you specify
NLS_TERRITORY_FORMAT in the
TO_DATE function, then an error is returned.
NLS_LANGUAGE can interfere with the session value of
NLS_DATE_LANGUAGE. If you specify
NLS_LANGUAGE in the
TO_CHAR function, for example, then its value is ignored if it differs from the session value of
This section includes the following topics:
CONVERT function converts the binary representation of a character string in one character set to another. It uses exactly the same technique as conversion between database and client character sets. Hence, it uses replacement characters and has the same limitations.
The syntax for
CONVERT is as follows:
CONVERT(char, dest_char_set[, source_char_set])
char is the value to be converted.
source_char_set is the source character set and
dest_char_set is the destination character set. If the
source_char_set parameter is not specified, then it defaults to the database character set.
Oracle provides SQL functions that work in accordance with different length semantics. There are three groups of such SQL functions:
INSTR. Each function in a group is based on a different kind of length semantics and is distinguished by the character or number appended to the function name. For example,
SUBSTRB is based on byte semantics.
SUBSTR functions return a requested portion of a substring. The
LENGTH functions return the length of a string. The
INSTR functions search for a substring in a string.
SUBSTR functions calculate the length of a string differently. Table 9-2 summarizes the calculation methods.
Table 9-2 How the SUBSTR Functions Calculate the Length of a String
Calculates the length of a string in characters based on the length semantics associated with the character set of the data type. For example, AL32UTF8 characters are calculated in UCS-4 characters. UTF8 and AL16UTF16 characters are calculated in UCS-2 characters. A supplementary character is counted as one character in AL32UTF8 and as two characters in UTF8 and AL16UTF16. Because
Calculates the length of a string in bytes.
Calculates the length of a string in UCS-2 characters, which is compliant with Java strings and Windows client environments. Characters are represented in UCS-2 or 16-bit Unicode values. Supplementary characters are counted as two characters.
Calculates the length of a string in UCS-4 characters. Characters are represented in UCS-4 or 32-bit Unicode values. Supplementary characters are counted as one character.
Calculates the length of a string in Unicode composed characters. Supplementary characters and composed characters are counted as one character.
INSTR functions calculate string length in the same way, according to the character or number added to the function name.
The following examples demonstrate the differences between
SUBSTRB on a database whose character set is AL32UTF8.
For the string
Fußball, the following statement returns a substring that is 4 characters long, beginning with the second character:
SELECT SUBSTR ('Fußball', 2 , 4) SUBSTR FROM DUAL; SUBS ---- ußba
For the string
Fußball, the following statement returns a substring 4 bytes long, beginning with the second byte:
SELECT SUBSTRB ('Fußball', 2 , 4) SUBSTRB FROM DUAL; SUB --- ußb
Oracle Database SQL Language Reference for more information about the
LIKE conditions specify a test that uses pattern-matching. The equality operator (=) exactly matches one character value to another, but the
LIKE conditions match a portion of one character value to another by searching the first value for the pattern specified by the second.
LIKE calculates the length of strings in characters using the length semantics associated with the input character set. The
LIKEC conditions are summarized in Table 9-3.
Table 9-3 LIKE Conditions
Use when characters are represented in UCS-2 semantics. A supplementary character is considered as two characters.
Use when characters are represented in UCS-4 semantics. A supplementary character is considered as one character.
Use when characters are represented in Unicode complete character semantics. A composed character is treated as one character.
There is no
Two SQL functions,
NLS_CHARSET_ID, can convert between character set ID numbers and character set names. They are used by programs that need to determine character set ID numbers for binding variables through OCI.
Another SQL function,
NLS_CHARSET_DECL_LEN, returns the declaration length of a column in number of characters, given the byte length of the column.
This section includes the following topics:
n) function returns the name of the character set corresponding to ID number
n. The function returns
n is not a recognized character set ID value.
text) returns the character set ID corresponding to the name specified by
text is defined as a run-time
VARCHAR2 quantity, a character set name. Values for
text can be
NLSRTL names that resolve to character sets that are not the database character set or the national character set.
If the value
CHAR_CS is entered for
text, then the function returns the ID of the database character set. If the value
NCHAR_CS is entered for
text, then the function returns the ID of the database's national character set. The function returns
text is not a recognized name.
The value for
text must be entered in uppercase characters.
NLSSORT function enables you to use any linguistic sort for an
ORDER BY clause. It replaces a character string with the equivalent sort string used by the linguistic sort mechanism so that sorting the replacement strings produces the desired sorting sequence. For a binary sort, the sort string is the same as the input string.
The kind of linguistic sort used by an
BY clause is determined by the
NLS_SORT session parameter, but it can be overridden by explicitly using the
The following example specifies a German sort with the
NLS_SORT session parameter.
ALTER SESSION SET NLS_SORT = GERMAN; SELECT * FROM table1 ORDER BY column1;
The following example first sets the
NLS_SORT session parameter to German, but the
NLSSORT function overrides it by specifying a French sort.
ALTER SESSION SET NLS_SORT = GERMAN; SELECT * FROM table1 ORDER BY NLSSORT(column1, 'NLS_SORT=FRENCH');
WHERE clause uses binary comparison when
NLS_COMP is set to
BINARY, but this can be overridden by using the
NLSSORT function in the
The following example makes a linguistic comparison using the
ALTER SESSION SET NLS_COMP = BINARY; SELECT * FROM table1 WHERE NLSSORT(column1, 'NLS_SORT=FRENCH')> NLSSORT(column2, 'NLS_SORT=FRENCH');
NLS_COMP session parameter to
LINGUISTIC causes the
NLS_SORT value to be used in the
The rest of this section contains the following topics:
There are four ways to use
NLSSORT(), which relies on the
NLS_LANG parameter of the
NLSSORT function is not the same as the
NLS_LANG client environment setting. In the
NLS_LANG specifies the abbreviated language name, such as
US for American or
PL for Polish. For example:
SELECT * FROM table1 ORDER BY NLSSORT(column1, 'NLS_LANG=PL');
NLSSORT enables applications to perform string matching that follows alphabetic conventions. Normally, character strings in a
WHERE clause are compared by using the binary values of the characters. One character is considered greater than another character if it has a greater binary value in the database character set. Because the sequence of characters based on their binary values might not match the alphabetic sequence for a language, such comparisons may not follow alphabetic conventions. For example, if a column (
column1) contains the values ABC, ABZ, BCD, and ÄBC in the ISO 8859-1 8-bit character set, then the following query returns both
Ä has a higher numeric value than
SELECT column1 FROM table1 WHERE column1 > 'B';
Ä is sorted alphabetically before
B, but in Swedish,
Ä is sorted after
Z. Linguistic comparisons can be made by using
NLSSORT in the
WHERE NLSSORT(col) comparison_operator NLSSORT(comparison_string)
NLSSORT must be on both sides of the comparison operator. For example:
SELECT column1 FROM table1 WHERE NLSSORT(column1) > NLSSORT('B');
If a German linguistic sort has been set, then the statement does not return strings beginning with
Ä comes before
B in the German alphabet. If a Swedish linguistic sort has been set, then strings beginning with
Ä are returned because
Ä comes after
Z in the Swedish alphabet.
Comparison in the
WHERE clause or PL/SQL blocks is binary by default. Using the
NLSSORT function for linguistic comparison can be tedious, especially when the linguistic sort has already been specified in the
NLS_SORT session parameter. You can use the
NLS_COMP parameter to indicate that the comparisons in a
WHERE clause or in PL/SQL blocks must be linguistic according to the
NLS_SORT session parameter.
If a linguistic sort is in use, then
ORDER BY clauses use an implicit
NLSSORT on character data. The sort mechanism (linguistic or binary) for an
BY clause is transparent to the application. However, if the
NLSSORT function is explicitly specified in an
BY clause, then the implicit
NLSSORT is not done.
If a linguistic sort has been defined by the
NLS_SORT session parameter, then an
ORDER BY clause in an application uses an implicit
NLSSORT function. If you specify an explicit
NLSSORT function, then it overrides the implicit
When the sort mechanism has been defined as linguistic, the
NLSSORT function is usually unnecessary in an
When the sort mechanism either defaults or is defined as binary, then a query like the following uses a binary sort:
SELECT last_name FROM employees ORDER BY last_name;
A German linguistic sort can be obtained as follows:
SELECT last_name FROM employees ORDER BY NLSSORT(last_name, 'NLS_SORT = GERMAN');
This section contains the following topics:
RM (Roman Month) format element returns a month as a Roman numeral. You can specify either upper case or lower case by using
rm. For example, for the date 7 Sep 2007,
Note that the
DY format masks explicitly support month and day abbreviations that may not be three characters in length. For example, the abbreviations "Lu" and "Ma" can be specified for the French "Lundi" and "Mardi", respectively.
To support the ISO standard, the
IW format element is provided. It returns the ISO week number. In addition, the
IYYY format elements, equivalent in behavior to the
YYYY format elements, return the year relating to the ISO week number.
In the ISO standard, the year relating to an ISO week number can be different from the calendar year. For example, 1st Jan 1988 is in ISO week number 53 of 1987. A week always starts on a Monday and ends on a Sunday. The week number is determined according the following rules:
If January 1 falls on a Friday, Saturday, or Sunday, then the week including January 1 is the last week of the previous year, because most of the days in the week belong to the previous year.
If January 1 falls on a Monday, Tuesday, Wednesday, or Thursday, then the week is the first week of the new year, because most of the days in the week belong to the new year.
For example, January 1, 1991, is a Tuesday, so Monday, December 31, 1990, to Sunday, January 6, 1991, is in week 1. Thus, the ISO week number and year for December 31, 1990, is 1, 1991. To get the ISO week number, use the
IW format mask for the week number and one of the
IY formats for the year.
For Roman numerals, you can specify either upper case or lower case, using
rn, respectively. The number being converted must be an integer in the range 1 to 3999.
DBMS_LOB PL/SQL package can load external
BFILE data into
LOB columns. Oracle Database performs character set conversion before loading the binary data into
NCLOB columns. Thus, the
BFILE data does not need to be in the same character set as the database or national character set to work properly. The APIs convert the data from the specified
BFILE character set into the database character set for the
CLOB data type, or the national character set for the
NCLOB data type. The loading takes place on the server because
BFILE data is not supported on the client.
DBMS_LOB.LOADBLOBFROMFILE to load
DBMS_LOB.LOADCLOBFROMFILE to load