Go to main content
Oracle® Developer Studio 12.6: C User's Guide

Exit Print View

Updated: July 2017

7.8 Internationalization

Multibyte Characters and Wide Characters introduced the internationalization of the standard libraries. This section discusses the affected library functions and provides some guidelines on how programs should be written to take advantage of these features. The section discusses internationalization only with respect to the 1990 ISO/IEC C standard. The 1999 ISO/IEC C standard has no significant extension to support internationalization besides those discussed here.

7.8.1 Locales

At any time, a C program has a current locale: a collection of information that describes the conventions appropriate to some nationality, culture, and language. Locales have names that are strings. The only two standardized locale names are "C" and "". Each program begins in the "C" locale, which causes all library functions to behave just as they have historically. The "" locale is the implementation’s best guess at the correct set of conventions appropriate to the program’s invocation. "C" and "" can cause identical behavior. Other locales may be provided by implementations.

For the purposes of practicality and expediency, locales are partitioned into a set of categories. A program can change the complete locale, or just one or more categories. Generally, each category affects a set of functions disjoint from the functions affected by other categories, so temporarily changing one category for a limited duration can make sense.

7.8.2 setlocale() Function

The setlocale() function is the interface to the program’s locale. Any program that uses the invocation country’s conventions should place a call such as the following example early in the program’s execution path.

#include <locale.h>
setlocale(LC_ALL, "");

This call causes the program’s current locale to change to the appropriate local version, because LC_ALL is the macro that specifies the entire locale instead of one category. The standard categories are:


Sorting information


Character classification information


Currency printing information


Numeric printing information


Date and time printing information

Any of these macros can be passed as the first argument to setlocale() to specify that category.

The setlocale() function returns the name of the current locale for a given category (or LC_ALL) and serves in an inquiry-only capacity when its second argument is a null pointer. Thus, code similar to the following example can be used to change the locale or a portion thereof for a limited duration:

#include <locale.h>
char *oloc;
oloc = setlocale(LC_category, NULL);
if (setlocale(LC_category, "new") != 0)
        /* use temporarily changed locale */
    (void)setlocale(LC_category, oloc);

Most programs do not need this capability.

7.8.3 Changed Functions

Wherever possible and appropriate, existing library functions were extended to include locale-dependent behavior. These functions came in two groups:

  • Those declared by the ctype.h header (character classification and conversion), and

  • Those that convert to and from printable and internal forms of numeric values, such as printf() and strtod().

All ctype.h predicate functions, except isdigit() and isxdigit(), can return nonzero (true) for additional characters when the LC_CTYPE category of the current locale is other than "C". In a Spanish locale, isalpha(’ñ’) should be true. Similarly, the character conversion functions, tolower() and toupper(), should appropriately handle any extra alphabetic characters identified by the isalpha() function. The ctype.h functions are almost always macros that are implemented using table lookups indexed by the character argument. Their behavior is changed by resetting the tables to the new locale’s values, and therefore have no performance impact.

Those functions that write or interpret printable floating values can change to use a decimal-point character other than period (.) when the LC_NUMERIC category of the current locale is other than "C". There is no provision for converting any numeric values to printable form with thousands separator-type characters. When converting from a printable form to an internal form, implementations are allowed to accept such additional forms, again in other than the "C" locale. Those functions that make use of the decimal-point character are the printf() and scanf() families, atof(), and strtod(). Those functions that are allowed implementation-defined extensions are atof(), atoi(), atol(), strtod(), strtol(), strtoul(), and the scanf() family.

7.8.4 New Functions

Certain locale-dependent capabilities were added as new standard functions. Besides setlocale(), which allows control over the locale itself, the standard includes the following new functions:


Numeric/monetary conventions


Collation order of two strings


Translate string for collation


Format date and time

In addition, there are the multibyte functions mblen(), mbtowc(), mbstowcs(), wctomb(), and wcstombs().

The localeconv() function returns a pointer to a structure containing information useful for formatting numeric and monetary information appropriate to the current locale’s LC_NUMERIC and LC_MONETARY categories. This is the only function whose behavior depends on more than one category. For numeric values, the structure describes the decimal-point character, the thousands separator, and where the separators should be located. Fifteen other structure members describe how to format a monetary value.

The strcoll() function is analogous to the strcmp() function except that the two strings are compared according to the LC_COLLATE category of the current locale. The strxfrm() function can also be used to transform a string into another, such that any two such after-translation strings can be passed to strcmp() and result in an ordering analogous to what strcoll() would have returned if passed the two pre-translation strings.

The strftime() function provides formatting similar to that used with sprintf() of the values in a struct tm, along with some date and time representations that depend on the LC_TIME category of the current locale. This function is based on the ascftime() function released as part of UNIX System V Release 3.2 .