1 Internationalization Enhancements

Recent releases of the JDK include enhancements to the internationalization process to support updated standards.

Internationalization Enhancements in JDK 11

Internationalization enhancements for JDK 11 include:

Unicode 10.0.0

Support has been added for Unicode 10.0.0. Java Platform, Standard Edition (Java SE) 9 and 10 supported Unicode 8.0.

The Unicode 10.0 standard includes 16,018 characters and 10 scripts that were introduced since Unicode 8.0, all of which are supported in Java SE 11.

Internationalization Enhancements in JDK 10

Internationalization enhancements for JDK 10 include:

Additional Unicode Language-Tag Extensions

The IETF BCP (best current practice) 47 language tags standard, which has been supported in the Locale class since Java SE 7, includes a Unicode extension subtag. As of Java SE 9, only the -ca (calendar) and -nu (number) extensions are supported.

Java SE 10 added support for the following additional extensions in the relevant JDK classes:

  • -cu (currency type)

  • -fw (first day of week)

  • -rg (region override)

  • -tz (time zone)

Since JDK 10, if an application specifies a locale of en-US-u-cu-EUR, which means US English with Euro currency, java.util.Currency.getInstance(locale) instantiates a Euro Currency. If the locale is en-US-u-cu-JPY, a Japanese Yen Currency is instantiated.

Internationalization Enhancements in JDK 9

Internationalization enhancements for Oracle Java Development Kit 9 include:

Unicode 8.0

Support has been added for Unicode 8.0. Java Platform, Standard Edition (Java SE) 8 supported Unicode 6.2.

The Unicode 6.3, 7.0, and 8.0 standards introduced 10,555 characters, 29 scripts, and 42 blocks, all of which are supported in Java SE 9.

CLDR Locale Data Enabled by Default

The XML-based locale data of the Unicode Common Locale Data Repository (CLDR), first added in JDK 8, is the default locale data since JDK 9.

There are four distinct sources for locale data, identified by the following keywords:

  • CLDR represents the locale data provided by the Unicode CLDR project.

  • HOST represents the current user's customization of the underlying operating system's settings. It works only with the user's default locale, and the customizable settings may vary depending on the operating system. However, primarily date, time, number, and currency formats are supported.

  • SPI represents the locale-sensitive services implemented by the installed Service Provider Interface (SPI) providers.

  • COMPAT represents the locale data that is compatible with releases prior to JDK 9.

To select a locale data source, use the java.locale.providers system property, listing the data sources in the preferred order. If a provider cannot offer the requested locale data, the search proceeds to the next provider in order. For example:

java.locale.providers=HOST,SPI,CLDR,COMPAT

If you do not set this property, the default behavior is equivalent to the following setting:

java.locale.providers=CLDR,COMPAT,SPI

To enable behavior that is compatible with JDK 8, set the java.locale.providers system property to a value with COMPAT to the left of CLDR.

For supported locales, use the search field on the Technical Resources from Oracle page and search for "Supported Locales" See java.util.spi.LocaleServiceProvider API specification for the related API.

UTF-8 Properties Files

Since Java SE 9, properties files are loaded in UTF-8 encoding. In previous releases, ISO-8859-1 encoding was used for loading property resource bundles. UTF-8 is a much more convenient way to represent non-Latin characters.

Most existing properties files should not be affected: UTF-8 and ISO-8859-1 have the same encoding for ASCII characters, and human-readable non-ASCII ISO-8859-1 encoding is not valid UTF-8. If an invalid UTF-8 byte sequence is detected, the Java runtime automatically rereads the file in ISO-8859-1.

If there is an issue, consider the following options:
  • Convert the properties file into UTF-8 encoding.

  • Specify the runtime system property for the properties file's encoding, as in this example:
    java.util.PropertyResourceBundle.encoding=ISO-8859-1

See java.util.PropertyResourceBundle.