UTF-16

UTF-16 replaces the original UCS-2. UTF-16 can access 63,000 characters as single Unicode 16-bit units and an additional one million characters through a mechanism known as surrogate pairs.

For surrogate pairs, two ranges of Unicode code values are reserved for the high (first) and low (second) values of these pairs. High values are from 0xD800 to 0xDBFF, and low values are from 0xDC00 to 0xDFFF. The number of characters requiring surrogate pairs is fairly limited, because the most common characters have already been included in the first 64,000 values.