public final class StringUtil extends Object
PROP_ENCODING_EXT
.
An implementation must throw a StringException
with reason StringException.UNSUPPORTED_ENCODING
if a requested character encoding is not supported.
UTF-8 encodes each of the code points in the Unicode character set using one to four 8-bit bytes.
Unicode code points are the numerical values that make up the Unicode code space.
The Unicode Standard, version 4.0 is available from the Unicode Consortium at http://www.unicode.org.
The UTF-8 transformation format is specified by RFC 3629.
The encoded character sequences handled by this class are stored in byte arrays.
Each string or character sequence is designated by a byte array, an offset in the byte array
indicating the start of the character sequence and a length indicating the length in bytes of the character sequence.
If a designated character sequence is outside the bounds of its containing array
an ArrayIndexOutOfBoundsException
is thrown (note: for readablity reasons,
these exceptions are assumed and not systematically documented in the methods of this
class).
IndexOutOfBoundsException
is thrown.
This class provides two categories of methods:
indexOf
method.
StringException
with reason StringException.INVALID_BYTE_SEQUENCE
when encountering an ill-formed UTF-8 byte sequence; as an example of such methods,
see the convertTo
method.
codePointCount
method.
check
method should be used.
Because Unicode case conversion may require locale-sensitive mappings,
context-sensitive mappings, and 1:M character mappings, and in order to limit footprint
case conversion supported by the methods toLowerCase
,
toUpperCase
and
compare
is
only available by default for the Basic Latin Unicode block (US-ASCII character set: U+0000 - U+007F). Other
character blocks may be supported.
Modifier and Type | Field and Description |
---|---|
static byte |
GSM_7
The GSM Septet character encoding.
|
static byte |
ISO_8859_1
The ISO 8859-1 (Latin-1) character encoding.
|
static byte |
PROP_ENCODING_EXT
Start of proprietary character encoding numbering.
|
static byte |
UCS_2
The UCS-2 character encoding.
|
static byte |
UTF_16
The UTF-16 character encoding.
|
static byte |
UTF_16_BE
The UTF-16BE (Big Endian) character encoding.
|
static byte |
UTF_16_LE
The UTF-16LE (Little Endian) character encoding.
|
static byte |
UTF_8
The UTF-8 character encoding.
|
Modifier and Type | Method and Description |
---|---|
static boolean |
check(byte[] aString,
short offset,
short length)
Checks if the provided byte array contains a valid UTF-8 encoded character
or character sequence.
|
static short |
codePointAt(byte[] aString,
short offset,
short length,
short index,
byte[] dstBuffer,
short dstOffset)
Copies to the destination buffer the character (Unicode code point) at the specified index in
the UTF-8 encoded character sequence designated by
aString , offset and length . |
static short |
codePointBefore(byte[] aString,
short offset,
short length,
short index,
byte[] dstBuffer,
short dstOffset)
Copies to the destination buffer the character (Unicode code point) before the specified
index in the UTF-8 encoded character sequence designated by
aString , offset and length . |
static short |
codePointCount(byte[] aString,
short offset,
short length)
Returns the number of characters (Unicode code points) in the UTF-8
encoded character sequence designated by
aString , offset and length . |
static short |
compare(boolean ignoreCase,
byte[] aString,
short offset,
short length,
byte[] anotherString,
short ooffset,
short olength)
Compares two strings lexicographically, optionally ignoring case considerations.
|
static short |
convertFrom(byte[] srcString,
short srcOffset,
short srcLength,
byte[] dstString,
short dstOffset,
byte encoding)
Converts from the specified character encoding to the UTF-8 character encoding all of the characters from the provided
source character string and copies them to the provided destination array, starting at the provided
dstOffset . |
static short |
convertTo(byte[] srcString,
short srcOffset,
short srcLength,
byte[] dstString,
short dstOffset,
byte encoding)
Converts to the specified character encoding all of the characters from the provided
source UTF-8 encoded character string and copies them to the provided destination array,
starting at the provided
dstOffset . |
static boolean |
endsWith(byte[] aString,
short offset,
short length,
byte[] suffix,
short soffset,
short slength,
short codePointCount)
Tests if the UTF-8 encoded character sequence
designated by
aString , offset and length ends
with the first codePointCount characters of the character sequence
designated by suffix , soffset and slength . |
static short |
indexOf(byte[] aString,
short offset,
short length,
byte[] subString,
short soffset,
short slength)
Returns the index within the provided UTF-8 encoded character string of the first occurrence of the
specified substring.
|
static short |
offsetByCodePoints(byte[] aString,
short offset,
short length,
short index,
short codePointOffset)
Returns the byte index within the UTF-8 encoded character sequence designated
by
aString , offset and length
that is offset from the
given index by codePointOffset code points. |
static boolean |
parseBoolean(byte[] aString,
short offset,
short length)
Parses the string argument as a boolean.
|
static short |
parseLongInteger(byte[] aString,
short offset,
short length,
short[] integer,
short ioffset)
Parses the provided UTF-8 encoded character sequence into a (up-to) 64 bits long signed integer.
|
static short |
parseShortInteger(byte[] aString,
short offset,
short length)
Parses the provided UTF-8 encoded character sequence into the (up-to) 16 bits long signed (
short ) integer. |
static short |
replace(byte[] srcString,
short srcOffset,
short srcLength,
byte[] oldSubstring,
short oOffset,
short oLength,
byte[] newSubstring,
short nOffset,
short nLength,
byte[] dstString,
short dstOffset)
Copies to the destination byte array the string resulting from replacing all occurrences of
the old substring in the provided source string with the new substring.
|
static boolean |
startsWith(byte[] aString,
short offset,
short length,
byte[] prefix,
short poffset,
short plength,
short codePointCount)
Tests if the UTF-8 encoded character sequence
designated by
aString , offset and length starts
with the first codePointCount characters of the character sequence
designated by prefix , poffset and plength . |
static short |
substring(byte[] srcString,
short srcOffset,
short srcLength,
short codePointBeginIndex,
short codePointEndIndex,
byte[] dstString,
short dstOffset)
Copies to the destination byte array the specified substring of the designated source string.
|
static short |
toLowerCase(byte[] srcString,
short srcOffset,
short srcLength,
byte[] dstString,
short dstOffset)
Converts to lower case and copies all of the characters from the provided
source UTF-8 encoded character string to the provided destination array, starting at the provided
dstOffset . |
static short |
toUpperCase(byte[] srcString,
short srcOffset,
short srcLength,
byte[] dstString,
short dstOffset)
Converts to upper case and copies all of the characters from the provided
source UTF-8 encoded character string to the provided destination array, starting at the provided
dstOffset . |
static short |
trim(byte[] srcString,
short srcOffset,
short srcLength,
byte[] dstString,
short dstOffset)
Removes white space from both ends of the provided
UTF-8 encoded character string and copies the resulting character sequence to the provided destination array, starting at the provided
dstOffset . |
static short |
valueOf(boolean b,
byte[] dstString,
short dstOffset)
Copies the UTF-8 encoded character string representation of the
boolean argument into
the provided destination array, starting at the provided offset. |
static short |
valueOf(short[] l,
byte[] dstString,
short dstOffset)
Copies the UTF-8 encoded, signed decimal string representation of the (up-to) 64 bits long signed integer
argument provided as an array of
short integers, into
the provided destination array, starting at the provided offset. |
static short |
valueOf(short i,
byte[] dstString,
short dstOffset)
Copies the UTF-8 encoded, signed decimal string representation of the
the (up-to) 16 bits long signed (
short ) argument into
the provided destination array, starting at the provided offset. |
public static final byte UTF_8
public static final byte UTF_16
public static final byte UTF_16_LE
public static final byte UTF_16_BE
public static final byte UCS_2
public static final byte GSM_7
public static final byte ISO_8859_1
public static final byte PROP_ENCODING_EXT
public static short codePointCount(byte[] aString, short offset, short length)
aString
, offset
and length
.
Ill-formed or incomplete byte sequences within the text range count as one code point each.aString
- the byte array containing the UTF-8 encoded character sequence.offset
- the starting offset of the character sequence in the byte array.length
- the length (in bytes) of the contained character sequence.NullPointerException
- if aString
is null
.public static short codePointAt(byte[] aString, short offset, short length, short index, byte[] dstBuffer, short dstOffset)
aString
, offset
and length
.
index
is an index in the byte array relative to the offset of the
designated character sequence within the byte array (that is relative to offset
).
The resulting code point copied to the destination array is UTF-8 encoded
and is therefore from one to four byte long.
Ill-formed or incomplete byte sequences within the text range counting as
one code point each, they are returned as-is.aString
- the byte array containing the UTF-8 encoded reference character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.index
- the byte index (relative to offset
) of the character to be returned.dstBuffer
- the byte array for copying the resulting character.dstOffset
- the starting offset in dstBuffer
for copying the UTF-8 byte
sequence of the character (Unicode code point) at the specified index.IndexOutOfBoundsException
- if index
is negative or not less than the length of aString
.NullPointerException
- if aString
or dstBuffer
is null
.public static short codePointBefore(byte[] aString, short offset, short length, short index, byte[] dstBuffer, short dstOffset)
aString
, offset
and length
.
index
is an index in the byte array relative to the offset of the
designated character sequence within the byte array (that is relative to offset
).
The resulting code point copied to the destination array is UTF-8 encoded
and is therefore from one to four byte long.
Ill-formed or incomplete byte sequences within the text range counting as
one code point each, they are returned as-is.aString
- the byte array containing the UTF-8 encoded character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.index
- the byte index (relative to offset
) following the character to be returned.dstBuffer
- the byte array for copying the resulting character.dstOffset
- the starting offset in dstBuffer
for copying the UTF-8 byte sequence of the character (Unicode code point) before the specified index.IndexOutOfBoundsException
- if index
is less than 1 or greater than the length of aString
.NullPointerException
- if aString
or dstBuffer
is null
.public static short offsetByCodePoints(byte[] aString, short offset, short length, short index, short codePointOffset)
aString
, offset
and length
that is offset from the
given index
by codePointOffset
code points.
Ill-formed or incomplete UTF-8 byte sequences within the text range given
by index
and codePointOffset
count as one code point each.
This method can be used to extract a substring from a string. For example, to copy to buffer
the substring of the string aString
that begins at codePointBeginIndex
and
extends to the character at index codePointEndIndex - 1
, one can call:
short beginOffset = StringUtil.offsetByCodePoints(aString, offset, length, (short) 0, codePointBeginIndex); short endOffset = StringUtil.offsetByCodePoints(aString, offset, length, (short) 0, codePointEndIndex); short l = Util.arrayCopy(aString, beginOffset, buffer, 0, (short) (endOffset - beginOffset));The copied substring thus has a length in codepoints (that is a codepoint count) equal to
codePointEndIndex - codePointbeginIndex
.aString
- the byte array containing the UTF-8 encoded character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.index
- the byte index to be offset (relative to offset
).codePointOffset
- the offset in code points.aString
relative to the begining of the string contained in aString
that is, relative to offset
.IndexOutOfBoundsException
- if index
is negative or larger than the length of aString
,
or if codePointOffset
is positive and the substring starting with
index
has fewer than codePointOffset
code points,
or if codePointOffset
is negative and the substring before
index
has fewer than the absolute value of codePointOffset
code points.NullPointerException
- if aString
is null
.public static short compare(boolean ignoreCase, byte[] aString, short offset, short length, byte[] anotherString, short ooffset, short olength)
aString
, offset
and length
is compared
lexicographically to the character sequence designated by anotherString
, ooffset
and olength
.
The result is a negative number if the character sequence contained in aString
lexicographically precedes the character sequence contained in anotherString
. The result is
a positive number if the character sequence contained in aString
lexicographically
follows the character sequence contained in anotherString
. The result is zero if the two character sequences are equal.
This is the definition of lexicographic ordering. If two strings are
different, then either they have different characters at some index that
is a valid index for both strings, or their lengths are different, or
both. If they have different characters at one or more index positions,
let k be the smallest such index; then the string whose character
at position k has the smaller value, as determined by using the <
operator, lexicographically precedes the other string. In this case,
compare
returns the difference of the first mismatching byte
of the UTF-8 encode representation of the two character at position k
in the two strings. If there is no index position at which they differ, then
the shorter string lexicographically precedes the longer string. In this
case, compare
returns the difference of the lengths of
the strings.
When ignoring case considerations, this method behaves as if comparing
(using the same algorithm as described above)
normalized versions of the strings where case differences have been
eliminated by calling toLowerCase(toUpperCase(string))
on both argument strings.
ignoreCase
- whether case must be ignored.aString
- the byte array containing the reference UTF-8 encoded character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.anotherString
- the byte array containing the UTF-8 encoded character sequence to be compared.ooffset
- the starting offset in anotherString
of the character sequence to be compared.olength
- the length (in bytes) of the character sequence to be compared.0
if the character sequence contained in anotherString
is equal to
the character sequence contained in aString
; a value less than 0
if the character sequence contained in aString
is
lexicographically less than the character sequence contained in anotherString
; and a value
greater than 0
if the character sequence contained in aString
is lexicographically
greater than the character sequence contained in anotherString
, optionally ignoring case considerations..NullPointerException
- if aString
or anotherString
is null
.public static short indexOf(byte[] aString, short offset, short length, byte[] subString, short soffset, short slength)
If no such value of k exists, then -1 is returned.compare(false, aString, offset + k, slength, substring, soffset, slength) == 0
aString
- the byte array containing the reference UTF-8 encoded character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.subString
- the byte array containing the UTF-8 encoded character sequence of the substring.soffset
- the starting offset in subString
of the substring's character sequence.slength
- the length (in bytes) of the substring's character sequence.subString
, soffset
and slength
occurs as a substring within the string designated by aString
, offset
and length
,
then the index (relative to offset
) of the first byte of the first such substring
is returned; if it does not occur as a substring, -1
is returned.NullPointerException
- if aString
or subString
is null
.public static short replace(byte[] srcString, short srcOffset, short srcLength, byte[] oldSubstring, short oOffset, short oLength, byte[] newSubstring, short nOffset, short nLength, byte[] dstString, short dstOffset)
If the character sequence (substring) designated by oldSubstring
,
oOffset
and oLength
does not occur in the source character sequence
designated by srcString
, srcOffset
and srcLength
, then the source character sequence is
copied as is to dstString
, starting at dstOffset
. Otherwise, a character
sequence identical to the character sequence designated by
srcString
, srcOffset
and srcLength
is copied to dstString
,
starting at dstOffset
, except that every occurrence of
the substring designated by oldSubstring
,
oOffset
and oLength
is replaced by an occurrence of
the substring designated by newSubstring
,
nOffset
and nLength
.
The replacement proceeds from the beginning of the source string to the end.
srcString
- the byte array containing the source UTF-8 encoded character sequence.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.oldSubstring
- the byte array containing the UTF-8 encoded character sequence to be replaced.oOffset
- the starting offset of the replaced character sequence in oldSubstring
.oLength
- the length (in bytes) of the replaced character sequence.newSubstring
- the byte array containing the replacement UTF-8 encoded character sequence.nOffset
- the starting offset of the replacement character sequence in newSubstring
.nLength
- the length (in bytes) of the replacement character sequence.dstString
- the byte array for copying the resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.NullPointerException
- if srcString
, oldSubstring
, newSubstring
or dstString
is null
.public static short toLowerCase(byte[] srcString, short srcOffset, short srcLength, byte[] dstString, short dstOffset)
dstOffset
.
This method skips/ignores any unrecognized (ill-formed or incomplete) byte sequence.srcString
- the byte array containing the source UTF-8 encoded character sequence.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.dstString
- the byte array for copying the resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.NullPointerException
- if srcString
or dstString
is null
.toUpperCase(byte[], short, short, byte[], short)
public static short toUpperCase(byte[] srcString, short srcOffset, short srcLength, byte[] dstString, short dstOffset)
dstOffset
.
This method skips/ignores any unrecognized (ill-formed or incomplete) byte sequence.srcString
- the byte array containing the source UTF-8 encoded character sequence.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.dstString
- the byte array for copying the resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.NullPointerException
- if srcString
or dstString
is null
.toLowerCase(byte[], short, short, byte[], short)
public static short trim(byte[] srcString, short srcOffset, short srcLength, byte[] dstString, short dstOffset)
dstOffset
.
If the source string designated by srcString
, srcOffset
and srcLength
represents an empty character
sequence, or the first and last characters of character sequence
of the source string both have codes greater
than '\u0020'
(the space character), then the source string
is copied as is to dstString
, starting at dstOffset
.
Otherwise, if there is no character with a code greater than
'\u0020'
in the source string, then no character
is copied and 0 is returned.
Otherwise, let k be the index of the first character in the
source string whose code is greater than '\u0020'
, and let
m be the index of the last character in the source string whose code is
greater than '\u0020'
. The substring of the source string that begins
with the character at index k and ends with the character at
index m is copied to dstString
, starting at dstOffset
.
This method may be used to trim whitespace from the beginning and end of a string; in fact, it trims all ASCII control characters as well.
Illegal byte sequences are considered as non-white spaces.
srcString
- the byte array containing the source UTF-8 encoded character sequence.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.dstString
- the byte array for copying the resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.NullPointerException
- if srcString
or dstString
is null
.public static short valueOf(boolean b, byte[] dstString, short dstOffset)
boolean
argument into
the provided destination array, starting at the provided offset.
If the argument is true
, a string equal to "true"
is copied; otherwise, a string equal to
"false"
is copied.b
- a boolean
.dstString
- the destination UTF-8 encoded character string, as a byte arraydstOffset
- the starting offset in the destination arrayNullPointerException
- if dstString
is null
.public static boolean parseBoolean(byte[] aString, short offset, short length)
boolean
returned represents the value true
if the string argument
is not null
and is equal, ignoring case, to the string
"true"
.aString
- the byte array containing the UTF-8 encoded character sequence to be parsed.offset
- the starting offset of the character sequence in aString
.length
- the length (in bytes) of the character sequence to be parsed.NullPointerException
- if aString
is null
.public static short valueOf(short i, byte[] dstString, short dstOffset)
short
) argument into
the provided destination array, starting at the provided offset.i
- a short
.dstString
- the destination UTF-8 encoded character string, as a byte arraydstOffset
- the starting offset in the destination arrayNullPointerException
- if dstString
is null
.public static short parseShortInteger(byte[] aString, short offset, short length)
short
) integer.
Accepts decimal and hexadecimal numbers given by the following grammar:
DecimalNumeral and HexDigits are defined in §3.10.1 of the Java Language Specification.
- DecodableString:
- Signopt DecimalNumeral
- Signopt
0x
HexDigits- Signopt
0X
HexDigits- Signopt
#
HexDigits
- Sign:
-
The sequence of characters following an (optional) negative
sign and/or radix specifier ("0x
",
"0X
", "#
", or
leading zero) is parsed as a (short
) integer in the specified radix (10, or 16).
This sequence of characters must represent a positive value or a StringException
will be thrown with reason StringException.ILLEGAL_NUMBER_FORMAT
.
The result is negated
if first character of the specified character string is the
minus sign. No whitespace characters are permitted in the
character string.
aString
- the byte array containing the UTF-8 encoded character sequence to be parsed.offset
- the starting offset of the character sequence in aString
.length
- the length (in bytes) of the character sequence to be parsed.short
integer value represented by the designated character sequence.StringException
- if the designated character sequence does not contain a parsable (short
) integer.NullPointerException
- if aString
is null
.public static short valueOf(short[] l, byte[] dstString, short dstOffset)
short
integers, into
the provided destination array, starting at the provided offset.l
- an array of short
integers representing up to a 64bits signed long integer;
the most significant short
integer is at index 0
.dstString
- the destination UTF-8 encoded character string, as a byte arraydstOffset
- the starting offset in the destination arrayNullPointerException
- if l
or dstString
is null
.public static short parseLongInteger(byte[] aString, short offset, short length, short[] integer, short ioffset)
DecimalNumeral and HexDigits are defined in §3.10.1 of the Java Language Specification.
- DecodableString:
- Signopt DecimalNumeral
- Signopt
0x
HexDigits- Signopt
0X
HexDigits- Signopt
#
HexDigits
- Sign:
-
The sequence of characters following an (optional) negative
sign and/or radix specifier ("0x
",
"0X
", "#
", or
leading zero) is parsed as a (long) integer in the specified radix (10 or 16).
This sequence of characters must represent a positive value or a StringException
will be thrown with reason StringException.ILLEGAL_NUMBER_FORMAT
.
The result is negated
if first character of the specified character string is the
minus sign. No whitespace characters are permitted in the
character string.
aString
- the byte array containing the UTF-8 encoded character sequence to be parsed.offset
- the starting offset of the character sequence in aString
.length
- the length (in bytes) of the character sequence to be parsed.integer
- the array of short
integers to contained the value represented
by the designated character sequence; the most significant short
integer is at index 0
.ioffset
- the starting offset in integer
for copying the resulting short sequence.short
integers written into the array, ignoring leading zeroshort values.StringException
- the designated character sequence does not contain a parsable (long) integer.NullPointerException
- if aString
or integer
is null
.public static short convertTo(byte[] srcString, short srcOffset, short srcLength, byte[] dstString, short dstOffset, byte encoding)
dstOffset
.srcString
- the byte array containing the source UTF-8 encoded character sequence.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.dstString
- the byte array for copying the resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.encoding
- the character encoding to be used.StringException
- with reason StringException.UNSUPPORTED_ENCODING
if the requested character encoding is not supported.StringException
- with reason StringException.INVALID_BYTE_SEQUENCE
if an invalid byte sequence is encountered.NullPointerException
- if srcString
or dstString
is null
.convertFrom(byte[], short, short, byte[], short, byte)
public static short convertFrom(byte[] srcString, short srcOffset, short srcLength, byte[] dstString, short dstOffset, byte encoding)
dstOffset
.srcString
- the byte array containing the source character sequence encoded in the character encoding designated by encoding
.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.dstString
- the byte array for copying the UTF-8 encoded resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.encoding
- the character encoding of the source character string.StringException
- with reason StringException.UNSUPPORTED_ENCODING
if the requested character encoding is not supported.StringException
- with reason StringException.INVALID_BYTE_SEQUENCE
if an invalid byte sequence is encountered.NullPointerException
- if srcString
or dstString
is null
.convertTo(byte[], short, short, byte[], short, byte)
public static boolean check(byte[] aString, short offset, short length)
aString
- the byte array containing the UTF-8 encoded character sequence to be checked.offset
- the starting offset of the character sequence in srcString
.length
- the length (in bytes) of the character sequence to be checked.NullPointerException
- if aString
is null
.public static boolean startsWith(byte[] aString, short offset, short length, byte[] prefix, short poffset, short plength, short codePointCount)
aString
, offset
and length
starts
with the first codePointCount
characters of the character sequence
designated by prefix
, poffset
and plength
.
If codePointCount
is negative, the whole prefix character sequence is
considered; in which case calling this method is equivalent to calling
arrayCompare
as follows:
return length >= plength && arrayCompare(aString, offset, prefix, poffset, plength) == 0;Otherwise if
codePointCount
is positive, calling this method
is equivalent to calling arrayCompare
as follows:
short endOffset = StringUtil.offsetByCodePoints(prefix, poffset, plength, 0, codePointCount); return length >= endOffset && arrayCompare(aString, offset, prefix, poffset, endOffset) == 0;
aString
- the byte array containing the reference UTF-8 encoded character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.prefix
- the byte array containing the prefixing UTF-8 encoded character sequence.poffset
- the starting offset in prefix
of the prefixing character sequence.plength
- the length (in bytes) of the prefixing character sequence.codePointCount
- the number of code points to be used for testing.true
if the character sequence designated by prefix
,
poffset
and plength
is a prefix of the character sequence designated by
aString
, offset
and length
; false
otherwise.NullPointerException
- if aString
or prefix
is null
.public static boolean endsWith(byte[] aString, short offset, short length, byte[] suffix, short soffset, short slength, short codePointCount)
aString
, offset
and length
ends
with the first codePointCount
characters of the character sequence
designated by suffix
, soffset
and slength
.
If codePointCount
is negative, the whole suffix character sequence is
considered; in which case calling this method is equivalent to calling
arrayComapre
as follows:
return length >= slength && arrayCompare(aString, (short) (offset + length - slength), suffix, soffset, slength) == 0;Otherwise if
codePointCount
is positive, calling this method
is equivalent to calling arrayCompare
as follows:
short endOffset = StringUtil.offsetByCodePoints(suffix, soffset, slength, 0, codePointCount); return length >= endOffset && arrayCompare(aString, (short) (offset + length - endOffset), suffix, soffset, endOffset) == 0;
aString
- the byte array containing the reference UTF-8 encoded character sequence.offset
- the starting offset of the reference character sequence in aString
.length
- the length (in bytes) of the reference character sequence.suffix
- the byte array containing the suffixing UTF-8 encoded character sequence.soffset
- the starting offset in suffix
of the suffixing character sequence.slength
- the length (in bytes) of the suffixing character sequence.codePointCount
- the number of code points to be used for testing.true
if the character sequence designated by suffix
,
soffset
and slength
is a suffix of the character sequence designated by
aString
, offset
and length
; false
otherwise.NullPointerException
- if aString
or suffix
is null
.public static short substring(byte[] srcString, short srcOffset, short srcLength, short codePointBeginIndex, short codePointEndIndex, byte[] dstString, short dstOffset)
codePointbeginIndex
and
extends to the character at index codePointEndIndex - 1
.
Thus the length of the substring in codepoints (that is its codepoint count) is codePointEndIndex - codePointbeginIndex
.
Ill-formed or incomplete byte sequences within the text range count as one code point each.
If codePointEndIndex
is negative, then the whole remaining character
sequence from the source string is considered.
srcString
- the byte array containing the source UTF-8 encoded character sequence.srcOffset
- the starting offset of the source character sequence in srcString
.srcLength
- the length (in bytes) of the source character sequence.codePointBeginIndex
- the beginning index (relative to srcOffset
), inclusive.codePointEndIndex
- the ending index (relative to srcOffset
), exclusive.dstString
- the byte array for copying the resulting character sequence.dstOffset
- the starting offset in dstString
for copying the resulting character sequence.NullPointerException
- if srcString
or dstString
is null
.Copyright © 1998, 2015, Oracle and/or its affiliates. All rights reserved. Use is subject to license terms