public class URLUtils
extends java.lang.Object
java.net.URLDecoder
and java.net.URLEncoder
.
The URL escaping/unescaping functions use the ISO-8859-1
encoding for conversion between characters and bytes.
If you need a customized set of reserved (unescaped) ASCII characters, or a UTF-8 character encoding, then use a URLEscaper. This static implementation delegates to a URLEscaperISO88591 object (with the slight performance overhead for dispatching and data access in a non-static implementation).
URLDecoder
,
URLEncoder
,
URLEscaper
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
CLASS_VERSION
Class version string
|
protected static java.lang.String |
EMAIL_ADDRESS_DELIMITER |
protected static java.lang.String |
NUM_CHAR_REF_END |
protected static java.lang.String |
NUM_CHAR_REF_START |
protected static java.lang.String |
PERCENT |
static java.util.regex.Pattern |
RESERVED_URI_CHARS |
protected static atg.core.net.URLEscaperUTF8 |
sUrlEscaperUTF8 |
Constructor and Description |
---|
URLUtils() |
Modifier and Type | Method and Description |
---|---|
static boolean |
containsNumCharRefs(java.lang.String pVal)
Checks to see if a string contains any numeric character
references of the form
N; where N is either
a decimal or hexidecimal value of a character's unicode code point. |
static java.lang.String |
emailAddressFromPunycode(java.lang.String pEmailAddress)
converts the domain portion of an email address into Unicode, from Punycode
|
static java.lang.String |
emailAddressToPunycode(java.lang.String pEmailAddress)
converts an email address containing non-ASCII characters,
or numeric character references into punycode.
|
static java.lang.String |
encodeEmailAddresses(java.lang.String pAddressString)
encode the email address, or comma delimited list of email addresses
|
static void |
escapeAndAppendUrl(java.lang.StringBuffer sb,
char[] str)
URL encode the contents of a character array
and append them to the provided StringBuffer.
|
static void |
escapeAndAppendUrlString(java.lang.StringBuffer pDestBuffer,
java.lang.String pStr)
URL encode the contents of the String and append them to the provided
StringBuffer.
|
static void |
escapeAndAppendUrlString(java.lang.StringBuilder pDestBuffer,
java.lang.String pStr)
URL encode the contents of the String and append them to the provided
StringBuilder.
|
static java.lang.String |
escapeUrlString(java.lang.String pStr)
Return the passed in String as a URL escaped String.
|
static java.lang.String |
escapeUrlString(java.lang.StringBuffer pStrBuf) |
static java.lang.String |
escapeUTF8UrlString(java.lang.String pStr)
Return the passed in String as a URL escaped String.
|
static java.lang.String |
getAddressForUrl(java.lang.String pAddress)
Return the ipv6 ip address string wrapping i side the '[' ']' for the given ip address
|
static java.lang.String |
getAddressPortNumberSeparator(java.lang.String pAddress)
Return the port separator string for the given address
|
static java.lang.String |
getModifiedHostValue(java.lang.String pAddress)
Return the ipv6 ip address string br truncating the '%' suffix for the given ip address
|
static java.lang.String |
getUrlPortNumberSeparator(java.lang.String pUrl)
Return the port separator string for the given URL
|
static char |
hex2Char(char x,
char y)
Takes two ASCII characters,
x and y ,
which represent a two-digit hexadecimal number (%xy ),
and returns the unsigned byte value in the range
0-255 . |
static boolean |
isIPv6Address(java.lang.String pAddress)
Check to see if a given address is an IPv6 address
|
static boolean |
isRelative(java.lang.String pUrl)
Returns true if the URL is relative
|
static boolean |
needsIPv6Wrapper(java.lang.String pAddress)
Check to see of the address passed is an IPv6 address and it does not already have
the square bracket wrapper.
|
static java.lang.String |
numCharRefToUnicode(java.lang.String pVal)
Converts any unicode values in a string encoded as numeric character
references into their literal unicode character values.
|
static java.util.Hashtable |
parsePostData(int len,
java.io.InputStream in)
Parses posted data sent from the client to the
server, and builds a
HashTable object
with key-value pairs. |
static java.util.Hashtable |
parseQuery(char[] str)
Parses a query string passed from the client to the
server and builds a
HashTable object
with key-value pairs. |
static java.util.Hashtable |
parseQueryString(java.lang.String pQueryString)
Parses a query string passed from the client to the
server and builds a
HashTable object
with key-value pairs. |
static java.util.Dictionary<java.lang.String,java.lang.String[]> |
parseQueryStringToDictionary(java.lang.String pQueryString)
Parses a query string passed from the client to the
server and builds a
Dictionary object
with key-value pairs (the returned Dictionary also implements Map). |
static java.util.Map<java.lang.String,java.lang.String[]> |
parseQueryStringToMap(java.lang.String pQueryString)
Parses a query string passed from the client to the
server and builds a
HashTable object
with key-value pairs. |
static java.lang.String |
processEmailAddressPunycode(java.lang.String pEmailAddress,
boolean pEncode) |
protected static java.lang.String |
processUrlPunycode(java.lang.String pURL,
boolean pEncode)
find the domain and path portions of a url, and
pass them to the URLProcessor to convert them to/from punycode
|
static java.lang.String |
removeRelativePaths(java.lang.String pURL)
Takes the an absolute URI that may have "../" in it and removes
the "../" entries (by removing it and the directory above it in the
path as well).
|
static void |
resolvePureRelativePath(java.lang.StringBuffer pResultBuffer,
java.lang.String pPath,
java.lang.String pWorkingDirectory)
Resolves a pure relative path against a working directory, handling
".." references.
|
static void |
resolvePureRelativePath(java.lang.StringBuilder pResultBuffer,
java.lang.String pPath,
java.lang.String pWorkingDirectory)
Resolves a pure relative path against a working directory, handling
".." references.
|
static java.lang.String |
resolvePureRelativePath(java.lang.String pPath,
java.lang.String pWorkingDirectory)
Resolves a pure relative path against a working directory, handling
".." references.
|
static java.lang.String |
stripURIArgs(java.lang.String pURI)
Strip off any arguments etc found in the URI.
|
static int |
unescapeUrl(char[] str)
Utility wrapper.
|
static int |
unescapeUrl(char[] str,
int len)
Utility method for converting from a MIME format called
x-www-form-urlencoded . |
static java.lang.String |
unescapeUrlString(java.lang.String pString)
Utility method for converting from a MIME format called
x-www-form-urlencoded to a String. |
static java.lang.String |
unescapeUTF8UrlString(java.lang.String pString)
This is the UTF8 version of the ISO88591
URL unescape method.
|
static java.lang.String |
URIDirectory(java.lang.String pURI)
Returns the directory part of a URI (everything up to the filename), which
will begin and end with the "/" character.
|
static java.lang.String |
URIFilename(java.lang.String pURI)
Returns the filename part of a URI
|
static java.lang.String |
urlEncodeDisallowedOnly(java.lang.String pUrl)
performs a percent-encoding on only the disallowed characters in a URL.
|
static java.lang.String |
urlFromPunycode(java.lang.String pURL)
converts a url containing punycode or percent encoded portions into Unicode
|
static java.lang.String |
urlToPunycode(java.lang.String pURL)
converts a url containing non-ASCII characters, or numeric
character references into punycode with the path and query param portions
percent-encoded.
|
public static java.lang.String CLASS_VERSION
protected static final java.lang.String NUM_CHAR_REF_START
protected static final java.lang.String NUM_CHAR_REF_END
protected static final java.lang.String PERCENT
protected static final java.lang.String EMAIL_ADDRESS_DELIMITER
public static java.util.regex.Pattern RESERVED_URI_CHARS
protected static atg.core.net.URLEscaperUTF8 sUrlEscaperUTF8
public static boolean isRelative(java.lang.String pUrl)
public static boolean isIPv6Address(java.lang.String pAddress)
pAddress
- The address to be checkpublic static boolean needsIPv6Wrapper(java.lang.String pAddress)
pAddress
- The address to be checkpublic static java.lang.String getAddressForUrl(java.lang.String pAddress)
pAddress
- The ip address to be wrappedpublic static java.lang.String getModifiedHostValue(java.lang.String pAddress)
pAddress
- The ip address from which '%' suffix is to be removedpublic static java.lang.String getUrlPortNumberSeparator(java.lang.String pUrl)
pUrl
- The URL to be checkedpublic static java.lang.String getAddressPortNumberSeparator(java.lang.String pAddress)
pAddress
- The address to be checkedpublic static java.lang.String URIDirectory(java.lang.String pURI)
public static java.lang.String URIFilename(java.lang.String pURI)
public static java.lang.String stripURIArgs(java.lang.String pURI)
public static java.lang.String resolvePureRelativePath(java.lang.String pPath, java.lang.String pWorkingDirectory) throws java.lang.IllegalArgumentException
pPath
- The pure relative path (i.e. does not start with "/") to be
resolved.pWorkingDirectory
- The working directory to resolve against, which
should begin and end with the "/" character.java.lang.IllegalArgumentException
public static void resolvePureRelativePath(java.lang.StringBuffer pResultBuffer, java.lang.String pPath, java.lang.String pWorkingDirectory) throws java.lang.IllegalArgumentException
pPath
- The pure relative path (i.e. does not start with "/") to be
resolved.pWorkingDirectory
- The working directory to resolve against, which
should begin and end with the "/" character.java.lang.IllegalArgumentException
public static void resolvePureRelativePath(java.lang.StringBuilder pResultBuffer, java.lang.String pPath, java.lang.String pWorkingDirectory) throws java.lang.IllegalArgumentException
pPath
- The pure relative path (i.e. does not start with "/") to be
resolved.pWorkingDirectory
- The working directory to resolve against, which
should begin and end with the "/" character.java.lang.IllegalArgumentException
public static java.lang.String removeRelativePaths(java.lang.String pURL) throws java.lang.IllegalArgumentException
java.lang.IllegalArgumentException
- if the path name has too many
../'s in it.public static java.lang.String unescapeUrlString(java.lang.String pString)
x-www-form-urlencoded
to a String.
Optimized version of java.net.URLDecoder#decode
.
To convert to a String, each character is examined in turn:
'a'
through 'z'
,
'A'
through 'Z'
, and
'0'
through '9'
, and
unreserved delimiters /-_.*!~'()
remain the same.'+'
is converted
into a space character ' '
."%xy"
,
where xy
is the two-digit hexadecimal representation of a byte value.
The conversion of hex escapes to byte values
follows hex2Char
usage.
The conversion of byte values to Unicode characters
uses the ISO-8859-1 encoding.If the string argument is null
,
an empty string is returned.
pString
- String object to be decodednull
hex2Char( char, char )
,
unescapeUrl( char[], int )
,
URLDecoder.decode( String )
public static java.lang.String unescapeUTF8UrlString(java.lang.String pString)
pString
- String object to be decodednull
#unescapeUrl( String )
public static int unescapeUrl(char[] str)
unescapeUrl( char[], int )
public static int unescapeUrl(char[] str, int len)
x-www-form-urlencoded
.
Optimized version of java.net.URLDecoder#decode
.
This version operates on a mutable character array.
Each character is examined in turn:
'a'
through 'z'
,
'A'
through 'Z'
, and
'0'
through '9'
, and
unreserved delimiters -_.*!~'()
remain the same.'+'
is converted
into a space character ' '
."%xy"
,
where xy
is the two-digit hexadecimal representation of a byte value.
The conversion of hex characters to byte values
follows hex2Char
usage.
The conversion of byte values to Unicode characters
uses the ISO-8859-1
encoding.str
- character array to be unescaped,
conversion occurs in place-1
if no change was necessaryhex2Char( char, char )
,
URLDecoder.decode( String )
public static final java.lang.String escapeUrlString(java.lang.String pStr)
Uses the ISO-8859-1
character encoding.
Equivalent to an optimized version of
java.net.URLEncoder#encode
.
If the string argument is null
,
an empty string is returned.
pStr
- the String to be translated.null
URLEncoder.encode( String )
public static final java.lang.String escapeUrlString(java.lang.StringBuffer pStrBuf)
public static final java.lang.String escapeUTF8UrlString(java.lang.String pStr)
Uses the UTF8
character encoding.
If the string argument is null
,
an empty string is returned.
pStr
- the String to be translated.null
( String )
public static final void escapeAndAppendUrlString(java.lang.StringBuffer pDestBuffer, java.lang.String pStr)
pDestBuffer
- the StringBuffer to which to append the resultpStr
- the String to be translatedescapeAndAppendUrl( StringBuffer, char[] )
public static final void escapeAndAppendUrlString(java.lang.StringBuilder pDestBuffer, java.lang.String pStr)
pDestBuffer
- the StringBuffer to which to append the resultpStr
- the String to be translatedescapeAndAppendUrl( StringBuffer, char[] )
public static void escapeAndAppendUrl(java.lang.StringBuffer sb, char[] str)
This algorithm is only valid only for the lower 8-bits
of the character value.
The Unicode range U+0000 - U+00FF
is equivalent to ISO-8859-1
,
so it handles these characters correctly.
Unicode characters above U+00FF
are truncated and
mapped to a single character from ISO-8859-1
.
sb
- the StringBuffer to which to append the resultstr
- the character array to be translatedpublic static java.util.Hashtable parseQueryString(java.lang.String pQueryString)
HashTable
object
with key-value pairs. Same functionality as
javax.servlet.http.HttpUtils#parseQueryString
but this implementation attempts optimization by inlining
character unescaping with the key-value delimiter parsing.
The query string should be in the form of a string
packaged by the GET or POST method, that is,
it should have key-value pairs in the form key=value,
with each pair separated from the next by a '&'
character.
A key can appear more than once in the query string with different values. However, the key appears only once in the hashtable, with its value being an array of strings containing the multiple values sent by the query string.
The keys and values in the hashtable are stored in their
decoded form, so any '+'
characters are converted
to spaces ' '
, and byte values
in hexadecimal notation "%xy"
are
converted to Unicode characters using the ISO-8859-1 encoding.
pQueryString
- a string containing the query to be parsedHashTable
object built
from the parsed key-value pairs,
may be emtpty, but never null
java.lang.IllegalArgumentException
- if the query string is invalidparseQuery( char[] )
,
HttpUtils.parseQueryString( String )
public static java.util.Hashtable parseQuery(char[] str)
HashTable
object
with key-value pairs.
The query string should be in the form of a string
packaged by the GET or POST method, that is,
it should have key-value pairs in the form key=value,
with each pair separated from the next by a '&'
character.
A key can appear more than once in the query string with different values. However, the key appears only once in the hashtable, with its value being an array of strings containing the multiple values sent by the query string.
The keys and values in the hashtable are stored in their
decoded form, so any '+'
characters are converted
to spaces ' '
, and byte values
in hexadecimal notation "%xy"
are
converted to Unicode characters using the ISO-8859-1
encoding.
str
- character array representing a query stringHashTable
object built
from the parsed key-value pairs,
may be empty, but never null
java.lang.IllegalArgumentException
- if the query string is invalidpublic static java.util.Map<java.lang.String,java.lang.String[]> parseQueryStringToMap(java.lang.String pQueryString)
HashTable
object
with key-value pairs.
The query string should be in the form of a string
packaged by the GET or POST method, that is,
it should have key-value pairs in the form key=value,
with each pair separated from the next by a '&'
character.
A key can appear more than once in the query string with different values. However, the key appears only once in the hashtable, with its value being an array of strings containing the multiple values sent by the query string.
The keys and values in the hashtable are stored in their
decoded form, so any '+'
characters are converted
to spaces ' '
, and byte values
in hexadecimal notation "%xy"
are
converted to Unicode characters using the ISO-8859-1
encoding.
pQueryString
- the query string to parse.Map
object built
from the parsed key-value pairs,
may be empty, but never null
java.lang.IllegalArgumentException
- if the query string is invalidpublic static java.util.Dictionary<java.lang.String,java.lang.String[]> parseQueryStringToDictionary(java.lang.String pQueryString)
Dictionary
object
with key-value pairs (the returned Dictionary also implements Map).
The query string should be in the form of a string
packaged by the GET or POST method, that is,
it should have key-value pairs in the form key=value,
with each pair separated from the next by a '&'
character.
A key can appear more than once in the query string with different values. However, the key appears only once in the hashtable, with its value being an array of strings containing the multiple values sent by the query string.
The keys and values in the hashtable are stored in their
decoded form, so any '+'
characters are converted
to spaces ' '
, and byte values
in hexadecimal notation "%xy"
are
converted to Unicode characters using the ISO-8859-1
encoding.
pQueryString
- the query string to parse.Dictionary
object built
from the parsed key-value pairs,
may be empty, but never null
java.lang.IllegalArgumentException
- if the query string is invalidpublic static java.util.Hashtable parsePostData(int len, java.io.InputStream in) throws java.lang.IllegalArgumentException
HashTable
object
with key-value pairs. Same functionality as
java.servlet.http.HttpUtils#parsePostData
,
but this implementation attempts optimization by inlining
character unescaping with the key-value delimiter parsing.
The query string should be in the form of a string
packaged by the GET or POST method, that is,
it should have key-value pairs in the form key=value,
with each pair separated from the next by a '&'
character.
A key can appear more than once in the query string with different values. However, the key appears only once in the hashtable, with its value being an array of strings containing the multiple values sent by the query string.
The keys and values in the hashtable are stored in their
decoded form, so any '+'
characters are converted
to spaces ' '
, and byte values
in hexadecimal notation "%xy"
are
converted to characters.
The conversion from bytes to characters uses the standard
ISO-8859-1
encoding, which is compatible with
the version in java.servlet.http.HttpUtils
,
and also produces the String format expected by the Dynamo
character Converter
interface.
The return value will be an empty hashtable
if len
is not a positive value,
or if there was an exception reading the stream,
or if the stream ended before all the data was read.
len
- an integer specifying the length, in characters, of the
data to be parsed from the ServletInputStream
argumentin
- the ServletInputStream
object that contains
the data sent from the clientHashTable
object built
from the parsed key-value pairs (never null
)java.lang.IllegalArgumentException
- if the stream argument is null
parseQuery( char[] )
,
java.servlet.http.HttpUtils#parsePostData( int, ServletInputStream )
public static char hex2Char(char x, char y)
x
and y
,
which represent a two-digit hexadecimal number (%xy
),
and returns the unsigned byte
value in the range
0-255
.
If the character input values are outside the ASCII range
(0x00-0x7f
),
their values are masked to 7-bits before the look-up.
If the (masked) input ASCII character is not a valid hex character
(0-9,a-f,A-F
), then it does not contribute to the result value.
If both inputs are invalid, then the result will be zero.
Neither the hex input nor the byte valued output can represent
the full repertoire of Unicode characters. If the byte return value
is interpreted directly as a character, it will appear to be from the
the ISO-8859-1
repertoire,
which is equivalent to the subset of Unicode with a zero upper byte.
The name hex2Char
is purely for historical reasons,
it should really be hex2Byte
.
x
- high order digit of hexadecimal representationy
- low order digit of hexadecimal representationpublic static java.lang.String encodeEmailAddresses(java.lang.String pAddressString) throws javax.mail.internet.AddressException, java.io.UnsupportedEncodingException
pAddressString
- javax.mail.internet.AddressException
- if there is an error parsing the address stringjava.io.UnsupportedEncodingException
public static java.lang.String emailAddressFromPunycode(java.lang.String pEmailAddress)
pEmailAddress
- the email address to decodepublic static java.lang.String emailAddressToPunycode(java.lang.String pEmailAddress)
pEmailAddress
- the email address to encodepublic static java.lang.String processEmailAddressPunycode(java.lang.String pEmailAddress, boolean pEncode)
public static java.lang.String urlFromPunycode(java.lang.String pURL)
pURL
- the url to decodepublic static java.lang.String urlToPunycode(java.lang.String pURL)
pURL
- the url to encodeprotected static java.lang.String processUrlPunycode(java.lang.String pURL, boolean pEncode)
public static java.lang.String urlEncodeDisallowedOnly(java.lang.String pUrl) throws java.io.UnsupportedEncodingException
pUrl
- a URL to be percent encodedjava.io.UnsupportedEncodingException
- if the UTF-8 encoding is not supportedpublic static java.lang.String numCharRefToUnicode(java.lang.String pVal)
N;
where N is either the decimal or hexidecimal value of a
character's unicode code point.
For instance, the numeric character reference of ᄌ
would
be converted into the Korean Hangul Jamo character of choseong cieuc.pVal
- A string potentially containing numeric character referencespublic static boolean containsNumCharRefs(java.lang.String pVal)
N;
where N is either
a decimal or hexidecimal value of a character's unicode code point.pVal
- the string to check