String functions

String functions perform different actions on Strings, such as converting an entire String to uppercase or removing whitespace from a String.

This table describes the String functions that Transform supports. The same functions are described in the Transform API Reference (Groovydoc).

User Function Return Data Type Description
concat(String... arguments) String Combines a list of String arguments into a single String.
concatWithToken(String joinToken, String... arguments) String Combines a list of String arguments into a single String using a join token. For example:
concatWithToken("|", "merlot", "cabernet",
                                         "malbec")
would return:
"merlot|cabernet|malbec"
contains(String originalString, String substring) Boolean Determines whether a String contains a substring. For example, contains("Boston", "Bos") would return true.
find(String originalString, String pattern) String, null Returns the first instance of a substring or regular expression within a String. Returns null if no match is found.
findAll(String originalString, String pattern) String, null Returns a (possibly empty) list of all occurrences of a regular expression (in String format) found within a String.
indexOf(String originalString, String substring) Integer Returns the index of a substring within a String.
isDouble(String s) Boolean Determines whether a String is a Double.
isLong(String s) Boolean Determines whether a String is a Long.
length(String s) Integer Returns the length of a String.
replace(String attribute, String searchString, String replaceString, Boolean useRegex) String Replaces every instance of a substring or regular expression within a String with a new text String. The useRegex argument specifies whether searchString is a literal (false, the default) or a regular expression (true).
splitToSet(String originalString, String delimiter) String Splits an original String based on a specified regular expression delimiter character.
stripIndent(String s) String Removes leading spaces from a String.
substring(String s, Integer start, Integer end) String Returns a substring from the original String, based on its start point and end point. For example, substring("cabernet", 0, 3) returns cab.
substring(String s, Integer start) String Returns a substring from the original String, based on its start point. The returned substring will be from the start point to the end of the original String. For example, substring("cabernet", 5) returns net.
toLowerCase(String s, String locale) String Converts a String to lowercase. You can optionally specify the String's locale; this defaults to "en".
toTitleCase(String s, String locale) String Converts a String to title case. For example, toTitleCase("sOMe STrING") would return "Some String". You can optionally specify the String's locale; this defaults to "en".

Note:

For lists of comma-separated values that do not include spaces, only the first item in the list will be converted to title case. For example, toTitleCase("apple,cherry,plum") returns "Apple,cherry,plum".
toUpperCase(String s, String locale) String Converts a String to uppercase. You can optionally specify the String's locale; this defaults to "en".
trim(String s) String Removes leading and trailing whitespace from a String.
urlDecode(String s, String characterEncoding) String Decodes a String into application/x-www-form-urlencoded format using a specific encoding scheme. You can optionally specify the encoding scheme; this defaults to "UTF-8".
urlEncode(String s, String characterEncoding) String Translates a String into application/x-www-form-urlencoded format using a specific encoding scheme. You can optionally specify the encoding scheme; this defaults to "UTF-8".

Example 20-11 Escaping special characters in regular expressions

Some String functions (such as find, findAll, replace, and splitToSet) use a regular expression (regex) as a String argument. Some characters have special meaning in the regular expression. For example, the dot (.) matches any character except the new line, the dollar sign ($) is an end-of-line anchor, and the vertical pipe (|) separates a series of alternatives.

If you use a special character in a regular expression in a Groovy function, you will have to double-escape it in the argument. For example, if you do not escape the pipe character:
def attrs='aa|bb|cc|dd'
splitToSet(attrs, "|")
returns:
d,b,c,a|
If you double-escape the pipe character:
def attrs='aa|bb|cc|dd'
splitToSet(attrs, "\\|")
returns:
dd,aa,bb,cc
Alternatively, you can use the open and close square brackets to match a single special character, which gives you the same effect as double-escaping the character. Therefore, the last example can be written as::
def attrs='aa|bb|cc|dd'
splitToSet(attrs, "[|]")
returns:
dd,aa,bb,cc

Example 20-12 Locating String patterns using find and regular expressions

This example shows how to use the find function. Assume you have an attribute amz_desc, and it has the following value:
The future is here people and it has arrived in the form of an LED digital bracelet watch. That's right. You'll never 
have to live the disappointing life of not owning a digital bracelet watch. This revolutionary piece of technology is not 
only stylish but it will completely change the way you read time.
You can use the following transformation code with find function and regular expressions in it. This script locates a String pattern that begins with "LED" and ends with "h.", where in between, there can be zero or more of any characters (excluding a new line):
find(amz_desc,'LED.*h\\.')
This script produces the following result:
LED digital bracelet watch. That's right. You'll never have to live the disappointing life
 of not owning a digital bracelet watch.
In the output, you can see that the script a part of the first sentence, and includes it because it starts with "LED". Next, the script looks for the last occurrence of "h.", which is a letter "h" followed by a period at the end of the sentence.

Note also that the script must escape the second "." , because the script wants the second "." to be treated as a regular period in the end of the sentence, and not as a regular expression for any character excluding a new line. Typically, to escape a character, "\" is used, however, in this case, "\" must be used twice "\\". This is because the transformation script must pass the "\" literally ( as text) to the Groovy language, which then treats it as an escape character for the "." period.

Example 20-13 String replacement using replace

This example replaces Strings:
replace(cost,'\\$','',false)

Example 20-14 SubString replacement using trim and replace

This example removes County suffix from pickup_county attribute:
trim(replace(pickup_county,'County',''))

The above code uses method chaining to perform multiple actions with a single statement. replace first locates the substring County in the attribute pickup_county and replaces it with a blank String (''), which essentially removes it. trim then removes all leading and trailing whitespace from the result.

Example 20-15 SubString replacement using regular expressions

The following code masks the number in the medallion attribute by replacing it with 'X':

replace(medallion,'[0-9]','X',true)

The replace function locates all numeric characters in the medallion attribute using the regular expression [0-9], which defines a range of characters. It then replaces any characters that match this pattern with the String X. The true argument specifies that a regular expression will be used instead of a literal String.