String functions

String functions perform different actions on Strings, such as converting an entire String to uppercase or removing whitespace from a String.

This table describes the String functions that Transform supports. The same functions are described in the Transform API Reference (Groovydoc).

User Function	Return Data Type	Description
`concat(String... arguments)`	`String`	Combines a list of String arguments into a single String.
`concatWithToken(String joinToken, String... arguments)`	`String`	Combines a list of String arguments into a single String using a join token. For example: concatWithToken("\|", "merlot", "cabernet", "malbec") would return: "merlot\|cabernet\|malbec"
`contains(String originalString, String substring)`	`Boolean`	Determines whether a String contains a substring. For example, `contains("Boston", "Bos")` would return `true`.
`find(String originalString, String pattern)`	`String`, `null`	Returns the first instance of a substring or regular expression within a String. Returns `null` if no match is found.
`findAll(String originalString, String pattern)`	`String`, `null`	Returns a (possibly empty) list of all occurrences of a regular expression (in String format) found within a String.
`indexOf(String originalString, String substring)`	`Integer`	Returns the index of a substring within a String.
`isDouble(String s)`	`Boolean`	Determines whether a String is a `Double`.
`isLong(String s)`	`Boolean`	Determines whether a String is a `Long`.
`length(String s)`	`Integer`	Returns the length of a String.
`replace(String attribute, String searchString, String replaceString, Boolean useRegex)`	`String`	Replaces every instance of a substring or regular expression within a String with a new text String. The `useRegex` argument specifies whether `searchString` is a literal (`false`, the default) or a regular expression (`true`).
`splitToSet(String originalString, String delimiter)`	`String`	Splits an original String based on a specified regular expression delimiter character.
`stripIndent(String s)`	`String`	Removes leading spaces from a String.
`substring(String s, Integer start, Integer end)`	`String`	Returns a substring from the original String, based on its start point and end point. For example, `substring("cabernet", 0, 3)` returns `cab`.
`substring(String s, Integer start)`	`String`	Returns a substring from the original String, based on its start point. The returned substring will be from the start point to the end of the original String. For example, `substring("cabernet", 5)` returns `net`.
`toLowerCase(String s, String locale)`	`String`	Converts a String to lowercase. You can optionally specify the String's locale; this defaults to `"en"`.
`toTitleCase(String s, String locale)`	`String`	Converts a String to title case. For example, `toTitleCase("sOMe STrING")` would return `"Some String"`. You can optionally specify the String's locale; this defaults to `"en"`. Note: For lists of comma-separated values that do not include spaces, only the first item in the list will be converted to title case. For example, `toTitleCase("apple,cherry,plum")` returns `"Apple,cherry,plum"`.
`toUpperCase(String s, String locale)`	`String`	Converts a String to uppercase. You can optionally specify the String's locale; this defaults to `"en"`.
`trim(String s)`	`String`	Removes leading and trailing whitespace from a String.
`urlDecode(String s, String characterEncoding)`	`String`	Decodes a String into application/x-www-form-urlencoded format using a specific encoding scheme. You can optionally specify the encoding scheme; this defaults to `"UTF-8"`.
`urlEncode(String s, String characterEncoding)`	`String`	Translates a String into application/x-www-form-urlencoded format using a specific encoding scheme. You can optionally specify the encoding scheme; this defaults to `"UTF-8"`.

Example 20-11 Escaping special characters in regular expressions

Some String functions (such as find, findAll, replace, and splitToSet) use a regular expression (regex) as a String argument. Some characters have special meaning in the regular expression. For example, the dot (.) matches any character except the new line, the dollar sign ($) is an end-of-line anchor, and the vertical pipe (|) separates a series of alternatives.

If you use a special character in a regular expression in a Groovy function, you will have to double-escape it in the argument. For example, if you do not escape the pipe character:

def attrs='aa|bb|cc|dd'
splitToSet(attrs, "|")

returns:

d,b,c,a|

If you double-escape the pipe character:

def attrs='aa|bb|cc|dd'
splitToSet(attrs, "\\|")

returns:

dd,aa,bb,cc

Alternatively, you can use the open and close square brackets to match a single special character, which gives you the same effect as double-escaping the character. Therefore, the last example can be written as::

def attrs='aa|bb|cc|dd'
splitToSet(attrs, "[|]")

returns:

dd,aa,bb,cc

Example 20-12 Locating String patterns using find and regular expressions

This example shows how to use the find function. Assume you have an attribute amz_desc, and it has the following value:

The future is here people and it has arrived in the form of an LED digital bracelet watch. That's right. You'll never 
have to live the disappointing life of not owning a digital bracelet watch. This revolutionary piece of technology is not 
only stylish but it will completely change the way you read time.

You can use the following transformation code with find function and regular expressions in it. This script locates a String pattern that begins with "LED" and ends with "h.", where in between, there can be zero or more of any characters (excluding a new line):

find(amz_desc,'LED.*h\\.')

This script produces the following result:

LED digital bracelet watch. That's right. You'll never have to live the disappointing life
 of not owning a digital bracelet watch.

In the output, you can see that the script a part of the first sentence, and includes it because it starts with "LED". Next, the script looks for the last occurrence of "h.", which is a letter "h" followed by a period at the end of the sentence.

Note also that the script must escape the second "." , because the script wants the second "." to be treated as a regular period in the end of the sentence, and not as a regular expression for any character excluding a new line. Typically, to escape a character, "\" is used, however, in this case, "\" must be used twice "\\". This is because the transformation script must pass the "\" literally ( as text) to the Groovy language, which then treats it as an escape character for the "." period.

Example 20-13 String replacement using replace

This example replaces Strings:

replace(cost,'\\$','',false)

Example 20-14 SubString replacement using trim and replace

This example removes County suffix from pickup_county attribute:

trim(replace(pickup_county,'County',''))

The above code uses method chaining to perform multiple actions with a single statement. replace first locates the substring County in the attribute pickup_county and replaces it with a blank String (''), which essentially removes it. trim then removes all leading and trailing whitespace from the result.

Example 20-15 SubString replacement using regular expressions

The following code masks the number in the medallion attribute by replacing it with 'X':

replace(medallion,'[0-9]','X',true)

The replace function locates all numeric characters in the medallion attribute using the regular expression [0-9], which defines a range of characters. It then replaces any characters that match this pattern with the String X. The true argument specifies that a regular expression will be used instead of a literal String.