String
functions perform different actions on Strings, such as converting an entire String to uppercase or removing whitespace from a String.
This table describes the String functions that Transform supports. The same functions are described in the Transform API Reference (Groovydoc).
User Function | Return Data Type | Description |
---|---|---|
concat(String... arguments) |
String |
Combines a list of String arguments into a single String. |
concatWithToken(String joinToken, String... arguments) |
String |
Combines a list of String arguments into a single String using a join token. For example:
concatWithToken("|", "merlot", "cabernet", "malbec")would return: "merlot|cabernet|malbec" |
contains(String originalString, String substring) |
Boolean |
Determines whether a String contains a substring. For example, contains("Boston", "Bos") would return true . |
find(String originalString, String pattern) |
String , null |
Returns the first instance of a substring or regular expression within a String. Returns null if no match is found. |
findAll(String originalString, String pattern) |
String , null |
Returns a (possibly empty) list of all occurrences of a regular expression (in String format) found within a String. |
indexOf(String originalString, String substring) |
Integer |
Returns the index of a substring within a String. |
isDouble(String s) |
Boolean |
Determines whether a String is a Double . |
isLong(String s) |
Boolean |
Determines whether a String is a Long . |
length(String s) |
Integer |
Returns the length of a String. |
replace(String attribute, String searchString, String replaceString, Boolean useRegex) |
String |
Replaces every instance of a substring or regular expression within a String with a new text String. The useRegex argument specifies whether searchString is a literal (false , the default) or a regular expression (true ). |
splitToSet(String originalString, String delimiter) |
String |
Splits an original String based on a specified regular expression delimiter character. |
stripIndent(String s) |
String |
Removes leading spaces from a String. |
substring(String s, Integer start, Integer end) |
String |
Returns a substring from the original String, based on its start point and end point. For example, substring("cabernet", 0, 3) returns cab . |
substring(String s, Integer start) |
String |
Returns a substring from the original String, based on its start point. The returned substring will be from the start point to the end of the original String. For example, substring("cabernet", 5) returns net . |
toLowerCase(String s, String locale) |
String |
Converts a String to lowercase. You can optionally specify the String's locale; this defaults to "en" . |
toTitleCase(String s, String locale) |
String |
Converts a String to title case. For example, toTitleCase("sOMe STrING") would return "Some String" . You can optionally specify the String's locale; this defaults to "en" .
Note: For lists of comma-separated values that do not include spaces, only the first item in the list will be converted to title case. For example,toTitleCase("apple,cherry,plum") returns "Apple,cherry,plum" . |
toUpperCase(String s, String locale) |
String |
Converts a String to uppercase. You can optionally specify the String's locale; this defaults to "en" . |
trim(String s) |
String |
Removes leading and trailing whitespace from a String. |
urlDecode(String s, String characterEncoding) |
String |
Decodes a String into application/x-www-form-urlencoded format using a specific encoding scheme. You can optionally specify the encoding scheme; this defaults to "UTF-8" . |
urlEncode(String s, String characterEncoding) |
String |
Translates a String into application/x-www-form-urlencoded format using a specific encoding scheme. You can optionally specify the encoding scheme; this defaults to "UTF-8" . |
Example 20-11 Escaping special characters in regular expressions
Some String
functions (such as find
, findAll
, replace
, and splitToSet
) use a regular expression (regex) as a String
argument. Some characters have special meaning in the regular expression. For example, the dot (.) matches any character except the new line, the dollar sign ($) is an end-of-line anchor, and the vertical pipe (|) separates a series of alternatives.
def attrs='aa|bb|cc|dd' splitToSet(attrs, "|")returns:
d,b,c,a|
def attrs='aa|bb|cc|dd' splitToSet(attrs, "\\|")returns:
dd,aa,bb,cc
def attrs='aa|bb|cc|dd' splitToSet(attrs, "[|]")returns:
dd,aa,bb,cc
Example 20-12 Locating String patterns using find
and regular expressions
find
function. Assume you have an attribute amz_desc
, and it has the following value:
The future is here people and it has arrived in the form of an LED digital bracelet watch. That's right. You'll never have to live the disappointing life of not owning a digital bracelet watch. This revolutionary piece of technology is not only stylish but it will completely change the way you read time.You can use the following transformation code with
find
function and regular expressions in it. This script locates a String pattern that begins with "LED" and ends with "h.", where in between, there can be zero or more of any characters (excluding a new line):
find(amz_desc,'LED.*h\\.')This script produces the following result:
LED digital bracelet watch. That's right. You'll never have to live the disappointing life of not owning a digital bracelet watch.In the output, you can see that the script a part of the first sentence, and includes it because it starts with "LED". Next, the script looks for the last occurrence of "h.", which is a letter "h" followed by a period at the end of the sentence.
Note also that the script must escape the second "." , because the script wants the second "." to be treated as a regular period in the end of the sentence, and not as a regular expression for any character excluding a new line. Typically, to escape a character, "\" is used, however, in this case, "\" must be used twice "\\". This is because the transformation script must pass the "\" literally ( as text) to the Groovy language, which then treats it as an escape character for the "." period.
Example 20-13 String replacement using replace
replace(cost,'\\$','',false)
Example 20-14 SubString replacement using trim
and replace
County
suffix from pickup_county
attribute:
trim(replace(pickup_county,'County',''))
The above code uses method chaining to perform multiple actions with a single statement. replace
first locates the substring County
in the attribute pickup_county
and replaces it with a blank String (''
), which essentially removes it. trim
then removes all leading and trailing whitespace from the result.
Example 20-15 SubString replacement using regular expressions
The following code masks the number in the medallion
attribute by replacing it with 'X'
:
replace(medallion,'[0-9]','X',true)
The replace
function locates all numeric characters in the medallion
attribute using the regular expression [0-9]
, which defines a range of characters. It then replaces any characters that match this pattern with the String X
. The true
argument specifies that a regular expression will be used instead of a literal String.