2.11 Strings

2.11.1 String Representation
2.11.2 String Constants
2.11.3 String Assignment
2.11.4 String Conversion
2.11.5 String Comparison

DTrace provides support for tracing and manipulating strings. This chapter describes the complete set of D language features for declaring and manipulating strings. Unlike ANSI C, strings in D have their own built-in type and operator support so you can easily and unambiguously use them in your tracing programs.

2.11.1 String Representation

Strings are represented in DTrace as an array of characters terminated by a null byte (that is, a byte whose value is zero, usually written as '\0'). The visible part of the string is of variable length, depending on the location of the null byte, but DTrace stores each string in a fixed-size array so that each probe traces a consistent amount of data. Strings may not exceed the length of this predefined string limit, but the limit can be modified in your D program or on the dtrace command line by tuning the strsize option. Refer to Chapter 10, Options and Tunables for more information on tunable DTrace options. The default string limit is 256 bytes.

The D language provides an explicit string type rather than using the type char * to refer to strings. The string type is equivalent to a char * in that it is the address of a sequence of characters, but the D compiler and D functions such as trace provide enhanced capabilities when applied to expressions of type string. For example, the string type removes the ambiguity of the type char * when you need to trace the actual bytes of a string. In the following D statement:

trace(s);

if s is of type char *, DTrace traces the value of the pointer s (that is, it traces an integer address value). In the D statement:

trace(*s);

by the definition of the * operator, the D compiler dereferences the pointer s and traces the single character at that location. These behaviors allow you to manipulate character pointers that refer to either single characters, or to arrays of byte-sized integers that are not strings and do not end with a null byte. In the D statement:

trace(s);

if s is of type string, the string type indicates to the D compiler that you want DTrace to trace a null terminated string of characters whose address is stored in the variable s. You can also perform lexical comparison of expressions of type string, as described in Section 2.11.5, “String Comparison”.

2.11.2 String Constants

String constants are enclosed in pairs of double quotes ("") and are automatically assigned the type string by the D compiler. You can define string constants of any length, limited only by the amount of memory DTrace is permitted to consume on your system. The terminating null byte (\0) is added automatically by the D compiler to any string constants that you declare. The size of a string constant object is the number of bytes associated with the string plus one additional byte for the terminating null byte.

A string constant may not contain a literal newline character. To create strings containing newlines, use the \n escape sequence instead of a literal newline. String constants may also contain any of the special character escape sequences defined for character constants (see Table 2.6, “Character Escape Sequences”).

2.11.3 String Assignment

Unlike the assignment of char * variables, strings are copied by value, not by reference. The string assignment operator = copies the actual bytes of the string from the source operand up to and including the null byte to the variable on the left-hand side, which must be of type string. You can create a new string variable by assigning it an expression of type string. For example, the D statement:

s = "hello";

would create a new variable s of type string and copy the six bytes of the string "hello" into it (five printable characters plus the null byte). String assignment is analogous to the C library function strcpy(), except that if the source string exceeds the limit of the storage of the destination string, the resulting string is automatically truncated by a null byte at this limit.

You can also assign to a string variable an expression of a type that is compatible with strings. In this case, the D compiler automatically promotes the source expression to the string type and performs a string assignment. The D compiler permits any expression of type char * or of type char[n] (that is, a scalar array of char of any size) to be promoted to a string.

2.11.4 String Conversion

Expressions of other types may be explicitly converted to type string by using a cast expression or by applying the special stringof operator, which are equivalent in meaning:

s = (string) expression;

s = stringof (expression);

The stringof operator binds very tightly to the operand on its right-hand side. Typically, parentheses are used to surround the expression for clarity, although they are not strictly necessary.

Any expression that is a scalar type such as a pointer or integer or a scalar array address may be converted to string. Expressions of other types such as void may not be converted to string. If you erroneously convert an invalid address to a string, the DTrace safety features will prevent you from damaging the system or DTrace, but you might end up tracing a sequence of undecipherable characters.

2.11.5 String Comparison

D overloads the binary relational operators and permits them to be used for string comparisons as well as integer comparisons. The relational operators perform string comparison whenever both operands are of type string, or when one operand is of type string and the other operand can be promoted to type string, as described in Section 2.11.3, “String Assignment”. Table 2.14, “D Relational Operators for Strings” lists the relational operators that can be used to compare strings.

Table 2.14 D Relational Operators for Strings

Operator

Description

<

Left-hand operand is less than right-operand

<=

Left-hand operand is less than or equal to right-hand operand

>

Left-hand operand is greater than right-hand operand

>=

Left-hand operand is greater than or equal to right-hand operand

==

Left-hand operand is equal to right-hand operand

!=

Left-hand operand is not equal to right-hand operand


As with integers, each operator evaluates to a value of type int which is equal to one if the condition is true, or zero if it is false.

The relational operators compare the two input strings byte-by-byte, similar to the C library routine strcmp(). Each byte is compared using its corresponding integer value in the ASCII character set, as shown in the ascii(7) manual page, until a null byte is read or the maximum string length is reached. Some example D string comparisons and their results are:

D string comparison

Result

"coffee" < "espresso"

Returns 1 (true)

"coffee" == "coffee"

Returns 1 (true)

"coffee"" >= "mocha"

Returns 0 (false)

Note

Seemingly identical Unicode strings might compare as being different if one or the other of the strings is not normalized.