C++ Migration Guide

String Literals and char*

Some history might help you understand this subtle issue. Standard C introduced the const keyword and the concept of constant objects, neither of which was present in the original C language ("K&R" C). A string literal such as "Hello world" logically should be const in order to prevent nonsensical results, as in the following example.


#define GREETING "Hello world";
char* greet = GREETING; // No compiler complaint
greet[0] = `J';
printf("%s", GREETING); // Prints "Jello world" on some systems

In both C and C++, the results of attempting to modify a string literal are undefined. The previous example produces the odd result shown if the implementation chooses to use the same writable storage for identical string literals.

Because so much then-existing code looked like the second line in the preceding example, the C Standards Committee in 1989 did not wish to make string literals const. The C++ language originally followed the C language rule. The C++ Standards Committee later decided that the C++ goal of type safety was more important, and changed the rule.

In standard C++, string literals are constant and have type const char[]. The second line of code in the previous example is not valid in standard C++. Similarly, a function parameter declared as char* should no longer be passed as a string literal. However, the C++ standard also provides for a deprecated conversion of a string literal from const char[] to char*. Some examples are:


char *p1 = "Hello";         // Formerly OK, now deprecated
const char* p2 = "Hello";  // OK
void f(char*);
f(p1);         // Always OK, since p1 is not declared const
f(p2);         // Always an error, passing const char* to char*
f("Hello");   // Formerly OK, now deprecated
void g(const char*);
g(p1);        // Always OK
g(p2);        // Always OK
g("Hello");  // Always OK

If a function does not modify, directly or indirectly, a character array that is passed as an argument, the parameter should be declared const char* (or const char[]). You might find that the need to add const modifiers propagates through the program; as you add modifiers, still more become necessary. (This phenomenon is sometimes called "const poisoning.")

C++ 5.0 in standard mode issues a warning about the deprecated conversion of a string literal to char*. If you were careful to use const wherever it was appropriate in your existing programs, they probably compile without these warnings under the new rules.

For function overloading purposes, a string literal is always regarded as const in standard mode. For example:


void f(char*);
void f(const char*);
f("Hello"); // which f gets called?

If the above example is compiled in compatibility mode (or with the 4.2 compiler), function f(char*) is called. If compiled in standard mode, function f(const char*) is called.

In standard mode, the compiler will put literal strings in read-only memory by default. If you then attempt to modify the string (which might happen due to automatic conversion to char*) the program aborts with a memory violation.

With the following example, the 4.2 compiler and the 5.0 compiler in compatibility mode put the string literal in writable memory. The program will run, although it technically has undefined behavior. The 5.0 compiler in standard mode puts the string literal in read-only memory by default, and the program aborts with a memory fault. You should therefore heed all warnings about conversion of string literals, and try to fix your program so the conversions do not occur. Such changes will ensure your program is correct for every C++ implementation.


void f(char* p) { p[0] = `J'; }

int main()
{
    f("Hello"); // conversion from const char[] to char*
}

You can change the compiler behavior with the use of a compiler option:

You might find it convenient to use the standard C++ string class instead of C-style strings. The C++ string class does not have the problems associated with string literals, because standard string objects can be declared separately as const or not, and can be passed by reference, by pointer, or by value to functions.