Sun Studio 12: C User's Guide

6.5.1 ISO C Translation Phases

The order of these translation phases is specified by ISO C.

Every trigraph sequence in the source file is replaced. ISO C has exactly nine trigraph sequences that were invented solely as a concession to deficient character sets, and are three-character sequences that name a character not in the ISO 646-1983 character set:

Table 6–1 Trigraph Sequences

Trigraph Sequence  

Converts to  

??=

#

??-

~

??(

[

??)

]

??!

|

??<

{

??>

}

??/

\

??’

^

These sequences must be understood by ISO C compilers, but we do not recommend their use. The ISO C compiler warns you, when you use the -xtransition option, whenever it replaces a trigraph while in transition (–Xt) mode, even in comments. For example, consider the following:


/* comment *??/
/* still comment? */

The ??/ becomes a backslash. This character and the following newline are removed. The resulting characters are:


/* comment */* still comment? */

The first / from the second line is the end of the comment. The next token is the *.

  1. Every backslash/new-line character pair is deleted.

  2. The source file is converted into preprocessing tokens and sequences of white space. Each comment is effectively replaced by a space character.

  3. Every preprocessing directive is handled and all macro invocations are replaced. Each #included source file is run through the earlier phases before its contents replace the directive line.

  4. Every escape sequence (in character constants and string literals) is interpreted.

  5. Adjacent string literals are concatenated.

  6. Every preprocessing token is converted into a regular token; the compiler properly parses these and generates code.

  7. All external object and function references are resolved, resulting in the final program.