One of the choices made by Dennis Ritchie in the design of C was to give compilers a license to rearrange expressions involving adjacent operators that are mathematically commutative and associative, even in the presence of parentheses. This is explicitly noted in the appendix in the The C Programming Language by Kernighan and Ritchie. However, ISO C does not grant compilers this same freedom.
This section discusses the differences between these two definitions of C and clarifies the distinctions between an expression’s side effects, grouping, and evaluation by considering the expression statement from the following code fragment.
int i, *p, f(void), g(void); /*...*/ i = *++p + f() + g(); |
The side effects of an expression are its modifications to memory and its accesses to volatile qualified objects. The side effects in the above expression are the updating of i and p and any side effects contained within the functions f() and g().
An expression’s grouping is the way values are combined with other values and operators. The above expression’s grouping is primarily the order in which the additions are performed.
An expression’s evaluation includes everything necessary to produce its resulting value. To evaluate an expression, all specified side effects must occur anywhere between the previous and next sequence point, and the specified operations are performed with a particular grouping. For the above expression, the updating of i and p must occur after the previous statement and by the ; of this expression statement; the calls to the functions can occur in either order, any time after the previous statement, but before their return values are used. In particular, the operators that cause memory to be updated have no requirement to assign the new value before the value of the operation is used.
The K&R C rearrangement license applies to the above expression because addition is mathematically commutative and associative. To distinguish between regular parentheses and the actual grouping of an expression, the left and right curly braces designate grouping. The three possible groupings for the expression are:
i = { {*++p + f()} + g() }; i = { *++p + {f() + g()} }; i = { {*++p + g()} + f() }; |
All of these are valid given K&R C rules. Moreover, all of these groupings are valid even if the expression were written instead, for example, in either of these ways:
i = *++p + (f() + g()); i = (g() + *++p) + f(); |
If this expression is evaluated on an architecture for which either overflows cause an exception, or addition and subtraction are not inverses across an overflow, these three groupings behave differently if one of the additions overflows.
For such expressions on these architectures, the only recourse available in K&R C was to split the expression to force a particular grouping. The following are possible rewrites that respectively enforce the above three groupings:
i = *++p; i += f(); i += g() i = f(); i += g(); i += *++p; i = *++p; i += g(); i += f(); |
ISO C does not allow operations to be rearranged that are mathematically commutative and associative, but that are not actually so on the target architecture. Thus, the precedence and associativity of the ISO C grammar completely describes the grouping for all expressions; all expressions must be grouped as they are parsed. The expression under consideration is grouped in this manner:
i = { {*++p + f()} + g() }; |
This code still does not mean that f() must be called before g(), or that p must be incremented before g() is called.
In ISO C, expressions need not be split to guard against unintended overflows.
ISO C is often erroneously described as honoring parentheses or evaluating according to parentheses due to an incomplete understanding or an inaccurate presentation.
Since ISO C expressions simply have the grouping specified by their parsing, parentheses still only serve as a way of controlling how an expression is parsed; the natural precedence and associativity of expressions carry exactly the same weight as parentheses.
The above expression could have been written as:
i = (((*(++p)) + f()) + g()); |
with no different effect on its grouping or evaluation.
There were several reasons for the K&R C rearrangement rules:
The rearrangements provide many more opportunities for optimizations, such as compile-time constant folding.
The rearrangements do not change the result of integral-typed expressions on most machines.
Some of the operations are both mathematically and computationally commutative and associative on all machines.
The ISO C Committee eventually became convinced that the rearrangement rules were intended to be an instance of the as if rule when applied to the described target architectures. ISO C’s as if rule is a general license that permits an implementation to deviate arbitrarily from the abstract machine description as long as the deviations do not change the behavior of a valid C program.
Thus, all the binary bitwise operators (other than shifting) are allowed to be rearranged on any machine because there is no way to notice such regroupings. On typical two’s-complement machines in which overflow wraps around, integer expressions involving multiplication or addition can be rearranged for the same reason.
Therefore, this change in C does not have a significant impact on most C programmers.