Solaris 64-bit Developer's Guide

Sign Extension

Unintended sign extension is a common problem when converting to 64–bits. It is hard to detect before the problem actually occurs because lint(1) does not warn you about it. Furthermore, the type conversion and promotion rules are somewhat obscure. To fix unintended sign extension problems, you must use explicit casting to achieve the intended results.

To understand why sign extension occurs, it helps to understand the conversion rules for ANSI C. The conversion rules that seem to cause the most sign extension problems between 32-bit and 64-bit integral values are:

  1. Integral promotion

    A char, short, enumerated type, or bit-field, whether signed or unsigned, can be used in any expression that calls for an int. If an int can hold all possible values of the original type, the value is converted to an int. Otherwise, it is converted to an unsigned int.

  2. Conversion between signed and unsigned integers

    When a negative signed integer is promoted to an unsigned integer of the same or larger type, it is first promoted to the signed equivalent of the larger type, then converted to the unsigned value.

For a more detailed discussion of the conversion rules, refer to the ANSI C standard. Also included in this standard are useful rules for ordinary arithmetic conversions and integer constants.

When compiled as a 64-bit program, the addr variable in the following example becomes sign-extended, even though both addr and a.base are unsigned types.


Example 4–1 test.c

struct foo {
		unsigned int	base:19, rehash:13;  
};

main(int argc, char *argv[]) 
{
		struct foo	a;
		unsigned long addr;

		a.base = 0x40000;
		addr = a.base << 13;		/* Sign extension here! */
		printf("addr 0x%lx\n", addr);

		addr = (unsigned int)(a.base << 13);  /* No sign extension here! */
		printf("addr 0x%lx\n", addr);
}

This sign extension occurs because the conversion rules are applied as follows:

  1. a.base is converted from an unsigned int to an int because of the integral promotion rule. Thus, the expression a.base << 13 is of type int, but no sign extension has yet occurred.

  2. The expression a.base << 13 is of type int, but it is converted to a long and then to an unsigned long before being assigned to addr, because of the signed and unsigned integer promotion rule. The sign extension occurs when it is converted from an int to a long.


% cc -o test64 -xarch=v9 test.c
% ./test64
addr 0xffffffff80000000
addr 0x80000000
%

When this same example is compiled as a 32-bit program it does not display any sign extension:


% cc -o test32 test.c
% ./test32
addr 0x80000000
addr 0x80000000
%