A P P E N D I X C - Implementation-Defined ISO/IEC C Behavior

A P P E N D I X C

Implementation-Defined ISO/IEC C Behavior

The ISO/IEC 9899:1990, Programming Languages - C standard specifies the form and establishes the interpretation of programs written in C. However, this standard leaves a number of issues as implementation-defined, that is, as varying from compiler to compiler. This chapter details these areas. They can be readily compared to the ISO/IEC 9899:1990 standard itself:

Each issue uses the same section text as found in the ISO standard.

Each issue is preceded by its corresponding section number in the ISO standard.

C.1 Implementation Compared to the ISO Standard

C.1.1 Translation (G.3.1)

The numbers in parentheses correspond to section numbers in the ISO/IEC 9899:1990 standard.

(5.1.1.3) Identification of diagnostics:

Error messages have the following format:

filename, line line number: message

Warning messages have the following format:

filename, line line number: warning message

Where:

filename is the name of the file containing the error or warning

line number is the number of the line on which the error or warning is found

message is the diagnostic message

C.1.2 Environment (G.3.2)

(5.1.2.2.1) Semantics of arguments to `main`:

int main (int argc, char *argv[])

....

argc is the number of command-line arguments with which the program is invoked with. After any shell expansion, argc is always equal to at least 1, the name of the program.

argv is an array of pointers to the command-line arguments.

(5.1.2.3) What constitutes an interactive device:

An interactive device is one for which the system library call isatty() returns a nonzero value.

C.1.3 Identifiers (G.3.3)

(6.1.2) The number of significant initial characters (beyond 31) in an identifier without external linkage:

The first 1,023 characters are significant. Identifiers are case-sensitive.

(6.1.2) The number of significant initial characters (beyond 6) in an identifier with external linkage:

The first 1,023 characters are significant. Identifiers are case-sensitive.

C.1.4 Characters (G.3.4)

(5.2.1) The members of the source and execution character sets, except as explicitly specified in the Standard:

Both sets are identical to the ASCII character sets, plus locale-specific extensions.

(5.2.1.2) The shift states used for the encoding of multibyte characters:

There are no shift states.

(5.2.4.2.1) The number of bits in a character in the execution character set:

There are 8 bits in a character for the ASCII portion; locale-specific multiple of 8 bits for locale-specific extended portion.

(6.1.3.4) The mapping of members of the source character set (in character and string literals) to members of the execution character set:

Mapping is identical between source and execution characters.

(6.1.3.4) The value of an integer character constant that contains a character or escape sequence not represented in the basic execution character set or the extended character set for a wide character constant:

It is the numerical value of the rightmost character. For example, '\q' equals 'q'. A warning is emitted if such an escape sequence occurs.

(3.1.3.4) The value of an integer character constant that contains more than one character or a wide character constant that contains more than one multibyte character:

A multiple-character constant that is not an escape sequence has a value derived from the numeric values of each character.

(6.1.3.4) The current locale used to convert multibyte characters into corresponding wide characters (codes) for a wide character constant:

The valid locale specified by LC_ALL, LC_CTYPE, or LANG environment variable.

(6.2.1.1) Whether a plain `char` has the same range of values as `signed char` or `unsigned char`:

A char is treated as a signed char (SPARC) (Intel).

C.1.5 Integers (G.3.5)

(6.1.2.5) The representations and sets of values of the various types of integers:


Integer	Bits	Minimum	Maximum
`char` (SPARC) (Intel)	8	-128	127
`signed` `char`	8	-128	127
`unsigned char`	8	0	255
`short`	16	-32768	32767
`signed short`	16	-32768	32767
`unsigned short`	16	0	65535
`int`	32	-2147483648	2147483647
`signed int`	32	-2147483648	2147483647
`unsigned int`	32	0	4294967295
`long` (SPARC) `v8`	32	-2147483648	2147483647
`long` (SPARC) `v9`	64	-9223372036854775808	9223372036854775807
`signed long` (SPARC)`v8`	32	-2147483648	2147483647
`signed long` (SPARC) `v9`	64	-9223372036854775808	9223372036854775807
`unsigned long` (SPARC) `v8`	32	0	4294967295
`unsigned long` (SPARC) `v9`	64	0	18446744073709551615
`long long`^[1]	64	-9223372036854775808	9223372036854775807
`signed long long*`	64	-9223372036854775808	9223372036854775807
`unsigned long long*`	64	0	18446744073709551615

(6.2.1.2) The result of converting an integer to a shorter signed integer, or the result of converting an unsigned integer to a signed integer of equal length, if the value cannot be represented:

When an integer is converted to a shorter signed integer, the low order bits are copied from the longer integer to the shorter signed integer. The result may be negative.

When an unsigned integer is converted to a signed integer of equal size, the low order bits are copied from the unsigned integer to the signed integer. The result may be negative.

(6.3) The results of bitwise operations on signed integers:

The result of a bitwise operation applied to a signed type is the bitwise operation of the operands, including the sign bit. Thus, each bit in the result is set if--and only if--each of the corresponding bits in both of the operands is set.

(6.3.5) The sign of the remainder on integer division:

The result is the same sign as the dividend; thus, the remainder of -23/4 is -3.

(6.3.7) The result of a right shift of a negative-valued signed integral type:

The result of a right shift is a signed right shift.

C.1.6 Floating-Point (G.3.6)

(6.1.2.5) The representations and sets of values of the various types of floating-point numbers:


float
Bits	32
Min	1.17549435E-38
Max	3.40282347E+38
Epsilon	1.19209290E-07


double
Bits	64
Min	2.2250738585072014E-308
Max	1.7976931348623157E+308
Epsilon	2.2204460492503131E-16


long double
Bits	128 (SPARC) 80 (Intel)
Min	3.362103143112093506262677817321752603E-4932 (SPARC) 3.3621031431120935062627E-4932 (Intel)
Max	1.189731495357231765085759326628007016E+4932 (SPARC) 1.1897314953572317650213E4932 (Intel)
Epsilon	1.925929944387235853055977942584927319E-34 (SPARC) 1.0842021724855044340075E-19 (Intel)

(6.2.1.3) The direction of truncation when an integral number is converted to a floating-point number that cannot exactly represent the original value:

Numbers are rounded to the nearest value that can be represented.

(6.2.1.4) The direction of truncation or rounding when a floating- point number is converted to a narrower floating-point number:

Numbers are rounded to the nearest value that can be represented.

C.1.7 Arrays and Pointers (G.3.7)

(6.3.3.4, 7.1.1) The type of integer required to hold the maximum size of an array; that is, the type of the `sizeof` operator, `size_t`:

unsigned int as defined in stddef.h.

unsigned long for -Xarch=v9

(6.3.4) The result of casting a pointer to an integer, or vice versa:

The bit pattern does not change for pointers and values of type int, long, unsigned int and unsigned long.

(6.3.6, 7.1.1) The type of integer required to hold the difference between two pointers to members of the same array, `ptrdiff_t`:

int as defined in stddef.h.

long for -Xarch=v9

C.1.8 Registers (G.3.8)

(6.5.1) The extent to which objects can actually be placed in registers by use of the `register` storage-class specifier:

The number of effective register declarations depends on patterns of use and definition within each function and is bounded by the number of registers available for allocation. Neither the compiler nor the optimizer is required to honor register declarations.

C.1.9 Structures, Unions, Enumerations, and Bit-Fields (G.3.9)

(6.3.2.3) A member of a union object is accessed using a member of a different type:

The bit pattern stored in the union member is accessed, and the value interpreted, according to the type of the member by which it is accessed.

(6.5.2.1) The padding and alignment of members of structures.


Type	Alignment Boundary	Byte Alignment
`char`	Byte	1
`short`	Halfword	2
`int`	Word	4
`long` (SPARC) `v8`	Word	4
`long` (SPARC) `v9`	Doubleword	8
`float` (SPARC)	Word	4
`double` (SPARC)	Doubleword (SPARC) Word (Intel)	8 (SPARC) 4 (Intel)
`long` `double` (SPARC) `v8`	Doubleword (SPARC) Word (Intel)	8 (SPARC) 4 (Intel)
`long` `double` (SPARC) `v9`	Quadword	16
`pointer` (SPARC) `v8`	Word	4
`pointer` (SPARC) `v9`	Quadword	8
`long` `long`^[2]	Doubleword (SPARC) Word (Intel)	8 (SPARC) 4 (Intel)

Structure members are padded internally, so that every element is aligned on the appropriate boundary.

Alignment of structures is the same as its more strictly aligned member. For example, a struct with only chars has no alignment restrictions, whereas a struct containing a double would be aligned on an 8-byte boundary.

(6.5.2.1) Whether a plain `int` bit-field is treated as a `signed int` bit-field or as an `unsigned int` bit-field:

It is treated as an unsigned int.

(6.5.2.1) The order of allocation of bit-fields within an `int`:

Bit-fields are allocated within a storage unit from high-order to low-order.

(6.5.2.1) Whether a bit-field can straddle a storage-unit boundary:

Bit-fields do not straddle storage-unit boundaries.

(6.5.2.2) The integer type chosen to represent the values of an enumeration type:

This is an int.

C.1.10 Qualifiers (G.3.10)

(6.5.5.3) What constitutes an access to an object that has volatile-qualified type:

Each reference to the name of an object constitutes one access to the object.

C.1.11 Declarators (G.3.11)

(6.5.4) The maximum number of declarators that may modify an arithmetic, structure, or union type:

No limit is imposed by the compiler.

C.1.12 Statements (G.3.12)

(6.6.4.2) The maximum number of `case` values in a `switch` statement:

No limit is imposed by the compiler.

C.1.13 Preprocessing Directives (G.3.13)

(6.8.1) Whether the value of a single-character character constant in a constant expression that controls conditional inclusion matches the value of the same character constant in the execution character set:

A character constant within a preprocessing directive has the same numeric value as it has within any other expression.

(6.8.1) Whether such a character constant may have a negative value:

Character constants in this context may have negative values (SPARC) (Intel).

(6.8.2) The method for locating includable source files:

A file whose name is delimited by < > is searched for first in the directories named by the -I option, and then in the standard directory. The standard directory is /usr/include, unless the -YI option is used to specify a different default location.

A file whose name is delimited by quotes is searched for first in the directory of the source file that contains the #include, then in directories named by the -I option, and last in the standard directory.

If a file name enclosed in < > or double quotes begins with a / character, the file name is interpreted as a path name beginning in the root directory. The search for this file begins in the root directory only.

(6.8.2) The support of quoted names for includable source files:

Quoted file names in include directives are supported.

(6.8.2) The mapping of source file character sequences:

Source file characters are mapped to their corresponding ASCII values.

(6.8.6) The behavior on each recognized `#pragma` directive:

The following pragmas are supported. See Section 2.8, Pragmas for more information.

align integer (variable[, variable])

does_not_read_global_data (funcname [, funcname])

does_not_return (funcname[, funcname])

does_not_write_global_data (funcname[, funcname])

error_messages (on|off|default, tag1[ tag2... tagn])

fini (f1[, f2..., fn])

ident string

init (f1[, f2..., fn])

inline (funcname[, funcname])

int_to_unsigned (funcname)

MP serial_loop

MP serial_loop_nested

MP taskloop

no_inline (funcname[, funcname])

nomemorydepend

no_side_effect (funcname[, funcname])

opt_level (funcname[, funcname])

pack(n)

pipeloop(n)

rarely_called (funcname[, funcname])

redefine_extname old_extname new_extname

returns_new_memory (funcname[, funcname])

unknown_control_flow (name[, name])

unroll (unroll_factor)

weak (symbol1 [= symbol2])

(6.8.8) The definitions for `DATE` and `TIME` when, respectively, the date and time of translation are not available:

These macros are always available from the environment.

C.1.14 Library Functions (G.3.14)

(7.1.6) The null pointer constant to which the macro `NULL` expands:

NULL equals 0.

(7.2) The diagnostic printed by and the termination behavior of the `assert` function:

The diagnostic is:

Assertion failed: statement. file filename, line number

Where:

statement is the statement which failed the assertion

filename is the name of the file containing the failure

line number is the number of the line on which the failure occurs

(7.3.1) The sets of characters tested for by the `isalnum`, `isalpha`, `iscntrl`, `islower`, `isprint`, and `isupper` functions:


`isalnum`	ASCII characters A-Z, a-z and 0-9
`isalpha`	ASCII characters A-Z and a-z, plus locale-specific single-byte letters
`iscntrl`	ASCII characters with value 0-31 and 127
`islower`	ASCII characters a-z
`isprint`	Locale-specific single-byte printable characters
`isupper`	ASCII characters A-Z

(7.5.1) The values returned by the mathematics functions on domain errors:


Error	Math Functions	Compiler Modes
Error	Math Functions	`-Xs`, `-Xt`	`-Xa`, `-Xc`
DOMAIN	acos(\|x\|>1)	0.0	0.0
DOMAIN	asin(\|x\|>1)	0.0	0.0
DOMAIN	atan2(+-0,+-0)	0.0	0.0
DOMAIN	y0(0)	-HUGE	-HUGE_VAL
DOMAIN	y0(x<0)	-HUGE	-HUGE_VAL
DOMAIN	y1(0)	-HUGE	-HUGE_VAL
DOMAIN	y1(x<0)	-HUGE	-HUGE_VAL
DOMAIN	yn(n,0)	-HUGE	-HUGE_VAL
DOMAIN	yn(n,x<0)	-HUGE	-HUGE_VAL
DOMAIN	log(x<0)	-HUGE	-HUGE_VAL
DOMAIN	log10(x<0)	-HUGE	-HUGE_VAL
DOMAIN	pow(0,0)	0.0	1.0
DOMAIN	pow(0,neg)	0.0	-HUGE_VAL
DOMAIN	pow(neg,non-integal)	0.0	NaN
DOMAIN	sqrt(x<0)	0.0	NaN
DOMAIN	fmod(x,0)	x	NaN
DOMAIN	remainder(x,0)	NaN	NaN
DOMAIN	acosh(x<1)	NaN	NaN
DOMAIN	atanh(\|x\|>1)	NaN	NaN

(7.5.1) Whether the mathematics functions set the integer expression `errno` to the value of the macro `ERANGE` on underflow range errors:

Mathematics functions, except scalbn, set errno to ERANGE when underflow is detected.

(7.5.6.4) Whether a domain error occurs or zero is returned when the `fmod` function has a second argument of zero:

In this case, it returns the first argument with domain error.

(7.7.1.1) The set of signals for the `signal` function:

The following table shows the semantics for each signal as recognized by the signal function:


Signal	No.	Default	Event
SIGHUP	1	Exit	`hangup`
SIGINT	2	Exit	`interrupt`
SIGQUIT	3	Core	`quit`
SIGILL	4	Core	illegal instruction (not reset when caught)
SIGTRAP	5	Core	`trace trap (not reset when caught)`
SIGIOT	6	Core	`IOT instruction`
SIGABRT	6	Core	`Used by abort`
SIGEMT	7	Core	`EMT instruction`
SIGFPE	8	Core	`floating point exception`
SIGKILL	9	Exit	`kill (cannot be caught or ignored)`
SIGBUS	10	Core	`bus error`
SIGSEGV	11	Core	`segmentation violation`
SIGSYS	12	Core	`bad argument to system call`
SIGPIPE	13	Exit	`write on a pipe with no one to read it`
SIGALRM	14	Exit	`alarm clock`
SIGTERM	15	Exit	`software termination signal from kill`
SIGUSR1	16	Exit	`user defined signal 1`
SIGUSR2	17	Exit	`user defined signal 2`
SIGCLD	18	Ignore	`child status change`
SIGCHLD	18	Ignore	`child status change alias`
SIGPWR	19	Ignore	`power-fail restart`
SIGWINCH	20	Ignore	`window size change`
SIGURG	21	Ignore	`urgent socket condition`
SIGPOLL	22	Exit	`pollable event occurred`
SIGIO	22	Exit	`socket I/O possible`
SIGSTOP	23	Stop	`stop (cannot be caught or ignored)`
SIGTSTP	24	Stop	`user stop requested from tty`
SIGCONT	25	Ignore	`stopped process has been continued`
SIGTTIN	26	Stop	`background tty read attempted`
SIGTTOU	27	Stop	`background tty write attempted`
SIGVTALRM	28	Exit	`virtual timer expired`
SIGPROF	29	Exit	`profiling timer expired`
SIGXCPU	30	Core	`exceeded cpu limit`
SIGXFSZ	31	Core	`exceeded file size limit`
SIGWAITINGT	32	Ignore	`process's lwps are blocked`

(7.7.1.1) The default handling and the handling at program startup for each `signal` recognized by the signal function:

See above.

(7.7.1.1) If the equivalent of `signal(sig, SIG_DFL);` is not executed prior to the call of a signal handler, the blocking of the signal that is performed:

The equivalent of signal(sig,SIG_DFL) is always executed.

(7.7.1.1) Whether the default handling is reset if the `SIGILL` signal is received by a handler specified to the signal function:

Default handling is not reset in SIGILL.

(7.9.2) Whether the last line of a text stream requires a terminating new-line character:

The last line does not need to end in a newline.

(7.9.2) Whether space characters that are written out to a text stream immediately before a new-line character appear when read in:

All characters appear when the stream is read.

(7.9.2) The number of null characters that may be appended to data written to a binary stream:

No null characters are appended to a binary stream.

(7.9.3) Whether the file position indicator of an append mode stream is initially positioned at the beginning or end of the file:

The file position indicator is initially positioned at the end of the file.

(7.9.3) Whether a write on a text stream causes the associated file to be truncated beyond that point:

A write on a text stream does not cause a file to be truncated beyond that point unless a hardware device forces it to happen.

(7.9.3) The characteristics of file buffering:

Output streams, with the exception of the standard error stream (stderr), are by default-buffered if the output refers to a file, and line-buffered if the output refers to a terminal. The standard error output stream (stderr) is by default unbuffered.

A buffered output stream saves many characters, and then writes the characters as a block. An unbuffered output stream queues information for immediate writing on the destination file or terminal immediately. Line-buffered output queues each line of output until the line is complete (a newline character is requested).

(7.9.3) Whether a zero-length file actually exists:

A zero-length file does exist since it has a directory entry.

(7.9.3) The rules for composing valid file names:

A valid file name can be from 1 to 1,023 characters in length and can use all character except the characters null and / (slash).

(7.9.3) Whether the same file can be open multiple times:

The same file can be opened multiple times.

(7.9.4.1) The effect of the `remove` function on an open file:

The file is deleted on the last call which closes the file. A program cannot open a file which has already been removed.

(7.9.4.2) The effect if a file with the new name exists prior to a call to the `rename` function:

If the file exists, it is removed and the new file is written over the previously existing file.

(7.9.6.1) The output for `%p` conversion in the `fprintf` function:

The output for %p is equivalent to %x.

(7.9.6.2) The input for `%p` conversion in the `fscanf` function:

The input for %p is equivalent to %x.

(7.9.6.2) The interpretation of a `-` character that is neither the first nor the last character in the scan list for `%[` conversion in the `fscanf` function:

The - character indicates an inclusive range; thus, [0-9] is equivalent to [0123456789].

C.1.15 Locale-Specific Behavior (G.4)

(7.12.1) The local time zone and Daylight Savings Time:

The local time zone is set by the environment variable TZ.

(7.12.2.1) The era for the `clock` function

The era for the clock is represented as clock ticks with the origin at the beginning of the execution of the program.

The following characteristics of a hosted environment are locale-specific:

(5.2.1) The content of the execution character set, in addition to the required members:

Locale-specific (no extension in C locale).

(5.2.2) The direction of printing:

Printing is always left to right.

(7.1.1) The decimal-point character:

Locale-specific ("." in C locale).

(7.3) The implementation-defined aspects of character testing and case mapping functions:

Same as 4.3.1.

(7.11.4.4) The collation sequence of the execution character set:

Locale-specific (ASCII collation in C locale).

(7.12.3.5) The formats for time and date:

Locale-specific. Formats for the C locale are shown in the tables below.

The names of the months are:


January	May	September
February	June	October
March	July	November
April	August	December

The names of the days of the week are:


Days		Abbreviated Days
Sunday	Thursday	Sun	Thu
Monday	Friday	Mon	Fri
Tuesday	Saturday	Tue	Sat
Wednesday		Wed

The format for time is:

%H:%M:%S

The format for date is:

%m/%d/%y

The formats for AM and PM designation are: AM PM

^{1 (TableFootnote) Not valid in -Xc mode}

^{2 (TableFootnote) Not available in -Xc mode.}