C H A P T E R 9 |
Using Runtime Checking |
Runtime checking (RTC) lets you automatically detect runtime errors, such as memory access errors and memory leak, in a native code application during the development phase. It also lets you monitor memory usage. You cannot use runtime checking on Java code.
The following topics are covered in this chapter:
Because runtime checking is an integral debugging feature, you can perform all debugging operations while using runtime checking except collecting performance data using the Collector.
Compiling with the -g flag provides source line number correlation in the runtime checking error messages. Runtime checking can also check programs compiled with the optimization -O flag. There are some special considerations with programs not compiled with the -g option.
You can use runtime checking by using the check command.
One way to avoid seeing a large number of errors at once is to use runtime checking earlier in the development cycle--as you are developing the individual modules that make up your program. Write a unit test to drive each module and use runtime checking incrementally to check one module at a time. That way, you deal with a smaller number of errors at a time. When you integrate all of the modules into the full program, you are likely to encounter few new errors. When you reduce the number of errors to zero, you need to run runtime checking again only when you make changes to a module.
To use runtime checking, you must fulfill the following requirements:
Runtime checking does not handle program text areas and data areas larger than 8 megabytes on hardware that is not based on UltraSPARC® processors. For more information, see Runtime Checking's 8 Megabyte Limit.
A possible solution is to insert special files in the executable image to handle program text areas and data areas larger than 8 megabytes.
To use runtime checking, enable the type of checking you want to use before you run the program.
To turn on memory use and memory leak checking, type:
(dbx) check -memuse |
When memory use checking or memory leak checking is turned on, the showblock command shows the details about the heap block at a given address. The details include the location of the block's allocation and its size. For more information, see showblock Command.
To turn on memory access checking only, type:
(dbx) check -access |
To turn on memory leak, memory use, and memory access checking, type:
(dbx) check -all |
For more information, see check Command.
To turn off runtime checking entirely, type:
(dbx) uncheck -all |
For detailed information, see uncheck Command.
After turning on the types of runtime checking you want, run the program being tested, with or without breakpoints.
The program runs normally, but slowly because each memory access is checked for validity just before it occurs. If dbx detects invalid access, it displays the type and location of the error. Control returns to you (unless the dbx environment variable rtc_auto_continue is set to on (see Setting dbx Environment Variables.))
You can then issue dbx commands, such as where to get the current stack trace or print to examine variables. If the error is not a fatal error, you can continue execution of the program with the cont command. The program continues to the next error or breakpoint, whichever is detected first. For detailed information, see cont Command.
If rtc_auto_continue is set to on, runtime checking continues to find errors, and keeps running automatically. It redirects errors to the file named by the dbx environment variable rtc_error_log_file_name. (See Setting dbx Environment Variables.) The default log file name is /tmp/dbx.errlog.uniqueid.
You can limit the reporting of runtime checking errors using the suppress command. For detailed information, see suppress Command.
Below is a simple example showing how to turn on memory access and memory use checking for a program called hello.c.
The function access_error() reads variable j before it is initialized. Runtime checking reports this access error as a Read from uninitialized (rui).
The function memory_leak() does not free the variable local before it returns. When memory_leak() returns, this variable goes out of scope and the block allocated at line 20 becomes a leak.
The program uses global variables hello1 and hello2, which are in scope all the time. They both point to dynamically allocated memory, which is reported as Blocks in use (biu).
Access checking checks whether your program accesses memory correctly by monitoring each read, write, and memory free operation.
Programs might incorrectly read or write memory in a variety of ways; these are called memory access errors. For example, the program may reference a block of memory that has been deallocated through a free()call for a heap block. Or a function might return a pointer to a local variable, and when that pointer is accessed an error would result. Access errors might result in wild pointers in the program and can cause incorrect program behavior, including wrong outputs and segmentation violations. Some kinds of memory access errors can be very hard to track down.
Runtime checking maintains a table that tracks the state of each block of memory being used by the program. Runtime checking checks each memory operation against the state of the block of memory it involves and then determines whether the operation is valid. The possible memory states are:
Using runtime checking to find memory access errors is not unlike using a compiler to find syntax errors in your program. In both cases, a list of errors is produced, with each error message giving the cause of the error and the location in the program where the error occurred. In both cases, you should fix the errors in your program starting at the top of the error list and working your way down. One error can cause other errors in a chain reaction. The first error in the chain is, therefore, the "first cause," and fixing that error might also fix some subsequent errors.
For example, a read from an uninitialized section of memory can create an incorrect pointer, which when dereferenced can cause another invalid read or write, which can in turn lead to yet another error.
Runtime checking prints the following information for memory access errors:
The following example shows a typical access error.
Read from uninitialized (rui): Attempting to read 4 bytes at address 0xefffee50 which is 96 bytes above the current stack pointer Variable is `j' Current function is rui 12 i = j; |
Runtime checking detects the following memory access errors:
Note - Runtime checking does not perform array bounds checking and, therefore, does not report array bound violations as access errors. |
A memory leak is a dynamically allocated block of memory that has no pointers pointing to it anywhere in the data space of the program. Such blocks are orphaned memory. Because there are no pointers pointing to the blocks, programs cannot reference them, much less free them. Runtime checking finds and reports such blocks.
Memory leaks result in increased virtual memory consumption and generally result in memory fragmentation. This might slow down the performance of your program and the whole system.
Typically, memory leaks occur because allocated memory is not freed and you lose a pointer to the allocated block. Here are some examples of memory leaks:
A leak can result from incorrect use of an API.
You can avoid memory leaks by always freeing memory when it is no longer needed and paying close attention to library functions that return allocated memory. If you use such functions, remember to free up the memory appropriately.
Sometimes the term memory leak is used to refer to any block that has not been freed. This is a much less useful definition of a memory leak, because it is a common programming practice not to free memory if the program will terminate shortly. Runtime checking does not report a block as a leak, if the program still retains one or more pointers to it.
Runtime checking detects the following memory leak errors:
Note - Runtime checking only finds leaks of malloc memory. If your program does not use malloc, runtime checking cannot find memory leaks. |
There are two cases where runtime checking can report a "possible" leak. The first case is when no pointers are found pointing to the beginning of the block, but a pointer is found pointing to the interior of the block. This case is reported as an "Address in Block (aib)" error. If it was a stray pointer that pointed into the block, this would be a real memory leak. However, some programs deliberately move the only pointer to an array back and forth as needed to access its entries. In this case, it would not be a memory leak. Because runtime checking cannot distinguish between these two cases, it reports both of them as possible leaks, letting you determine which are real memory leaks.
The second type of possible leak occurs when no pointers to a block are found in the data space, but a pointer is found in a register. This case is reported as an "Address in Register (air)" error. If the register points to the block accidentally, or if it is an old copy of a memory pointer that has since been lost, then this is a real leak. However, the compiler can optimize references and place the only pointer to a block in a register without ever writing the pointer to memory. Such a case would not be a real leak. Hence, if the program has been optimized and the report was the result of the showleaks command, it is likely not to be a real leak. In all other cases, it is likely to be a real leak. For more information, see showleaks Command.
Note - Runtime leak checking requires the use of the standard libc malloc/free/realloc functions or allocators based on those functions. For other allocators, see Runtime Checking Application Programming Interface. |
If memory leak checking is turned on, a scan for memory leaks is automatically performed just before the program being tested exits. Any detected leaks are reported. The program should not be killed with the kill command. Here is a typical memory leak error message:
Memory leak (mel): Found leaked block of size 6 at address 0x21718 At time of allocation, the call stack was: [1] foo() at line 63 in test.c [2] main() at line 47 in test.c |
A UNIX program has a main procedure (called MAIN in f77) that is the top-level user function for the program. Normally, a program terminates either by calling exit(3) or by returning from main. In the latter case, all variables local to main go out of scope after the return, and any heap blocks they pointed to are reported as leaks (unless global variables point to those same blocks).
It is a common programming practice not to free heap blocks allocated to local variables in main, because the program is about to terminate and return from main without calling exit(). To prevent runtime checking from reporting such blocks as memory leaks, stop the program just before main returns by setting a breakpoint on the last executable source line in main. When the program halts there, use the showleaks command to report all the true leaks, omitting the leaks that would result merely from variables in main going out of scope.
For more information, see showleaks Command.
With leak checking turned on, you receive an automatic leak report when the program exits. All possible leaks are reported--provided the program has not been killed using the kill command. The level of detail in the report is controlled by the dbx environment variable rtc_mel_at_exit (see Setting dbx Environment Variables). By default, a nonverbose leak report is generated.
Reports are sorted according to the combined size of the leaks. Actual memory leaks are reported first, followed by possible leaks. The verbose report contains detailed stack trace information, including line numbers and source files whenever they are available.
Both reports include the following information for memory leak errors:
Call stack at time of allocation, as constrained by check -frames. |
Here is the corresponding nonverbose memory leak report.
Following is a typical verbose leak report.
You can ask for a leak report at any time using the showleaks command, which reports new memory leaks since the last showleaks command. For more information, see showleaks Command.
Because the number of individual leaks can be very large, runtime checking automatically combines leaks allocated at the same place into a single combined leak report. The decision to combine leaks, or report them individually, is controlled by the number-of-frames-to-match parameter specified by the -match m option on a check -leaks or the -m option of the showleaks command. If the call stack at the time of allocation for two or more leaks matches to m frames to the exact program counter level, these leaks are reported in a single combined leak report.
Consider the following three call sequences:
If all of these blocks lead to memory leaks, the value of m determines whether the leaks are reported as separate leaks or as one repeated leak. If m is 2, Blocks 1 and 2 are reported as one repeated leak because the 2 stack frames above malloc() are common to both call sequences. Block 3 will be reported as a separate leak because the trace for c() does not match the other blocks. For m greater than 2, runtime checking reports all leaks as separate leaks. (The malloc is not shown on the leak report.)
In general, the smaller the value of m, the fewer individual leak reports and the more combined leak reports are generated. The greater the value of m, the fewer combined leak reports and the more individual leak reports are generated.
Once you have obtained a memory leak report, follow these guidelines for fixing the memory leaks.
For more information, see showleaks Command.
Memory use checking lets you see all the heap memory in use. You can use this information to get a sense of where memory is allocated in your program or which program sections are using the most dynamic memory. This information can also be useful in reducing the dynamic memory consumption of your program and might help in performance tuning
Memory use checking is useful during performance tuning or to control virtual memory use. When the program exits, a memory use report can be generated. Memory usage information can also be obtained at any time during program execution with the showmemuse command, which causes memory usage to be displayed. For information, see showmemuse Command.
Turning on memory use checking also turns on leak checking. In addition to a leak report at the program exit, you also get a blocks in use (biu) report. By default, a nonverbose blocks in use report is generated at program exit. The level of detail in the memory use report is controlled by the dbx environment variable rtc_biu_at_exit (see Setting dbx Environment Variables).
The following is a typical nonverbose memory use report.
The following is the corresponding verbose memory use report:
You can ask for a memory use report any time with the showmemuse command.
Runtime checking provides a powerful error suppression facility that allows great flexibility in limiting the number and types of errors reported. If an error occurs that you have suppressed, then no report is given, and the program continues as if no error had occurred.
You can suppress errors using the suppress command (see suppress Command).
You can undo error suppression using the unsuppress command (see unsuppress Command).
Suppression is persistent across run commands within the same debug session, but not across debug commands.
The following types of suppression are available:
You must specify which type of error to suppress. You can specify which parts of the program to suppress. The options are:
Applies to an entire load object, such as a shared library, or the main program. |
|
By default, runtime checking suppresses the most recent error to prevent repeated reports of the same error. This is controlled by the dbx environment variable rtc_auto_suppress. When rtc_auto_suppress is set to on (the default), a particular access error at a particular location is reported only the first time it is encountered and suppressed thereafter. This is useful, for example, for preventing multiple copies of the same error report when an error occurs in a loop that is executed many times.
You can use the dbx environment variable rtc_error_limit to limit the number of errors that will be reported. The error limit is used separately for access errors and leak errors. For example, if the error limit is set to 5, then a maximum of five access errors and five memory leaks are shown in both the leak report at the end of the run and for each showleaks command you issue. The default is 1000.
In the following examples, main.cc is a file name, foo and bar are functions, and a.out is the name of an executable.
Do not report memory leaks whose allocation occurs in function foo.
suppress mel in foo |
Suppress reporting blocks in use allocated from libc.so.1.
suppress biu in libc.so.1 |
Suppress read from uninitialized in all functions in a.out.
suppress rui in a.out |
Do not report read from unallocated in file main.cc.
suppress rua in main.cc |
Suppress duplicate free at line 10 of main.cc.
suppress duf at main.cc:10 |
Suppress reporting of all errors in function bar.
suppress all in bar |
For more information, see suppress Command.
To detect all errors, runtime checking does not require the program be compiled using the -g option (symbolic). However, symbolic information is sometimes needed to guarantee the correctness of certain errors, mostly rui errors. For this reason certain errors, rui for a.out and rui, aib, and air for shared libraries, are suppressed by default if no symbolic information is available. This behavior can be changed using the -d option of the suppress and unsuppress commands.
The following command causes runtime checking to no longer suppress read from uninitialized memory (rui) in code that does not have symbolic information (compiled without -g):
unsuppress -d rui |
For more information, see unsuppress Command.
For the initial run on a large program, the large number of errors might be overwhelming. It might be better to take a phased approach. You can do so using the suppress command to reduce the reported errors to a manageable number, fixing just those errors, and repeating the cycle; suppressing fewer and fewer errors with each iteration.
For example, you could focus on a few error types at one time. The most common error types typically encountered are rui, rua, and wua, usually in that order. rui errors are less serious (although they can cause more serious errors to happen later). Often a program might still work correctly with these errors. rua and wua errors are more serious because they are accesses to or from invalid memory addresses and always indicate a coding error.
You can start by suppressing rui and rua errors. After fixing all the wua errors that occur, run the program again, this time suppressing only rui errors. After fixing all the rua errors that occur, run the program again, this time with no errors suppressed. Fix all the rui errors. Lastly, run the program a final time to ensure no errors are left.
If you want to suppress the last reported error, use suppress -last.
To use runtime checking on a child process, you must have the dbx environment variable rtc_inherit set to on. By default, it is set to off. (See Setting dbx Environment Variables.)
dbx supports runtime checking of a child process if runtime checking is enabled for the parent and the dbx environment variable follow_fork_mode is set to child (see Setting dbx Environment Variables).
When a fork happens, dbx automatically performs runtime checking on the child. If the program calls exec(), the runtime checking settings of the program calling exec() are passed on to the program.
At any given time, only one process can be under runtime checking control. The following is an example.
Runtime checking works on an attached process with the exception that RUI cannot be detected if the affected memory has already been allocated. However, the process must have librtc.so preloaded when it starts. If the process to which you are attaching is a 64-bit SPARC V9 process, use the sparcv9 librtc.so. If the product is installed in /opt, librtc.so is at:
/opt/SUNWspro/lib/v9/librtc.so for SPARC V9
/opt/SUNWspro/lib for all other platforms
% setenv LD_PRELOAD path-to-librtc/librtc.so |
Set LD_PRELOAD to preload librtc.so only when needed; do not keep it loaded all the time. For example:
% setenv LD_PRELOAD... % start-your-application % unsetenv LD_PRELOAD |
Once you attach to the process, you can enable runtime checking.
If the program you want to attach to is forked or executed from some other program, you need to set LD_PRELOAD for the main program (which will fork). The setting of LD_PRELOAD is inherited across forks and execution.
Some versions of the Solaris Operating Environment support LD_PRELOAD_32 and LD_PRELOAD_64, which affect only 32-bit programs and 64-bit programs, respectively. See the Linker and Libraries Guide for the version of the Solaris Operating Environment you are running to determine if these variables are supported.
You can use runtime checking along with fix and continue to isolate and fix programming errors rapidly. Fix and continue provides a powerful combination that can save you a lot of debugging time. Here is an example:.
For more information on using fix and continue, see Chapter 10.
Both leak detection and access checking require that the standard heap management routines in the shared library libc.so be used so that runtime checking can keep track of all the allocations and deallocations in the program. Many applications write their own memory management routines either on top of the malloc() or free() function or stand-alone. When you use your own allocators (referred to as private allocators), runtime checking cannot automatically track them; thus you do not learn of leak and memory access errors resulting from their improper use.
However, runtime checking provides an API for the use of private allocators. This API allows the private allocators the same treatment as the standard heap allocators. The API itself is provided in the header file rtc_api.h and is distributed as a part of Sun ONE Studio Compiler Collection software. The man page rtc_api(3x) details the runtime checking API entry points.
Some minor differences might exist with runtime checking access error reporting when private allocators do not use the program heap. The error report will not include the allocation item.
The bcheck utility is a convenient batch interface to the runtime checking feature of dbx. It runs a program under dbx and by default, places the runtime checking error output in the default file program.errs.
The bcheck utility can perform memory leak checking, memory access checking, memory use checking, or all three. Its default action is to perform only leak checking. See the bcheck(1) man page for more details on its use.
bcheck [-V] [-access | -all | -leaks | -memuse] [-o logfile] [-q] [-s script] program [args] |
Use the -o logfile option to specify a different name for the logfile. Use the -s script option before executing the program to read in the dbx commands contained in the file script. The script file typically contains commands like suppress and dbxenv to tailor the error output of the bcheck utility.
The -q option makes the bcheck utility completely quiet, returning with the same status as the program. This option is useful when you want to use the bcheck utility in scripts or makefiles.
To perform only leak checking on hello, type:
bcheck hello |
To perform only access checking on mach with the argument 5, type:
bcheck -access mach 5 |
To perform memory use checking on cc quietly and exit with normal exit status, type:
bcheck -memuse -q cc -c prog.c |
The program does not stop when runtime errors are detected in batch mode. All error output is redirected to your error log file logfile. The program stops when breakpoints are encountered or if the program is interrupted.
In batch mode, the complete stack backtrace is generated and redirected to the error log file. The number of stack frames can be controlled using the dbx environment variable stack_max_size.
If the file logfile already exists, bcheck erases the contents of that file before it redirects the batch output to it.
You can also enable a batch-like mode directly from dbx by setting the dbx environment variables rtc_auto_continue and rtc_error_log_file_name (see Setting dbx Environment Variables).
If rtc_auto_continue is set to on, runtime checking continues to find errors and keeps running automatically. It redirects errors to the file named by the dbx environment variable rtc_error_log_file_name. (See Setting dbx Environment Variables.) The default log file name is /tmp/dbx.errlog.uniqueid. To redirect all errors to the terminal, set the rtc_error_log_file_name environment variable to /dev/tty.
By default, rtc_auto_continue is set to off.
After error checking has been enabled for a program and the program is run, one of the following errors may be detected:
librtc.so and dbx version mismatch; Error checking disabled
This error can occur if you are using runtime checking on an attached process and have set LD_PRELOAD to a version of librtc.so other than the one shipped with your Sun ONE Studio dbx image. To fix this, change the setting of LD_PRELOAD.
patch area too far (8mb limitation); Access checking disabled
Runtime checking was unable to find patch space close enough to a loadobject for access checking to be enabled. See "Runtime Checking's 8 Megabyte Limit" next.
The 8 megabyte limit described below no longer applies on hardware based on UltraSPARC processors, on which dbx has the ability to invoke a trap handler instead of using a branch. The transfer of control to a trap handler is up to 10 times slower but does not suffer from the 8 megabyte limit. Traps are used automatically, as necessary, as long as the hardware is based on UltraSPARC processors. You can check your hardware by using the system command isalist and checking that the result contains the string sparcv8plus. The rtc -showmap command (see rtc -showmap Command) displays a map of instrument types sorted by address.
When access checking is enabled, dbx replaces each load and store instruction with a branch instruction that branches to a patch area. This branch instruction has an 8 megabyte range. This means that if the debugged program has used up all the address space within 8 megabytes of the particular load or store instruction being replaced, no place exists to put the patch area.
If runtime checking cannot intercept all loads and stores to memory, it cannot provide accurate information and so disables access checking completely. Leak checking is unaffected.
dbx internally applies some strategies when it runs into this limitation and continues if it can rectify this problem. In some cases dbx cannot proceed; when this happens, it turns off access checking after printing an error message.
If you encounter this 8 megabyte limit, try the following workarounds.
1. Try using 32-bit SPARC-V8 instead of 64-bit SPARC-V9
2. Try adding patch area object files.
3. Try dividing the large load object into smaller load objects.
4. Try adding a "pad" .so file.
Errors reported by runtime checking generally fall in two categories. Access errors and leaks.
When access checking is turned on, runtime checking detects and reports the following types of errors.
Problem: Attempt to free memory that has never been allocated.
Possible causes: Passing a non-heap data pointer to free() or realloc().
char a[4];
char *b = &a[0];
free(b); /* Bad free (baf) */
Problem: Attempt to free a heap block that has already been freed.
Possible causes: Calling free() more than once with the same pointer. In C++, using the delete operator more than once on the same pointer.
char *a = (char *)malloc(1);
free(a);
free(a); /* Duplicate free (duf) */
Problem: Attempt to free a misaligned heap block.
Possible causes: Passing an improperly aligned pointer to free() or realloc(); changing the pointer returned by malloc.
char *ptr = (char *)malloc(4);
ptr++;
free(ptr); /* Misaligned free */
Problem: Attempt to read data from an address without proper alignment.
Possible causes: Reading 2, 4, or 8 bytes from an address that is not half-word-aligned, word-aligned, or double-word-aligned, respectively.
char *s = "hello world";
int *i = (int *)&s[1];
int j;
j = *i; /* Misaligned read (mar) */
Problem: Attempt to write data to an address without proper alignment.
Possible causes: Writing 2, 4, or 8 bytes to an address that is not half-word-aligned, word-aligned, or double-word-aligned, respectively.
char *s = "hello world";
int *i = (int *)&s[1];
*i = 0; /* Misaligned write (maw) */
Problem: Attempt to allocate memory beyond physical memory available.
Cause: Program cannot obtain more memory from the system. Useful in locating problems that occur when the return value from malloc() is not checked for NULL, which is a common programming mistake.
char *ptr = (char *)malloc(0x7fffffff);
/* Out of Memory (oom), ptr == NULL */
Problem: Attempt to read from nonexistent, unallocated, or unmapped memory.
Possible causes: A stray pointer, overflowing the bounds of a heap block or accessing a heap block that has already been freed.
char c, *a = (char *)malloc(1);
c = a[1]; /* Read from unallocated memory (rua) */
Problem: Attempt to read from uninitialized memory.
Possible causes: Reading local or heap data that has not been initialized.
foo()
{ int i, j;
j = i; /* Read from uninitialized memory (rui) */
}
Problem: Attempt to write to read-only memory.
Possible causes: Writing to a text address, writing to a read-only data section (.rodata), or writing to a page that mmap has made read-only.
foo()
{ int *foop = (int *) foo;
*foop = 0; /* Write to read-only memory (wro) */
}
Problem: Attempt to write to nonexistent, unallocated, or unmapped memory.
Possible causes: A stray pointer, overflowing the bounds of a heap block, or accessing a heap block that has already been freed.
char *a = (char *)malloc(1);
a[1] = `\0'; /* Write to unallocated memory (wua) */
With leak checking turned on, runtime checking reports the following types of errors.
Problem: A possible memory leak. There is no reference to the start of an allocated block, but there is at least one reference to an address within the block.
Possible causes: The only pointer to the start of the block is incremented.
char *ptr;
main()
{
ptr = (char *)malloc(4);
ptr++; /* Address in Block */
}
Problem: A possible memory leak. An allocated block has not been freed, and no reference to the block exists anywhere in program memory, but a reference exists in a register.
Possible causes: This can occur legitimately if the compiler keeps a program variable only in a register instead of in memory. The compiler often does this for local variables and function parameters when optimization is turned on. If this error occurs when optimization has not been turned on, it is likely to be an actual memory leak. This can occur if the only pointer to an allocated block goes out of scope before the block is freed.
if (i == 0) {
char *ptr = (char *)malloc(4);
/* ptr is going out of scope */
}
/* Memory Leak or Address in Register */
Problem: An allocated block has not been freed, and no reference to the block exists anywhere in the program.
Possible causes: Program failed to free a block no longer used.
ptr = (char *)malloc(1);
ptr = 0;
/* Memory leak (mel) */
Copyright © 2003, Sun Microsystems, Inc. All rights reserved.