Performance Tuning (SPARC)
|
This appendix describes performance tuning on SPARC platforms.
G.1 Limits
Some parts of the C library cannot be optimized for speed, even though doing so would benefit most applications. Some examples:
- Integer arithmetic routines--Current SPARC V8 processors support integer multiplication and division instructions. However, if standard C library routines were to use these instructions, programs running on V7 SPARC processors would either run slowly due to kernel emulation overhead, or might break altogether. Hence, integer multiplication and division instructions cannot be used in the standard C library routines.
- Doubleword memory access--Block copy and move routines, such as memmove() and bcopy(), could run considerably faster if they used SPARC doubleword load and store instructions (ldd and std). Some memory-mapped devices, such as frame buffers, do not support 64-bit access; nevertheless, these devices are expected to work correctly with memmove() and bcopy(). Hence, ldd and std cannot be used in the standard C library routines.
- Memory allocation algorithms--The C library routines malloc() and free() are typically implemented as a compromise between speed, space, and insensitivity to coding errors in old UNIX programs. Memory allocators based on "buddy system" algorithms typically run faster than the standard library version, but tend to use more space.
G.2 libfast.a Library
The library libfast.a provides speed-tuned versions of standard C library functions. Because it is an optional library, it can use algorithms and data representations that may not be appropriate for the standard C library, even though they improve the performance of most applications.
Use profiling to determine whether the routines in the following checklist are important to the performance of your application, then use this checklist to decide whether libfast.a benefits the performance:
- Do use libfast.a if performance of integer multiplication or division is important, even if a single binary version of the application must run on both V7 and V8 SPARC platforms. The important routines are: .mul, .div, .rem, .umul, .udiv, and .urem.
- Do use libfast.a if performance of memory allocation is important, and the size of the most commonly allocated blocks is close to a power of two. The important routines are: malloc(), free(), realloc().
- Do use libfast.a if performance of block move or fill routines is important. The important routines are: bcopy(), bzero(), memcpy(), memmove(), and memset().
- Do not use libfast.a if the application requires user mode, memory-mapped access to an I/O device that does not support 64-bit memory operations.
- Do not use libfast.a if the application is multithreaded.
When linking the application, add the option -lfast to the cc command used at link time. The cc command links the routines in libfast.a ahead of their counterparts in the standard C library.
C User's Guide
|
817-6697-10
|
|
Copyright © 2004, Sun Microsystems, Inc. All rights reserved.