C H A P T E R  1

Introduction

Sun Performance Library is a set of optimized, high-speed mathematical subroutines for solving linear algebra and other numerically intensive problems. Sun Performance Library is based on a collection of public domain applications available from Netlib at http://www.netlib.org. Sun has enhanced these public domain applications and bundled them as the Sun Performance Library.

The Sun Performance Library User’s Guide explains the Sun-specific enhancements to the base applications available from Netlib. Reference material describing the base routines is available from Netlib and the Society for Industrial and Applied Mathematics (SIAM).


1.1 Libraries Included With Sun Performance Library

Sun Performance Library contains enhanced versions of the following standard libraries:



Note - LINPACK has been removed from Sun Performance Library. LAPACK version 3.1.1 supersedes LINPACK and all previous versions of LAPACK. If the LINPACK routines are still needed, the LINPACK library and documentation can be obtained from http://www.netlib.org.


Sun Performance Library is available in both static and dynamic library forms. There are optimized SPARC versions for V8, sparcvis, sparcvis2, and sparcfmaf architectures on Solaris 10 operating systems. There are also optimized versions for x86/x64 architectures on Solaris 10 systems, along with SuSE Linux Enterprise Server 9 and Redhat Enterprise Linux 4 systems. All versions have support for parallel programming on multiprocessor platforms. (See the Sun Studio release notes for details.)

Sun Performance Library LAPACK routines have been compiled with a Fortran 95 compiler and remain compatible with the Netlib LAPACK version 3.1.1 library. The Sun Performance Library versions of these routines perform the same operations as the Fortran callable routines and have the same interface as the standard Netlib versions.

LAPACK contains driver, computational, and auxiliary routines. Sun Performance Library does not support the auxiliary routines, because auxiliary routines can change or be removed from LAPACK without notice. Because the auxiliary routines are not supported, they are not documented in the Sun Performance Library User’s Guide or the section 3P man pages.

Many auxiliary routines contain LA as the second and third characters in the routine name; however, some do not. Appendix B of the LAPACK Users’ Guide contains a list of auxiliary routines.

1.1.1 Netlib

Netlib is an online repository of mathematical software, papers, and databases maintained by AT&T Bell Laboratories, the University of Tennessee, Oak Ridge National Laboratory, and professionals from around the world.

Netlib provides many libraries, in addition to the libraries used in Sun Performance Library. While some of these libraries can appear similar to libraries used with Sun Performance Library, they can be different from, and incompatible with Sun Performance Library.

Using routines from other libraries can produce compatibility problems, not only with Sun Performance Library routines, but also with the base Netlib LAPACK routines. When using routines from other libraries, refer to the documentation provided with those libraries.

For example, Netlib provides a CLAPACK library, but the CLAPACK interfaces differ from the C interfaces included with Sun Performance Library. A LAPACK 90 library package is also available on Netlib. The LAPACK 90 library contains interfaces that differ from the Sun Performance Library Fortran 95 interfaces and the Netlib LAPACK version 3.1.1 interfaces. If using LAPACK 90, refer to the documentation provided with that library.

For the base libraries supported by Sun Performance Library, Netlib provides detailed information that can supplement this user’s guide. The LAPACK Users’ Guide, Third Edition describes LAPACK algorithms and how to use the routines, but it does not describe the Sun Performance Library extensions made to the base routines.


1.2 Sun Performance Library Features

Sun Performance Library routines can increase application performance on both serial and multiprocessor (MP) platforms, because the serial speed of many Sun Performance Library routines has been increased, and many routines have been parallelized. Sun Performance Library routines also have SPARC, AMD, and Intel specific optimizations that are not present in the base Netlib libraries.

Sun Performance Library provides the following optimizations and extensions to the base Netlib libraries:


1.3 Mathematical Routines

The Sun Performance Library routines are used to solve the following types of linear algebra and numerical problems:


1.4 Compatibility With Previous LAPACK Versions

The Sun Performance Library routines that are based on LAPACK support the expanded capabilities and improved algorithms in LAPACK 3.1.1, but are completely compatible with both LAPACK l and LAPACK 2.0. Maintaining compatibility with previous LAPACK versions:


1.5 Getting Started With Sun Performance Library

This section shows the most basic compiler options used to compile an application that uses the Sun Performance Library routines.

To use the Sun Performance Library, type one of the following commands.


my_system% f95 -dalign my_file.f -xlic_lib=sunperf

or


my_system% cc -xmemalign=8s my_file.c -xlic_lib=sunperf

or


my_system% CC my_file.cpp -xmemalign=8s -library=sunperf

Because Sun Performance Library routines are compiled with -dalign
(-xmemalign=8s for C routines), the -dalign option should be used for compilation of all files if any routine in the program makes a Sun Performance Library call. On SPARC platforms, if -dalign cannot be used, enabling Trap 6, described in the section Enabling Trap 6 on SPARC Platforms, is a low-performance workaround that allows misaligned data. While there are no data alignment restrictions on x86/x64 platforms, misaligned data might require extra instructions to properly handle memory transfers, which in turn can cause poor performance.

The -xlic_lib=sunperf option (for F95 and C) and the -library=sunperf option (for C++) include additional compiler and system libraries (e.g. Fortran run-time and micro-tasking library) and set run-time search paths for the resulting executable or shared library.

To summarize, use:

See Compiling, and Parallel Processing for additional options that optimize application performance.

1.5.1 Enabling Trap 6 on SPARC Platforms

On SPARC platforms, if an application cannot be compiled using -dalign, enable trap 6 to provide a handler for misaligned data. To enable trap 6 on SPARC platforms, do the following:

1. Place this assembly code in a file called trap6_handler.s.


	.global trap6_handler_
	.text
	.align 4
trap6_handler_:
	retl
	ta    6

2. Assemble trap6_handler.s.


my_system% fbe trap6_handler.s

The first parallelizable subroutine invoked from Sun Performance Library will call a routine named trap6_handler_. If a trap6_handler_ is not specified, Sun Performance Library will call a default handler that does nothing. Not supplying a handler for any misaligned data will cause a trap that will be fatal. (fbe (1) is the Solaris assembler for SPARC platforms.)

3. Include trap6_handler.o on the command line.


my_system% f95 any.f trap6_handler.o -xlic_lib=sunperf