Sun Studio 12: Sun Performance Library Readme

Updated 2007/05/31

Sun[tm] Studio 12: Sun Performance Library[tm] Readme

Contents

Introduction

About the Sun Performance Library

New and Changed Features

Software Corrections

Problems and Workarounds

Limitations and Incompatibilities

Documentation Errors

A. Introduction

This document contains information about the Sun Performance Library[tm] provided with the Sun Studio compilers and tools.

This document describes the software corrections, known problems, limitations, and incompatibilities of this release of the Sun Performance Library[TM].
Product Documentation

Release Notes: Available on the developer portal at http://developers.sun.com/sunstudio/documentation/ss12/release_notes.html. Information in the release notes updates and extends information in all readme files.

Sun Studio Documentation: Product man pages, HTML versions of readmes, and manuals can be accessed from /installation_directory/docs/index.html. The default installation directory is /opt/SUNWspro on Solaris, and /opt/sun/sunstudio12/ on Linux platforms.

IDE Documentation: Online help for all components of the Sun Studio IDE can be accessed from the Help menu in the IDE.

Developer Resources Portal: For technical articles, code samples, documentation, and a knowledge base, see the developers portal at http://developers.sun.com/sunstudio.

Note - If your Sun Studio compilers and tools have not been installed in the default /opt directory, ask your system administrator for the equivalent path on your system.

B. About the Sun Performance Library

This release of Sun Performance Library is available on the Solaris[tm] Operating System, versions 9 and 10. It is also available on several Linux operating environments.

Sun Performance Library is a set of optimized, high-speed mathematical subroutines for solving linear algebra and other numerically intensive problems. Sun Performance Library is based on a collection of public domain subroutines available from Netlib at http://www.netlib.org. Sun has enhanced these public domain subroutines and bundled them as the Sun Performance Library.

Sun Performance Library contains enhanced versions of the following standard libraries:

LAPACK version 3.0 for solving linear algebra problems.

BLAS1 (Basic Linear Algebra Subprograms) for performing vector-vector operations.

BLAS2 for performing matrix-vector operations.

BLAS3 for performing matrix-matrix operations.

Netlib Sparse-BLAS for performing sparse vector operations.

NIST Fortran Sparse BLAS version 0.5 for performing fundamental sparse matrix operations.

SuperLU version 3.0 for solving sparse linear systems of equations.

Sun Performance Library includes the following additional routines:

Fast Fourier transform (FFT) routines

Direct Sparse Solver routines

Interval BLAS routines

Compatibility

The LAPACK 3.0 routines in Sun Performance Library are compatible with the user routines from previous versions of LAPACK, including 1.x and 2.0, and with all routines in LAPACK 3.0. However, due to internal changes in LAPACK 3.0, compatibility with internal routines cannot be guaranteed. Internal routines that might be incompatible are called auxiliary routines in the LAPACK source code available from Netlib. Some information on auxiliary routines is included in the LAPACK Users' Guide, available from the Society for Industrial and Applied Mathematics (SIAM) at http://www.siam.org.

Because the user interfaces to the LAPACK auxiliary routines can change from release to release of LAPACK, the user interfaces to the LAPACK auxiliary routines in Sun Performance Library can change as well. Auxiliary routines compatible with LAPACK 3.0 are generally available for users to call; however, the auxiliary routines are not specifically documented, tested, or supported. Be aware that the user interfaces for the LAPACK auxiliary routines can change in future releases of Sun Performance Library, so that the user interfaces comply with the version of LAPACK supported by that version of the Sun Performance Library.

Documentation

The following Sun Performance Library documentation is available:

Man pages (section 3p) for each function and subroutine in the library

Interval BLAS man pages (section 3pi) for each interval BLAS routine

Sun Performance Library User's Guide describes and shows examples for:

Using Sun Performance Library routines

Using the Fortran and C interfaces

Using optimization and parallelization options

Using SPSOLVE and SuperLU sparse solver packages

Using the FFT routines

Sun Performance Library Reference Manual contains all the section 3p man pages.

For additional reference information, see the LAPACK Users' Guide 3rd ed., by Anderson, E. and others, SIAM, 1999, which is available from the Society for Industrial and Applied Mathematics (SIAM) or your local bookstore. The LAPACK Users' Guide is the official reference for the base LAPACK 3.0 routines available on Netlib and provides mathematical descriptions of the LAPACK 3.0 routines.

C. New and Changed Features

This section describes new and changed features for the Sun Studio Sun Performance Library.

For x86-based systems:

Libraries are available on 32-bit and 64-bit systems with SuSE Linux Enterprise Server 9 or Redhat Enterprise Linux 4 operating environment.

Routines with 64-bit integer parameters are now available. That is, DAXPY() and DAXPY_64() are in all versions of the Sun Performance Library.

Serial version of the sparse solver package SuperLU is available and can be called from C drivers or through the existing Fortran-based sparse solver in the Library.

At this time, quad-precision routines (dqdoti, dqdota) are not available.

Interval BLAS routines are available for Solaris OS and Linux OS on SSE2-enabled and above X86 systems.

For SPARC Processors:

BLAS and FFT improvements for the UltraSPARC IV+ and UltraSPARC IV processors were done.

Support for SPARC64VI CPUs is available. This version of Sun Performance Library uses the floating point multiply-add instruction to achieve the best performance possible on SPARC64VI CPUs. To link with this library, use the -xtarget=sparcfmaf flag.

Serial version of the sparse solver package SuperLU is available and can be called from C drivers or through the existing Fortran-based SPSOLVE sparse solver in the Library.

D. Software Corrections

There is no new information at this time.

E. Problems and Workarounds

Under certain conditions, calling the F95 interface of LAPACK routine SPGVD from the 64-bit libraries (by compiling with -m64 flag) can result in run-time failure and core dump. The conditions are:

Caller routine is compiled with -m64
Optional integer work array is omitted
JOBZ argument is set to 'N'

F95 interfaces of other LAPACK routines with similar argument list are also affected. Workarounds include explicitly sending an integer work array or calling the specific F77 interface.
Complex-to-complex FFT (forward and inverse) is computed incorrectly by routines cfftf(), cfftb(), zfftf() and zfftb() when the problem size N is a power of 2 under the following conditions:

N is greater than 131072 on a Sparc platform, or
N is greater than 16384 in double precision on an x86/x64 platform, or
N is greater than 32768 in single precision on an x86/x64 platform

Rouines cfftc() and zfftz() which also compute the forward and inverse Complex-to-complex FFT are not affected.
C interfaces of LAPACK routines cgelsd(), dgelsd(), cvbrsm(), zvbrsm, cgels(), zgels(), dsbtrd() and dstedc() do not work correctly.
For updates or patches, check the updated information at http://developers.sun.com/sunstudio/support/.

F. Limitations and Incompatibilities

This section discusses limitations and incompatibilities with systems or other software.

There is no new information at this time.

For last-minute information, see the release notes at http://developers.sun.com/sunstudio/documentation/ss12/release_notes.html.

G. Documentation Errors

There is a documentation bug that affects man pages of LAPACK routines dopmtr, dormtr, zunmtr and zupmtr.
Workspace specified in the man pages of routines cfftc3() and zfftz3() is incorrect. The correct value should be (MAX(N1,N2,N3) + 16 * N3) * 2 * NCPUS.

Copyright © 2007 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms.