C H A P T E R  6

Sparse Computation

The Sun Performance Library has two software packages, SPSOVLE and SuperLU, that can be used to factor and solve sparse linear systems of equations.

Mainly written in Fortran, SPSOLVE is a collection of routines that solve symmetric, structurally symmetric, and unsymmetric coefficient matrices, using one of several ordering methods, including a user-specified ordering. In previous releases, SPSOLVE was referred to as the sparse solver package. It contains interfaces for FORTRAN 77; Fortran 95 and C interfaces are not currently provided. To use SPSOLVE routines from Fortran 95, use the FORTRAN 77 interfaces. To call SPSOLVE from C, append an underscore to the routine name (dgssin_(), dgssor_(), and so on), pass arguments by reference, and use one-based array indexing. (See Unsymmetric Sparse Matrices for an example of one-based and zero-based array indexing. For information on how to call Fortran routines from C, see the Fortran Programming Guide.)

The SuperLU package in the Sun Performance Library is the sequential version (version 3.0) of the public domain application that solves general unsymmetric sparse systems. While it is sequential, SuperLU does make use of several level 2 and level 3 BLAS routines that are parallelized. For detailed documentation of its algorithm, routines and data structures, see [5, 6, 7]. SuperLU is written in C; therefore, array indexing must be zero-based regardless of whether its routines are being called from Fortran-based SPSOLVE or a C driver program. (See SuperLU Interface for more detail and examples.)


6.1 Sparse Matrices

Sparse matrices are usually represented in formats that minimize storage requirements. By taking advantage of the sparsity and not storing zeros, considerable storage space can be saved. The storage format used by SPSOLVE and SuperLU is the compressed sparse column (CSC) format (also called the Harwell-Boeing format).

The CSC format represents a sparse matrix with two integer arrays and one floating point array. The integer arrays (colptr and rowind) specify the location of the nonzeros of the sparse matrix, and the floating point array (values) is used for the nonzero values.

The column pointer (colptr) array consists of n+1 elements where colptr(i) points to the beginning of the ith column, and colptr(i+1)-1 points to the end of the ith column. The row indices (rowind) array contains the row indices of the nonzero values. The values arrays contains the corresponding nonzero numerical values.

The following matrix data formats exist for a sparse matrix of neqns equations and nnz nonzeros:

Currently, SuperLU only supports unsymmetric matrices. The most efficient data representation often depends on the specific problem. The following sections show examples of sparse matrix data formats.

6.1.1 Symmetric Sparse Matrices

A symmetric sparse matrix is a matrix where a(i, j) = a(j, i) for all i and j. Because of this symmetry, only the lower triangular values need to be passed to the solver routines. The upper triangle can be determined from the lower triangle.

An example of a symmetric matrix is shown below. This example is derived from A. George and J. W-H. Liu. “Computer Solution of Large Sparse Positive Definite Systems.”


symmetric matrix

To represent A in CSC format:

6.1.2 Structurally Symmetric Sparse Matrices

A structurally symmetric sparse matrix has nonzero values with the property that if a(i, j) not equal 0, then a(j, i) not equal 0 for all i and j. When solving a structurally symmetric system, the entire matrix must be passed to the solver routines.

An example of a structurally symmetric matrix is shown below.


struturally symmetric matrix

To represent A in CSC format:

6.1.3 Unsymmetric Sparse Matrices

An unsymmetric sparse matrix does not have a(i, j) = a(j, i) for all i and j. The structure of the matrix does not have an apparent pattern. When solving an unsymmetric system, the entire matrix must be passed to the solver routines. An example of an unsymmetric matrix is shown below.


unsymmetric matix

To represent A in CSC format:


6.2 Sun Performance Library Sparse BLAS

The Sun Performance Library sparse BLAS package is based on the following two packages:

Refer to the following sources for additional sparse BLAS information.

The Netlib Sparse BLAS and NIST Fortran Sparse BLAS Library routines each use their own naming conventions, as described in the following sections.

 

6.2.1 Netlib Sparse BLAS

Each Netlib Sparse BLAS routine has a name of the form Prefix-Root-Suffix where the:

TABLE 6-1 lists the naming conventions for the Netlib Sparse BLAS vector routines.


TABLE 6-1 Netlib Sparse BLAS Naming Conventions

Operation

Root of Name

Prefix and Suffix

Dot product

-DOT-

S-I D-I C-UI Z-UI C-CI Z-CI

Scalar times a vector added to a vector

-AXPY-

S-I D-I C-I Z-I

Apply Givens rotation

-ROT-

S-I D-I

Gather x into y

-GTHR-

S- D- C- Z- S-Z D-Z C-Z Z-Z

Scatter x into y

-SCTR-

S- D- C- Z-


The prefix can be one of the following data types:

The I, CI, and UI suffixes denote sparse BLAS routines that are direct extensions to dense BLAS routines.

6.2.2 NIST Fortran Sparse BLAS

Each NIST Fortran Sparse BLAS routine has a six-character name of the form XYYYZZ where:

TABLE 6-2 shows the values for X, Y, and Z.


TABLE 6-2 NIST Fortran Sparse BLAS Routine Naming Conventions

X: Data Type

X

S: single precision

D: double precision

C: complex

Z: double complex

 

YYY: Sparse Storage Format

YYY

Single entry formats:

 

COO: coordinate

CSC: compressed sparse column

CSR: compressed sparse row

DIA: diagonal

ELL: ellpack

JAD: jagged diagonal

SKY: skyline

 

Block entry formats:

 

BCO: block coordinate

BSC: block compressed sparse column

BSR: block compressed sparse row

BDI: block diagonal

BEL: block ellpack

VBR: block compressed sparse row

ZZ: Operation

ZZ

MM: matrix-matrix product

SM: solution of triangular system (supported for all formats except COO)

RP: right permutation (for JAD format only)



6.3 SPSOLVE Interface

SPSOLVE computes the solution of a sparse system through a sequence of steps: Initialization, ordering to reduce fill-in, symbolic factorization, numeric factorization, and triangular solve. A user code may call individual routines or make use of a one-call interface to perform these steps.

6.3.1 SPSOLVE Routines

Listed in the table below are user-accessible routines in SPSOLVE and their functions.


TABLE 6-3 SPSOLVE Sparse Solver Routines

Routine

Function

DGSSFS() 

One call interface to sparse solver

DGSSIN() 

Sparse solver initialization

DGSSOR() 

Fill reducing ordering and symbolic factorization

DGSSUO()

Sets user-specified ordering permutation and performs symbolic factorization (called in place of DGSSOR)

DGSSFA() 

Matrix value input and numeric factorization

DGSSSL() 

Triangular solve

Utility Routine

Function

DGSSRP() 

Returns permutation used by solver.

DGSSCO() 

Returns condition number estimate of coefficient matrix.

DGSSDA() 

De-allocates sparse solver.

DGSSPS() 

Prints solver statistics.


Matrices with the same structure but with different numerical values can be solved by calling SPSOLVE routines in the order shown below:


call dgssin() ! initialization, input coefficient matrix structure
call dgssor() ! fill-reducing ordering, symbolic factorization
              ! (or call dgssuo() to specify a user ordering, 
              ! and perform symbolic factorization)
 
do m = 1, number_of_structurally_identical_matrices
   call dgssfa() ! input coefficient matrix values, numeric                  ! factorization
   do r = 1, number_of_right_hand_sides
      call dgsssl() ! triangular solve
   enddo
enddo

The one-call interface is not as flexible as the regular interface, but it covers the most common case of factoring a single matrix and solving some number right-hand sides. Additional calls to dgsssl() are used to solve for additional right-hand sides, as shown in the following example.


call dgssfs() ! initialization, input coefficient matrix structure
              ! fill-reducing ordering, symbolic factorization
              ! input coefficient matrix values, numeric factorization
              ! triangular solve
do r = 1, number_of_right_hand_sides
    call dgsssl() ! triangular solve
enddo

6.3.2 Routine Calling Order

To use SPSOLVE, its routines must be called in the order shown below:

1. One Call Interface: For solving single matrix

a. DGSSFS() - Initialize, order, factor, solve

b. DGSSSL() - Additional solves (optional): repeat DGSSSL() as needed

c. DGSSDA() - Deallocate working storage

2. Regular Interface: For solving multiple matrices with the same structure

a. DGSSIN() - Initialize

b. DGSSOR() or DGSSUO() - Order and symbolically factor

c. DGSSFA() - Factor

d. DGSSSL() - Solve: repeat DGSSFA() or DGSSSL() as needed

e. DGSSDA() - Deallocate working storage

6.3.3 SPSOLVE Examples

The following examples show solving a symmetric system using the one-call interface, and solving a symmetric system using the regular interface. In CODE EXAMPLE 6-1, the one-call interface is used to solve a symmetric system, and in CODE EXAMPLE 6-2, individual routines are called to solve a symmetric system. CODE EXAMPLE 6-5 shows how the Fortran SPSOLVE interface can be called from a C program. (For more information on how to call Fortran routines from C programs, see the Fortran Programming Guide.)


CODE EXAMPLE 6-1 Solving a Symmetric System-One-Call Interface

my_system% cat example_1call.f
      program example_1call
c
c  This program is an example driver that calls the sparse solver.
c    It factors and solves a symmetric system, by calling the
c    one-call interface.
c
      implicit none
 
      integer           neqns, ier, msglvl, outunt, ldrhs, nrhs
      character         mtxtyp*2, pivot*1, ordmthd*3
      double precision  handle(150)
      integer           colstr(6), rowind(9)
      double precision  values(9), rhs(5), xexpct(5)
      integer           i
c
c  Sparse matrix structure and value arrays.  From George and Liu,
c  page 3.
c    Ax = b, (solve for x) where:
c
c      4.0   1.0   2.0   0.5   2.0       2.0       7.0
c      1.0   0.5   0.0   0.0   0.0       2.0       3.0
c  A = 2.0   0.0   3.0   0.0   0.0   x = 1.0   b = 7.0
c      0.5   0.0   0.0   0.625 0.0      -8.0      -4.0
c      2.0   0.0   0.0   0.0  16.0      -0.5      -4.0
c
      data colstr / 1, 6, 7, 8, 9, 10 /
      data rowind / 1, 2, 3, 4, 5, 2, 3, 4, 5 /
      data values / 4.0d0, 1.0d0, 2.0d0, 0.5d0, 2.0d0, 0.5d0, 3.0d0,
     &              0.625d0, 16.0d0 /
      data rhs    / 7.0d0, 3.0d0, 7.0d0, -4.0d0, -4.0d0 /
      data xexpct / 2.0d0, 2.0d0, 1.0d0, -8.0d0, -0.5d0 /
c
c  set calling parameters
c
      mtxtyp= 'ss'
      pivot = 'n'
      neqns  = 5
      nrhs   = 1
 
      ldrhs  = 5
      outunt = 6
      msglvl = 0
      ordmthd = 'mmd'
c
c  call single call interface
c
      call dgssfs ( mtxtyp, pivot,  neqns , colstr, rowind,
     &              values, nrhs  , rhs,    ldrhs , ordmthd,
     &              outunt, msglvl, handle, ier             )
      if ( ier .ne. 0 ) goto 110
c
c  deallocate sparse solver storage
c
      call dgssda ( handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  print values of sol
c
      write(6,200) 'i', 'rhs(i)', 'expected rhs(i)', 'error'
      do i = 1, neqns
        write(6,300) i, rhs(i), xexpct(i), (rhs(i)-xexpct(i))
      enddo
      stop
  110 continue
c
c call to sparse solver returns an error
c
      write ( 6 , 400 )
     &      ' example: FAILED sparse solver error number = ', ier
      stop
 
  200 format(a5,3a20)
 
  300 format(i5,3d20.12) ! i/sol/xexpct values
 
  400 format(a60,i20) ! fail message, sparse solver error number
 
      end
 
my_system% f95 -dalign example_1call.f -xlic_lib=sunperf
my_sytem% a.out
    i              rhs(i)     expected rhs(i)               error
    1  0.200000000000D+01  0.200000000000D+01 -0.528466159722D-13
    2  0.200000000000D+01  0.200000000000D+01  0.105249142734D-12
    3  0.100000000000D+01  0.100000000000D+01  0.350830475782D-13
    4 -0.800000000000D+01 -0.800000000000D+01  0.426325641456D-13
    5 -0.500000000000D+00 -0.500000000000D+00  0.660582699652D-14
 


CODE EXAMPLE 6-2 Solving a Symmetric System-Regular Interface

my_system% cat example_ss.f
      program example_ss
c
c  This program is an example driver that calls the sparse solver.
c  It factors and solves a symmetric system.
 
      implicit none
 
      integer           neqns, ier, msglvl, outunt, ldrhs, nrhs
      character         mtxtyp*2, pivot*1, ordmthd*3
      double precision  handle(150)
      integer           colstr(6), rowind(9)
      double precision  values(9), rhs(5), xexpct(5)
      integer           i
c
c  Sparse matrix structure and value arrays.  From George and Liu, 
c  page 3.
c    Ax = b, (solve for x) where:
c
c      4.0   1.0   2.0   0.5   2.0       2.0       7.0
c      1.0   0.5   0.0   0.0   0.0       2.0       3.0
c  A = 2.0   0.0   3.0   0.0   0.0   x = 1.0   b = 7.0
c      0.5   0.0   0.0   0.625 0.0      -8.0      -4.0
c      2.0   0.0   0.0   0.0  16.0      -0.5      -4.0
c
      data colstr / 1, 6, 7, 8, 9, 10 /
      data rowind / 1, 2, 3, 4, 5, 2, 3, 4, 5 /
      data values / 4.0d0, 1.0d0, 2.0d0, 0.5d0, 2.0d0, 0.5d0, 
     &             3.0d0, 0.625d0, 16.0d0 /
      data rhs    / 7.0d0, 3.0d0, 7.0d0, -4.0d0, -4.0d0 /
      data xexpct / 2.0d0, 2.0d0, 1.0d0, -8.0d0, -0.5d0 /
 
c
c  initialize solver
c
      mtxtyp= 'ss'
      pivot = 'n'
      neqns  = 5
      outunt = 6
      msglvl = 0
c
c  call regular interface
c
      call dgssin ( mtxtyp, pivot,  neqns , colstr, rowind,
     &              outunt, msglvl, handle, ier             )  
      if ( ier .ne. 0 ) goto 110
c
c  ordering and symbolic factorization
c
      ordmthd = 'mmd'
      call dgssor ( ordmthd, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  numeric factorization
c
      call dgssfa ( neqns, colstr, rowind, values, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  solution
c
      nrhs   = 1
      ldrhs  = 5
      call dgsssl ( nrhs, rhs, ldrhs, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  deallocate sparse solver storage
c
      call dgssda ( handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  print values of sol
c
      write(6,200) 'i', 'rhs(i)', 'expected rhs(i)', 'error'
      do i = 1, neqns
        write(6,300) i, rhs(i), xexpct(i), (rhs(i)-xexpct(i))
      enddo
      stop
 
  110 continue
c
c call to sparse solver returns an error
c
      write ( 6 , 400 )
     &      ' example: FAILED sparse solver error number = ', ier
      stop
 
  200 format(a5,3a20)
 
  300 format(i5,3d20.12) ! i/sol/xexpct values
 
  400 format(a60,i20) ! fail message, sparse solver error number
      
      end
my_system% f95 -dalign example_ss.f -xlic_lib=sunperf
my_sytem% a.out
    i              rhs(i)     expected rhs(i)               error
    1  0.200000000000D+01  0.200000000000D+01 -0.528466159722D-13
    2  0.200000000000D+01  0.200000000000D+01  0.105249142734D-12
    3  0.100000000000D+01  0.100000000000D+01  0.350830475782D-13
    4 -0.800000000000D+01 -0.800000000000D+01  0.426325641456D-13
    5 -0.500000000000D+00 -0.500000000000D+00  0.660582699652D-14
 


CODE EXAMPLE 6-3 Solving a Structurally Symmetric System With Unsymmetric Values-Regular Interface

my_system% cat example_su.f
      program example_su
c
c  This program is an example driver that calls the sparse solver.
c    It factors and solves a structurally symmetric system
c    (w/unsymmetric values).
c
      implicit none
 
      integer           neqns, ier, msglvl, outunt, ldrhs, nrhs
      character         mtxtyp*2, pivot*1, ordmthd*3
      double precision  handle(150)
      integer           colstr(5), rowind(8)
      double precision  values(8), rhs(4), xexpct(4)
      integer           i
 
c
c  Sparse matrix structure and value arrays.  Coefficient matrix
c    has a symmetric structure and unsymmetric values.
c    Ax = b, (solve for x) where:
c
c      1.0   3.0   0.0   0.0       1.0        7.0
c      2.0   4.0   0.0   7.0       2.0       38.0
c  A = 0.0   0.0   6.0   0.0   x = 3.0   b = 18.0
c      0.0   5.0   0.0   8.0       4.0       42.0
c
      data colstr / 1, 3, 6, 7, 9 /
      data rowind / 1, 2, 1, 2, 4, 3, 2, 4 /
      data values / 1.0d0, 2.0d0, 3.0d0, 4.0d0, 5.0d0, 6.0d0, 7.0d0,
     &              8.0d0 /
      data rhs    / 7.0d0, 38.0d0, 18.0d0, 42.0d0 /
      data xexpct / 1.0d0, 2.0d0, 3.0d0, 4.0d0 /
c
c  initialize solver
c
      mtxtyp= 'su'
      pivot = 'n'
      neqns  = 4
      outunt = 6
      msglvl = 0
c
c  call regular interface
c
      call dgssin ( mtxtyp, pivot,  neqns , colstr, rowind,
     &              outunt, msglvl, handle, ier             )  
      if ( ier .ne. 0 ) goto 110
c
c  ordering and symbolic factorization
c
      ordmthd = 'mmd'
      call dgssor ( ordmthd, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  numeric factorization
c
      call dgssfa ( neqns, colstr, rowind, values, handle, ier )
      if ( ier .ne. 0 ) goto 110
 
c
c  solution
c
      nrhs   = 1
      ldrhs  = 4
      call dgsssl ( nrhs, rhs, ldrhs, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  deallocate sparse solver storage
c
      call dgssda ( handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  print values of sol
c
      write(6,200) 'i', 'rhs(i)', 'expected rhs(i)', 'error'
      do i = 1, neqns
        write(6,300) i, rhs(i), xexpct(i), (rhs(i)-xexpct(i))
      enddo
      stop
  110 continue
c
c call to sparse solver returns an error
c
      write ( 6 , 400 )
     &      ' example: FAILED sparse solver error number = ', ier
      stop
 
  200 format(a5,3a20)
 
  300 format(i5,3d20.12)     ! i/sol/xexpct values
 
  400 format(a60,i20)   ! fail message, sparse solver error number
 
      end
my_system% f95 -dalign example_su.f -xlic_lib=sunperf
my_system% a.out
    i              rhs(i)     expected rhs(i)               error
    1  0.100000000000D+01  0.100000000000D+01  0.000000000000D+00
    2  0.200000000000D+01  0.200000000000D+01  0.000000000000D+00
    3  0.300000000000D+01  0.300000000000D+01  0.000000000000D+00
    4  0.400000000000D+01  0.400000000000D+01  0.000000000000D+00
 


CODE EXAMPLE 6-4 Solving an Unsymmetric System-Regular Interface

my_system% cat example_uu.f
      program example_uu
c
c  This program is an example driver that calls the sparse solver.
c    It factors and solves an unsymmetric system.
c
      implicit none
 
      integer           neqns, ier, msglvl, outunt, ldrhs, nrhs
      character         mtxtyp*2, pivot*1, ordmthd*3
      double precision  handle(150)
      integer           colstr(6), rowind(10)
      double precision  values(10), rhs(5), xexpct(5)
      integer           i
c
c  Sparse matrix structure and value arrays.  Unsummetric matrix A.
c    Ax = b, (solve for x) where:
c
c      1.0   0.0   0.0   0.0   0.0       1.0        1.0
c      2.0   6.0   0.0   0.0   9.0       2.0       59.0
c  A = 3.0   0.0   7.0   0.0   0.0   x = 3.0   b = 24.0
c      4.0   0.0   0.0   8.0   0.0       4.0       36.0
c      5.0   0.0   0.0   0.0  10.0       5.0       55.0
c
      data colstr / 1, 6, 7, 8, 9, 11 /
      data rowind / 1, 2, 3, 4, 5, 2, 3, 4, 2, 5 /
      data values / 1.0d0, 2.0d0, 3.0d0, 4.0d0, 5.0d0, 6.0d0, 7.0d0,
     &              8.0d0, 9.0d0, 10.0d0 /
      data rhs    / 1.0d0, 59.0d0, 24.0d0, 36.0d0, 55.0d0 /
      data xexpct / 1.0d0, 2.0d0, 3.0d0, 4.0d0, 5.0d0 /
c
c  initialize solver
c
      mtxtyp= 'uu'
      pivot = 'n'
      neqns  = 5
      outunt = 6
      msglvl = 3
      call dgssin ( mtxtyp, pivot,  neqns , colstr, rowind,
     &              outunt, msglvl, handle, ier             )  
      if ( ier .ne. 0 ) goto 110
 
c
c  ordering and symbolic factorization
c
      ordmthd = 'mmd'
      call dgssor ( ordmthd, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  numeric factorization
c
      call dgssfa ( neqns, colstr, rowind, values, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  solution
c
      nrhs   = 1
      ldrhs  = 5
      call dgsssl ( nrhs, rhs, ldrhs, handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  deallocate sparse solver storage
c
      call dgssda ( handle, ier )
      if ( ier .ne. 0 ) goto 110
c
c  print values of sol
c
      write(6,200) 'i', 'rhs(i)', 'expected rhs(i)', 'error'
      do i = 1, neqns
        write(6,300) i, rhs(i), xexpct(i), (rhs(i)-xexpct(i))
      enddo
      stop
  110 continue
c
c call to sparse solver returns an error
c
      write ( 6 , 400 )
     &      ' example: FAILED sparse solver error number = ', ier
      stop
 
  200 format(a5,3a20)
 
  300 format(i5,3d20.12)     ! i/sol/xexpct values
 
  400 format(a60,i20)    ! fail message, sparse solver error number
      end
 
my_system% f95 -dalign example_uu.f -xlic_lib=sunperf
my_system% a.out
  i              rhs(i)     expected rhs(i)               error
    1  0.100000000000D+01  0.100000000000D+01  0.000000000000D+00
    2  0.200000000000D+01  0.200000000000D+01  0.000000000000D+00
    3  0.300000000000D+01  0.300000000000D+01  0.000000000000D+00
    4  0.400000000000D+01  0.400000000000D+01  0.000000000000D+00
    5  0.500000000000D+01  0.500000000000D+01  0.000000000000D+00
 


CODE EXAMPLE 6-5 Calling SPSOLVE Routines From C
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/time.h>
#include <sunperf.h>
 
int main() {
/*
 Sparse matrix structure and value arrays. Coefficient matrix
 is a general unsymmetric sparse matrix.
 
 Ax = b, (solve for x) where:
 
     1.0  0.0  7.0   9.0   0.0        1.0           17.0
     2.0  4.0  0.0   0.0   0.0        1.0            6.0
 A = 0.0  5.0  8.0   0.0   0.0    x = 1.0       b = 13.0
     0.0  0.0  0.0  10.0  11.0        1.0           21.0
     3.0  6.0  0.0   0.0  12.0        1.0           21.0
*/
/* Array indices must be one-based for calling SPSOLVE routines */
int colstr[]    = {1, 4, 7, 9, 11, 13};
int rowind[]    = {1, 2, 5, 2, 3, 5, 1, 3, 1, 4, 4, 5};
double values[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0,
                   7.0, 8.0, 9.0, 10.0, 11.0, 12.0};
double rhs[]    = {17.0, 6.0, 13.0, 21.0, 21.0};
double xexpct[] = {1.0, 1.0, 1.0, 1.0, 1.0};
 
    int n = 5, nnz = 12, nrhs = 1, msglvl = 0, outunt = 6, ierr,
          i,j,k, int_ierr;
    double t[4], handle[150];
    char type[] = "uu", piv = ’n’;
 
/* Last two parameters in argument list indicate lengths of
 * character arguments type and piv
 */
    dgssin_(type, &piv, &n, colstr, rowind, &outunt, &msglvl,
               handle, &ierr,2,1);
    if (ierr != 0) {
      int_ierr = ierr;
      printf("dgssin err = %d\n", int_ierr);
      return -1;
    }
 
    char ordmth[] = "mmd";
    dgssor_(ordmth, handle, &ierr, 3);
    if (ierr != 0) {
      int_ierr = ierr;
      printf("dgssor err = %d\n", int_ierr);
      return -1;
    }
 
    dgssfa_(&n, colstr, rowind, values, handle, &ierr);
    if (ierr != 0) {
      int_ierr = ierr;
      printf("dgssfa err = %d\n", int_ierr);
      return -1;
    }
 
    dgsssl_(&nrhs, rhs, &n, handle, &ierr);
    if (ierr != 0) {
      int_ierr = ierr;
      printf("dgsssl err = %d\n", int_ierr);
      return -1;
    }
    printf("i   computed solution      expected solution\n");
    for (i=0; i<n; i++)
      printf("%d         %lf            %lf\n", i,rhs[i], 1.0);
}
 
my_system%  cc -m32 -dalign dr.c -xlic_lib=sunperf
my_system%  ./a.out
i   computed solution      expected solution
0         1.000000            1.000000
1         1.000000            1.000000
2         1.000000            1.000000
3         1.000000            1.000000
4         1.000000            1.000000


6.4 SuperLU Interface

SuperLU has two driver routines, simple and expert, that can be called to completely solve a general unsymmetric sparse system in a similar manner to the one-call interface in SPSOLVE. These and other SuperLU user-callable routines are available in single precision, double precision, complex and double complex data types. Single precision names of all external routines are listed the following tables. Man pages (section 3P) are available for these routines. Also see the man page of SuperMatrix for a description of the sparse matrix data structure that is used in the application.


TABLE 6-4 SuperLU Computational Routines

Routine

Function

sgstrf

Computes factorization

sgssvx

Factorizes and solves (expert driver)

sgssv

Factorizes and solves (simple driver)

sgstrs

Computes triangular solve

sgsrfs

Improves computed solution; provides error bounds

slangs

Computes one-norm, Frobenius-norm, or infinity-norm

sgsequ

Computes row and column scalings

sgscon

Estimates reciprocal of condition number

slaqgs

Equilibrates a general sparse matrix



TABLE 6-5 SuperLU Utility Routines

Routine

Function

LUSolveTime

Returns time spent in solve stage

LUFactTime

Returns time spent in factorization stage

LUFactFlops

Returns number of floating point operations in factorization stage

LUSolveFlops

Returns number of floating point operations in solve stage

sQuerySpace

Returns information on the memory statistics

sp_ienv

Returns specified machine dependent parameter

sPrintPerf

Prints statistics collected by the computational routines

set_default_options

Sets parameters that control solver behavior to default options

StatInit

Allocates and initializes structure that stores performance statistics

StatFree

Frees structure that stores performance statistics

Destroy_Dense_Matrix

Deallocates a SuperMatrix in dense format

Destroy_SuperNode_Matrix

Deallocates a SuperMatrix in supernodal format

Destroy_CompCol_Matrix

Deallocates a SuperMatrix in compressed sparse column format

Destroy_CompCol_Permuted

Deallocates a SuperMatrix in permuted compressed sparse column format

Destroy_SuperMatrix_Store

Deallocates actual storage that stores matrix in a SuperMatrix

sCopy_CompCol_Matrix

Copies a SuperMatrix in compressed sparse column format

sCreate_CompCol_Matrix

Allocates a SuperMatrix in compressed sparse column format

sCreate_Dense_Matrix

Allocates a SuperMatrix in dense format

sCreate_CompRow_Matrix

Allocates a SuperMatrix in compressed sparse row format

sCreate_SuperNode_Matrix

Allocates a SuperMatrix in supernodal format

sp_preorder

Permutes columns of original sparse matrix

sp_sgemm

Multiplies a SuperMatrix by a dense matrix


6.4.1 Calling from C

SuperLU routines are written in C. Therefore, column- and row-related indices must be zero-based. In the following example, double precision simple driver dgssv is called to compute factors L and U and to solve for the solution matrix.
Running the above example:


CODE EXAMPLE 6-6 SuperLU Simple Driver
#include <stdio.h>
#include <sunperf.h>
 
#define M 5
#define N 5
 
int main(int argc, char *argv[])
{
    SuperMatrix A, L, U, B1, B2;
    int      perm_r[M];  /* row permutations from partial pivoting */
    int      perm_c[N];  /* column permutation vector */
    int      info, i;
    superlu_options_t options;
    SuperLUStat_t stat;
    trans_t  trans = NOTRANS;
 
    printf("Example code calling SuperLU simple driver to factor a \n");
    printf("general unsymmetric matrix and solve two right-hand-side matrices\n");
 
   /* the matrix in Harwell-Boeing format. */
    int m = M;
    int n = M;
    int nnz = 12;
    double *dp;
   /* nonzeros of A, column-wise */
    double a[] = {1.0, 2.0, 3.0,  4.0, 5.0,  6.0,
                  7.0, 8.0, 9.0, 10.0, 11.0, 12.0};
   /* row index of nonzeros */
    int asub[] = {0, 1, 4, 1, 2, 4, 0, 2, 0, 3, 3, 4};
   /* column pointers */
    int xa[]   = {0, 3, 6, 8, 10, 12};
 
    /* Create Matrix A in the format expected by SuperLU */
    dCreate_CompCol_Matrix(&A, m, n, nnz, a, asub, xa, SLU_NC, SLU_D, SLU_GE);
 
    int nrhs = 1;
    double rhs1[] = {17.0, 6.0, 13.0, 21.0, 21.0};
    double rhs2[] = {17*.3, 6*.3, 13*.3, 21*.3, 21*.3};
 
    /* right-hand side matrix B1, B2 */
    dCreate_Dense_Matrix(&B1, m, nrhs, rhs1, m, SLU_DN, SLU_D, SLU_GE);
    dCreate_Dense_Matrix(&B2, m, nrhs, rhs2, m, SLU_DN, SLU_D, SLU_GE);
 
    /* set options that control behavior of solver to default parameters */
    set_default_options(&options);
    options.ColPerm = NATURAL;
 
    /* Initialize the statistics variables. */
    StatInit(&stat);
    /* factor input matrix and solve the first right-hand-side matrix */
    dgssv(&options, &A, perm_c, perm_r, &L, &U, &B1, &stat, &info);
 
    printf("\nsolution matrix B1:\n");
    dp = (double *) (((NCformat *)B1.Store)->nzval);
    printf("    i    rhs[i]     expected\n");
    for (i=0; i<M; i++)
      printf("%5d   %7.4lf     %7.4lf\n", i, dp[i], 1.0);
    printf("Factor time  = %8.2e sec\n", stat.utime[FACT]);
    printf("Solve time   = %8.2e sec\n\n\n", stat.utime[SOLVE]);
 
    /* solve the second right-hand-side matrix */
    dgstrs(trans, &L, &U, perm_c, perm_r, &B2, &stat, &info);
 
    printf("solution matrix B2:\n");
    dp = (double *) (((NCformat *)B2.Store)->nzval);
    printf("    i    rhs[i]     expected\n");
    for (i=0; i<M; i++)
      printf("%5d   %7.4lf     %7.4lf\n", i, dp[i], 0.3);
    printf("Solve time   = %8.2e sec\n", stat.utime[SOLVE]);
 
    StatFree(&stat);
    Destroy_CompCol_Matrix(&A);
    Destroy_SuperMatrix_Store(&B1);
    Destroy_SuperMatrix_Store(&B2);
    Destroy_SuperNode_Matrix(&L);
    Destroy_CompCol_Matrix(&U);
}


my_system% cc -dalign simple.c -xlic_lib=sunperf
my_system% a.out
 
Example code calling SuperLU simple driver to factor a
general unsymmetric matrix and solve two right-hand-side matrices
 
solution matrix B1:
    i    rhs[i]     expected
    0    1.0000      1.0000
    1    1.0000      1.0000
    2    1.0000      1.0000
    3    1.0000      1.0000
    4    1.0000      1.0000
Factor time  = 5.43e-02 sec
Solve time   = 6.76e-03 sec
 
 
solution matrix B2:
    i    rhs[i]     expected
    0    0.3000      0.3000
    1    0.3000      0.3000
    2    0.3000      0.3000
    3    0.3000      0.3000
    4    0.3000      0.3000
Solve time   = 6.76e-03 sec


CODE EXAMPLE 6-7 SuperLU Expert Driver
#include <stdio.h>
#include <sunperf.h>
 
#define M    5
#define N    5
#define NRHS 1
 
int main(int argc, char *argv[])
{
    SuperMatrix A, L, U, B, X;
    int      perm_r[M];  /* row permutations from partial pivoting */
    int      perm_c[N];  /* column permutation vector */
    int      etree[N];   /* elimination tree */
    double   ferr[NRHS]; /* estimated forward error bound */
    double   berr[NRHS]; /* component-wise relative backward error */
    double   C[N], R[M]; /* column and row scale factors */
    double   rpg, rcond;
    char     equed[1];   /* Specifies the form of equilibration that was done */          
    double   *work, *dp; /* user-supplied workspace */
    int      lwork = 0;  /* 0 for workspace to be allocated by system malloc */
    int      info, i;
    superlu_options_t options;
    SuperLUStat_t stat;
    mem_usage_t    mem_usage;
 
    printf("Example code calling SuperLU expert driver\n\n");
 
    /* the matrix in Harwell-Boeing format. */
    int m = M;
    int n = M;
    int nnz = 12;
   /* nonzeros of A, column-wise */
    double a[] = {1.0, 2.0, 3.0,  4.0, 5.0,  6.0,
                  7.0, 8.0, 9.0, 10.0, 11.0, 12.0};
    /* row index of nonzeros */
    int asub[] = {0, 1, 4, 1, 2, 4, 0, 2, 0, 3, 3, 4};
    /* column pointers */
    int xa[]   = {0, 3, 6, 8, 10, 12};
    int nrhs = NRHS;
    double rhs[] = {17.0, 6.0, 13.0, 21.0, 21.0};
 
    /* Create Matrix A in the format expected by SuperLU */
    dCreate_CompCol_Matrix(&A, m, n, nnz, a, asub, xa, SLU_NC, SLU_D, SLU_GE);
 
    /* right-hand-side matrix B */
    dCreate_Dense_Matrix(&B, m, nrhs, rhs, m, SLU_DN, SLU_D, SLU_GE);
 
    /*  solution matrix X */
    dCreate_Dense_Matrix(&X, m, nrhs, rhs, m, SLU_DN, SLU_D, SLU_GE);
    set_default_options(&options);
    options.ColPerm = NATURAL;
 
    /* Initialize the statistics variables. */
    StatInit(&stat);
 
    dgssvx(&options, &A, perm_c, perm_r, etree, equed, R, C, &L, &U, work, lwork,
           &B, &X, &rpg, &rcond, ferr, berr, &mem_usage, &stat, &info);
    dp = (double *) (((NCformat *)X.Store)->nzval);
    printf("    i    rhs[i]     expected\n");
    for (i=0; i<M; i++)
      printf("%5d   %7.4lf     %7.4lf\n",
             i, dp[i], 1.0);
    printf("Factor time  = %8.2e sec\n", stat.utime[FACT]);
    printf("Solve time   = %8.2e sec\n", stat.utime[SOLVE]);
 
    StatFree(&stat);
    Destroy_CompCol_Matrix(&A);
    Destroy_SuperMatrix_Store(&B);
    Destroy_SuperNode_Matrix(&L);
    Destroy_CompCol_Matrix(&U);
}

Running the above example:


my_system% cc -dalign expert.c -xlic_lib=sunperf
my_system% a.out
Example code calling SuperLU expert driver
 
    i    rhs[i]     expected
    0    1.0000      1.0000
    1    1.0000      1.0000
    2    1.0000      1.0000
    3    1.0000      1.0000
    4    1.0000      1.0000
Factor time  = 1.25e-03 sec
Solve time   = 1.70e-04 sec

6.4.2 Calling from Fortran

The simplest way to call SuperLU from Fortran is through the SPSOLVE interface. SuperLU can be selected to solve an unsymmetric coefficient matrix through input argument MTXTYP of routine DGSSIN(), which is the initialization routine in SPSOLVE. The same argument also exists in the one-call interface routine DGSSFS(). Valid options for MTXTYP are listed in the following table. To invoke SuperLU, select ’s0’ or ’S0’ as matrix type. Since SPSOLVE is Fortran-based, all column and row indices associated with the input matrix should be one-based. However, if SuperLU is invoked through DGSSIN() or DGSSFS() (by setting MTXTYP = ’s0’ or ’S0’), these indices must be zero-based.


TABLE 6-6 Matrix Type Options for DGSSIN() and DGSSFS()

Option

Type of Matrix

Solver

’sp’ or ’SP’

symmetric structure, positive-definite values

SPSOLVE

’ss’ or ’SS’

symmetric structure, symmetric values

SPSOLVE

’su’ or ’SU’

symmetric structure, unsymmetric values

SPSOLVE

’uu’ or ’UU’

unsymmetric structure, unsymmetric values

SPSOLVE

’s0’ or ’S0’

unsymmetric structure, unsymmetric values

SuperLU


A call to routine DGSSOR() must follow DGSSIN() to perform fill-reduce ordering and symbolic factorization. A character argument (ORDMTHD) is used to select the desired ordering method. This argument also exists in the one-call interface routine DGSSFS(). Valid ordering methods for SPSOLVE and SuperLU are listed in the following table. The user may also provide a particular ordering to the solver by calling DGSSUO() in place of DGSSOR(). The input permutation array must be zero-based.


TABLE 6-7 Matrix Ordering Options for DGSSOR() and DGSSFS()

Option

Ordering Method

Solver

’nat’ or ’NAT’

natural ordering (no ordering)

SPSOLVE, SuperLU

’mmd’ or ’MMD’

minimum degree on A’*A (default)

SPSOLVE, SuperLU

’gnd’ or ’GND’

general nested dissection

SPSOLVE

’spm’ or ’SPM’

Minimum degree ordering on A’+A

SuperLU

’sam’ or ’SAM’

Approximate minimum degree column

SuperLU


As shown above, the general nested dissection method is not available in SuperLU. On the other hand, the minimum degree ordering on A’+A and approximate minimum degree column ordering are not available in SPSOLVE.

6.4.3 Examples

The following code examples show how SuperLU can be selected through the regular interface and the one-call interface of SPSOLVE to factorize and solve a general unsymmetric system of equations.

 

CODE EXAMPLE 6-8 Invoking SuperLU through SPSOLVE Interface
      program SLU
 
c  This program is an example driver that calls the regular interface of SPSOLVE
c  to invoke SuperLU to factor and solve a general unsymmetric system.
 
      implicit none
      integer           neqns, ier, msglvl, outunt, ldrhs, nrhs, i
      character         mtxtyp*2, pivot*1, ordmthd*3
      double precision  handle(150)
      integer           colstr(6), rowind(12)
      double precision  values(12), rhs(5), xexpct(5)
 
c  Sparse matrix structure and value arrays.  Coefficient matrix
c    is a general unsymmetric sparse matrix.
c    Ax = b, (solve for x) where:
 
c       1.0   0.0   7.0   9.0   0.0      1.0       17.0
c       2.0   4.0   0.0   0.0   0.0      1.0        6.0
c  A =  0.0   5.0   8.0   0.0   0.0  x = 1.0   b = 13.0
c       0.0   0.0   0.0  10.0  11.0      1.0       21.0
c       3.0   6.0   0.0   0.0  12.0      1.0       21.0
 
c  Array indices must be zero-based for calling SuperLU
      data colstr / 0, 3, 6, 8, 10, 12 /
      data rowind / 0, 1, 4, 1, 2, 4, 0, 2, 0, 3, 3, 4 /
      data values / 1.0, 2.0, 3.0,  4.0, 5.0,  6.0,
     $              7.0, 8.0, 9.0, 10.0, 11.0, 12.0 /
      data rhs    / 17.0, 6.0, 13.0, 21.0, 21.0 /
      data xexpct / 1.0d0, 1.0d0, 1.0d0, 1.0d0, 1.0d0 /
 
c  initialize solver
      mtxtyp= ’s0’
      pivot = ’n’
      neqns  = 5
      outunt = 6
      msglvl = 0
 
c  call regular interface
      call dgssin(mtxtyp, pivot,  neqns , colstr, rowind,outunt, msglvl,
     &            handle, ier)
      if ( ier .ne. 0 ) goto 110
 
c  ordering and symbolic factorization
      ordmthd = ’mmd’
      call dgssor(ordmthd, handle, ier)
      if ( ier .ne. 0 ) goto 110
 
c  numeric factorization
      call dgssfa ( neqns, colstr, rowind, values, handle, ier )
      if ( ier .ne. 0 ) goto 110
 
c  solution
      nrhs   = 1
      ldrhs  = 5
      call dgsssl ( nrhs, rhs, ldrhs, handle, ier )
      if ( ier .ne. 0 ) goto 110
 
c  deallocate sparse solver storage
      call dgssda ( handle, ier )
      if ( ier .ne. 0 ) goto 110
 
c  print values of sol
      write(6,200) ’i’, ’rhs(i)’, ’expected rhs(i)’, ’error’
      do i = 1, neqns
        write(6,300) i, rhs(i), xexpct(i), (rhs(i)-xexpct(i))
      enddo
      stop
 
  110 continue
c call to sparse solver returns an error
      write ( 6 , 400 )
     &      ’ example: FAILED sparse solver error number = ’, ier
      stop
 
  200 format(4x,a1,3x,a6,3x,a15,4x,a6)
  300 format(i5,3x,f5.2,7x,f5.2,8x,e10.2)     ! i/sol/xexpct values
  400 format(a60,i20)   ! fail message, sparse solver error number
      end

Running the above example:


my_system% f95 -dalign slu.f -xlic_lib=sunperf
my_system% a.out
 
    i   rhs(i)   expected rhs(i)     error
    1    1.00        1.00          0.00E+00
    2    1.00        1.00         -0.33E-15
    3    1.00        1.00          0.22E-15
    4    1.00        1.00         -0.11E-15
    5    1.00        1.00          0.22E-15
 


CODE EXAMPLE 6-9 Invoking SuperLU through One-Call SPSOLVE Interface
program SLU_SINGLE
c  This program is an example driver that calls the regular interface of SPSOLVE
c  to invoke SuperLU to factor and solve a general unsymmetric system.
 
      implicit none
      integer           neqns, ier, msglvl, outunt, ldrhs, nrhs, i
      character         mtxtyp*2, pivot*1, ordmthd*3
      double precision  handle(150)
      integer           colstr(6), rowind(12)
      double precision  values(12), rhs(5), xexpct(5)
 
c  Sparse matrix structure and value arrays.  Coefficient matrix
c    is a general unsymmetric sparse matrix.
c    Ax = b, (solve for x) where:
 
c       1.0   0.0   7.0   9.0   0.0      1.0       17.0
c       2.0   4.0   0.0   0.0   0.0      1.0        6.0
c  A =  0.0   5.0   8.0   0.0   0.0  x = 1.0   b = 13.0
c       0.0   0.0   0.0  10.0  11.0      1.0       21.0
c       3.0   6.0   0.0   0.0  12.0      1.0       21.0
 
c  Array indices must be zero-based for calling SuperLU
      data colstr / 0, 3, 6, 8, 10, 12 /
      data rowind / 0, 1, 4, 1, 2, 4, 0, 2, 0, 3, 3, 4 /
      data values / 1.0, 2.0, 3.0,  4.0, 5.0,  6.0,
     $              7.0, 8.0, 9.0, 10.0, 11.0, 12.0 /
      data rhs    / 17.0, 6.0, 13.0, 21.0, 21.0 /
      data xexpct / 1.0d0, 1.0d0, 1.0d0, 1.0d0, 1.0d0 /
 
c  initialize solver
      mtxtyp= ’s0’
      pivot = ’n’
      neqns  = 5
      outunt = 6
      msglvl = 0
      ordmthd = ’mmd’
      nrhs = 1
      ldrhs = 5
 
c  One-call routine of SPSOLVE
      call dgssfs (mtxtyp, pivot, neqns , colstr, rowind,
     &             values, nrhs , rhs, ldrhs , ordmthd,
     &             outunt, msglvl, handle, ier)
      if ( ier .ne. 0 ) goto 110
 
c  deallocate sparse solver storage
      call dgssda ( handle, ier )
      if ( ier .ne. 0 ) goto 110
 
c  print values of sol
      write(6,200) ’i’, ’rhs(i)’, ’expected rhs(i)’, ’error’
      do i = 1, neqns
        write(6,300) i, rhs(i), xexpct(i), (rhs(i)-xexpct(i))
      enddo
      stop
 
  110 continue
c call to sparse solver returns an error
      write ( 6 , 400 )
     &      ’ example: FAILED sparse solver error number = ’, ier
      stop
 
  200 format(4x,a1,3x,a6,3x,a15,4x,a6)
  300 format(i5,3x,f5.2,7x,f5.2,8x,e10.2)     ! i/sol/xexpct values
  400 format(a60,i20)   ! fail message, sparse solver error number
      end

Running the above example:


my_system% f95 -dalign slu_single.f -xlic_lib=sunperf
my_system% a.out
 
    i   rhs(i)   expected rhs(i)     error
    1    1.00        1.00          0.00E+00
    2    1.00        1.00         -0.33E-15
    3    1.00        1.00          0.22E-15
    4    1.00        1.00         -0.11E-15
    5    1.00        1.00          0.22E-15
 


6.5 References

The following books and papers provide additional information for the sparse BLAS and sparse solver routines.

1. D.S. Dodson, R.G. Grimes, and J.G. Lewis, Sparse Extensions to the Fortran Basic Linear Algebra Subprograms, ACM Transactions on Mathematical Software, June 1991, Vol 17, No. 2.

2. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall Inc., Englewood Cliffs, New Jersey, 1981.

3. E. Ng and B. W. Peyton, Block Sparse Cholesky Algorithms on Advanced Uniprocessor Computers, SIAM M. Sci Comput., 14:1034-1056, 1993.

4. Ian S. Duff, Roger G. Grimes and John G. Lewis, User’s Guide for the Harwell-Boeing Sparse Matrix Collection (Release I), Technical Report TR/PA/92/86, CERFACS, Lyon, France, October 1992.

5. J. W. Demmel, J. R. Gilbert, and X. S. Li, SuperLU User’s Guide, Technical report LBNL-44289.

6. X. S. Li, An Overview of SuperLU: Algorithms, Implementation, and User Interface, ACM Transactions on Mathematical Software, 2004.

7. J. W. Demmel, S. C. Eisenstat, J. R. Gilbert, X. S. Li, J. W. H. Liu, A supernodal approach to sparse partial pivoting, SIAM J. Matrix Analysis and Applications, Vol 20, No. 3, 1999, pp. 720-755.