Sun S3L 3.0 Programming and Reference Guide

Gaussian Elimination for Dense Systems

`S3l_lu_factor`

Description

For each M x N coefficient matrix A of a, S3L_lu_factor computes the LU factorization using partial pivoting with row interchanges.

The factorization has the form A = P x L x U, where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if M > N), and U is upper triangular (upper trapezoidal if M < N). L and U are stored in A.

In general, S3L_lu_factor performs most efficiently when the array is distributed using the same block size along each axis.

S3L_lu_factor behaves somewhat differently for 3D arrays, however. In this case, it applies nodal LU factorization on each M x N coefficient matrix across the instance axis. This factorization is performed concurrently on all participating processes.

You must call S3L_lu_factor before calling any of the other LU routines. The S3L_lu_factor routine performs on the preallocated parallel array and returns a setup ID. You must supply this setup ID in subsequent LU calls, as long as you are working with the same set of factors.

Be sure to call S3L_deallocate_lu when you have finished working with a set of LU factors. See "S3l_lu_deallocate " for details.

The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.

Syntax

The C and Fortran syntax for S3L_lu_factor are shown below.

C/C++ Syntax

Example 8-23

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_factor(a, row_axis, col_asix, setup_id)
    S3L_array_t               A
    int                       row_axis
    int                       col_axis
    int                       *setup_id
    S3L_data_type             type
    char                      *fname
    char                      *dfmt

F77/F90 Syntax

Example 8-24

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_factor(a, row_axis, col_asix, setup_id, ier)
    integer*8               a
    integer*4               row_axis
    integer*4               col_axis
    integer*4               setup_id
    integer*4               ier

Input

a - Parallel array of rank greater than or equal to 2. This array contains one or more instances of a coefficient matrix A to be factored. Each A is assumed to be dense with dimensions M x N with rows counted by axis row_axis and columns counted by axis col_axis.
row_axis - Scalar integer variable. Identifies the axis of a that counts the rows of each matrix A. For C program calls, row_axis must be >= 0 and less than the rank of a; for Fortran program calls, it must be >= 1 and not exceed the rank of a. In addition, row_axis and col_axis must not be equal.
col_axis - Scalar integer variable. Identifies the axis of a that counts the columns of each matrix A. For C program calls, col_axis must be >= 0 and less than the rank of a; for Fortran program calls, it must be >= 1 and not exceed the rank of a. In addition, row_axis and col_axis must not be equal.

Output

This function uses the following arguments for output:

a - Upon successful completion, each matrix instance A is overwritten with data giving the corresponding LU factors.

setup_id - Scalar integer variable returned by S3L_lu_factor. It can be used when calling other LU routines to reference the LU-factored array.

ier (Fortran only) - When called from a Fortran program, this function returns error status in ier.

Error Handling

On success, S3L_lu_factor returns S3L_SUCCESS.

S3L_lu_factor performs generic checking of the validity of the arrays it accepts as arguments. If an array argument contains an invalid or corrupted value, the function terminates and returns an error code indicating which value was invalid. See Appendix A of this manual for a detailed list of these error codes.

The following conditions will cause the function to terminate and return the associated error code:

S3L_ERR_ARG_RANK - Invalid rank; must be >= 2.

S3L_ERR_ARG_BLKSIZE - Invalid blocksize; must be >= 1.

S3L_ERR_ARG_DTYPE - Invalid data type. It must be real or complex (single- or double-precision).

S3L_ERR_ARG_NULL - Invalid array. a must be preallocated.

S3L_ERR_ARG_AXISNUM - row_axis or col_axis is invalid. This condition can be caused by either an out-of-range axis number (see row_axis and col_axis argument definitions) or row_axis equal to col_axis.

S3L_ERR_FACTOR_SING - A singular factor U is returned. If it is used by S3L_lu_solve, division by zero will occur.

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_deallocate(3)
S3L_lu_invert(3)
S3L_lu_solve(3)

`S3l_lu_invert`

Description

S3L_lu_invert uses the LU factorization generated by S3L_lu_factor to compute the inverse of each square (M x M) matrix instance A of the parallel array a. This is done by inverting U and then solving the system A^-1L = U^-1 for A^-1, where A^-1 and U^-1 denote the inverse of A and U, respectively.

In general, S3L_lu_invert performs most efficiently when the array is distributed using the same block size along each axis.

For arrays with rank > 2, the nodal inversion is applied on each of the 2D slices of a across the instance axis and is performed concurrently on all participating processes.

The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.

Syntax

The C and Fortran syntax for S3L_lu_invert are shown below.

C/C++ Syntax

Example 8-25

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_invert(a, setup_id)
    S3L_array_t               a
    int                       setup_id

F77/F90 Syntax

Example 8-26

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_invert(a, setup_id, ier)
    integer*8               a
    integer*4               setup_id
    integer*4               ier

Input

a - Parallel array that was factored by S3L_lu_factor, where each matrix instance A is a dense M x M square matrix. Supply the same value a that was used in S3L_lu_factor.
setup_id - Scalar integer variable. Use the value returned by the corresponding S3L_lu_factor call for this argument.

Output

This function uses the following arguments for output:

a - Upon successful completion, each matrix instance A is overwritten with data giving the corresponding LU factors.

setup_id - Scalar integer variable returned by S3L_lu_factor. It can be used when calling other LU routines to reference the LU-factored array.

ier (Fortran only) - When called from a Fortran program, this function returns error status in ier.

Error Handling

On success, S3L_lu_invert returns S3L_SUCCESS.

S3L_lu_invert performs generic checking of the validity of the arrays it accepts as arguments. If an array argument contains an invalid or corrupted value, the function terminates and returns an error code indicating which value was invalid. See Appendix A of this manual for a detailed list of these error codes.

The following conditions will cause the function to terminate and return the associated error code:

S3L_ERR_ARG_NULL - Invalid array; must be the same value returned by S3L_lu_factor.

S3L_ERR_ARG_SETUP - Invalid setup_id.

S3L_ERR_FACTOR_SING - a contains singular factors; its inverse could not be computed.

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_factor(3)
S3L_lu_invert(3)
S3L_lu_solve(3)

`S3l_lu_solve`

Description

For each square coefficient matrix A of a, S3L_lu_solve solves a system of distributed linear equations AX = B, with a general M x M square matrix instance A, using the LU factorization computed by S3L_lu_factor.

Note -

Throughout these descriptions, L^-1 and U^-1 denote the inverse of L and U, respectively.

A and B are corresponding instances within a and b, respectively. To solve AX = B, S3L_lu_solve performs forward elimination:

Let UX = C
A = LU implies that AX = B is equivalent to C = L^-1B

followed by back substitution:

X = U^-1C = U^-1(L^-1B)

To obtain this solution, the S3L_lu_solve routine performs the following steps:

Applies L^-1 to B.

Applies U^-1 to L^-1B.

Upon successful completion, each B is overwritten with the solution to AX = B.

In general, S3L_lu_solve performs most efficiently when the array is distributed using the same block size along each axis.

S3L_lu_solve behaves somewhat differently for 3D arrays, however. In this case, the nodal solve is applied on each of the 2D systems AX=B across the instance axis of a and is performed concurrently on all participating processes.

The input parallel arrays a and b must be distinct.

The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.

Syntax

The C and Fortran syntax for S3L_lu_solve are shown below.

C/C++ Syntax

Example 8-27

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_solve(b, a, setup_id)
    S3L_array_t               b
    S3L_array_t               a
    int                       setup_id

F77/F90 Syntax

Example 8-28

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_solve(b, a, setup_id, ier)
    integer*8               b
    integer*8               a
    integer*4               setup_id
    integer*4               ier

Input

b - Parallel array of the same type (real or complex) and precision as a. Must be distinct from a. The instance axes of b must match those of a in order of declaration and extents. The rows and columns of each B must be counted by axes row_axis and col_axis, respectively (from the S3L_lu_factor call). For the two-dimensional case, if b consists of only one right-hand side vector, you can represent b as a vector (an array of rank 1) or as an array of rank 2 with the number of columns set to 1 and the elements counted by axis row_axis.
a - Parallel array that was factored by S3L_lu_factor, where each matrix instance A is a dense M x M square matrix. Supply the same value a that was used in S3L_lu_factor.
setup_id - Scalar integer variable. Use the value returned by the corresponding S3L_lu_factor call for this argument.

Output

This function uses the following arguments for output:

b - Upon successful completion, each matrix instance B is overwritten with the solution to AX = B.

ier (Fortran only) - When called from a Fortran program, this function returns error status in ier.

Error Handling

On success, S3L_lu_solve returns S3L_SUCCESS.

S3L_lu_solve performs generic checking of the validity of the arrays it accepts as arguments. If an array argument contains an invalid or corrupted value, the function terminates and returns an error code indicating which value was invalid. See Appendix A of this manual for a detailed list of these error codes.

The following conditions will cause the function to terminate and return the associated error code:

S3L_ERR_ARG_NULL - Invalid array. b must be preallocated and the same value returned by S3L_lu_factor must be supplied in a.

S3L_ERR_ARG_RANK - Invalid rank. For cases where rank >= 3, rank(b) must equal rank(a). For the two-dimensional case, rank(b) must be either 1 or 2.

S3L_ERR_ARG_DTYPE - Invalid data type; must be real or complex (single- or double-precision).

S3L_ERR_ARG_BLKSIZE - Invalid block size; must be >= 1.

S3L_ERR_MATCH_EXTENTS - Extents of a and b are mismatched along the row or instance axis.

S3L_ERR_MATCH_DTYPE - Unmatched data type between a and b.

S3L_ERR_ARRNOTSQ - Invalid matrix size; each coefficient matrix must be square.

S3L_ERR_ARG_SETUP - Invalid setup_id value. It does not match the value returned by S3L_lu_factor.

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_deallocate(3)
S3L_lu_factor(3)
S3L_lu_invert(3)

`S3l_lu_deallocate`

Description

S3L_lu_deallocate invalidates the specified setup ID, which deallocates the memory that has been set aside for the S3L_lu_factor routine associated with that ID. Attempts to use a deallocated setup ID will result in errors.

When you finish working with a set of factors, be sure to use S3L_lu_deallocate to free up the associated memory. Repeated calls to S3L_lu_factor without deallocation can cause you to run out of memory.

Syntax

The C and Fortran syntax for S3L_lu_deallocate are shown below.

C/C++ Syntax

Example 8-29

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_deallocate(setup_id)
    int                setup_id

F77/F90 Syntax

Example 8-30

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_deallocate(setup_id,
ier)
    integer*4          setup_id
    integer*4          ier

Input

setup_id - Scalar integer variable. Use the value returned by the corresponding S3L_lu_factor call for this argument.

Output

This function uses the following argument for output:

ier (Fortran only) - When called from a Fortran program, this function returns error status in ier.

Error Handling

On success, S3L_lu_deallocate returns S3L_SUCCESS.

The following condition will cause the function to terminate and return the associated error code.

S3L_ERR_ARG_SETUP - Invalid setup_id value. It does not match the value returned by S3L_lu_factor.

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_factor(3)
S3L_lu_solve(3)
S3L_lu_invert(3)