Sun S3L 3.0 Programming and Reference Guide

Gaussian Elimination for Dense Systems

S3l_lu_factor

Description

For each M x N coefficient matrix A of a, S3L_lu_factor computes the LU factorization using partial pivoting with row interchanges.

The factorization has the form A = P x L x U, where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if M > N), and U is upper triangular (upper trapezoidal if M < N). L and U are stored in A.

In general, S3L_lu_factor performs most efficiently when the array is distributed using the same block size along each axis.

S3L_lu_factor behaves somewhat differently for 3D arrays, however. In this case, it applies nodal LU factorization on each M x N coefficient matrix across the instance axis. This factorization is performed concurrently on all participating processes.

You must call S3L_lu_factor before calling any of the other LU routines. The S3L_lu_factor routine performs on the preallocated parallel array and returns a setup ID. You must supply this setup ID in subsequent LU calls, as long as you are working with the same set of factors.

Be sure to call S3L_deallocate_lu when you have finished working with a set of LU factors. See "S3l_lu_deallocate " for details.

The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.

Syntax

The C and Fortran syntax for S3L_lu_factor are shown below.

C/C++ Syntax


Example 8-23

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_factor(a, row_axis, col_asix, setup_id)
    S3L_array_t               A
    int                       row_axis
    int                       col_axis
    int                       *setup_id
    S3L_data_type             type
    char                      *fname
    char                      *dfmt

F77/F90 Syntax


Example 8-24

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_factor(a, row_axis, col_asix, setup_id, ier)
    integer*8               a
    integer*4               row_axis
    integer*4               col_axis
    integer*4               setup_id
    integer*4               ier

Input

Output

This function uses the following arguments for output:

Error Handling

On success, S3L_lu_factor returns S3L_SUCCESS.

S3L_lu_factor performs generic checking of the validity of the arrays it accepts as arguments. If an array argument contains an invalid or corrupted value, the function terminates and returns an error code indicating which value was invalid. See Appendix A of this manual for a detailed list of these error codes.

The following conditions will cause the function to terminate and return the associated error code:

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_deallocate(3)
S3L_lu_invert(3)
S3L_lu_solve(3)

S3l_lu_invert

Description

S3L_lu_invert uses the LU factorization generated by S3L_lu_factor to compute the inverse of each square (M x M) matrix instance A of the parallel array a. This is done by inverting U and then solving the system A-1L = U-1 for A-1, where A-1 and U-1 denote the inverse of A and U, respectively.

In general, S3L_lu_invert performs most efficiently when the array  is distributed using the same block size along each axis.

For arrays with rank > 2, the nodal inversion is applied on each of the 2D slices of a across the instance axis and is performed concurrently on all participating processes.

The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.

Syntax

The C and Fortran syntax for S3L_lu_invert are shown below.

C/C++ Syntax


Example 8-25

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_invert(a, setup_id)
    S3L_array_t               a
    int                       setup_id

F77/F90 Syntax


Example 8-26

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_invert(a, setup_id, ier)
    integer*8               a
    integer*4               setup_id
    integer*4               ier

Input

Output

This function uses the following arguments for output:

Error Handling

On success, S3L_lu_invert returns S3L_SUCCESS.

S3L_lu_invert performs generic checking of the validity of the arrays it accepts as arguments. If an array argument contains an invalid or corrupted value, the function terminates and returns an error code indicating which value was invalid. See Appendix A of this manual for a detailed list of these error codes.

The following conditions will cause the function to terminate and return the associated error code:

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_factor(3)
S3L_lu_invert(3)
S3L_lu_solve(3)

S3l_lu_solve

Description

For each square coefficient matrix A of a, S3L_lu_solve solves a system of distributed linear equations AX = B, with a general M x M square matrix instance A, using the LU factorization computed by S3L_lu_factor.


Note -

Throughout these descriptions, L-1 and U-1 denote the inverse of L and U, respectively.


A and B are corresponding instances within a and b, respectively. To solve AX = B, S3L_lu_solve performs forward elimination:

Let UX = C
A = LU implies that AX = B is equivalent to C = L-1B

followed by back substitution:

X = U-1C = U-1(L-1B)

To obtain this solution, the S3L_lu_solve routine performs the following steps:

  1. Applies L-1 to B.

  2. Applies U-1 to L-1B.

Upon successful completion, each B is overwritten with the solution to AX = B.

In general, S3L_lu_solve performs most efficiently when the array is distributed using the same block size along each axis.

S3L_lu_solve behaves somewhat differently for 3D arrays, however. In this case, the nodal solve is applied on each of the 2D systems AX=B across the instance axis of a and is performed concurrently on all participating processes.

The input parallel arrays a and b must be distinct.

The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.

Syntax

The C and Fortran syntax for S3L_lu_solve are shown below.

C/C++ Syntax


Example 8-27

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_solve(b, a, setup_id)
    S3L_array_t               b
    S3L_array_t               a
    int                       setup_id

F77/F90 Syntax


Example 8-28

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_solve(b, a, setup_id, ier)
    integer*8               b
    integer*8               a
    integer*4               setup_id
    integer*4               ier

Input

Output

This function uses the following arguments for output:

Error Handling

On success, S3L_lu_solve returns S3L_SUCCESS.

S3L_lu_solve performs generic checking of the validity of the arrays it accepts as arguments. If an array argument contains an invalid or corrupted value, the function terminates and returns an error code indicating which value was invalid. See Appendix A of this manual for a detailed list of these error codes.

The following conditions will cause the function to terminate and return the associated error code:

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_deallocate(3)
S3L_lu_factor(3)
S3L_lu_invert(3)

S3l_lu_deallocate

Description

S3L_lu_deallocate invalidates the specified setup ID, which deallocates the memory that has been set aside for the S3L_lu_factor routine associated with that ID. Attempts to use a deallocated setup ID will result in errors.

When you finish working with a set of factors, be sure to use S3L_lu_deallocate to free up the associated memory. Repeated calls to S3L_lu_factor without deallocation can cause you to run out of memory.

Syntax

The C and Fortran syntax for S3L_lu_deallocate are shown below.

C/C++ Syntax


Example 8-29

#include <s3l/s3l-c.h>
#include <s3l/s3l_errno-c.h>
int
S3L_lu_deallocate(setup_id)
    int                setup_id

F77/F90 Syntax


Example 8-30

include `s3l/s3l-f.h'
include `s3l/s3l_errno-f.h'
subroutine
S3L_lu_deallocate(setup_id,
ier)
    integer*4          setup_id
    integer*4          ier

Input

Output

This function uses the following argument for output:

Error Handling

On success, S3L_lu_deallocate returns S3L_SUCCESS.

The following condition will cause the function to terminate and return the associated error code.

Examples

../examples/s3l/lu/lu.c
../examples/s3l/lu/ex_lu1.c
../examples/s3l/lu/ex_lu2.c
../examples/s3l/lu-f/lu.f
../examples/s3l/lu-f/ex_lu1.f

Related Functions

S3L_lu_factor(3)
S3L_lu_solve(3)
S3L_lu_invert(3)