For each square coefficient matrix A of a, S3L_lu_solve solves a system of distributed linear equations AX = B, with a general M x M square matrix instance A, using the LU factorization computed by S3L_lu_factor.
Throughout these descriptions, L-1 and U-1 denote the inverse of L and U, respectively.
A and B are corresponding instances within a and b, respectively. To solve AX = B, S3L_lu_solve performs forward elimination:
Let UX = C A = LU implies that AX = B is equivalent to C = L-1B |
followed by back substitution:
X = U-1C = U-1(L-1B) |
To obtain this solution, the S3L_lu_solve routine performs the following steps:
Upon successful completion, each B is overwritten with the solution to AX = B.
In general, S3L_lu_solve performs most efficiently when the array is distributed using the same block size along each axis.
S3L_lu_solve behaves somewhat differently for 3D arrays, however. In this case, the nodal solve is applied on each of the 2D systems AX=B across the instance axis of a and is performed concurrently on all participating processes.
The input parallel arrays a and b must be distinct.
The internal variable setup_id is required for communicating information between the factorization routine and the other LU routines. The application must not modify the contents of this variable.