Sun S3L provides six outer product routines which compute one or more instances of an outer product of two vectors. For each instance, the outer product routines perform the operations listed in Table 8-11.
In these descriptions, yT and yH denote y transpose and y Hermitian, respectively
Routine |
Operation |
Data Type |
---|---|---|
S3L_outer_prod |
A = A + xyT |
real or complex |
S3L_outer_prod_noadd |
A = xyT |
real or complex |
S3L_outer_prod_addto | A = B + xyT |
real or complex |
S3L_outer_prod_c2 | A = A + xyH |
complex only |
S3L_outer_prod_c2_noadd | A = xyT |
complex only |
S3L_outer_prod_c2_noadd | A = B + xyT |
complex only |
In elementwise notation, for each instance S3L_outer_prod computes
A(i,j) = A(i,j) + x(i) * y(j) |
and S3L_outer_prod_c2 computes
A(i,j) = A(i,j) + x(i) * conj[y(j)] |
where conj[y(j)] denotes the conjugate of y(j).
The C and Fortran syntax for S3L_outer_prod are shown below.
#include <s3l/s3l-c.h> #include <s3l/s3l_errno-c.h> int S3L_outer_prod(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis) S3L_outer_prod_noadd(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis) S3L_outer_prod_addto(A, x, y, B, row_axis, col_axis, x_vector_axis, y_vector_axis) S3L_outer_prod_c2(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis) S3L_outer_prod_c2_noadd(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis) S3L_outer_prod_c2_addto(A, x, y, B, row_axis, col_axis, x_vector_axis, y_vector_axis) S3L_array_t A S3L_array_t x S3L_array_t y S3L_array_t B int row_axis int col_axis int x_vector_axis int y_vector_axis |
include `s3l/s3l-f.h' include `s3l/s3l_errno-f.h' subroutine S3L_outer_prod(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis, ier) S3L_outer_prod_noadd(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis, ier) S3L_outer_prod_addto(A, x, y, B, row_axis, col_axis, x_vector_axis, y_vector_axis, ier) S3L_outer_prod_c2(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis, ier) S3L_outer_prod_c2_noadd(A, x, y, row_axis, col_axis, x_vector_axis, y_vector_axis, ier) S3L_outer_prod_c2_addto(A, x, y, B, row_axis, col_axis, x_vector_axis, y_vector_axis, ier) S3L_array_t A S3L_array_t x S3L_array_t y S3L_array_t B int row_axis int col_axis int x_vector_axis int y_vector_axis int ier |
A - Array handle for an S3L parallel array of rank greater than or equal to 2. Two S3L outer product routines, S3L_outer_prod and S3L_outer_prod_c2, add the contents of this array to the product of xy. All outer product routines use A as the destination array, as described in the Output section.
x - Array handle for an S3L parallel array of rank one less than that of A. It contains one or more instances of the first source vector, x, embedded along axis x_vector_axis.
Axis x_vector_axis of x must have the same length as axis row_axis of A. The remaining axes of x must match the instance axes of A in length and order of declaration. Thus, each vector in x corresponds to a vector in A.
y - Array handle for an S3L parallel array of rank one less than that of A. It contains one or more instances of the second source vector, x, embedded along axis y_vector_axis.
y_vector_axis must have the same length as axis col_axis of A. The remaining axes of y must match the instance axes of A in length and order of declaration. Thus, each vector in y corresponds to a vector in A.
Note: The argument y can be identical to the argument x.
B - Parallel array of the same shape as A. It contains one or more embedded matrices B defined by axes row_axis (which counts the rows) and col_axis (which counts the columns). The remaining axes must match the instance axes of A in length and order of declaration. Thus, each matrix in B corresponds to a matrix in A.
This argument is used only in the S3L_outer_prod_addto and S3L_outer_prod_c2_addto calls, which add each outer product to the corresponding matrix within B and place the result in the corresponding matrix within A. The contents of B are not changed by the operation (unless B and A are the same variable).
Note: For S3L_outer_prod_addto and S3L_outer_prod_c2_addto, the argument B can be identical to the argument A.
row_axis - Scalar integer variable. The axis of A and B that counts the rows of the embedded matrix or matrices. For C/C++ programs, this argument must be nonnegative and less than the rank of A. For F77/F90 programs, it must be greater than zero and less than or equal to the rank of A.
col_axis - Scalar integer variable. The axis of A and B that counts the columns of the embedded matrix or matrices. For C/C++ programs, this argument must be nonnegative and less than the rank of A. For F77/F90 programs, it must be greater than zero and less than or equal to the rank of A.
x_vector_axis - Scalar integer variable that specifies the axis of x along which the elements of the embedded vectors lie. For C/C++ programs, this argument must be nonnegative and less than the rank of y. For F77/F90 programs, it must be greater than zero and less than or equal to the rank of x.
y_vector_axis - Scalar integer variable that specifies the axis of y and v along which the elements of the embedded vectors lie. For C/C++ programs, this argument must be nonnegative and less than the rank of y. For F77/F90 programs, it must be greater than zero and less than or equal to the rank of y.
These functions use the following arguments for output:
A - Array handle for an S3L parallel array of rank greater than or equal to 2, which contains one or more instances of the destination matrix A, defined by axes row_axis (which counts the rows) and col_axis (which counts the columns). Upon successful completion, each matrix instance is overwritten by the result of the outer product call.
ier (Fortran only) - When called from a Fortran program, these functions return error status in ier.
On success, the S3L_outer_prod routines return S3L_SUCCESS.
The S3L_outer_prod routines perform generic checking of the validity of the arrays they accept as arguments. If an array argument contains an invalid or corrupted value, the function terminates and an error code indicating which value of the array handle was invalid is returned. See Appendix A of this manual for a detailed list of these error codes.
In addition, the following conditions will cause these functions to terminate and return the associated error code:
S3L_ERR_MATCH_RANK - The parallel arrays do not have the same rank.
S3L_ERR_MATCH_EXTENTS - The lengths of corresponding axes do not match.
S3L_ERR_MATCH_DTYPE - The arguments are not all of the same data type and precision.
S3L_ERR_ARG_AXISNUM - row_axis and/or col_axis contains a bad axis number. For C/C++ program calls, each of these parameters must be nonnegative and less than the rank of A. For F77/F90 calls, they must be greater than zero and lessthan or equal to the rank of A.
S3L_ERR_CONJ_INVAL - Conjugation was requested, but the data supplied was not of type S3L_complex_t or S3L_dcomplex_t.
S3L_ERR_ARG_RANK - Rank of A is less than 2.
../examples/s3l/dense_matrix_ops/outer_prod.c ../examples/s3l/dense_matrix_ops-f/outer_prod.f
S3L_inner_prod(3) S3L_2_norm(3) S3L_mat_vec_mult(3) S3L_mat_mult(3)