Multiple-Instance Inner Product - Sun S3L provides six multiple-instance inner product routines, all of which compute one or more instances of the inner product of two vectors embedded in two parallel arrays. The operations performed by the multiple-instance inner product routines are shown in Table 8-3.
Table 8-3 S3L Multiple-Instance Inner Product Operations
Routine |
Operation |
Data Type |
---|---|---|
S3L_inner_prod |
z = z + xTy |
real or complex |
S3L_inner_prod_noadd |
z = xTy |
real or complex |
S3L_inner_prod_addto |
z = u + xTy |
real or complex |
S3L_inner_prod_c1 |
z = z + xHy |
complex only |
S3L_inner_prod_c1_noadd |
z = xHy |
complex only |
S3L_inner_prod_c1_addto |
z = u + xHy |
complex only |
For these multiple-instance operations, array x contains one or more instances of the first vector in each inner-product pair, x. Likewise, array y contains one or more instances of the second vector in each pair, y.
The array arguments x, y, and so forth. actually represent array handles that describe S3L parallel arrays. For convenience, however, this discussion ignores that distinction and refers to them as if they were the arrays themselves.
x and y must be at least rank 1 arrays, must be of the same rank, and their corresponding axes must have the same extents. Additionally, x and y must both be distributed arrays--that is, each must have at least one axis that is nonlocal.
Array z, which stores the results of the multiple-instance inner product operations, must be of rank one less than that of x and y. Its axes must match the instance axes of x and y in length and order of declaration and it must also have at least one axis that is nonlocal. This means each vector pair in x and y corresponds to a single destination value in z.
For S3L_inner_prod and S3L_inner_prod_c1, z is also used as the source for a set of values, which are added to the inner products of the corresponding x and y vector pairs.
Finally, x, y, and z must match in data type and precision.
Two scalar integer variables, x_vector_axis and y_vector_axis, specify the axes of x and y along which the constituent vectors in each vector pair lie.
When specifying values for x_vector_axis and y_vector_axis, keep in mind that Sun S3L functions employ zero-based array indexing when they are called via the C/C++ interface and one-based indexing when called via the F77/F90 interface.
The array handle u describes an S3L parallel array that is used by S3L_inner_prod_addto and S3L_inner_prod_c1_addto. These routines add the values contained in u to the inner products of the corresponding x and y vector pairs.
Upon successful completion of S3L_inner_prod or S3L_inner_prod_c1, the inner product of each vector pair x and y in x and y, respectively, is added to the corresponding value in z.
Upon successful completion of S3L_inner_prod_noadd or S3L_inner_prod_c1_noadd, the inner product of each vector pair x and y in x and y, respectively, overwrites the corresponding value in z.
Upon successful completion of S3L_inner_prod_addto or S3L_inner_prod_c1_addto, the inner product of each vector pair x and y in x and y respectively, is added to the corresponding value in u, and each resulting sum overwrites the corresponding value in z.
If the instance axes of x and y--that is, the axes along which the inner product will be taken--each contains only a single vector, either declare the axes to have an extent of 1 or use the comparable single-instance inner product routine, as described below.
Single-Instance Inner Product - Sun S3L also provides six single-instance inner product routines, all of which compute the inner product over all the axes of two parallel arrays. The operations performed by the single-instance inner product routines are shown in Table 8-4.
Table 8-4 S3L Single-Instance Inner Product Operations
Routine |
Operation |
Data Type |
---|---|---|
S3L_gbl_inner_prod |
a = a + xTy |
real or complex |
S3L_gbl_inner_prod_noadd |
a = xTy |
real or complex |
S3L_gbl_inner_prod_addto |
a = b + xTy |
real or complex |
S3L_gbl_inner_prod_c1 |
a = a + xHy |
complex only |
S3L_gbl_inner_prod_c1_noadd |
a = xHy |
complex only |
S3L_gbl_inner_prod_c1_addto |
a = b + xHy |
complex only |
For these single-instance functions, x and y are S3L parallel arrays of rank 1 or greater and with the same data type and precision.
a is a pointer to a scalar variable of the same data type as x and y. This variable stores the results of the single-instance inner product operations.
For S3L_gbl_inner_prod and S3L_gbl_inner_prod_c1, a is also used as the source for a set of values, which are added to the inner product of x and y.
b is also a pointer to a scalar variable of the same data type as x and y. It contains a set of values that S3L_gbl_inner_prod_addto and S3L_gbl_inner_prod_c1_addto add to the inner product of x and y.
Upon successful completion of S3L_gbl_inner_prod or S3L_gbl_inner_prod_c1, the global inner product of x and y is added to a.
Upon successful completion of S3L_gbl_inner_prod_noadd or S3L_gbl_inner_prod_c1_noadd, the global inner product of x and y overwrites a.
Upon successful completion of S3L_gbl_inner_prod_addto or S3L_gbl_inner_prod_c1_addto, the global inner product of x and y is added to b, and the resulting sum overwrites a.
Array variables must not overlap.
The C and Fortran syntax for S3L_inner_prod and S3L_gbl_inner_prod are shown below.
#include <s3l/s3l-c.h> #include <s3l/s3l_errno-c.h> int S3L_inner_prod(z, x, y, x_vector_axis, y_vector_axis) S3L_inner_prod_noadd(z, x, y, x_vector_axis, y_vector_axis) S3L_inner_prod_addto(z, x, y, *u, x_vector_axis, y_vector_axis) S3L_inner_prod_c1(z, x, y, x_vector_axis, y_vector_axis) S3L_inner_prod_c1_noadd(z, x, y, x_vector_axis, y_vector_axis) S3L_inner_prod_c1_addto(z, x, y, *u, x_vector_axis, y_vector_axis) S3L_gbl_inner_prod(a, x, y) S3L_gbl_inner_prod_noadd(a, x, y) S3L_gbl_inner_prod_addto(a, x, y, b) S3L_gbl_inner_prod_c1(a, x, y) S3L_gbl_inner_prod_c1_noadd(a, x, y) S3L_gbl_inner_prod_c1_addto(a, x, y, b) S3L_array_t z S3L_array_t x S3L_array_t y S3L_array_t u S3L_array_t a S3L_array_t b int x_vector_axis int y_vector_axis |
include `s3l/s3l-f.h' include `s3l/s3l_errno-f.h' subroutine S3L_inner_prod(z, x, y, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_noadd(z, x, y, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_addto(z, x, y, *u, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_c1(z, x, y, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_c1_noadd(z, x, y, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_c1_addto(z, x, y, *u, x_vector_axis, y_vector_axis, ier) S3L_gbl_inner_prod(a, x, y, ier) S3L_gbl_inner_prod_noadd(a, x, y) S3L_gbl_inner_prod_addto(a, x, y, b) S3L_gbl_inner_prod_c1(a, x, y) S3L_gbl_inner_prod_c1_noadd(a, x, y) S3L_gbl_inner_prod_c1_addto(a, x, y, b) S3L_array_t z S3L_array_t x S3L_array_t y S3L_array_t u S3L_array_t a S3L_array_t b int x_vector_axis int y_vector_axis int ier |
z - Array handle for an S3L parallel array, which S3L_inner_prod and S3L_inner_prod_c1 use as a source of values to be added to the inner products of the corresponding x and y vector pairs. z is also used for output; see the Output section for details.
x - Array handle for an S3L parallel array that contains the first vector in each vector pair for which an inner product will be computed.
y - Array handle for an S3L parallel array that contains the second vector in each vector pair for which an inner product will be computed.
u - Array handle for an S3L parallel array whose rank is one less than that of x and y. S3L_inner_prod_addto and S3L_inner_prod_c1_addto add the contents of u to the inner products of the corresponding vector pairs of x and y.
a - Pointer to a scalar variable, which S3L_gbl_inner_prod and S3L_gbl_inner_prod_c1 use as source of values to be added to the inner product of x and y. a is also used for output; see the Output section for details.
b - Pointer to a scalar variable, which S3L_gbl_inner_prod_addto and S3L_gbl_inner_prod_c1_addto use as source of values to be added to the inner product of x and y.
x_vector_axis - Scalar variable. Identifies the axis of x along which the vectors lie.
y_vector_axis - Scalar variable. Identifies the axis of y along which the vectors lie.
These functions use the following arguments for output:
z - Array handle for the S3L parallel array that will contain the results of the multiple-instance 2-norm routine.
a - Pointer to a scalar variable, which is the destination for the single-instance inner product routines.
ier (Fortran only) - When called from a Fortran program, these functions return error status in ier.
On success, S3L_inner_prod and S3L_gbl_inner_prod return S3L_SUCCESS.
S3L_inner_prod and S3L_gbl_inner_prod perform generic checking of the validity of the arrays they accept as arguments. If an array argument contains an invalid or corrupted value, the function terminates and an error code indicating which value of the array handle was invalid is returned. See Appendix A of this manual for a detailed list of these error codes.
In addition, the following conditions will cause the function to terminate and return the associated error code:
S3L_ERR_MATCH_RANK - x and y do not have the same rank.
S3L_ERR_MATCH_EXTENTS - Axes of x and y do not have the same extents.
S3L_ERR_MATCH_DTYPE - The arguments are not all of the same data type and precision.
S3L_ERR_CONJ_INVAL - Conjugation was requested, but data supplied was not of type S3L_complex_t or S3L_dcomplex_t.
../examples/s3l/dense_matrix_ops/inner_prod.c ../examples/s3l/dense_matrix_ops-f/inner_prod.f
S3L_2_norm(3) S3L_outer_prod(3) S3L_mat_vec_mult(3) S3L_mat_mult(3)