The Sun S3L core routines consist of:
Inner product - Compute the global inner product over all axes of two source parallel arrays. The inner product is added to the destination. A routine that takes the conjugate of the second operand is provided for complex data.
Outer product - Compute one or more instances of an outer product of two vectors. The result is added to the destination. For complex data, a routine that takes the conjugate of the second operand is provided.
Matrix-vector multiplication - Compute one or more instances of a matrix-vector product. The result is added to the destination, or is added to a second parallel array. For complex data, a routine that takes the conjugate of the matrix is provided.
LU-factorization and LU-solve routines
LU-factorization routine - For each m x n coefficient matrix A of a, computes LU factorization using partial pivoting with row interchanges.
LU-solve routine - Uses the L and U factors produced by the LU-factorization routine to produce solutions to the system AX=B. B may represent one or more right-hand sides for each instance of the systems of equations.
Setup and deallocation of FFT handles - Initialize and deallocate FFT handles for both complex and real data types. Separate routines are used for the two data types.
Simple complex-to-complex, mixed-radix, forward and inverse FFT routines - Performs forward or inverse Fast Fourier Transform of a parallel array of type complex or double complex. Supports both power-of-two and arbitrary radix parameters.
Detailed complex-to-complex FFT routine - Allows independent specification along each data axis of the transform direction in a complex-to-complex FFT. Can improve performance over the simple FFT in some cases.
Structured solver
Tridiagonal solver - Solves collections of tridiagonal linear systems of equations using Gaussian elimination with pivoting.
Banded solver - Solves collections of banded linear systems of equations using Gaussian elimination with pivoting.
Dense symmetric eigenvalue solver - Computes selected eigenvalues and, optionally, engenvectors of hermitian matrices.
Dense Singular Value Decomposition (SVD) - Computes the singular value decomposition of an M x N matrix and, optionally, the left and right singular vectors.
Sparse routines
Declare array handle for a sparse matrix.
Read data from a file into a distributed matrix, with support for both COO and CSR sparse storage formats.
Compute the product of a sparse matrix with a dense vector.
Iterative solver - Solves a general sparse linear system of equations using iterative methods, with or without preconditioning.
Convolution/Deconvolution
Convolve - Computes 1D or 2D convolution of one array with another.
Deconvolve - Deconvolves an array into a vector.
Iterative eigensolver - Computes selected eigenpairs of dense or sparse matrices, with optional specification of eigenpair properties.
Autocorrelation - Computes 1D or 2D autocorrelation of a signal.
Sort and grade - Sort and grade arrays.
Parallel random number generators
Fibonacci RNG setup and deallocation - Initializes and deallocates the state table of a lagged Fibonacci random number generator (LFG).
LCG RNG setup - Defines the parameters used in the Sun S3L linear congruential random number generator (LCG).
Zero array elements - Replaces all elements in an array with zero.