PLASMA
Parallel Linear Algebra Software for Multicore Architectures
|
Functions | |
int | plasma_cgels (plasma_enum_t trans, int m, int n, int nrhs, plasma_complex32_t *pA, int lda, plasma_desc_t *T, plasma_complex32_t *pB, int ldb) |
void | plasma_omp_cgels (plasma_enum_t trans, plasma_desc_t A, plasma_desc_t T, plasma_desc_t B, plasma_workspace_t work, plasma_sequence_t *sequence, plasma_request_t *request) |
int | plasma_dgels (plasma_enum_t trans, int m, int n, int nrhs, double *pA, int lda, plasma_desc_t *T, double *pB, int ldb) |
void | plasma_omp_dgels (plasma_enum_t trans, plasma_desc_t A, plasma_desc_t T, plasma_desc_t B, plasma_workspace_t work, plasma_sequence_t *sequence, plasma_request_t *request) |
int | plasma_sgels (plasma_enum_t trans, int m, int n, int nrhs, float *pA, int lda, plasma_desc_t *T, float *pB, int ldb) |
void | plasma_omp_sgels (plasma_enum_t trans, plasma_desc_t A, plasma_desc_t T, plasma_desc_t B, plasma_workspace_t work, plasma_sequence_t *sequence, plasma_request_t *request) |
int | plasma_zgels (plasma_enum_t trans, int m, int n, int nrhs, plasma_complex64_t *pA, int lda, plasma_desc_t *T, plasma_complex64_t *pB, int ldb) |
void | plasma_omp_zgels (plasma_enum_t trans, plasma_desc_t A, plasma_desc_t T, plasma_desc_t B, plasma_workspace_t work, plasma_sequence_t *sequence, plasma_request_t *request) |
int plasma_cgels | ( | plasma_enum_t | trans, |
int | m, | ||
int | n, | ||
int | nrhs, | ||
plasma_complex32_t * | pA, | ||
int | lda, | ||
plasma_desc_t * | T, | ||
plasma_complex32_t * | pB, | ||
int | ldb | ||
) |
Solves overdetermined or underdetermined linear systems involving an m-by-n matrix A, or its conjugate-transpose, using a QR or LQ factorization of A. It is assumed that A has full rank. The following options are provided:
overdetermined system, i.e., solve the least squares problem: minimize || B - A*X ||.
underdetermined system A * X = B.
underdetermined system A^H * X = B.
overdetermined system, i.e., solve the least squares problem: minimize || B - A^H*X ||.
Several right-hand side vectors B and solution vectors X can be handled in a single call; they are stored as the columns of the m-by-nrhs right-hand side matrix B and the n-by-nrhs solution matrix X.
[in] | trans |
|
[in] | m | The number of rows of the matrix A. m >= 0. |
[in] | n | The number of columns of the matrix A. n >= 0. |
[in] | nrhs | The number of right hand sides, i.e., the number of columns of the matrices B and X. nrhs >= 0. |
[in,out] | pA | On entry, pointer to the m-by-n matrix A. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_cgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_cgelqf. |
[in] | lda | The leading dimension of the array A. lda >= max(1,m). |
[out] | T | On exit, auxiliary factorization data. Matrix of T is allocated inside this function and needs to be destroyed by plasma_desc_destroy. |
[in,out] | pB | On entry, pointer to the m-by-nrhs matrix B of right-hand side vectors, stored columnwise; On exit, if return value = 0, B is overwritten by the solution vectors, stored columnwise: if trans = PlasmaNoTrans and m >= n, rows 1 to n of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements n+1 to m in that column; if trans = PlasmaNoTrans and m < n, rows 1 to n of B contain the minimum norm solution vectors; if trans = Plasma_ConjTrans and m >= n, rows 1 to m of B contain the minimum norm solution vectors; if trans = Plasma_ConjTrans and m < n, rows 1 to m of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements M+1 to N in that column. |
[in] | ldb | The leading dimension of the array B. ldb >= max(1,m,n). |
PlasmaSuccess | successful exit |
< | 0 if -i, the i-th argument had an illegal value |
void plasma_omp_cgels | ( | plasma_enum_t | trans, |
plasma_desc_t | A, | ||
plasma_desc_t | T, | ||
plasma_desc_t | B, | ||
plasma_workspace_t | work, | ||
plasma_sequence_t * | sequence, | ||
plasma_request_t * | request | ||
) |
Solves overdetermined or underdetermined linear system of equations using the tile QR or the tile LQ factorization. May return before the computation is finished. Allows for pipelining of operations at runtime.
[in] | trans |
|
[in,out] | A | Descriptor of matrix A stored in the tile layout. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_cgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_cgelqf. |
[out] | T | Descriptor of matrix T. Auxiliary factorization data, computed by plasma_cgeqrf or plasma_cgelqf. |
[in,out] | B | Descriptor of matrix B. On entry, right-hand side matrix B in the tile layout. On exit, solution matrix X in the tile layout. |
[in] | work | Workspace for the auxiliary arrays needed by some coreblas kernels. For QR/LQ factorizations used in GELS, it contains preallocated space for tau and work arrays. Allocated by the plasma_workspace_create function. |
[in] | sequence | Identifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). |
[out] | request | Identifies this function call (for exception handling purposes). |
void | Errors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PlasmaSuccess (the initial values) since another async call may be setting a failure value at the same time. |
int plasma_dgels | ( | plasma_enum_t | trans, |
int | m, | ||
int | n, | ||
int | nrhs, | ||
double * | pA, | ||
int | lda, | ||
plasma_desc_t * | T, | ||
double * | pB, | ||
int | ldb | ||
) |
Solves overdetermined or underdetermined linear systems involving an m-by-n matrix A, or its conjugate-transpose, using a QR or LQ factorization of A. It is assumed that A has full rank. The following options are provided:
overdetermined system, i.e., solve the least squares problem: minimize || B - A*X ||.
underdetermined system A * X = B.
underdetermined system A^T * X = B.
overdetermined system, i.e., solve the least squares problem: minimize || B - A^T*X ||.
Several right-hand side vectors B and solution vectors X can be handled in a single call; they are stored as the columns of the m-by-nrhs right-hand side matrix B and the n-by-nrhs solution matrix X.
[in] | trans |
|
[in] | m | The number of rows of the matrix A. m >= 0. |
[in] | n | The number of columns of the matrix A. n >= 0. |
[in] | nrhs | The number of right hand sides, i.e., the number of columns of the matrices B and X. nrhs >= 0. |
[in,out] | pA | On entry, pointer to the m-by-n matrix A. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_dgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_dgelqf. |
[in] | lda | The leading dimension of the array A. lda >= max(1,m). |
[out] | T | On exit, auxiliary factorization data. Matrix of T is allocated inside this function and needs to be destroyed by plasma_desc_destroy. |
[in,out] | pB | On entry, pointer to the m-by-nrhs matrix B of right-hand side vectors, stored columnwise; On exit, if return value = 0, B is overwritten by the solution vectors, stored columnwise: if trans = PlasmaNoTrans and m >= n, rows 1 to n of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements n+1 to m in that column; if trans = PlasmaNoTrans and m < n, rows 1 to n of B contain the minimum norm solution vectors; if trans = PlasmaTrans and m >= n, rows 1 to m of B contain the minimum norm solution vectors; if trans = PlasmaTrans and m < n, rows 1 to m of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements M+1 to N in that column. |
[in] | ldb | The leading dimension of the array B. ldb >= max(1,m,n). |
PlasmaSuccess | successful exit |
< | 0 if -i, the i-th argument had an illegal value |
void plasma_omp_dgels | ( | plasma_enum_t | trans, |
plasma_desc_t | A, | ||
plasma_desc_t | T, | ||
plasma_desc_t | B, | ||
plasma_workspace_t | work, | ||
plasma_sequence_t * | sequence, | ||
plasma_request_t * | request | ||
) |
Solves overdetermined or underdetermined linear system of equations using the tile QR or the tile LQ factorization. May return before the computation is finished. Allows for pipelining of operations at runtime.
[in] | trans |
|
[in,out] | A | Descriptor of matrix A stored in the tile layout. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_dgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_dgelqf. |
[out] | T | Descriptor of matrix T. Auxiliary factorization data, computed by plasma_dgeqrf or plasma_dgelqf. |
[in,out] | B | Descriptor of matrix B. On entry, right-hand side matrix B in the tile layout. On exit, solution matrix X in the tile layout. |
[in] | work | Workspace for the auxiliary arrays needed by some coreblas kernels. For QR/LQ factorizations used in GELS, it contains preallocated space for tau and work arrays. Allocated by the plasma_workspace_create function. |
[in] | sequence | Identifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). |
[out] | request | Identifies this function call (for exception handling purposes). |
void | Errors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PlasmaSuccess (the initial values) since another async call may be setting a failure value at the same time. |
int plasma_sgels | ( | plasma_enum_t | trans, |
int | m, | ||
int | n, | ||
int | nrhs, | ||
float * | pA, | ||
int | lda, | ||
plasma_desc_t * | T, | ||
float * | pB, | ||
int | ldb | ||
) |
Solves overdetermined or underdetermined linear systems involving an m-by-n matrix A, or its conjugate-transpose, using a QR or LQ factorization of A. It is assumed that A has full rank. The following options are provided:
overdetermined system, i.e., solve the least squares problem: minimize || B - A*X ||.
underdetermined system A * X = B.
underdetermined system A^T * X = B.
overdetermined system, i.e., solve the least squares problem: minimize || B - A^T*X ||.
Several right-hand side vectors B and solution vectors X can be handled in a single call; they are stored as the columns of the m-by-nrhs right-hand side matrix B and the n-by-nrhs solution matrix X.
[in] | trans |
|
[in] | m | The number of rows of the matrix A. m >= 0. |
[in] | n | The number of columns of the matrix A. n >= 0. |
[in] | nrhs | The number of right hand sides, i.e., the number of columns of the matrices B and X. nrhs >= 0. |
[in,out] | pA | On entry, pointer to the m-by-n matrix A. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_sgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_sgelqf. |
[in] | lda | The leading dimension of the array A. lda >= max(1,m). |
[out] | T | On exit, auxiliary factorization data. Matrix of T is allocated inside this function and needs to be destroyed by plasma_desc_destroy. |
[in,out] | pB | On entry, pointer to the m-by-nrhs matrix B of right-hand side vectors, stored columnwise; On exit, if return value = 0, B is overwritten by the solution vectors, stored columnwise: if trans = PlasmaNoTrans and m >= n, rows 1 to n of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements n+1 to m in that column; if trans = PlasmaNoTrans and m < n, rows 1 to n of B contain the minimum norm solution vectors; if trans = PlasmaTrans and m >= n, rows 1 to m of B contain the minimum norm solution vectors; if trans = PlasmaTrans and m < n, rows 1 to m of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements M+1 to N in that column. |
[in] | ldb | The leading dimension of the array B. ldb >= max(1,m,n). |
PlasmaSuccess | successful exit |
< | 0 if -i, the i-th argument had an illegal value |
void plasma_omp_sgels | ( | plasma_enum_t | trans, |
plasma_desc_t | A, | ||
plasma_desc_t | T, | ||
plasma_desc_t | B, | ||
plasma_workspace_t | work, | ||
plasma_sequence_t * | sequence, | ||
plasma_request_t * | request | ||
) |
Solves overdetermined or underdetermined linear system of equations using the tile QR or the tile LQ factorization. May return before the computation is finished. Allows for pipelining of operations at runtime.
[in] | trans |
|
[in,out] | A | Descriptor of matrix A stored in the tile layout. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_sgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_sgelqf. |
[out] | T | Descriptor of matrix T. Auxiliary factorization data, computed by plasma_sgeqrf or plasma_sgelqf. |
[in,out] | B | Descriptor of matrix B. On entry, right-hand side matrix B in the tile layout. On exit, solution matrix X in the tile layout. |
[in] | work | Workspace for the auxiliary arrays needed by some coreblas kernels. For QR/LQ factorizations used in GELS, it contains preallocated space for tau and work arrays. Allocated by the plasma_workspace_create function. |
[in] | sequence | Identifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). |
[out] | request | Identifies this function call (for exception handling purposes). |
void | Errors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PlasmaSuccess (the initial values) since another async call may be setting a failure value at the same time. |
int plasma_zgels | ( | plasma_enum_t | trans, |
int | m, | ||
int | n, | ||
int | nrhs, | ||
plasma_complex64_t * | pA, | ||
int | lda, | ||
plasma_desc_t * | T, | ||
plasma_complex64_t * | pB, | ||
int | ldb | ||
) |
Solves overdetermined or underdetermined linear systems involving an m-by-n matrix A, or its conjugate-transpose, using a QR or LQ factorization of A. It is assumed that A has full rank. The following options are provided:
overdetermined system, i.e., solve the least squares problem: minimize || B - A*X ||.
underdetermined system A * X = B.
underdetermined system A^H * X = B.
overdetermined system, i.e., solve the least squares problem: minimize || B - A^H*X ||.
Several right-hand side vectors B and solution vectors X can be handled in a single call; they are stored as the columns of the m-by-nrhs right-hand side matrix B and the n-by-nrhs solution matrix X.
[in] | trans |
|
[in] | m | The number of rows of the matrix A. m >= 0. |
[in] | n | The number of columns of the matrix A. n >= 0. |
[in] | nrhs | The number of right hand sides, i.e., the number of columns of the matrices B and X. nrhs >= 0. |
[in,out] | pA | On entry, pointer to the m-by-n matrix A. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_zgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_zgelqf. |
[in] | lda | The leading dimension of the array A. lda >= max(1,m). |
[out] | T | On exit, auxiliary factorization data. Matrix of T is allocated inside this function and needs to be destroyed by plasma_desc_destroy. |
[in,out] | pB | On entry, pointer to the m-by-nrhs matrix B of right-hand side vectors, stored columnwise; On exit, if return value = 0, B is overwritten by the solution vectors, stored columnwise: if trans = PlasmaNoTrans and m >= n, rows 1 to n of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements n+1 to m in that column; if trans = PlasmaNoTrans and m < n, rows 1 to n of B contain the minimum norm solution vectors; if trans = Plasma_ConjTrans and m >= n, rows 1 to m of B contain the minimum norm solution vectors; if trans = Plasma_ConjTrans and m < n, rows 1 to m of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements M+1 to N in that column. |
[in] | ldb | The leading dimension of the array B. ldb >= max(1,m,n). |
PlasmaSuccess | successful exit |
< | 0 if -i, the i-th argument had an illegal value |
void plasma_omp_zgels | ( | plasma_enum_t | trans, |
plasma_desc_t | A, | ||
plasma_desc_t | T, | ||
plasma_desc_t | B, | ||
plasma_workspace_t | work, | ||
plasma_sequence_t * | sequence, | ||
plasma_request_t * | request | ||
) |
Solves overdetermined or underdetermined linear system of equations using the tile QR or the tile LQ factorization. May return before the computation is finished. Allows for pipelining of operations at runtime.
[in] | trans |
|
[in,out] | A | Descriptor of matrix A stored in the tile layout. On exit, if m >= n, A is overwritten by details of its QR factorization as returned by plasma_zgeqrf; if m < n, A is overwritten by details of its LQ factorization as returned by plasma_zgelqf. |
[out] | T | Descriptor of matrix T. Auxiliary factorization data, computed by plasma_zgeqrf or plasma_zgelqf. |
[in,out] | B | Descriptor of matrix B. On entry, right-hand side matrix B in the tile layout. On exit, solution matrix X in the tile layout. |
[in] | work | Workspace for the auxiliary arrays needed by some coreblas kernels. For QR/LQ factorizations used in GELS, it contains preallocated space for tau and work arrays. Allocated by the plasma_workspace_create function. |
[in] | sequence | Identifies the sequence of function calls that this call belongs to (for completion checks and exception handling purposes). |
[out] | request | Identifies this function call (for exception handling purposes). |
void | Errors are returned by setting sequence->status and request->status to error values. The sequence->status and request->status should never be set to PlasmaSuccess (the initial values) since another async call may be setting a failure value at the same time. |