-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polar decomposition QDWH #2
base: master
Are you sure you want to change the base?
Conversation
…oid having a clean up tile in the middle of the W matrix
…ts to slate calls
… != q. Changed dd computing in qdwh for-now and minor changes
Check for warnings, i.e., add |
…trcondest. Reduced the condition number of the tested matrix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass through. Probably more changes later on.
…or update on gflops count
…from geqrf and geqrf_qdwh_full
All test passed, except one failure for gels using cholqr. |
lapack::Gflop<scalar_t>::potrf(m) + | ||
blas::Gflop<scalar_t>::trsm(slate::Side::Left, m, n) ); | ||
|
||
double gflop_compute_H = blas::Gflop<scalar_t>::her2k(n, m); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, this is really gemm
, but eventually it should be herk
(i.e., herkx
) instead of her2k
.
(I'll fix.)
The polar decomposition QDWH of a general matrix A = U * H, where U is orthogonal polar factor and H is hermitian polar factor.
QDWH iterations rely on Cholesky based and QR based iterations to compute the orthogonal polar factor U.
For the QR based iterations, new customized geqrf_qdwh_full and unmqr_qdwh_full are included to take advantage of the identity structure of the matrix involved during the QR based iterations.
The 2-norm estimate (norm2est) of the original matrix is required, the norm2est using power iteration is implemented and called in QDWH.
The following figure present the performance of SLATE_QDWH on Summit using various number of nodes.
data:image/s3,"s3://crabby-images/35b75/35b757d7105f49edbb5c6f6aa70c43ef995e0b93" alt="image"