-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: test with mpirun -np 4, 1 GPU per rank #48
Conversation
dac55ca
to
66923f6
Compare
08a7b1d
to
f7fc7af
Compare
Commit 3c4cdcc has 3 failures CUDA has 2 failures, one in Cholesky QR, one in SVD:
AMD has 1 failure, only in CMake build:
|
I can't reproduce the AMD error on histamine inside Docker. Bewildering. Unfortunately, I can't run Docker on dopamine right now. |
2 routines FAILED: hegv, gesvd |
On CUDA,
On ROCm,
|
b4cc33f
to
22dec52
Compare
Current failures:
|
The
gpu_bind.sh
script avoids oversubscribing GPUs, which can be detrimental, e.g. on a DGX, 4 ranks, each with 8 GPUs! However, it no longer tests the multi-GPU per MPI rank code.