You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Revisiting this briefly --- we have found that MPI.has_cuda() only tests Open MPI, so it is not a general solution for determining whether CUDA-aware MPI is available.
However, there are other possible solutions. For example we can write a test like
using MPI
using CUDA
MPI.Init()
functionsendrecv_works(grid)
arch =architecture(grid)
comm = arch.communicator
rank = arch.local_rank
size = MPI.Comm_size(comm)
dst =mod(rank+1, size)
src =mod(rank-1, size)
N =4
FT =eltype(grid)
send_mesg =CuArray{FT}(undef, N)
recv_mesg =CuArray{FT}(undef, N)
fill!(send_mesg, FT(rank))
CUDA.synchronize()
try
MPI.Sendrecv!(send_mesg, dst, 0, recv_mesg, src, 0, comm)
returntruecatch err
@warn"MPI.Sendrecv test failed." exception=(err, catch_backtrace())
returnfalseendend
Sounds good. We should also test that the message is received correctly, in that way, this test cannot pass if MPI is configured incorrectly (for example when erroneously size == 1 and dst == src as in the case of @francispoulin#3981 which would make this pass even without CUDA-aware MPI).
Revisiting this briefly --- we have found that
MPI.has_cuda()
only tests Open MPI, so it is not a general solution for determining whether CUDA-aware MPI is available.However, there are other possible solutions. For example we can write a test like
adapted from https://gist.github.com/luraess/0063e90cb08eb2208b7fe204bbd90ed2
Originally posted by @glwagner in #3883 (comment)
The text was updated successfully, but these errors were encountered: