-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New collcommobj #329
base: master
Are you sure you want to change the base?
New collcommobj #329
Conversation
Where possible an ifdef was added. So, depending on the cmake flag for ENABLE_PERSISTENT, the code is compiled differently.
Signed-off-by: Christoph Niethammer <[email protected]>
…onfig Move cmake configuration for persistent collectives into the mpi module
This updated wrapper has an improved interface and is more compact due to the environment class being internal.
This function does not clear the forces in the SoA structure.
This reverts commit 4ba2dff.
This reverts commit 813b6ec.
This reverts commit 3adc768.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive template magic :D
Can you explain in the main text of the PR why all of this is "just" behind an ifdef and why we can't simply get rid of the old communicator class?
class Coll_Comm_Obj | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(most of) the rest of the code places the opening {
not on a new line. As soon as #342 will be finalized this will be reformatted so we might as well do it now upon introducing new code.
template<int tag, int Fn, typename Op, typename Root, typename Comm, typename... Ts> | ||
class Coll_Comm_Obj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add doxygen documentation in javadoc style for all classes, functions, and members.
e.g. here:
template<int tag, int Fn, typename Op, typename Root, typename Comm, typename... Ts> | |
class Coll_Comm_Obj | |
/** | |
* Some brief description | |
* | |
* Some more verbose extensive description. Maybe just the large block above. | |
* | |
* @tparam tag explanation | |
* @tparam Fn explanation | |
* ... | |
*/ | |
template<int tag, int Fn, typename Op, typename Root, typename Comm, typename... Ts> | |
class Coll_Comm_Obj |
// this function requires that the MPI_Comm and MPI_Op have been setup in the constructor | ||
template<typename T = Comm, typename U = Op> | ||
auto allreduce() | ||
-> std::enable_if_t<std::is_same_v<T, MPI_Comm> && (std::is_same_v<U, add_struct> || std::is_same_v<U, max_struct> || std::is_same_v<U, min_struct>)> | ||
{ | ||
MPI_Allreduce( MPI_IN_PLACE, _buffer.data(), 1, _mpi_members->get_type(), _mpi_members->get_op(), _mpi_comm ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder to add javadoc to all functions as described above.
Also, please explain a bit of the technicalities of the template magic. There are mostly non-computer scientists working on this code ;)
class MPI_Env_Wrapper | ||
{ | ||
public: | ||
static auto init_environment(int* argc, char*** argv) | ||
{ | ||
static MPI_Environment mpi_env(argc, argv); | ||
_mpi_env = &mpi_env; | ||
} | ||
static auto init_thread_environment(int* argc, char*** argv, int required) | ||
{ | ||
static MPI_Environment mpi_env(argc, argv, required); | ||
_mpi_env = &mpi_env; | ||
} | ||
private: | ||
// class to manage MPI_Init and MPI_Finalize | ||
class MPI_Environment | ||
{ | ||
public: | ||
MPI_Environment() = default; | ||
|
||
MPI_Environment(int* argc, char*** argv) | ||
{ | ||
MPI_Init(argc, argv); | ||
} | ||
MPI_Environment(int* argc, char*** argv, int required) | ||
{ | ||
int provided; | ||
MPI_Init_thread(argc, argv, required, &provided); | ||
if (provided < required) | ||
{ | ||
std::cerr << "Cannot provide requested level of MPI thread support." << std::endl; | ||
MPI_Abort(MPI_COMM_WORLD, 1); | ||
} | ||
} | ||
|
||
~MPI_Environment() | ||
{ | ||
MPI_Finalize(); | ||
} | ||
}; | ||
|
||
// storing the class until the end of the program | ||
static inline MPI_Environment* _mpi_env; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere? If not I'd rather not add it to the master branch.
#ifndef COLLECTIVECOMMUNICATION_PERSISTENT_HELPER_H_ | ||
#define COLLECTIVECOMMUNICATION_PERSISTENT_HELPER_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer #pragma once
but ls1 isn't consistent with this so 🤷
// check if we were given a communicator | ||
if constexpr( std::is_same_v<Comm, MPI_Comm> ) | ||
_mpi_comm = comm; | ||
// check if we were can cerate a persistent communication request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// check if we were can cerate a persistent communication request | |
// check if we should create a persistent communication request |
std::array<int, sizeof...(Ts)> b; | ||
b.fill(1); | ||
std::array<MPI_Aint, sizeof...(Ts)> d; | ||
std::array<MPI_Datatype, sizeof...(Ts)> t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use more expressive variable names ;)
static inline std::vector<MPI_Request*> _allocated_requests; | ||
static inline std::vector<MPI_Op*> _allocated_ops; | ||
static inline std::vector<MPI_Datatype*> _allocated_types; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand this correctly, these vectors accumulate all MPI requests, ops, and types created throughout the simulation. Doesn't this mean that in a very long-running simulation, this can accumulate a lot of trash if the persistent communication isn't reusing all of this perfectly? Could this lead to problems? Can we somehow build safeguards / checks against this?
template<int T = Fn> | ||
auto persistent() | ||
-> std::enable_if_t< T != 0 > | ||
{ | ||
MPI_Start( &_mpi_members->get_request() ); | ||
MPI_Wait( &_mpi_members->get_request(), MPI_STATUS_IGNORE ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of this name. It sounds like it makes the CollCom object persistent (whatever that would mean). Maybe something along executePersistentRequests
or shorter.
For consistency, the next two functions then might need renaming as well.
unsigned long nNumMolsGlobalEnergyGlobal; | ||
double UkinGlobal; | ||
double UkinTransGlobal; | ||
double UkinRotGlobal; | ||
collComm.get(nNumMolsGlobalEnergyGlobal, UkinGlobal, UkinTransGlobal, UkinRotGlobal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already talked about this but it would be really cool if we could do something like this:
unsigned long nNumMolsGlobalEnergyGlobal; | |
double UkinGlobal; | |
double UkinTransGlobal; | |
double UkinRotGlobal; | |
collComm.get(nNumMolsGlobalEnergyGlobal, UkinGlobal, UkinTransGlobal, UkinRotGlobal); | |
auto [nNumMolsGlobalEnergyGlobal, UkinGlobal, UkinTransGlobal, UkinRotGlobal] = collComm.get(); |
Although looking at the implementation behind get()
I'm not sure how to easily do that 😬 . So if it would be too complicated leave it be.
Description
Implemented persistent communication class. This class can also be used to replace most of the already present collective communication in the code.
Related Pull Requests
Resolved Issues
How Has This Been Tested?
Tested with Spinodal Decomposition example and unit tests.