Enable Parallel Communication Through MDI #6

taylor-a-barnes · 2018-09-13T14:26:58Z

If possible, provide a means for communicating in parallel between two codes through the MDI interface. This will be a challenging project, and will require considerable planning. One of the main complications is that the data structures for the two codes might be laid out in entirely different ways. One possible solution might be to assign a particular standard data structure layout for MDI. Then, each production code can define a mapping of its data structure onto the MDI data structure for each relevant send or receive command. For example, the production code might define a mapping of the COORD data, which is used whenever it receives the COORD command. Once it knows the mappings, MDI is then responsible for correctly shuttling information between the production code and the driver in the correct way.

taylor-a-barnes · 2019-08-23T18:39:06Z

One possible approach is described here.

Every message sent through MDI is prefaced with an integer, recv_dist, which is a flag that indicates whether a map is going to be sent before the remainder of the message.

In order for a driver or an engine to communicate in a distributed manner, one or both of them must first call:

MDI_Create_Map(&my_elements, &map_descriptor)
MDI_Set_Map(“COMMAND”, &map_descriptor, comm)

When a DRIVER does the above, it sends a special command >MAP command through comm, which provides the engine with its map for that command. When an ENGINE does the above, MDI stores the map, but does not send anything to the driver.

When the driver sends a command that requests information (a command starting with "<"): On the engine side, MDI flags recv_dist in its message to the driver, and prepends the engine's map to the remainder of the message. On the driver side, MDI sees that the recv_dist flag has been set, and reads in the engine's map before the rest of the message. On both the engine side and the driver side, MDI has enough information to determine the sender and receiver of each element of the message, and can call MPI_Alltoallw to communicate the data.
When the driver sends a command that gives information to the engine (a command starting with ">"): On the driver side, MDI sets the recv_dist flag, but does not prepend the driver's map to the remainder of the message (the engine already has this information). The driver-side MDI knows how many MPI ranks are associated with the engine, and sends equal chunks of the message to each rank of the engine in a standardized distribution. On the engine side, MDI sees that the recv_dist flag has been set and receives the message in the standardized distribution through an MPI_Alltoallw. It then uses the map previously set by the engine code to perform a second MPI_Alltoallw to convert from the standardized distribution to the engine's local data distribution.

In order to avoid the need for this second MPI_Alltoallw, it might be possible to have an MDI_Update_Maps() function that synchronizes the maps of all of the codes. The engine-side MDI would then need to keep track of whether the driver-side MDI has the most up-to-date map. If not, the engine-side MDI receives the data in a manner consistent with the most recent map sent to the driver, and then does a second MPI_Alltoallw to convert from the old map to the new map.

taylor-a-barnes modified the milestones: 1.1, 2.0 Jan 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Parallel Communication Through MDI #6

Enable Parallel Communication Through MDI #6

taylor-a-barnes commented Sep 13, 2018

taylor-a-barnes commented Aug 23, 2019

Enable Parallel Communication Through MDI #6

Enable Parallel Communication Through MDI #6

Comments

taylor-a-barnes commented Sep 13, 2018

taylor-a-barnes commented Aug 23, 2019