Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Does Open MPI support end-to-end data reliability in MPI message passing?

April 26, 2017Data end-to-end message MPI passing reliability support

0

Posted

Does Open MPI support end-to-end data reliability in MPI message passing?

1 Answer

0

Posted

The current release of Open MPI does not support end-to-end data reliability in message passing any more than the underlying network already guarantees. Future releases of Open MPI will include explicit data reliability support (i.e., more functionality than is provided by the underlying network). Specifically, the data reliability (“dr”) PML component (available on the trunk, but not yet in a stable release) assumes that the underlying network is unreliable. It can drop / restart connections, retransmit corrupted or lost data, etc. The end effect is that data sent through MPI API functions will be guaranteed to be reliable. For example, if you’re using TCP as a message transport, chances of data corruption are fairly low. However, other interconnects do not guarantee that data will be uncorrupted when traveling across the network. Additionally, there are nonzero possibilities that data can be corrupted while traversing PCI buses, etc. (some corruption errors at this level can be caugh