Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

When I submit my jobs to run on the Myrinet nodes, why do they continually fail to start?

0
Posted

When I submit my jobs to run on the Myrinet nodes, why do they continually fail to start?

0

For single-processor jobs make certain never to specify “myrinet” nodes only. Myrinet can be especially useful when running some multi-processor/multi-node jobs which are able to benefit from the faster communication between nodes compared to Ethernet. Because such multi-processor jobs can run only on the Myrinet nodes, we want to prevent filling up those nodes with jobs that do not need Myrinet (as long as other nodes are available to run on). Because single-processor jobs run on one machine, and therefore do not benefit from faster communication between nodes, specifying “myrinet” in your PBS script is simply not allowed. If you are running code compiled with Myrinet, it should be parallel code. If not, you need to recompile your code without any Myrinet options. N. b., even though for single-processor jobs you are not allowed to specifically state “myrinet” in your PBS script, the pool of x86 Myrinet nodes are still available to you as long as your code is compiled to run on the x86

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.