Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Desmond and Jaguar parallel jobs fail when sent to an LSF queueing system. What can I do?

0
Posted

Desmond and Jaguar parallel jobs fail when sent to an LSF queueing system. What can I do?

0

MPI parallel jobs may fail when using the LSF queuing system, giving an error in the log file similar to: installation/mmshare-v19103/lib/Linux-x86_64/openmpi/bin/orterun: symbol lookup error: installation/mmshare-v19103/lib/Linux-x86_64/openmpi/lib/openmpi/mca_plm_lsf.so: undefined symbol: lsb_init where installation is the directory that contains your Schrödinger software installation. This failure is due to a bug in OpenMPI that causes problems for tight integration with LSF. This problem has been fixed in Schrödinger Suite 2010, with a patch to the version of Open MPI in the Schrödinger software distribution. If you are using another version of Open MPI and would like to recompile it with the patch included, send email to help@schrodinger.com. Otherwise, the simplest workaround is to disable the tight integration with the LSF queue. Your jobs should still run, but it will be harder for LSF to clean up certain types of job failures (which should be rare). To disable the tight integr

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.