Skip to content

LSFCluster may be overly specific? #328

Description

@mrocklin

I was trying out dask-jobqueue on the Summit supercomputer at Oak Ridge National Labs. I ran into a number of problems with our current configuration that seem to be special cases. I propose un-special-casing these, but I would like to get feedback from others who use LSF.

bsub command

We currently do some odd things with bsub. This made things fail for me on a login node (although they did work on a compute node). Removing this special-cased behavior made things work well for me in both places.

diff --git a/dask_jobqueue/lsf.py b/dask_jobqueue/lsf.py
index 95042a8..12126ee 100644
--- a/dask_jobqueue/lsf.py
+++ b/dask_jobqueue/lsf.py
@@ -53,7 +53,7 @@ class LSFJob(Job):
     """,
         4,
     )
-    submit_command = "bsub <"
+    submit_command = "bsub"
     cancel_command = "bkill"

     def __init__(
@@ -134,10 +134,6 @@ class LSFJob(Job):

         logger.debug("Job script: \n %s" % self.job_script())

-    def _submit_job(self, script_filename):
-        piped_cmd = [self.submit_command + " " + script_filename + " 2> /dev/null"]
-        return self._call(piped_cmd, shell=True)

cc'ing @raybellwaves and @guillaumeeb , who show up under git blame for this code

-R "span[hosts=1]"

My particular deployment didn't like these lines. I don't know if these are very useful generally though, and I can work around them.

            if ncpus > 1:
                # span[hosts=1] _might_ affect queue waiting
                # time, and is not required if ncpus==1
                header_lines.append('#BSUB -R "span[hosts=1]"')

cc @louisabraham (also pointed to by git blame)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions