Skip to content

OAR: add async to _submit_job method#380

Merged
lesteve merged 2 commits into
dask:masterfrom
msimonin:oar/async
Feb 5, 2020
Merged

OAR: add async to _submit_job method#380
lesteve merged 2 commits into
dask:masterfrom
msimonin:oar/async

Conversation

@msimonin

@msimonin msimonin commented Feb 5, 2020

Copy link
Copy Markdown
Contributor

_submit_job is awaited by the Job.start.

msimonin and others added 2 commits February 5, 2020 15:26
_submit_job is awaited by the `Job.start`.
@lesteve

lesteve commented Feb 5, 2020

Copy link
Copy Markdown
Member

Thanks a lot for fixing this! This bug was very likely added with the SpecCluster refactor (#307) and not noticed since the 0.7 release ...

I added an entry in the changelog, so merging this!

@lesteve lesteve merged commit 3f4cd0c into dask:master Feb 5, 2020
@lesteve

lesteve commented Feb 5, 2020

Copy link
Copy Markdown
Member

As discussed IRL, if you ever get the chance to add CI for OAR, it would be super appreciated!

I would also make OAR more of a first citizen in dask-jobqueue (as SLURM, SGE and PBS) and help avoid similar problems in the future. We use docker-compose for your CI, you can have a look how it is done for SLURM, SGE and PBS: https://github.com/dask/dask-jobqueue/tree/master/ci.

While I am at it, one of the limitation of OARCluster: in the OAR cluster I have access to, oarsub is not installed on the compute nodes. My understanding is that this is the default setup in OAR.

If you create your OARCluster on a compute node (rather than on the login node, which is easier but not completely best practice compliant), a work-around as the one mentioned #333 (comment) is probably necessary.

If it turns out that oarsub is not installed on the compute nodes on your OAR cluster either, we could add a parameter to OARCluster e.g. login_node_host or something similar to make OARCluster easier to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants