Dear dask community,
I am moving my first steps with Dask and distributed, and I wish to bring up a couple of ideas and suggestions.
The first one is about enhancing the integration with the schedulers.
Dask workers are usually requested in bulk.
When the schedulers allow it, i.e. with SGE, I think it's a good practice to submit those as job (task) arrays rather than multiple independent jobs.
In systems with heavy loads it would make the work of the scheduler much easier.
I don't know the code well enough yet to come up with a complete solution, I think there are multiple places where the code needs changes. So I would like to be sure it's a good idea before working on it.
What do you think?
Dear dask community,
I am moving my first steps with Dask and distributed, and I wish to bring up a couple of ideas and suggestions.
The first one is about enhancing the integration with the schedulers.
Dask workers are usually requested in bulk.
When the schedulers allow it, i.e. with SGE, I think it's a good practice to submit those as job (task) arrays rather than multiple independent jobs.
In systems with heavy loads it would make the work of the scheduler much easier.
I don't know the code well enough yet to come up with a complete solution, I think there are multiple places where the code needs changes. So I would like to be sure it's a good idea before working on it.
What do you think?