Skip to content

Configure Dask workers to contact scheduler on a specific address #548

Description

@etejedor

At CERN we have a Jupyter notebook service that we are now integrating with HTCondor resources, and we would like to use those resources via Dask.

The setup is the following: users log in to the notebook service and get a user session, which runs in a Docker container. Inside their session, users should be able to create a Dask HTCondorCluster to deploy Dask workers on our HTCondor pool. The problem we have is that the address that the scheduler binds to can’t be the same as the address workers use to contact the scheduler. The scheduler runs inside the container, and should listen on an address:port of the private network of the container. However, the workers (which are running in another network in the HTCondor pool) should contact the scheduler on an address:port of the node that hosts the user container, for which we would setup port forwarding to reach the container.

It looks like there currently no way for the workers to receive a different scheduler address than the address the scheduler binds to. We found dask/distributed#2963, but that only allows to specify a different address for the client to contact the scheduler (i.e. the scheduler must still bind to the same address that the workers receive).

Would it be interesting to support a use case like the one I just described? How could it be implemented? Perhaps via a new option for the scheduler to specify what address should workers use to connect to it. The naming should be clear to avoid confusion with the already existing external_address (added in dask/distributed#2963).

Pinging @oshadura as she had a proposal for such a patch.

(Previously discussed in:
https://dask.discourse.group/t/dask-scheduler-in-a-docker-container-workers-as-htcondor-jobs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions