Dask client connects to PBS workers, then rapidly loses them

The title is intentionally analogous to #20 as I have the feeling the explanation for the observed behavior is similar.

I'm on a PBS cluster whose nodes are made of 2 cpus with 14 cores each.

I was initially calling:
```
cluster = PBSCluster(queue='mpi_1', local_directory=local_dir, interface='ib0', walltime='24:00:00',
                     threads=4, processes=7, memory='10GB', resource_spec='select=1:ncpus=28:mem=100g', 
                     death_timeout=100)
```

This led to the creation of workers but they died after creation.

The following choice seems to fix the issue:
```
threads=14, processes=2, memory='50GB', 
```

Here is a link that describes dask workers:
http://distributed.readthedocs.io/en/latest/worker.html
this may be useful to readers having similar issues

Note that the link between cluster architecture and options that can be passed to PBSCluster is still not entirely clear to me.

So my issue seems to be fixed, but I wanted to put this experience visible to people that may encounter similar issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Dask client connects to PBS workers, then rapidly loses them #30

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

Dask client connects to PBS workers, then rapidly loses them #30

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions