Skip to content

trouble setting dask distributed worker to log level debug #6124

Description

@lastephey

What happened:

Dear Dask devs,

I am unable to set the only the dask distributed worker log level to 'DEBUG' when manually starting a cluster. I have seen there have been several issues and PRs on this subject, but I apologize-- it's still not totally clear to me what the state is or what the correct syntax is.

#2937
#2419
#2952
#4642

What you expected to happen:

I was trying to follow the advice in the dask distributed docs. I copy/pasted the logging yaml example into the configuration conversion utility to try to generate the right environment variables. The example formatting generated was

export DASK_LOGGING__DISTRIBUTED="debug"
export DASK_LOGGING__DISTRIBUTED.CLIENT="debug"
export DASK_LOGGING__DISTRIBUTED.WORKER="debug"

I saw though in this comment that the dot in the key should translate to a double underscore in the environment variable, so I have changed that in my test below.

Minimal Complete Verifiable Example:

My ultimate goal is to set the workers and maybe client to log level debug, but here I'll demonstrate with the scheduler since it shows the same behavior.

Based on the output of the config generator, I first tried:

DASK_LOGGING__DISTRIBUTED__SCHEDULER=debug \
python -m distributed.cli.dask_scheduler \
    --protocol ucx \
    --interface hsn0 \
    --scheduler-file $scheduler_file &

although this syntax is apparently invalid. It gives the error:

TypeError: Level not an integer or a valid string: {'scheduler': 'debug'}

Based on this comment I also tried:

DASK_LOGGING__DISTRIBUTED=debug \
python -m distributed.cli.dask_scheduler \
    --protocol ucx \
    --interface hsn0 \
    --scheduler-file $scheduler_file &

which does work today, even if it didn't 2 years ago (which is confusing). This turns on logging for everything which is good, but of course it's really verbose. I was hoping I could turn it on only for particular parts of Dask distributed.

tl;dr I only stumbled onto a partially working solution. I would really appreciate it if you could clarify what should/shouldn't work, and perhaps document it somewhere like the logging page.

Anything else we need to know?:

Environment:

  • Dask version:
dask                      2022.3.0           pyhd8ed1ab_1    conda-forge
dask-core                 2022.3.0           pyhd8ed1ab_0    conda-forge
dask-cuda                 22.06.00a220401         py38_13    rapidsai-nightly
dask-cudf                 22.04.00a220401 cuda_11_py38_ga02b7c2b44_301    rapidsai-nightly
  • Python version: 3.8.5

  • Operating System: Running inside a custom container implementation shifter, ubuntu 20.04

  • Install method (conda, pip, source): conda

Cluster Dump State:

Thank you all very much for your work on Dask,
Laurie

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions