Skip to content

Logging config from default distributed.yaml config has no effect #2839

Description

@chrish42

I have the pyyaml package installed, and am trying to change the logging config for the Distributed components via the config file. Distributed installs a default YAML config file (from distributed/distributed.yaml), which looks like this:

distributed:
  version: 2
  # logging:
  #   distributed: info
  #   distributed.client: warning
  #   bokeh: critical
  #   # http://stackoverflow.com/questions/21234772/python-tornado-disable-logging-to-stderr
  #   tornado: critical
  #   tornado.application: error

  scheduler:
    allowed-failures: 3  
# SNIP - file continues

This has the logging config dict contained within the "distributed" top-level dict. However, uncommenting and changing those lines has no effect on the logging config of the various distributed components. This is because distributed.config configures the logging from dask.config.config.get{'logging'), and so is expecting the "logging" dict to be at the top-level, and not within distributed.

Here's the code from distributed.config:

# imports here...
config = dask.config.config

# a bunch of code here

def initialize_logging(config):
    if "logging-file-config" in config:
        if "logging" in config:
            raise RuntimeError(
                "Config options 'logging-file-config' and 'logging' are mutually exclusive."
            )
        _initialize_logging_file_config(config)
    else:
        log_config = config.get("logging", {})
        if "version" in log_config:
            # logging module mandates version to be an int
            log_config["version"] = int(log_config["version"])
            _initialize_logging_new_style(config)
        else:
            _initialize_logging_old_style(config)

initialize_logging(dask.config.config)

Notice how it's doing basically dask.config.config.get("logging", {}) and using that to figure out which initialize_logging_*_style function. And likewise for the _intialize_logging_*_style functions. For example:

def _initialize_logging_old_style(config):
    loggers = {  # default values
        "distributed": "info",
        "distributed.client": "warning",
        "bokeh": "critical",
        "tornado": "critical",
        "tornado.application": "error",
    }
    loggers.update(config.get("logging", {}))
    # SNIP - more code here.

It's being passed the dask.config.config object, and does a .get("logging", {}) on it.

Here's some code that reproduces the problem too:

from dask import config
from distributed.config import initialize_logging

initialize_logging(config.config)
logging.getLogger("distributed")

# Prints out: <Logger distributed (INFO)>

log_doesnt_work = {'distributed': {
  'logging': {'distributed': 'debug',
   'distributed.client': 'debug',
   'distributed.worker': 'debug',
   'bokeh': 'critical',
   'tornado': 'critical',
   'tornado.application': 'error'}}}
config.update(config.config, log_doesnt_work)

initialize_logging(config.config)
logging.getLogger("distributed")

# Prints out: <Logger distributed (INFO)>

log_works = {'logging': {'distributed': 'debug',
  'distributed.client': 'debug',
  'distributed.worker': 'debug',
  'bokeh': 'critical',
  'tornado': 'critical',
  'tornado.application': 'error'}}
config.refresh()
config.update(config.config, log_works)

initialize_logging(config.config)
logging.getLogger("distributed")

# Prints out: <Logger distributed (DEBUG)>

There are a couple options to fix this. Some possibilities:

  1. Change the default distributed.yaml config file to have the "logging" dict at the top-level (But then it might conflict with stuff in dask.yaml from Dask. Maybe that's a problem, or maybe not.)
  2. Change the initialize_logging call at the bottom of distributed.config to do initialize_logging(dask.config.config.get("distributed", {}) instead.

If you let me know what would be the best way to fix this, I can submit a pull request with the fix. I also assume that there's no test checking for this... What would be the best test to add to make sure that this doesn't break again? Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions