I have the pyyaml package installed, and am trying to change the logging config for the Distributed components via the config file. Distributed installs a default YAML config file (from distributed/distributed.yaml), which looks like this:
distributed:
version: 2
# logging:
# distributed: info
# distributed.client: warning
# bokeh: critical
# # http://stackoverflow.com/questions/21234772/python-tornado-disable-logging-to-stderr
# tornado: critical
# tornado.application: error
scheduler:
allowed-failures: 3
# SNIP - file continues
This has the logging config dict contained within the "distributed" top-level dict. However, uncommenting and changing those lines has no effect on the logging config of the various distributed components. This is because distributed.config configures the logging from dask.config.config.get{'logging'), and so is expecting the "logging" dict to be at the top-level, and not within distributed.
Here's the code from distributed.config:
# imports here...
config = dask.config.config
# a bunch of code here
def initialize_logging(config):
if "logging-file-config" in config:
if "logging" in config:
raise RuntimeError(
"Config options 'logging-file-config' and 'logging' are mutually exclusive."
)
_initialize_logging_file_config(config)
else:
log_config = config.get("logging", {})
if "version" in log_config:
# logging module mandates version to be an int
log_config["version"] = int(log_config["version"])
_initialize_logging_new_style(config)
else:
_initialize_logging_old_style(config)
initialize_logging(dask.config.config)
Notice how it's doing basically dask.config.config.get("logging", {}) and using that to figure out which initialize_logging_*_style function. And likewise for the _intialize_logging_*_style functions. For example:
def _initialize_logging_old_style(config):
loggers = { # default values
"distributed": "info",
"distributed.client": "warning",
"bokeh": "critical",
"tornado": "critical",
"tornado.application": "error",
}
loggers.update(config.get("logging", {}))
# SNIP - more code here.
It's being passed the dask.config.config object, and does a .get("logging", {}) on it.
Here's some code that reproduces the problem too:
from dask import config
from distributed.config import initialize_logging
initialize_logging(config.config)
logging.getLogger("distributed")
# Prints out: <Logger distributed (INFO)>
log_doesnt_work = {'distributed': {
'logging': {'distributed': 'debug',
'distributed.client': 'debug',
'distributed.worker': 'debug',
'bokeh': 'critical',
'tornado': 'critical',
'tornado.application': 'error'}}}
config.update(config.config, log_doesnt_work)
initialize_logging(config.config)
logging.getLogger("distributed")
# Prints out: <Logger distributed (INFO)>
log_works = {'logging': {'distributed': 'debug',
'distributed.client': 'debug',
'distributed.worker': 'debug',
'bokeh': 'critical',
'tornado': 'critical',
'tornado.application': 'error'}}
config.refresh()
config.update(config.config, log_works)
initialize_logging(config.config)
logging.getLogger("distributed")
# Prints out: <Logger distributed (DEBUG)>
There are a couple options to fix this. Some possibilities:
- Change the default distributed.yaml config file to have the "logging" dict at the top-level (But then it might conflict with stuff in dask.yaml from Dask. Maybe that's a problem, or maybe not.)
- Change the initialize_logging call at the bottom of distributed.config to do
initialize_logging(dask.config.config.get("distributed", {}) instead.
If you let me know what would be the best way to fix this, I can submit a pull request with the fix. I also assume that there's no test checking for this... What would be the best test to add to make sure that this doesn't break again? Thanks!
I have the pyyaml package installed, and am trying to change the logging config for the Distributed components via the config file. Distributed installs a default YAML config file (from distributed/distributed.yaml), which looks like this:
This has the logging config dict contained within the "distributed" top-level dict. However, uncommenting and changing those lines has no effect on the logging config of the various distributed components. This is because distributed.config configures the logging from dask.config.config.get{'logging'), and so is expecting the "logging" dict to be at the top-level, and not within distributed.
Here's the code from
distributed.config:Notice how it's doing basically
dask.config.config.get("logging", {})and using that to figure out whichinitialize_logging_*_stylefunction. And likewise for the_intialize_logging_*_stylefunctions. For example:It's being passed the dask.config.config object, and does a
.get("logging", {})on it.Here's some code that reproduces the problem too:
There are a couple options to fix this. Some possibilities:
initialize_logging(dask.config.config.get("distributed", {})instead.If you let me know what would be the best way to fix this, I can submit a pull request with the fix. I also assume that there's no test checking for this... What would be the best test to add to make sure that this doesn't break again? Thanks!