Skip to content

fix(mqtt): split reconnect loop handler to reduce log noise#137

Merged
cayossarian merged 1 commit intomainfrom
fix/mqtt-reconnect-log-noise
Apr 17, 2026
Merged

fix(mqtt): split reconnect loop handler to reduce log noise#137
cayossarian merged 1 commit intomainfrom
fix/mqtt-reconnect-log-noise

Conversation

@cayossarian
Copy link
Copy Markdown
Member

Summary

  • Splits the single except Exception handler in SpanMqttClient._reconnect_loop into two branches: expected transient failures (OSError family — ConnectionRefusedError, socket.gaierror, TimeoutError, ssl.SSLError) log a one-line WARNING with the exception repr; unexpected exceptions retain the full traceback via exc_info=True.
  • The common "panel offline" case (e.g. ConnectionRefusedError: [Errno 61]) no longer floods logs with paho/stdlib stack frames that carry no diagnostic signal — the exception type and errno are the full diagnostic.
  • Unknown exceptions still surface full tracebacks so support tickets remain actionable.
  • Bumps version to 2.6.2.

Rationale

Pre-change log on panel disconnect:

WARNING ... Reconnect failed, retrying in 60.0s
Traceback (most recent call last):
  File ".../connection.py", line 409, in _reconnect_loop
    await self._loop.run_in_executor(None, self._client.reconnect)
  File ".../paho/mqtt/client.py", line 1598, in reconnect
  File ".../paho/mqtt/client.py", line 4609, in _create_socket
  File ".../paho/mqtt/client.py", line 4640, in _create_socket_connection
  File ".../socket.py", line 870, in create_connection
  File ".../socket.py", line 855, in create_connection
ConnectionRefusedError: [Errno 61] Connection refused

Every frame is paho/stdlib internals that are identical on every such failure. The last line is the only signal.

Post-change log:

WARNING ... Reconnect failed (ConnectionRefusedError(61, 'Connection refused')), retrying in 60.0s

Note: ssl.SSLError is an OSError subclass and falls into the transient branch — deliberate, since SSL misconfiguration fails at setup (async_setup_entry), not at reconnect. A reconnect-time SSL failure is treated as transient alongside the other network errors.

Test plan

  • All 343 existing tests pass (pre-commit coverage gate green at 96.16%)
  • All lint hooks pass (ruff, black, prettier, mypy, pylint, bandit, vulture, markdownlint, uv lock check)
  • Manual verification: take panel offline, observe single-line WARNING without paho/stdlib frames
  • Manual verification: force an unexpected exception path, confirm traceback still present

Expected transient failures (OSError family — ConnectionRefusedError,
socket.gaierror, TimeoutError, ssl.SSLError) now log a one-line
WARNING with the exception repr instead of a full paho/stdlib
traceback. Unexpected exceptions retain exc_info=True so support
tickets stay actionable.

Bumps version to 2.6.2.
@cayossarian cayossarian merged commit 99f174f into main Apr 17, 2026
6 checks passed
@cayossarian cayossarian deleted the fix/mqtt-reconnect-log-noise branch April 17, 2026 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant