fix(telegram): resolve connection pool exhaustion causing silent message loss#8320
Open
whatevertogo wants to merge 2 commits into
Open
fix(telegram): resolve connection pool exhaustion causing silent message loss#8320whatevertogo wants to merge 2 commits into
whatevertogo wants to merge 2 commits into
Conversation
…age loss Configure ApplicationBuilder with proper HTTP timeouts for long-polling: - get_updates_read_timeout 60s (must exceed Telegram's ~30s long-poll) - get_updates_connect_timeout 15s, pool_timeout 10s (allow pool recovery) - General API read_timeout 30s, connect_timeout 10s, pool_timeout 5s Also add asyncio.wait_for timeout shield around updater.stop() in shutdown to prevent indefinite hang when pool is already exhausted. Resolves AstrBotDevs#8314
Contributor
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new HTTP timeout and pool settings in
_build_applicationare hard-coded; consider wiring them to adapter config (with current values as defaults) so they can be tuned for different environments without code changes. - In
_shutdown_application, the broadexcept Exception: passaroundawait asyncio.wait_for(updater.stop(), ...)will silently swallow unexpected errors; it would be safer to either reusecontextlib.suppressfor specific exception types or at least log non-timeout exceptions.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new HTTP timeout and pool settings in `_build_application` are hard-coded; consider wiring them to adapter config (with current values as defaults) so they can be tuned for different environments without code changes.
- In `_shutdown_application`, the broad `except Exception: pass` around `await asyncio.wait_for(updater.stop(), ...)` will silently swallow unexpected errors; it would be safer to either reuse `contextlib.suppress` for specific exception types or at least log non-timeout exceptions.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/telegram/tg_adapter.py" line_range="172-183" />
<code_context>
+ self._application_started = False
+
+ updater = self.application.updater
+ if updater is not None:
+ try:
+ await asyncio.wait_for(updater.stop(), timeout=10.0)
+ except asyncio.TimeoutError:
+ logger.warning(
+ "Telegram updater stop timed out; connection pool may be exhausted."
+ )
+ except Exception:
+ pass
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Swallowing all exceptions from `updater.stop()` may hide shutdown issues.
Since you’re now handling timeouts explicitly, consider logging other exceptions instead of `except Exception: pass`. If `updater.stop()` starts failing after a library or dependency change, those failures will be invisible and hard to debug. Emitting at least a warning (e.g. `logger.warning(..., exc_info=True)`) would keep shutdown robust while making real issues detectable.
```suggestion
self._application_started = False
updater = self.application.updater
if updater is not None:
try:
await asyncio.wait_for(updater.stop(), timeout=10.0)
except asyncio.TimeoutError:
logger.warning(
"Telegram updater stop timed out; connection pool may be exhausted."
)
except Exception:
logger.warning(
"Error while stopping Telegram updater; shutdown may be incomplete.",
exc_info=True,
)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Contributor
There was a problem hiding this comment.
Code Review
This pull request enhances the Telegram adapter's resilience by configuring explicit timeouts for long-polling and general API calls. It also adds a timeout mechanism when stopping the updater during shutdown to avoid resource exhaustion. Feedback suggests increasing the connection pool size, as a single stalled connection could still block the adapter despite the new timeout settings.
- Increase get_updates_connection_pool_size from 1 to 2 so a single stalled connection doesn't exhaust the pool entirely - Log non-timeout exceptions during updater.stop() instead of silently swallowing them (they may indicate real shutdown issues) Resolves AstrBotDevs#8314
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves #8314
The Telegram adapter stops receiving messages after running for a while, requiring a manual adapter restart. The root cause is connection pool exhaustion in the
httpxclient used bypython-telegram-bot's long-polling mechanism.Root Cause
ApplicationBuilderdefaults create aget_updates_requestwith:read_timeout=5.0s— far shorter than Telegram's long-poll timeout (~30s), causing legitimate waiting connections to be forcibly droppedpool_timeout=1.0s— gives up immediately when the single connection pool slot is occupiedconnection_pool_size=1— only one connection for long-pollingWhen the single long-poll connection becomes stuck (network glitch, half-open TCP, proxy), the pool is exhausted. Subsequent
getUpdatescalls hitPoolTimeoutand the adapter silently stops receiving messages.On shutdown, the same exhaustion causes
updater.stop()to hang indefinitely as it tries a finalget_updatescall (visible in the issue's stack trace).Changes
_build_application()— Configure proper HTTP timeoutsget_updates_read_timeoutget_updates_connect_timeoutget_updates_pool_timeoutread_timeoutconnect_timeoutpool_timeout_shutdown_application()— Add timeout shield aroundupdater.stop()Wrap
updater.stop()inasyncio.wait_for(..., timeout=10.0)so shutdown doesn't hang indefinitely when the pool is already exhausted.Verification
ruff check— All checks passed ✅ruff format— Clean ✅Summary by Sourcery
Adjust Telegram adapter HTTP client timeouts and shutdown behavior to prevent long‑polling lockups and ensure clean termination.
Bug Fixes: