Skip to content

telegram bot reliability: silence noise, retry sends, multi-chat digest, persist nonces#130

Merged
0p3r4t0r44 merged 4 commits into
mainfrom
telegram-silence-third-party-commands
Apr 27, 2026
Merged

telegram bot reliability: silence noise, retry sends, multi-chat digest, persist nonces#130
0p3r4t0r44 merged 4 commits into
mainfrom
telegram-silence-third-party-commands

Conversation

@0p3r4t0r44

@0p3r4t0r44 0p3r4t0r44 commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Four telegram-bot reliability fixes, stacked on the same branch because
each builds on the previous (multi-chat needs the retry layer's
`send_to`; persistent nonces use the same Redis path the rest of the
service already depends on).

1. Silence third-party raid/sniper commands in DMs (`9d294a8`)

PR #127 stopped "unknown command" replies from going to other bots and
from groups, but DMs still hit the hint when a user types a third-party
command like `/raid` or `/setup_raid` themselves.

  • New `THIRD_PARTY_COMMANDS` allowlist (raid bots, sniper bots,
    chat-mod bots, common slash-prefixed slang) — silenced regardless of
    chat type.
  • New `looks_like_our_command` heuristic — only replies with a hint
    when the unrecognized command is purely alphabetic ASCII between 2
    and 12 chars (the shape of every command we own). `/setup_raid`,
    `/raid123`, single letters all fall through silently.

2. Retry sends on 429 and transient 5xx (`88b17c8`)

Both telegram clients used to log and drop any non-2xx response. Digest
signals were silently lost on rate-limit; command replies vanished on
transient 5xx.

  • Shared `send_with_retry` helper in `telegram_format`, used by both
    the alerter-side `TelegramClient` and the polling-side client in
    `telegram_commands`.
  • Honours Telegram's `parameters.retry_after` on 429.
  • Bounded exponential backoff (1s, 2s, 4s, 8s, 16s, capped at 60s) for
    5xx and network errors.
  • 4xx other than 429 → permanent, no retry.
  • Pure `classify_response` function for testability.

3. Fan the digest out per-chat (`a0c6e36`)

The digest scheduler used to drain the bus and post one message to the
env `TELEGRAM_CHAT_ID`. Per-chat overrides (`/threshold`, `/mute`,
`/subscribe`, `/quiet`) wrote rows into `tg_chat_config` but the
alerter ignored them — users were typing commands that did nothing.

  • Resolves destinations as env chat ∪ all `tg_chat_config` rows.
  • Per chat: applies quiet window, subscribed_kinds, muted_markets,
    effective_threshold.
  • Per-chat cooldown via `HashMap<(i64, String), u64>` projected down
    to the existing `select_top_signals` shape — a market just sent to
    chat A doesn't suppress it for chat B.
  • New `TelegramClient::send_to(chat_id, ...)` lets the digest target
    arbitrary chats without changing the existing `send()` surface that
    other alerters depend on.

4. Persist /link nonces in Redis (`209bf92`)

The `/link` nonce store was deliberately in-memory: a HashMap on the
long-poll task. Every API restart wiped it, forcing every user mid-flow
to redo `/link`. With auto-deploy from main that meant a fresh nonce
request after every merge.

  • New `NonceStore` writes to Redis under `tg:link_pending:<chat_id>`
    with the configured `NONCE_TTL_SECS` as the key TTL.
  • Single-use semantics: `peek` is non-destructive so signature
    verification can run first, then `consume` deletes.
  • Redis hiccups fall back to an in-memory shadow map — `/link` keeps
    working through an outage, just without persistence.
  • The chat-keyed scheme also fixes the previous non-deterministic
    HashMap-ordering behavior when multiple wallets had pending /link in
    the same chat. Latest `/link` wins.

Test plan

  • `cargo test -p relay44-backend --lib` — 530 passed, no failures
  • Telegram-area tests: 24 `telegram_commands` + 18 `telegram_format` + 12 `digest_scheduler` = 54 passed (16 new across the four commits)
  • `cargo clippy -p relay44-backend --lib -- -D warnings` — clean
  • After deploy: `/raid` in DM → no reply; `/statu` in DM → hint;
    `/threshold 10` in a non-env chat → next digest respects it;
    restart api, then `/verify` an outstanding link nonce → still works
    (nonce survived); digest survives a forced 429 from the bot api (manual)

Notes

  • Chats with no `tg_chat_config` row keep today's behavior — env
    threshold, no mutes, all kinds, no quiet window.
  • The retry helper is invoked on every `sendMessage` call, so command
    replies and digest sends both benefit.
  • Redis was already a hard dep; no new infrastructure.
  • Out of scope: trade-from-DM execution, native `/raid` feature, the
    dormant direct-send paths in `probability_alert`/`new_market_alert`,
    digest idempotency under genuine retry-exhaust.

PR #127 stopped the bot from replying "unknown command" to other bots
and to unknown commands in groups, but DMs still hit the hint when a
user types a third-party command like /raid or /setup_raid because
those messages come from a real user, not a bot.

This adds two layers on top:

1. A vocabulary list of well-known third-party commands (raid bots,
   sniper bots, chat-mod bots, common slang typed with a slash) that
   we silently ignore regardless of chat type.
2. A "looks like ours" heuristic — only reply with a hint when the
   unknown command is purely alphabetic ascii between 2 and 12 chars,
   which matches the shape of every command we own. /setup_raid,
   /raid123, single-letter commands, and anything else falls through
   silently.

Together they stop the bot from ever responding "unknown command"
during raid setup, while still helping a user who types /statu in a
dm.
@0p3r4t0r44 0p3r4t0r44 force-pushed the telegram-silence-third-party-commands branch from 1f2304f to 9d294a8 Compare April 26, 2026 21:05
Both telegram clients (the alerter-side TelegramClient in telegram_format
and the polling-side client in telegram_commands) used to log and drop
any non-2xx response. Digest signals were silently lost on rate-limit;
command replies vanished on transient 5xx.

Adds a shared send_with_retry helper in telegram_format that:

- Honours Telegram's parameters.retry_after on 429 responses
- Backs off exponentially (1s, 2s, 4s, 8s, 16s, capped at 60s) for 5xx
  and network errors
- Treats 4xx other than 429 as permanent — no retry, single error returned
- Retries up to 5 attempts before giving up

A pure classify_response function makes the policy testable without a
live HTTP call. New unit tests cover 2xx/4xx/5xx/429 classification,
retry-after extraction, and the backoff schedule.

Both clients now route their sendMessage calls through the helper.
The polling-side getUpdates path is unchanged because long-poll errors
are already retried by the outer loop.
The digest scheduler used to drain the alert bus and post one message to
the env TELEGRAM_CHAT_ID. Per-chat overrides (/threshold, /mute,
/subscribe, /quiet) wrote rows into tg_chat_config but the alerter
ignored them — users were typing commands that did nothing.

This wires the digest tick to:

1. Drain the bus once per tick (preserves single-publisher semantics).
2. Resolve destination chats from the env default plus every row in
   tg_chat_config (deduped, env chat first).
3. For each destination, apply the per-chat filter pipeline:
   - quiet window — skip the chat for this tick
   - subscribed_kinds — drop signals the chat opted out of
   - muted_markets — drop signals for muted slug or market_key
   - effective_threshold — drop signals whose move_size is below the
     chat's threshold (env default if unset)
4. Run select_top_signals with a per-chat cooldown slice so a market
   that was just sent to chat A doesn't suppress it for chat B.
5. Send via TelegramClient::send_to(chat_id, ...) — a new method that
   targets an arbitrary chat without changing the existing send()
   surface that other alerters depend on.

Per-chat cooldowns are tracked in a single HashMap<(i64, String), u64>
on the spawn task, projected down to the per-chat shape select_top_signals
already expects via scoped_cooldowns. New unit tests cover the projection
and confirm cooldowns isolate properly between chats.

list_chat_ids in tg_chat_config returns every configured chat. DB
errors return an empty vec so the env-default chat continues to receive
alerts even when the table query fails.
@0p3r4t0r44 0p3r4t0r44 changed the title telegram: silence third-party raid/sniper commands in DMs telegram: silence noise, retry sends, fan digest out per-chat Apr 26, 2026
@0p3r4t0r44 0p3r4t0r44 changed the title telegram: silence noise, retry sends, fan digest out per-chat telegram bot reliability: silence noise, retry sends, multi-chat digest, persist nonces Apr 26, 2026
The /link nonce store was deliberately in-memory: a HashMap keyed on
(chat_id, wallet_lower) on the long-poll task. Every API restart wiped
it, forcing every user mid-flow to redo /link. With auto-deploy from
main that meant a fresh nonce request after each merge.

Replaces the type alias + free functions with a NonceStore struct that
writes to Redis under tg:link_pending:<chat_id> with the configured
NONCE_TTL_SECS as the key TTL. The value is a small JSON blob holding
the wallet lowercase and the nonce hex so /verify can resolve both
without a second key.

Single-use semantics live at the consume step: peek is non-destructive
so /verify can recover the wallet, run signature verification, and only
then delete. A losing concurrent /verify call would re-fetch the same
peek but the second tg_chat_config upsert is idempotent for the same
wallet, so the race is safe.

Redis hiccups fall back to an in-memory map shadow on the same task —
/link still works during a Redis outage, just without persistence.
The fallback is purged on every operation so expired entries do not
leak.

The chat-keyed scheme also cleans up the multi-pending-wallet case the
old HashMap had ambiguously: when a user re-runs /link with a different
wallet in the same chat, the latest one wins. Today the behavior was
whichever HashMap iteration returned first, which was already
non-deterministic across multiple pending wallets.

Tests cover the in-memory fallback path directly. The Redis glue is a
thin wrapper over RedisService::set/get/delete and is exercised in
integration only.
@0p3r4t0r44 0p3r4t0r44 force-pushed the telegram-silence-third-party-commands branch from 209bf92 to 8f5b8a9 Compare April 26, 2026 22:13
@0p3r4t0r44 0p3r4t0r44 enabled auto-merge (squash) April 26, 2026 22:14
@0p3r4t0r44 0p3r4t0r44 merged commit 3e9fccd into main Apr 27, 2026
11 of 12 checks passed
@0p3r4t0r44 0p3r4t0r44 deleted the telegram-silence-third-party-commands branch April 27, 2026 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant