Skip to content

Add connection rate limiting#1765

Draft
baelter wants to merge 5 commits intomainfrom
feature/connection-rate-limiting
Draft

Add connection rate limiting#1765
baelter wants to merge 5 commits intomainfrom
feature/connection-rate-limiting

Conversation

@baelter
Copy link
Member

@baelter baelter commented Feb 28, 2026

Summary

  • Add token bucket rate limiter that rejects new connections at the TCP level before the AMQP/MQTT handshake
  • Two config options: connection_rate_limit (global max new connections/sec) and connection_rate_limit_per_ip (per source IP), both default to 0 (unlimited)
  • Rate limiting runs synchronously in the accept loop before spawning a fiber, so rejected connections never allocate a fiber or enter the handshake
  • Rate-limited log messages (max 1/sec) to avoid log spam when rejecting

Motivation

When LavinMQ is CPU-saturated, handshake timeouts cause clients to disconnect and immediately reconnect, creating a death spiral. Each reconnection goes through the full AMQP handshake (confirm_header, start, authenticate, tune, open), consuming CPU that should serve existing connections. Rejecting at the TCP level is orders of magnitude cheaper.

Test plan

  • make test SPEC=spec/connection_rate_limit_spec.cr (11 specs)
  • Manual test: set connection_rate_limit = 10, connect many clients, verify excess connections are rejected immediately
  • Verify existing connection tests still pass

Token bucket rate limiter that rejects new connections exceeding
configurable global and per-IP thresholds by closing the TCP
socket before the AMQP/MQTT handshake, avoiding CPU waste.

Config options: connection_rate_limit, connection_rate_limit_per_ip
(both default to 0/unlimited).
@claude

This comment was marked as resolved.

Check per-IP limit before global limit so a rejected per-IP
connection doesn't consume a global token unfairly.

Cap tracked IPs at 100k to prevent unbounded hash growth
from large numbers of unique source IPs.
@claude

This comment was marked as resolved.

- Remove pointer-based consume_token, modify instance vars directly
- Evict oldest entry instead of rejecting when IP table is full
- Rate-limit the table-full log warning
- Remove dead @log_suppressed field
@claude

This comment was marked as resolved.

- Merge @per_ip_tokens and @per_ip_last_refill into a single hash
  with a PerIPState record to eliminate sync risk
- Use Hash#first_key for O(1) eviction instead of min_by O(n) scan
- Document the per-IP token consumption trade-off in allow? comment
@claude

This comment was marked as outdated.

Guard all mutable state with a Mutex so the limiter is safe when
listener fibers and the stats_loop fiber run on separate OS threads
under preview_mt.
@claude
Copy link

claude bot commented Mar 2, 2026

Code Review

No issues found. The implementation is solid:

  • Token bucket algorithm is correctly implemented with proper mutex synchronization for thread safety.
  • Rate limit checks happen synchronously in the accept loop (before spawning fibers), which correctly prevents fiber explosion under connection floods.
  • Per-IP and global limits compose correctly via short-circuit evaluation (per-IP checked first to avoid wasting global tokens).
  • Bounded memory usage via MAX_TRACKED_IPS cap with O(1) eviction, plus periodic cleanup_stale_entries from the stats loop.
  • PerIPState is a record (struct), avoiding heap allocations on state updates.
  • Moving remote_address/remote_addr outside the spawn blocks is safe — closure capture works correctly and the values are used properly in both the rate check and the spawned fiber.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants