Skip to content

feat: add sync_max_age_days to email adapter#547

Open
connorblack wants to merge 4 commits into
spacedriveapp:mainfrom
connorblack:feat/email-sync-max-age
Open

feat: add sync_max_age_days to email adapter#547
connorblack wants to merge 4 commits into
spacedriveapp:mainfrom
connorblack:feat/email-sync-max-age

Conversation

@connorblack

@connorblack connorblack commented Apr 6, 2026

Copy link
Copy Markdown

Summary

Bound first-connect email backfill to a configurable recency window. Without this, the first poll after an email account is added imports every unread message in the configured folders, and the agent tries to respond to years of stale mail.

sync_max_age_days defaults to 0 (no limit) so existing setups are unchanged. When set, the poll query becomes UNSEEN SINCE <date> instead of just UNSEEN. The SINCE filter is evaluated server-side by the IMAP server, so the check is cheap on large mailboxes and no messages are fetched only to be discarded locally.

Works on the default email config and on per-instance configs.

Problem

Connecting an email account to a mailbox that has accumulated hundreds of unread emails from months (or years) ago floods the agent with stale messages. Each one looks like a new conversation, and the agent responds to mail it has no business responding to.

The existing email_search tool already supports a since_days filter (see build_imap_search_criterion in src/messaging/email.rs), so the IMAP path is well-trodden. The poll path was the gap.

Solution

Mirror the SINCE clause construction in poll_inbox_once and gate it on a new sync_max_age_days config field. Thread the field through the TOML schema, the loaded config types, and the adapter runtime config in the same way as the existing email knobs.

[messaging.email]
sync_max_age_days = 1  # cap backfill at one day of unread mail

# also works on named instances
[[messaging.email.instances]]
name = "support"
sync_max_age_days = 7

Semantics

IMAP SINCE (RFC 3501 §6.4.4) is inclusive and matches against whole dates at midnight in the server's local time, not rolling 24-hour windows. sync_max_age_days = 1 therefore bounds the oldest mail the poller will import to "received on or after yesterday's date" — which can include mail up to ~48 hours old depending on the current time and the server's timezone. Treat the value as a backfill cap, not a literal "last N hours" window; if you need a strict rolling window, set a value comfortably larger than your real cutoff (e.g. 2 to approximate "last 24h").

Implementation notes

  • Extracted a shared build_since_date() helper used by both poll_inbox_once and build_imap_search_criterion. Single source of truth for the dd-MMM-YYYY format string and the overflow guard.
  • Extracted a build_poll_search_query() helper that assembles the UNSEEN vs UNSEEN SINCE <date> query string. The poll-time assembly is unit-tested in isolation from the IMAP session.
  • Added a MAX_SINCE_DAYS clamp (1_000_000) inside build_since_date. Without it, a config value past chrono's internal TimeDelta bound would panic the poll task; a value past i64::MAX would wrap to a negative ChronoDuration and produce a future SINCE date, silently excluding every message and re-introducing the exact bug the feature is supposed to prevent. Both failure modes are now unreachable: oversized inputs degrade to "no SINCE clause" and the poller behaves as if sync_max_age_days = 0.
  • Six unit tests cover the new helpers: zero/missing input, IMAP date format, large-input saturation, poll query assembly for zero/non-zero/oversized values, and the existing build_imap_search_criterion escaping behavior.
  • sync_max_age_days is included in the Debug impls for both EmailConfig and EmailInstanceConfig so config dumps reflect the actual polling state during debugging.

Files changed

File Change
src/messaging/email.rs Add build_since_date and build_poll_search_query helpers, plumb sync_max_age_days through EmailPollConfig/EmailAdapter, use the helpers in poll_inbox_once, add 3 new unit tests
src/config/toml_schema.rs Add sync_max_age_days: u64 to TomlEmailConfig and TomlEmailInstanceConfig with #[serde(default)]
src/config/types.rs Add field to EmailConfig and EmailInstanceConfig; include in both Debug impls
src/config/load.rs Thread field through default and per-instance config construction
src/config.rs Update test fixture to set sync_max_age_days: 0
docs/content/docs/(messaging)/email-setup.mdx Document the new field, add example to the leading TOML block, add "Limiting backfill on first connect" section with a precise explanation of IMAP SINCE semantics

Test plan

  • cargo check --all-targets — clean compile
  • cargo test --lib — 881 tests passing (875 baseline + 6 new)
  • cargo clippy --all-targets -- -D warnings — no warnings
  • cargo fmt --all -- --check — clean
  • just gate-pr-ci-fast — all gate checks pass
  • Manual test with a real IMAP mailbox containing old unread emails

Reviewer notes

  • _unused MAX_SINCE_DAYS could be tightened to whatever value the team prefers — 1_000_000 days (~2739 years) was chosen as obviously-past-anything-sane without being so small it surprises users who think in decades.
  • The IMAP SINCE semantics are documented precisely in both the in-code doc comment on build_poll_search_query and the docs page. If the team would rather rename the field (e.g. backfill_max_age_days) or add a separate since_window_hours for users who need rolling windows, that's a small follow-up.

Out of scope (known follow-ups)

  • The poll loop has a pre-existing retry issue: a message that consistently fails to parse stays UNSEEN forever, and the next poll picks it up again. The SINCE filter narrows the scope of this bug but doesn't fix it. Worth a follow-up PR with a per-UID retry budget.
  • from_instance_config rebuilds a temporary EmailConfig to call Self::build. Every new email config field has to be copied in this temp struct. Consider a From<&EmailInstanceConfig> for EmailConfig impl.
  • search_mailbox in the same file hand-builds an EmailPollConfig literal; the new field is propagated there, but the broader divergence between search_mailbox's literal and build()'s initialization is a pre-existing maintenance hazard.

When connecting an email account for the first time, every unread email
in the inbox gets imported and treated as new. This floods the agent
with stale messages it shouldn't respond to.

Add a `sync_max_age_days` config option (default: 0 / no limit) that
combines IMAP's UNSEEN flag with a SINCE date filter so only recent
unread emails are imported. The SINCE query is evaluated server-side
by the IMAP server, so it's efficient even on large mailboxes.

Supported on both the default email config and per-instance configs.
Copilot AI review requested due to automatic review settings April 6, 2026 19:32
@coderabbitai

coderabbitai Bot commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a4fa31c4-3e7a-41bf-b450-8bdb7eeb12c0

📥 Commits

Reviewing files that changed from the base of the PR and between 6531919 and ca32577.

📒 Files selected for processing (3)
  • docs/content/docs/(messaging)/email-setup.mdx
  • src/config/types.rs
  • src/messaging/email.rs
✅ Files skipped from review due to trivial changes (1)
  • docs/content/docs/(messaging)/email-setup.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/config/types.rs
  • src/messaging/email.rs

Walkthrough

A new sync_max_age_days parameter is added to TOML schema and runtime config types, mapped during config load, threaded into EmailAdapter/EmailPollConfig, and used to optionally add IMAP SINCE filtering; tests and docs were updated accordingly.

Changes

Email sync config and polling

Layer / File(s) Summary
TOML schema and public types
src/config/toml_schema.rs, src/config/types.rs
Added sync_max_age_days: u64 with #[serde(default)] to TomlEmailConfig and TomlEmailInstanceConfig, and added pub sync_max_age_days: u64 to EmailConfig and EmailInstanceConfig; updated Debug outputs and made instances serde-default.
Config loading wiring
src/config/load.rs
Mapped instance.sync_max_age_days into each constructed EmailInstanceConfig and email.sync_max_age_days into the top-level EmailConfig during config construction.
Email adapter polling, IMAP query, tests, and docs
src/messaging/email.rs, docs/content/docs/(messaging)/email-setup.mdx
Threaded sync_max_age_days into EmailAdapter and EmailPollConfig; added build_since_date(days) and MAX_SINCE_DAYS; updated IMAP search criterion to use UNSEEN SINCE {date} when helper returns a date; added unit tests for date handling and updated documentation describing the setting.
Validation test update
src/config.rs
Updated an unsupported-platform adapter validation test to include sync_max_age_days in its inline EmailConfig construction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add sync_max_age_days to email adapter' accurately and concisely summarizes the main change—adding a new configuration field to limit email backfill scope.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, clearly explaining the problem, solution, implementation details, and testing approach.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a sync_max_age_days configuration option to the email adapter to limit IMAP polling to recent unread emails, preventing first-connect inbox floods from importing long-stale UNSEEN messages.

Changes:

  • Thread sync_max_age_days through TOML schema → loaded config types → adapter runtime config.
  • Update IMAP polling to use UNSEEN SINCE <date> when sync_max_age_days > 0.
  • Update config test fixture to include the new field.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/messaging/email.rs Adds sync_max_age_days to polling config and builds UNSEEN + SINCE IMAP search queries during polling.
src/config/types.rs Adds sync_max_age_days to EmailConfig and EmailInstanceConfig.
src/config/toml_schema.rs Adds sync_max_age_days to TOML deserialization structs with default behavior.
src/config/load.rs Threads sync_max_age_days through config loading for default and instance email configs.
src/config.rs Updates config fixture to set sync_max_age_days: 0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/messaging/email.rs Outdated
Comment thread src/messaging/email.rs Outdated
Comment thread src/messaging/email.rs Outdated
Comment thread src/config/types.rs
Comment thread src/config/types.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/messaging/email.rs`:
- Around line 723-726: The cast config.sync_max_age_days as i64 can wrap for
large u64 values; validate or safely convert before calling
ChronoDuration::days() to avoid producing a future SINCE date. Replace the blind
cast in the search_query construction by using
i64::try_from(config.sync_max_age_days) (or clamp to a sensible max) and handle
the Err by returning/config error or defaulting; ensure the conversion happens
before calling ChronoDuration::days() and reference config.sync_max_age_days,
ChronoDuration::days(), and the search_query/Utc::now() usage when locating the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b6b793cc-691d-47c5-85c2-9a862a5372dd

📥 Commits

Reviewing files that changed from the base of the PR and between fb5c0f3 and 3b80855.

📒 Files selected for processing (5)
  • src/config.rs
  • src/config/load.rs
  • src/config/toml_schema.rs
  • src/config/types.rs
  • src/messaging/email.rs

Comment thread src/messaging/email.rs Outdated
Extract the UNSEEN SINCE <date> construction into build_since_date() and
use it from both poll_inbox_once and build_imap_search_criterion. The
helper clamps day counts to MAX_SINCE_DAYS (1_000_000) and returns None
for values past chrono::TimeDelta's internal bounds, replacing the
original 'u64 as i64' cast in poll_inbox_once that would wrap to a
negative ChronoDuration for inputs past i64::MAX and produce a future
SINCE date that silently excluded every message.

Also adds 3 unit tests for build_since_date and updates the email-setup
docs to document sync_max_age_days on the default config and per-instance
configs.
@connorblack

Copy link
Copy Markdown
Author

@jamiepine — could you take a look when you get a chance?

I ran a front-to-back adversarial review before opening this up. The design is sound; the only blocking issue was a u64 as i64 cast that would wrap to a negative ChronoDuration for inputs past i64::MAX, producing a future SINCE date that silently excluded every message (re-introducing the exact bug this PR is supposed to fix). CodeRabbit flagged the same line. Pushed in 6531919d:

  • Extracted build_since_date() shared between poll_inbox_once and build_imap_search_criterion so the date format and the overflow guard stay in lockstep.
  • Clamped the day count to MAX_SINCE_DAYS = 1_000_000 (~2739 years), which is well past any realistic config and safely inside chrono's TimeDelta bounds. Anything larger degrades to "no SINCE clause" (i.e. the poller behaves as if sync_max_age_days = 0) instead of panicking or producing a future date.
  • Added 3 unit tests for the helper.
  • Updated email-setup.mdx to document the new field and added an example to the leading TOML block.

just gate-pr-ci-fast is clean: 878 tests pass (875 existing + 3 new), clippy/fmt clean.

Two pre-existing things I noticed but left out of scope — happy to follow up either as separate PRs or fold them in if you'd rather keep them in this slice:

  • Poll-loop retry budget: a message that consistently fails to parse stays UNSEEN forever; the next poll picks it up again. The SINCE filter narrows the scope but doesn't fix it.
  • from_instance_config rebuilds a temporary EmailConfig to call Self::build, and every new email field has to be copied in this temp. A From<&EmailInstanceConfig> for EmailConfig impl would remove the maintenance hazard.

… semantics

Addresses the remaining review comments on PR spacedriveapp#547:

- Extract build_poll_search_query() so the UNSEEN / UNSEEN SINCE <date>
  assembly is unit-testable independently of the IMAP session. Add
  three tests covering zero, non-zero, and absurd input.
- Add sync_max_age_days to the Debug impls for both EmailConfig and
  EmailInstanceConfig. The field is non-sensitive and omitting it made
  config dumps misleading when diagnosing polling behavior.
- Update docs and PR description to describe the actual IMAP SINCE
  semantics precisely: inclusive midnight boundary, not a rolling
  24h window. Treat the value as a backfill cap.

@capy-ai capy-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 1 comment


Default is `0` (no limit), which preserves the original behavior.

**Semantics.** IMAP `SINCE` (RFC 3501 §6.4.4) is inclusive and matches against whole dates at midnight in the server's local time, not rolling 24-hour windows. `sync_max_age_days = 1` therefore bounds the *oldest* mail the poller will import to "received on or after yesterday's date", which can include mail up to ~48 hours old depending on the current time and the server's timezone. Treat the value as a backfill cap, not a literal "last N hours" window — pick a value that's comfortably larger than your real cutoff if you need to be strict (e.g. use 2 to approximate "last 24h").

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 Medium] [🔵 Bug]

The new semantics note says users should pick a value "comfortably larger" than their real cutoff to be strict, but larger sync_max_age_days values include older messages and therefore make the cutoff less strict; this can cause operators to ingest more stale unread mail than intended when following the docs. Update the guidance to describe choosing the oldest acceptable date (or explicitly call out the extra slack) without implying larger values are stricter.

// docs/content/docs/(messaging)/email-setup.mdx
Default is `0` (no limit), which preserves the original behavior.
**Semantics.** IMAP `SINCE` (RFC 3501 §6.4.4) is inclusive and matches against whole dates at midnight in the server's local time, not rolling 24-hour windows. `sync_max_age_days = 1` therefore bounds the *oldest* mail the poller will import to "received on or after yesterday's date", which can include mail up to ~48 hours old depending on the current time and the server's timezone. Treat the value as a backfill cap, not a literal "last N hours" window — pick a value that's comfortably larger than your real cutoff if you need to be strict (e.g. use 2 to approximate "last 24h").
## Verify it's working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants