Skip to content

[codex] Add rollout-backed thread content search#23519

Merged
fc-oai merged 13 commits into
mainfrom
fc/thread-search-rg
May 21, 2026
Merged

[codex] Add rollout-backed thread content search#23519
fc-oai merged 13 commits into
mainfrom
fc/thread-search-rg

Conversation

@fc-oai
Copy link
Copy Markdown
Contributor

@fc-oai fc-oai commented May 19, 2026

Summary

  • add experimental thread/search for local rollout-backed thread search using rg over JSONL rollouts
  • return search-specific result rows with optional previews instead of storing preview data on StoredThread or ordinary Thread responses
  • keep thread/list separate from full-content search and document the new app-server surface

Testing

  • cargo test -p codex-app-server-protocol
  • cargo test -p codex-app-server thread_search_returns_content_and_title_matches -- --nocapture

@fc-oai fc-oai changed the title Add rollout-backed thread content search [codex] Add rollout-backed thread content search May 19, 2026
@fc-oai fc-oai marked this pull request as ready for review May 19, 2026 22:08
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7373d67e24

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

SortDirection::Asc => StoreSortDirection::Asc,
SortDirection::Desc => StoreSortDirection::Desc,
},
allowed_sources: compute_source_filters(source_kinds).0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply post-filtered source kinds to search results

When sourceKinds contains kinds that compute_source_filters cannot express as rollout SessionSources (for example exec, appServer, or any subagent variant), that helper intentionally returns an empty allowed_sources plus a post-filter, and thread/list applies source_kind_matches afterwards. thread/search only passes .0 here and never applies the post-filter to the returned rows, so those requests search with an empty source filter, which the store treats as “all sources”; a thread/search for sourceKinds: ["exec"] can therefore return CLI/VS Code matches.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@owenlin0 owenlin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the current impl does this:

rg --json over the entire sessions/archive root
  -> collect all stdout
  -> parse every matching line event
  -> parse each matched JSONL line into a rollout item
  -> keep first valid snippet per file path
  -> separately page thread metadata
  -> intersect metadata rows with the snippet map

for a common search term it could be a lot of data to pull into memory

Here's codex's suggestion:

One performance concern with this search path is that rg --json emits every matching JSONL line and we collect all stdout before processing it. For broad terms this can get very large even though the API only returns one result per thread. In a local benchmark against ~/.codex/sessions (2,248 JSONL files / 1.5 GB), a common-ish query produced ~636 MB of rg --json stdout and took 2.3s.

I think this can be simpler and more aligned with the API contract by using rg -l as the first pass:

  1. Run rg -l --fixed-strings --no-ignore --glob '*.jsonl' -- <term> <sessions-root> to get matching rollout paths.
  2. Put those paths in a set.
  3. Page/list threads in the requested API sort order via the existing metadata path.
  4. For rows whose rollout path is in the match set, open that rollout JSONL and parse until the first valid user/assistant ThreadSearchPreview::ContentMatch snippet.
  5. If a path matched only internal/tool/metadata content and does not produce a valid semantic snippet, skip it and continue through ordered metadata rows until limit + 1 semantic results are found or the metadata listing is exhausted.
  6. Stop after limit + 1 semantic results so nextCursor can be computed.

This keeps SQLite/listing as the source of truth for API ordering and uses ripgrep only to answer “which rollout files contain the raw term?”. It also preserves the current semantic filter: a raw ripgrep hit is only a candidate; it becomes a search result only after rollout parsing confirms the match is in user/assistant-visible content.

This should avoid huge output for common terms because rg -l prints each path once instead of every matching line. In the same benchmark, global rg -l plus metadata ordering took ~63ms for the common-ish query, ~149ms for a rare query, and ~106ms for a no-match query.

thoughts? I had codex benchmark my own sessions directory with this approach

Results:

Common-ish query:
current rg --json:      ~2.321s, ~636 MB stdout
proposed rg -l:         ~0.063s, ~178 KB stdout

Rare query:
current rg --json:      ~0.163s, ~252 KB stdout
proposed rg -l:         ~0.149s, ~110 B stdout

No-match query:
current rg --json:      ~0.099s, ~257 B stdout
proposed rg -l:         ~0.106s, 0 B stdout

}

#[test]
fn thread_search_is_marked_experimental() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: codex likes to add these tests but not super high value IMO - can we remove?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably should update AGENTS.md to not add trivial tests like these and remove the ones below, but no worries about doing it in this PR

Copy link
Copy Markdown
Collaborator

@owenlin0 owenlin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, can we switch to using InstallContext::rg_command() instead of invoking rg from the PATH directly? we do bundle ripgrep in our standalone codex installations

Copy link
Copy Markdown
Collaborator

@owenlin0 owenlin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love @wiltzius-openai to take a look too since this touches ThreadStore.

@wiltzius-openai for your context, @fc-oai would like to add an experimental thread/search API so the Codex App can expose a "thread search" feature. can you validate if this is something that we should eventually support in ThreadStore? would be nice to have a cloud-based solution for this as well eventually

@owenlin0 owenlin0 requested a review from wiltzius-openai May 20, 2026 17:23
let Some(local_store) = self
.thread_store
.as_any()
.downcast_ref::<LocalThreadStore>()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, we have an existing pattern for ThreadStore and throwing a ThreadStoreError::Unsupported error if unsupported on local or remote as needed. can we follow that?

thread/search should be modeled as a ThreadStore operation instead of downcasting to LocalThreadStore in the app-server processor.

Suggested shape:

  1. Add a search_threads(...) -> ThreadStoreResult<ThreadSearchPage> method to the ThreadStore trait.
  2. Give the trait default implementation:
    Err(ThreadStoreError::Unsupported { operation: "thread/search" })
  3. Implement it for LocalThreadStore by calling the existing local rollout search path.
  4. In app-server, call self.thread_store.search_threads(...) instead of as_any().downcast_ref::<LocalThreadStore>().
  5. Let the existing thread_store_list_error / unsupported_thread_store_operation mapping convert ThreadStoreError::Unsupported into the JSON-RPC error.

That keeps the API boundary honest: non-local stores report “unsupported” instead of returning an empty page, while future remote stores can add their own implementation without changing app-server dispatch.

@fc-oai fc-oai requested a review from owenlin0 May 20, 2026 17:55
@fc-oai fc-oai force-pushed the fc/thread-search-rg branch from 3b46e34 to 58156cd Compare May 20, 2026 18:55
Comment thread codex-rs/app-server-protocol/src/protocol/v2/thread_data.rs Outdated
@fc-oai fc-oai force-pushed the fc/thread-search-rg branch from 58156cd to 34c8b40 Compare May 20, 2026 20:57
Comment thread codex-rs/thread-store/src/store.rs Outdated
Copy link
Copy Markdown
Collaborator

@owenlin0 owenlin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small note on search semantics.

Comment thread codex-rs/rollout/src/search.rs
Copy link
Copy Markdown
Contributor

@wiltzius-openai wiltzius-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General direction seems OK but I think there are some implementation improvements possible

For remote thread store I think it would be possible to support the search contract (it'll just require a lot of infra)

#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ThreadSearchResult {
pub thread: Thread,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm this will be a Thread object that does not contain Turns right, it isn't the full history? I think full history would be too expensive.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. It's not full history. turns will be []

.map_err(thread_store_list_error)?;

for result in page.items {
let source = with_thread_spawn_agent_metadata(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cruft in this search implementation. I think what's happening is:

  • it searches in thread store using a search term and some some additional parameters
  • it applies a post-filter based on source kind? Is there a reason this filter can't be applied internally?
  • then it has to re-loop the search since it no longer necessarily has a full page of results?

Comment thread codex-rs/app-server/src/request_processors/thread_processor.rs
Comment thread codex-rs/app-server/README.md Outdated
- `cwd` — restrict results to threads whose session cwd exactly matches this path, or one of these paths when an array is provided. Relative paths are resolved against the app-server process cwd before matching.
- `useStateDbOnly` — when `true`, return from the state DB without scanning JSONL rollouts to repair metadata. Omit or pass `false` to preserve the default scan-and-repair behavior.
- `searchTerm` — restrict results to threads whose extracted title contains this substring (case-sensitive).
- `searchTerm` — on `thread/list`, restrict results using the existing stored-thread search behavior. Use experimental `thread/search` when callers need persisted user/assistant content search plus matching snippet context.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"using the existing stored-thread search behavior." isn't super helpful, the old text is more precise -- I assume that's what was preserved (case-sensitive title search)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fc-oai can we revert this change to the README?

sort_direction,
)
.await?;
for item in page.items {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a little inefficient to ripgrep over the entire rollout history then enumerate all threads filtering to only those that match the ripgrep results. You already have the paths from the ripgrep search, so why not use that directly in the snippet generation (which takes a path)?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that approach, while seemingly a bit inefficient in theory, is actually faster when a search term has a lot of matches

The tradeoff is broad terms. The old SQLite-intersection approach is good when a term appears in lots of files, because SQLite can walk newest-first and stop once it has a page. The revised approach has to read metadata for every rg hit before it can sort and page, so common terms like "thread" regress badly.

let mut items = matching_items
.into_iter()
.filter_map(|item| {
stored_thread_from_rollout_item(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is returning a stored thread but with only a single rollout item as its history? A little funky but I guess that works (its just a "different" shape of thread than anywhere else I think)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if there's a reason to include both the snippet and the item?

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct ThreadSearchPage {
/// Search results returned for this page.
pub items: Vec<StoredThreadSearchResult>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: slightly confusing to call this "items" since they're actually threads + snippets and the threads themselves contain "items" (if I'm reading correctly, a list of a single item?)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed but also this follows the current pattern for all of the *Page structs in the threadstore contract. wonder if we should call it data instead of items?

@fc-oai fc-oai force-pushed the fc/thread-search-rg branch from 267cc10 to 41f247d Compare May 21, 2026 16:53
@fc-oai fc-oai merged commit ac0bff2 into main May 21, 2026
31 checks passed
@fc-oai fc-oai deleted the fc/thread-search-rg branch May 21, 2026 18:52
@github-actions github-actions Bot locked and limited conversation to collaborators May 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants