Skip to content

feat(mcp): cap query results at 1000 rows#30

Merged
tianzhou merged 2 commits into
mainfrom
feat/mcp-result-cap
Jun 25, 2026
Merged

feat(mcp): cap query results at 1000 rows#30
tianzhou merged 2 commits into
mainfrom
feat/mcp-result-cap

Conversation

@tianzhou

Copy link
Copy Markdown
Contributor

Why

The MCP execution tools returned every row. A broad SELECT * FROM events would serialize the whole result into one JSON payload and ship it to the agent — blowing its context window and burning tokens. (Diagnosed while reviewing how desktop SQL clients handle large results: DBeaver caps at 200, DataGrip at 500, etc.)

What

Cap the MCP read path at MAX_RESULT_ROWS = 1000:

  • The result is fetched, then capped; the response carries truncated: true and the true rowCount (from CommandComplete, independent of capping), plus returnedRows and a note telling the agent to narrow with LIMIT/WHERE.
  • query gains an optional maxRows arg to request fewer (clamped to the hard cap).
  • write_data / run_ddl also cap their RETURNING rows, surfacing truncated only when it actually bit.

The agent has SQL, so — unlike a fixed-API tool that needs server-side jq filtering — the right lever is a safety cap plus a clear signal to refine the query.

Scope: MCP only (by decision)

The web UI route stays uncapped. It needs the full result set for CSV export (QueryResults.tsx exports client-side rows) and inline editing, and a human driving a virtualized grid doesn't have the agent's context-blowup problem. Capping the UI would silently truncate exports — so it's deliberately out of scope. The cap constant lives in mcp.ts and isn't imposed on the RPC path.

Tests

  • New unit tests for the pure helpers capRows / readMaxRows (pass-through, exact-cap boundary, truncation, default, clamp-down, reject non-positive/non-integer).
  • Full suite: tests/mcp.test.ts (39) + tests/execute-sql.test.ts (9) = 48 pass; tsc --noEmit clean; pnpm build:server succeeds.
  • Docs (docs/features/mcp-server.mdx) updated.

Follow-ups (noted, not in this PR)

  • Server-RAM bound via a postgres.js cursor (.cursor(cap+1)) so pathological SELECT * doesn't materialize fully server-side first — deferred because it rewires the result path and needs live-DB validation.
  • Per-cell byte guard for wide jsonb/bytea values within the row cap.

🤖 Generated with Claude Code

The MCP execution tools returned every row, so a broad SELECT could flood
an agent's context (and tokens). Cap reads at MAX_RESULT_ROWS (1000): the
result is fetched, then capped, and the response carries `truncated` plus
the true `rowCount` so the agent knows to narrow (LIMIT/WHERE) — it has
SQL, unlike a fixed API. `query` also takes an optional `maxRows` to
request fewer (clamped to the hard cap).

Scope is MCP-only by decision: the web UI route stays uncapped because it
needs the full result set for CSV export and inline editing, and a human
driving the grid (with virtualization) doesn't have the agent's
context-blowup problem.

Pure helpers (capRows, readMaxRows) are unit-tested; docs updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 25, 2026 17:23
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
pgconsole Ready Ready Preview, Comment Jun 25, 2026 5:32pm

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a hard row cap for MCP SQL execution tools to prevent oversized result payloads from flooding the agent context window, while keeping the web UI query path uncapped for CSV export and inline editing.

Changes:

  • Add a hard cap (MAX_RESULT_ROWS = 1000) with truncated signaling and additional metadata (rowCount, returnedRows, note) on MCP query results.
  • Add maxRows option (clamped to the hard cap) and pure helpers capRows / readMaxRows.
  • Add unit tests for the new helpers and update MCP server documentation to describe the cap behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/mcp.test.ts Adds unit tests covering capRows and readMaxRows behavior and validation.
server/mcp.ts Implements MCP-side row capping, exposes maxRows, and enriches tool responses with truncation metadata.
docs/features/mcp-server.mdx Documents the MCP-only cap and guidance for narrowing queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread server/mcp.ts
Comment on lines 523 to 527
const result = await runAndAudit(principal, tool, connection, details, finalSql, rawSql)
// rowCount is the true total from the server (CommandComplete), independent of capping.
const rowCount = result.count ?? result.rows.length
const { rows, truncated } = capRows(result.rows, readMaxRows(args))
if (expectedPerm === 'read') {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6f554dc. execute() now honors maxRows only when expectedPerm === 'read'; write_data/run_ddl always use the hard cap, so a maxRows smuggled past the (non-enforced) schema no longer changes their behavior.

@mintlify

mintlify Bot commented Jun 25, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
pgconsole 🟢 Ready View Preview Jun 25, 2026, 5:27 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@greptile-apps

greptile-apps Bot commented Jun 25, 2026

Copy link
Copy Markdown

Greptile Summary

Caps MCP execution tool results at 1000 rows to prevent agents from blowing their context windows on wide SELECT * queries, and adds an optional maxRows parameter to the query tool so agents can request a tighter limit.

  • capRows and readMaxRows are added as pure exported helpers; execute() runs them on the materialized result for all three tools (query, write_data, run_ddl) and returns truncated, returnedRows, and a guidance note on the read path.
  • The hard cap lives solely in mcp.ts (MAX_RESULT_ROWS = 1000) so the web-UI route remains uncapped, preserving full-result CSV export and inline editing.
  • Unit tests cover the helpers exhaustively; docs are updated to describe the cap and the maxRows override.

Confidence Score: 4/5

Safe to merge; the cap is MCP-only, the helpers are well-tested, and no existing behaviour is changed for the web-UI path.

The core logic is correct and well-tested. Two minor follow-up concerns: the truncation note can mislead agents into raising maxRows rather than using SQL predicates, and the rowCount fallback silently depends on full result materialization — a dependency that will need revisiting when the cursor optimization lands.

server/mcp.ts — the note text in the truncation response and the rowCount fallback comment are worth a second look before the cursor optimization is implemented.

Important Files Changed

Filename Overview
server/mcp.ts Adds MAX_RESULT_ROWS cap, readMaxRows/capRows helpers, and maxRows arg to the query tool; execute() now caps all three tool paths before returning rows
tests/mcp.test.ts Adds unit tests for capRows and readMaxRows covering pass-through, exact-cap boundary, truncation, default, clamp, and rejection of invalid values
docs/features/mcp-server.mdx Documents the 1000-row cap and maxRows override for the query tool; notes the MCP-only scope of the cap

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[MCP tool call: query / write_data / run_ddl] --> B[Permission check + SQL analysis]
    B --> C[runAndAudit → full result materialized in memory]
    C --> D["readMaxRows(args)\nreturns min(maxRows ?? 1000, 1000)"]
    D --> E["capRows(result.rows, cap)"]
    E --> F{truncated?}
    F -- No --> G_read{expectedPerm == read?}
    F -- Yes --> H_read{expectedPerm == read?}
    G_read -- Yes --> I["{ rowCount, returnedRows, truncated:false, columns, rows }"]
    G_read -- No --> J["{ rowCount, rows: rows.length ? rows : undefined }"]
    H_read -- Yes --> K["{ rowCount, returnedRows, truncated:true, note, columns, rows }"]
    H_read -- No --> L["{ rowCount, returnedRows, truncated:true, rows }"]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[MCP tool call: query / write_data / run_ddl] --> B[Permission check + SQL analysis]
    B --> C[runAndAudit → full result materialized in memory]
    C --> D["readMaxRows(args)\nreturns min(maxRows ?? 1000, 1000)"]
    D --> E["capRows(result.rows, cap)"]
    E --> F{truncated?}
    F -- No --> G_read{expectedPerm == read?}
    F -- Yes --> H_read{expectedPerm == read?}
    G_read -- Yes --> I["{ rowCount, returnedRows, truncated:false, columns, rows }"]
    G_read -- No --> J["{ rowCount, rows: rows.length ? rows : undefined }"]
    H_read -- Yes --> K["{ rowCount, returnedRows, truncated:true, note, columns, rows }"]
    H_read -- No --> L["{ rowCount, returnedRows, truncated:true, rows }"]
Loading

Reviews (1): Last reviewed commit: "feat(mcp): cap query results at 1000 row..." | Re-trigger Greptile

Comment thread server/mcp.ts Outdated
returnedRows: rows.length,
truncated,
...(truncated
? { note: `Showing the first ${rows.length} of ${rowCount} rows. Add LIMIT/WHERE, or set maxRows (≤${MAX_RESULT_ROWS}), to narrow.` }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 When an agent explicitly passes a smaller maxRows (e.g. maxRows: 50) and still gets truncated, the note says "set maxRows (≤1000), to narrow" — which reads as an invitation to raise the value toward 1000 rather than a prompt to reduce it further or use SQL predicates. An agent following this hint literally could request more rows than intended. Reflecting the effective cap in the note makes the guidance unambiguous.

Suggested change
? { note: `Showing the first ${rows.length} of ${rowCount} rows. Add LIMIT/WHERE, or set maxRows (≤${MAX_RESULT_ROWS}), to narrow.` }
? { note: `Showing the first ${rows.length} of ${rowCount} rows. Add LIMIT/WHERE, or pass a smaller \`maxRows\` (minimum 1), to narrow.` }

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded in 6f554dc, taking the spirit rather than the literal suggestion: the note now reads "Refine with WHERE/ORDER BY/LIMIT to target the rows you need." I dropped the maxRows mention from the narrowing hint entirely — lowering maxRows just shows fewer rows, not the right ones, so steering toward SQL predicates is the unambiguous guidance. maxRows stays documented in the tool schema for callers who simply want fewer.

Comment thread server/mcp.ts Outdated
Comment on lines +524 to +526
// rowCount is the true total from the server (CommandComplete), independent of capping.
const rowCount = result.count ?? result.rows.length
const { rows, truncated } = capRows(result.rows, readMaxRows(args))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 rowCount fallback relies on full in-memory materialization — the comment says rowCount is "from CommandComplete, independent of capping," but the fallback result.rows.length is the full in-memory result, not the CommandComplete tag directly. This is functionally correct today only because all rows are materialized before capping (as the PR itself notes in follow-ups). If the cursor optimization is later added without updating this line, result.rows.length could become the cursor batch size instead of the true total, making rowCount silently wrong when result.count is not set. A brief code comment here flagging this dependency would help future maintainers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 6f554dc with a clarifying comment. You are right it is correct only under full materialization — the comment now explicitly flags that the result.rows.length fallback is the true total only because the result is materialized before capping, and that a future cursor-based fetch must revisit it.

…comment

Address PR review:
- maxRows is a query-only knob, but execute() applied readMaxRows() for
  all execution tools; since MCP tool schemas aren't enforced at runtime,
  a maxRows smuggled into write_data/run_ddl was honored. Honor it only
  for reads (`expectedPerm === 'read'`); other tools always get the hard
  cap.
- Reworded the truncation note: it said "set maxRows (≤1000)", which read
  as an invitation to raise the value. Now points to WHERE/ORDER BY/LIMIT
  to target the needed rows.
- Tightened the rowCount comment to flag that the result.rows.length
  fallback is the true total only under full materialization, so a future
  cursor change must revisit it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tianzhou tianzhou merged commit 6c21cd5 into main Jun 25, 2026
4 checks passed
@tianzhou tianzhou deleted the feat/mcp-result-cap branch June 29, 2026 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants