fetch: harden against SSRF (private/loopback/metadata block, scheme block, redirect revalidation)#4205
Open
JAE0Y2N wants to merge 1 commit into
Open
Conversation
…heme block, redirect revalidation) The fetch tool previously passed any user-controlled URL directly to httpx.get with follow_redirects=True. Combined with the model-callable nature of MCP tools, this enabled an attacker who could influence a prompt to make the host fetch arbitrary local-network or cloud-metadata resources. On EC2 hosts this is enough to read the IAM-role token via http://169.254.169.254/latest/meta-data/iam/...; on developer laptops it exposes localhost services; on office networks it exposes RFC1918 internal services. file:// scheme is also honored by httpx and lets the tool double as a local-file read primitive. Defenses landed: * assert_url_safe_or_raise() rejects non-http(s) schemes (file://, gopher://, dict://, ftp://, …) and resolves the hostname to its full IP set, rejecting any whose IP falls in loopback / link-local / RFC1918 / multicast / reserved / unspecified ranges. "any IP in the set is blocked" matters because httpx's connection pool may pick any A/AAAA record returned by DNS — a single loopback entry is enough to win the race. * fetch_url() now follows redirects manually with follow_redirects= False and per-hop revalidation, so a public URL that 302s to http://127.0.0.1/ is rejected at the second hop rather than silently followed. Bounded to MAX_REDIRECTS to prevent loops. * New --allow-private-networks CLI flag (defaults off) for the legitimate use cases — developer-loop tooling, internal-network scraping behind a trusted egress allowlist. The scheme block is unconditional even when this flag is set. * 22 new tests covering each rejection class (literal IP, hostname resolution, multi-A-record, DNS-failure, redirect revalidation, opt-in escape hatch). Ruff + pyright clean. Known residual risk: DNS rebinding between resolution and connect. The hostname → IP check is best-effort. For higher assurance run the fetch server behind a network egress filter that enforces the same policy.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The fetch tool currently passes any user-controlled URL straight to
httpx.getwithfollow_redirects=True. Combined with the model-callable nature of MCP tools, this turns the server into a confused-deputy SSRF primitive whenever someone can influence a prompt: a malicious string fed to the model can be turned into a request againsthttp://127.0.0.1,http://169.254.169.254/latest/meta-data/iam/..., RFC1918 ranges, orfile:///etc/passwd(httpx honorsfile://). On EC2 deployments the IAM-credential exfiltration path is the most concerning; on developer laptops the local-services exposure is the most common; on office networks the internal-services exposure matters.The SECURITY.md note that the reference servers aren't bounty-eligible is noted, so I'm landing this as a hardening contribution rather than a vulnerability report. The patch keeps behavior identical for normal external-URL fetches and adds a backstop for the three concrete bypass surfaces.
What landed
assert_url_safe_or_raise(url, allow_private_networks)is now called before every outbound request:{http, https}. Everything else (file://,gopher://,dict://,ftp://, …) raisesINVALID_PARAMS. This is unconditional — the opt-in flag below does not unlock it.socket.getaddrinfoand every returned IP is checked. If any is loopback / link-local / RFC1918 / multicast / reserved / unspecified, the URL is rejected. The "any" semantics matter becausehttpx's pool may pick any A/AAAA entry — a single loopback record in the response is enough to win a race.http:///path) and DNS failures are treated as rejection, not fail-open.fetch_urlnow follows redirects manually withfollow_redirects=Falseand re-runsassert_url_safe_or_raiseagainst eachLocationheader. Without this, a public 302 tohttp://127.0.0.1/would silently bypass the gate. Redirect chains are bounded byMAX_REDIRECTS = 10.--allow-private-networksis a new opt-in CLI flag for the legitimate use cases (developer-loop tooling, internal-network scraping behind a trusted egress proxy). When set, the IP-range check is skipped, but the scheme block stays.Tests
22 new tests in
TestAssertUrlSafeandTestFetchUrlRedirectRevalidationcovering:file://,gopher://,dict://scheme rejection127.0.0.1+127.255.255.254)[::1])169.254.169.254link-local)10.0.0.5,192.168.1.1,172.20.0.1)0.0.0.0)127.0.0.1is rejectedMAX_REDIRECTSAll 41 tests pass (19 existing + 22 new).
ruff checkclean,pyrightclean.Known residual
The hostname → IP check is best-effort against DNS rebinding — the IP seen at validation may differ from the IP that
httpxconnects to. For full assurance, run the fetch server behind a network egress filter that enforces the same policy at the connection layer. Documented in the docstring.Manual reproducer (for reviewers who want to verify)
Before the patch (with bogus env API keys), starting the fetch server and asking the model:
returns the EC2 metadata service response. After the patch the same request raises
INVALID_PARAMSwith the message "target IP 169.254.169.254 is a link-local address (covers cloud-metadata endpoints)".— Jaeyoung Yun