Add --recurse mode to fedify lookup and fix separator output#608
Conversation
Added relationship-oriented recursion to `fedify lookup` with `--recurse` and `--recurse-depth`, including parser-level constraints. `--traverse` and `--recurse` are now mutually exclusive, and `--recurse-depth` requires `--recurse`. Implemented recursive chain fetching with visited tracking, depth limits, and best-effort behavior through `--suppress-errors`. Fixed separator rendering in lookup output paths so separators are printed as plain text without extra quotes in recurse/traverse and other multi-object flows. Expanded lookup tests to cover parser behavior, recursion semantics, cycle/depth handling, and error suppression, and updated CLI docs and changelog entries accordingly. Fixes fedify-dev#606 Co-Authored-By: OpenAI Codex <codex@openai.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
--recurse mode to fedify lookup and fix separator output
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable recursive lookup feature to the fedify lookup command and fixes an issue with separator rendering. The implementation is well-structured, with clear separation of concerns in the new helper functions like collectRecursiveObjects. The command-line options are thoughtfully designed to be mutually exclusive with traversal mode, and the addition of comprehensive tests ensures the new functionality is robust. The documentation updates are also clear and helpful.
There was a problem hiding this comment.
Pull request overview
Adds a new recursive lookup mode to fedify lookup for following object relationships (e.g., reply chains), and adjusts multi-object output so separators are emitted as plain text (unquoted).
Changes:
- Added
--recurse/--recurse-depthlookup mode with cycle protection, depth limiting, and best-effort behavior via--suppress-errors. - Reworked lookup-mode parsing to make
--traverseand--recursemutually exclusive and to require--recursewhen using--recurse-depth. - Updated output separator printing in recurse/traverse and other multi-object flows, plus docs/changelog and expanded CLI tests.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/cli/src/lookup.ts | Implements recurse mode, recursion helpers, and updates separator printing paths. |
| packages/cli/src/lookup.test.ts | Adds parser and recursion helper tests for recurse mode and error/cycle/depth behaviors. |
| packages/cli/src/config.ts | Extends lookup config schema with recurse and recurseDepth. |
| docs/cli.md | Documents --recurse, --recurse-depth, updated --suppress-errors, and usage notes. |
| CHANGES.md | Adds changelog entries for recurse feature and separator output fix. |
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
Align config validation with CLI option constraints for --recurse-depth. Ref: fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Update CLI documentation to match current implementation that accepts multiple inputs in traverse/recurse flows. Ref: fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Emit separators only before each printed object after the first to avoid duplicate separators in multi-object lookup output. Ref: fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Align config validation with CLI option constraints for --recurse-depth. fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Update CLI documentation to match current implementation that accepts multiple inputs in traverse/recurse flows. fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Emit separators only before each printed object after the first to avoid duplicate separators in multi-object lookup output. Ref: fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
6953933 to
ee07874
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a recursive lookup feature (--recurse) to the fedify lookup command, allowing users to traverse object relationships like reply chains, and also includes a fix for separator rendering in multi-object outputs. However, a critical security vulnerability exists in how output is handled when the --output option is used: the writeObjectToStream function truncates the output file on every call and leaks file descriptors, leading to data loss and potential resource exhaustion. It is recommended to refactor the output logic to open the file stream once and pass it to the writing function. Additionally, there are suggestions to improve code maintainability and simplify a helper function.
Open the output stream once in runLookup() and pass it to writeObjectToStream() so recursive and traversal flows do not truncate the file per object. Also close streams explicitly and add a regression test for stream reuse. fedify-dev#608 (comment) fedify-dev#608 (comment) fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Do not mark a recursive target as visited until the lookup succeeds. This keeps suppress-errors mode from blocking later attempts to the same target. Add a regression test for visited-set behavior when lookup fails in suppress mode. fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
Collapse getRecursiveTargetId() to a single return now that recurse properties are validated by the CLI parser. fedify-dev#608 (comment) Co-Authored-By: OpenAI Codex <codex@openai.com>
|
/gemini review |
Reject unsafe extension fragments containing path separators or parent-directory traversal markers before composing the temporary output path. This prevents path traversal through crafted image URLs in downloadImage. Also adds a regression test covering a traversal-style URL suffix. fedify-dev#608 (comment)
Handle quote recursion properties explicitly in getRecursiveTargetId instead of relying on an implicit non-reply fallback. This keeps the mapping self-documenting and returns null for unexpected values. Also adds a regression test that verifies unknown recurse properties fall back to null. fedify-dev#608 (comment)
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant new feature, --recurse, to the fedify lookup command, allowing for recursive traversal of object relationships. It also includes a fix for separator quoting in multi-object outputs and a crucial security enhancement that disallows private address lookups by default. The implementation is robust, with comprehensive tests covering new functionality, edge cases, and security considerations. My review includes suggestions to further improve security by adopting a safer default parameter, enhance maintainability by refactoring the process exit logic, and increase the clarity of the changelog.
Note: Security Review did not run due to the size of the PR.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
packages/cli/src/imagerenderer.ts:149
downloadImage()now validates URLs withvalidatePublicUrl()and manually follows up to 10 redirects (and will skip private/localhost targets and redirect chains). This is a user-visible behavior/security change for image rendering infedify lookup, but it isn’t mentioned in the PR description/CHANGES summary; please document it (or add a changelog entry) so users understand why some images may no longer render.
export async function downloadImage(url: string): Promise<string | null> {
try {
let targetUrl = url;
let response: Response | null = null;
for (let redirectCount = 0; redirectCount < 10; redirectCount++) {
await validatePublicUrl(targetUrl);
response = await fetch(targetUrl, { redirect: "manual" });
if (
response.status === 301 || response.status === 302 ||
response.status === 303 || response.status === 307 ||
response.status === 308
) {
const location = response.headers.get("location");
if (location == null) {
await response.body?.cancel();
return null;
}
await response.body?.cancel();
targetUrl = new URL(location, targetUrl).href;
continue;
}
break;
}
if (response == null) return null;
if (!response.ok) {
await response.body?.cancel();
return null;
}
const imageData = new Uint8Array(await response.arrayBuffer());
const extension = new URL(targetUrl).pathname.split(".").pop() || "jpg";
if (
extension.includes("/") || extension.includes("\\") ||
extension.includes("..")
) {
return null;
}
const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "fedify"));
const tempPath = path.join(tempDir, `image.${extension}`);
await fs.writeFile(tempPath, imageData);
return tempPath;
} catch (_error) {
return null;
}
Document that --allow-private-address only applies to explicit lookup/traverse targets, while recursive fetch steps always disallow private and localhost addresses. This updates both the lookup CLI docs and the @fedify/cli changelog entry for the recurse/private-address hardening. fedify-dev#608 (comment)
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a recursive lookup feature to fedify lookup and resolves a separator rendering bug. The implementation is robust, including new CLI options, updated documentation, and thorough tests. Security is well-addressed through a hardened image downloader and distinct document loaders for recursive fetches, which correctly restrict access to private addresses. The overall code quality is high, and the noted areas for maintainability enhancement, particularly by refactoring the large runLookup function and other readability improvements, remain valid.
Detect private/localhost URL validation failures and show actionable hints instead of always suggesting authorized fetch. - In lookup/traverse fetch failures, suggest -p/--allow-private-address when the failure is a private-address UrlError. - In recurse step failures, explain that recursive fetches always block private/localhost targets and suggest best-effort suppression or explicit non-recursive fetches. - Keep existing -a/--authorized-fetch guidance for non-UrlError cases. Also adds unit tests for the hint classification helper. fedify-dev#608 (comment) fedify-dev#608 (comment) fedify-dev#608 (comment)
Switch getRecursiveTargetId() to an explicit switch statement for recurseProperty handling. This keeps the mapping table-style and easier to extend as additional recurse properties are introduced. fedify-dev#608 (comment)
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new --recurse feature to the fedify lookup command, enabling users to recursively follow object relationships such as reply and quote chains up to a specified depth. It adds --recurse and --recurse-depth options, making them mutually exclusive with --traverse. The command's security is enhanced by disallowing private/localhost document loads by default, with a new -p/--allow-private-address option to re-enable this for explicit lookups, though recursive fetches strictly disallow private addresses. The --suppress-errors option now also applies to recursive operations. Code changes include updating configuration schemas to support new options, refactoring document loaders to incorporate allowPrivateAddress and cache isolation, and improving image downloading with redirect validation and path traversal prevention. The output handling for fedify lookup has been refactored for better stream management and error reporting, including specific hints for private address or authorization issues. Documentation and tests have been updated to reflect these new features and security measures.
UrlError can represent URL validation failures beyond private-address cases, so treating every UrlError as private produced incorrect hints. Also, printLookupFailureHint() suppressed all hints whenever authorized fetch was enabled, which hid private/localhost guidance. This change narrows private-address detection for UrlError and only suppresses the authorized-fetch hint when auth loading is already enabled. fedify-dev#608 (comment) fedify-dev#608 (comment) Co-Authored-By: Codex <codex@openai.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a powerful recursive lookup feature to the fedify lookup command and significantly hardens the CLI against Server-Side Request Forgery (SSRF) vulnerabilities. The implementation of --recurse and --recurse-depth is robust, with proper handling of depth limits and cycles. The security improvements, particularly in imagerenderer and the new default-deny policy for private addresses in lookup, are excellent. The code is well-structured, and the addition of comprehensive tests is commendable. I have one suggestion to improve security consistency across the CLI toolchain by changing a default parameter.
Change getDocumentLoader() to default allowPrivateAddress to false. This keeps commands that do not pass an explicit policy on the safer side by default and aligns the shared loader behavior with the hardening intent in this PR. fedify-dev#608 (comment) Co-Authored-By: Codex <codex@openai.com>
Use the last path segment to derive an extension and fall back to jpg for extensionless single-segment paths like /image. Keep rejecting unsafe or ambiguous paths by bailing out when no valid extension can be derived. This keeps the path-traversal safeguard intact while allowing the extensionless URL shape called out in review. fedify-dev#608 (comment) Co-Authored-By: Codex <codex@openai.com>
|
/gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request introduces a recursive lookup feature (--recurse) to the fedify lookup command, allowing users to traverse object relationships like reply chains. It also includes significant security enhancements by preventing fetches to private/localhost addresses by default in both lookup and image downloading, with manual overrides for development. The implementation correctly handles redirects and validates URLs to prevent SSRF vulnerabilities. A bug related to quoted separator output in multi-object lookups has also been fixed. The changes are accompanied by thorough tests and documentation updates. The lookup command has been refactored to support different modes and manage output streams robustly.
Note: Security Review did not run due to the size of the PR.
In recurse and traverse error handling, authorized fetch previously suppressed all lookup hints and always fell back to the generic suppress-errors guidance. That hid actionable private/localhost explanations even though authorized fetch cannot bypass private-address blocking. Use hint classification to show suppress-errors guidance only for authorized-fetch failures with auth enabled, while still emitting private-address and recursive-private-address hints in auth mode. fedify-dev#608 (comment) fedify-dev#608 (comment) Co-Authored-By: Codex <codex@openai.com>
Some image URLs use nested extensionless paths such as /media/12345. Rejecting all such URLs made image rendering skip common CDN/proxy media endpoints. Derive the extension from the filename when present; otherwise, infer it from Content-Type (with a safe fallback). Keep blocking suspicious encoded traversal patterns in the pathname. fedify-dev#608 (comment) Co-Authored-By: Codex <codex@openai.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a powerful recursive lookup feature to fedify lookup and significantly hardens the CLI against security vulnerabilities like SSRF and path traversal. The changes are well-structured, with clear separation of concerns, robust error handling, and comprehensive tests. The addition of --recurse and --recurse-depth is well-documented, and the fix for separator output improves the user experience for multi-object lookups. The security enhancements, particularly the strict handling of private addresses in recursive fetches and image downloads, are excellent. The code quality is high, and the changes are a great addition to the project.
Note: Security Review did not run due to the size of the PR.
Summary
This PR implements recursive relationship lookup for fedify lookup and fixes separator rendering in multi-object output flows.
--recurse=<PROPERTY_OR_IRI>to recursively follow object relationships.replyTarget.https://www.w3.org/ns/activitystreams#inReplyTo.inReplyTo.--recurse-depth=<N>with a default depth of20.--recurse-depthrequires--recurse.--traverseand--recursecannot be used together.--suppress-errorsbehavior to recurse mode for best-effort chain traversal.Tests
Added and expanded tests in packages/cli/src/lookup.test.ts for parser behavior (
--recurse,--recurse-depthdependency,traverse/recurseconflict), accepted/rejected recurse property forms, recursion behavior (depth limit, cycle handling), error behavior with and without--suppress-errors, and recursive target resolution helper behavior.Manual verification