Skip to content

fix(cm-client): preserve broken symlinks when zipping repositories#1468

Merged
solaris007 merged 7 commits intomainfrom
fix/cm-client-broken-symlinks
Apr 15, 2026
Merged

fix(cm-client): preserve broken symlinks when zipping repositories#1468
solaris007 merged 7 commits intomainfrom
fix/cm-client-broken-symlinks

Conversation

@rpapani
Copy link
Copy Markdown
Contributor

@rpapani rpapani commented Mar 25, 2026

Summary

  • Replace zip-lib's archiveFolder with direct yazl usage in zipRepository to handle broken symlinks gracefully. zip-lib internally calls fs.stat() on symlink targets to determine file type, which throws ENOENT for broken (dangling) symlinks — causing the entire code import to fail.
  • The new approach uses lstat (never stat) to read symlink metadata and stores symlinks via yazl.addBuffer() with their original mode bits, preserving them exactly as-is in the zip regardless of whether the target exists.
  • Symlinks escaping the repository root still fail the import (security check unchanged).
  • Broken symlinks within the repo root now log a warning and are packaged as-is, ensuring the repository ZIP is an exact copy of the cloned ref.

Context

Some customer repos may have broken symlinks on certain branches, which blocks code zipping and fails the entire code import. This fix allows zipping to proceed with broken symlinks as long as there are no security concerns (symlinks stay within the repo root).

Test plan

  • Unit tests: 56 passing, 100% coverage
  • Round-trip verified against a repo with broken symlinks: zip → extract → confirmed broken symlink preserved as actual symlink with correct target
  • Deploy to dev and run code import for affected site
  • Verify prod code import succeeds after release

🤖 Generated with Claude Code

Replace zip-lib archiveFolder with direct yazl usage in zipRepository
to handle broken (dangling) symlinks. zip-lib calls fs.stat() on symlink
targets which throws ENOENT for broken symlinks, failing the entire code
import. The new approach uses lstat to preserve symlinks as-is in the zip.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rpapani rpapani requested review from ramboz and solaris007 and removed request for ramboz March 25, 2026 04:03
Copy link
Copy Markdown
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rpapani,

Good fix - broken symlinks crashing the entire code import pipeline is a real production pain point, and dropping down from zip-lib to yazl with lstat is the right approach.

Strengths

  • Correct root cause fix (src/index.js:393-394): zip-lib's archiveFolder calls fs.stat() on symlink targets, which throws ENOENT for dangling symlinks. Using yazl directly with lstatSync solves this at the right layer.
  • Security boundary preserved (src/index.js:326-335): #validateSymlinks still runs before any zipping and still throws on symlinks escaping the repo root. The security gate is unchanged.
  • Clean separation of concerns (src/index.js:349-371): #addDirToZip does exactly one thing and delegates validation to #validateSymlinks. The two-pass approach keeps the security boundary clear.
  • Stream error handling (src/index.js:392-397): Both the yazl output stream and the write stream have error handlers wired to the rejection path. No silent failures.
  • Thorough test coverage: The broken-symlink test sets up a realistic nested directory structure and asserts on buffer content, metadata path, mode bits, and mtime. Four focused tests replacing two generic ones is a net positive.
  • Low supply chain risk: yazl 3.3.1 is a mature, widely-used library (~2M weekly npm downloads) with a single transitive dependency (buffer-crc32). No known vulnerabilities.

Issues

Important (Should Fix)

  1. Symlinks are stored as regular file content, not as ZIP symlink entries (src/index.js:360)

    addBuffer(Buffer.from(linkTarget), metadataPath, { mode: stat.mode }) writes the symlink target path as the file's content and sets mode to 0o120777. The ZIP format has no native symlink type - this relies on the extractor interpreting Unix external attributes to reconstruct symlinks. Standard ZIP extractors (including zip-lib's extract used in unzipRepository) will extract this as a regular file containing the target path string, not as an actual symlink.

    This is not a security issue (the content is just a short path string, and the boundary check prevents escaping), but the PR claims "preserving them exactly as-is" - that claim depends entirely on the downstream consumer. If consumers expect working symlinks after extraction (e.g., Apache Dispatcher enabled_farms -> available_farms patterns), they will get regular files instead.

    Suggestion: Confirm with the downstream team (Cloud Manager pipeline) that this encoding is handled correctly on extraction. At minimum, document this behavior in the zipRepository JSDoc so future engineers know the contract.

  2. Error test may be passing for the wrong reason (test/cloud-manager-client.test.js - throws when yazl zip fails)

    The throws when yazl zip fails and cleans up temp dir test configures the mock pipe to emit 'write failed', but it never configures readdirSyncStub for the test's clonePath. After beforeEach resets the stub, readdirSync() returns undefined. When #validateSymlinks iterates over undefined with for...of, it throws a TypeError before yazl is ever used - so the test may be passing because of a TypeError rather than the intended stream error.

    Fix: Add readdirSyncStub.withArgs(clonePath, { withFileTypes: true }).returns([]); at the top of this test case so the validation and walk phases complete cleanly and the test actually exercises the stream error path.

Minor (Nice to Have)

  1. Empty directories are silently dropped (src/index.js:352-370)

    #addDirToZip recurses into directories but never adds directory entries themselves. If a directory is empty, it will be absent from the zip. In practice this is unlikely to matter (git doesn't track empty directories), but since the stated goal is "exact copy of the cloned ref," consider calling zip.addEmptyDirectory() when a directory has zero children, or document that empty directories are intentionally omitted.

  2. Missing test for writable stream error path (src/index.js:393)

    The promise has two reject wires - output.on('error') and zip.outputStream.on('error'). Only the outputStream error path is tested. A disk-full or permission-denied scenario would trigger the output error handler. Consider adding a test for that path.

  3. zip-lib is now used only for extract (package.json:41)

    Not blocking, but the package now carries two ZIP libraries. Consider migrating unzipRepository to yauzl (yazl's companion) in a follow-up to drop zip-lib entirely.

Recommendations

  • Document the symlink encoding contract in the JSDoc: "Symlinks are stored as regular entries whose content is the link target and whose mode preserves S_IFLNK. Consumers must check mode bits to restore symlinks."
  • Consider combining #validateSymlinks and #addDirToZip into a single pass in a follow-up - this would eliminate duplicate readlinkSync/readdirSync calls and close the (theoretical) TOCTOU gap between validation and archiving.
  • Consider consolidating to yazl/yauzl and dropping zip-lib as a follow-up.

Assessment

Ready to merge? Yes, with minor fixes.

The core fix is correct and well-tested. The security boundary is maintained - symlink escape validation runs before zipping, lstat prevents symlink following, and stored content for symlinks is just the target path string (not file contents). The yazl dependency is low-risk. The main items to address are: (1) document or confirm the symlink-as-regular-file behavior with downstream consumers, and (2) fix the error test configuration so it exercises the intended code path. Neither is a merge blocker if you confirm the downstream behavior is acceptable.

Copy link
Copy Markdown

@ramboz ramboz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'd address the detailed PR feedback first though

Resolve conflicts in package.json (take newer dep versions + keep yazl),
src/index.js (keep both lstatSync and mkdirSync imports), and test file
(keep both stubs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

This PR will trigger a patch release when merged.

@fpsqdb
Copy link
Copy Markdown

fpsqdb commented Apr 14, 2026

Hi there!

I’m the maintainer of zip-lib. I happened to notice that your project is using zip-lib, and it's great to see more projects adopting it!

By looking through your recent commits and PRs, I noticed some of the challenges you encountered with the library. Based on that, I’ve just released zip-lib v1.3.2, which includes several updates:

  • Buffer Support: Added support for compressing to and decompressing from Buffers.
  • Broken Symlinks: Added support for handling broken symbolic links during compression.
  • Bug Fixes: Included various fixes and stability improvements.

Feel free to give the new version a try! If you find any other issues or have requests for missing features, please don't hesitate to open an issue or submit a PR on our repository.

Happy coding! 🚀

@solaris007 solaris007 added the bug Something isn't working label Apr 14, 2026
…symlinks

# Conflicts:
#	packages/spacecat-shared-cloud-manager-client/package.json
#	packages/spacecat-shared-cloud-manager-client/test/cloud-manager-client.test.js
Replace direct yazl usage with zip-lib v1.3.2 which now natively
handles broken symlinks when followSymlinks is false. This
simplifies the implementation - archiveFolder returns a Buffer
directly, eliminating the need for temp files and manual directory
walking.

The symlink security validation (escape-root check) and broken
symlink warning log are preserved.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extract method supports a Buffer as an input and the safeSymlinksOnly option.
By default (false), it only prevents path traversal (writing outside the directory via symlinks). If set to true, it strictly rejects any symlink whose target is outside the extraction directory. Using safeSymlinksOnly: true we can remove the manual #validateSymlinks check after extract.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @fpsqdb, thank you so much for reaching out - and for shipping v1.3.2 with these fixes! It's genuinely rare for a maintainer to notice a downstream workaround and proactively release a fix. We really appreciate it.

We've updated this PR to use zip-lib v1.3.2 throughout - both the buffer-returning archiveFolder and the buffer-accepting extract with safeSymlinksOnly. It simplified our code significantly (deleted ~200 lines of manual yazl wiring). Your suggestion on the extract API was spot on.

If we run into anything else, we'll open an issue on your repo. Keep up the great work!

Use zip-lib v1.3.2's buffer-accepting extract and safeSymlinksOnly
option to simplify unzipRepository:

- extract accepts Buffer directly (no temp zip file needed)
- safeSymlinksOnly: true replaces manual validateSymlinks on extract
- Removes ZIP_DIR_PREFIX constant (no longer needed)

Suggested-by: fpsqdb (zip-lib maintainer)
Copy link
Copy Markdown
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rpapani,

All items from the original review are resolved by the switch to zip-lib v1.3.2 - symlink encoding, error test concerns, empty directories, dual-lib overhead - all gone. The implementation is significantly simpler now and the security boundary is preserved.

Ready to ship.

Relax exact fetch call count assertion (2) to at.least(2) -
CI timing can cause an extra retry, making the count 3.
The test still verifies the warmup delay and successful response.
@solaris007 solaris007 merged commit 943662a into main Apr 15, 2026
5 checks passed
@solaris007 solaris007 deleted the fix/cm-client-broken-symlinks branch April 15, 2026 13:05
solaris007 pushed a commit that referenced this pull request Apr 15, 2026
## [@adobe/spacecat-shared-cloud-manager-client-v1.1.3](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-cloud-manager-client-v1.1.2...@adobe/spacecat-shared-cloud-manager-client-v1.1.3) (2026-04-15)

### Bug Fixes

* **cm-client:** preserve broken symlinks when zipping repositories ([#1468](#1468)) ([943662a](943662a))
@solaris007
Copy link
Copy Markdown
Member

🎉 This PR is included in version @adobe/spacecat-shared-cloud-manager-client-v1.1.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

solaris007 pushed a commit that referenced this pull request Apr 15, 2026
## [@adobe/spacecat-shared-tokowaka-client-v1.13.4](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-tokowaka-client-v1.13.3...@adobe/spacecat-shared-tokowaka-client-v1.13.4) (2026-04-15)

### Bug Fixes

* **cm-client:** preserve broken symlinks when zipping repositories ([#1468](#1468)) ([943662a](943662a))
@solaris007
Copy link
Copy Markdown
Member

🎉 This PR is included in version @adobe/spacecat-shared-tokowaka-client-v1.13.4 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working released

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants