Skip to content

test: add trailing-slash corner-case#187

Closed
rmg wants to merge 1 commit intoisaacs:masterfrom
rmg:handle-bad-luck
Closed

test: add trailing-slash corner-case#187
rmg wants to merge 1 commit intoisaacs:masterfrom
rmg:handle-bad-luck

Conversation

@rmg
Copy link
Copy Markdown
Contributor

@rmg rmg commented Jun 19, 2018

The logic for checking if an entry is marked as a file but has a
trailing '/' in the path trips over files that happen to be under a
99-character path that is the perfect length for the 100-byte truncated
path to look like it is has a trailing / when it really doesn't.

<1: header, link, "././@LongLink">
<2: "full-path-to-file-inside-dir/filename">
<3: header, file, "first-99-bytes-of-full-path/">
<4: body of file>

The parser appears to treat the above as:

  1. parse link entry as normal
  2. parse metadata frame
  3. parse file entry as directory entry
  4. assume next frame is next entry

This happens with tarballs generated by GNU Tar, but it is not
related to the gnu tar format.

The way BSD tar generates the above is:

<frame: header, file, "path-after-the-slash", prefix>
<frame: body of file>

It appears that the root cause is GNU Tar doesn't use the prefix field
of the POSIX tar header and instead relies on the link frame and assumes
that the path in the following "real" file entry is mostly advisory so
it doesn't matter that it is truncated at 100 bytes.

BSD Tar, on the other hand, makes use of the prefix field and instead
only puts the file portion of the path in the path field.

The take away here is that if you are not using npm to tar up your
bundles, at least use BSD Tar instead of GNU Tar, because even after
fixing this there will be a long tail of broken versions of npm to
support.

The logic for checking if an entry is marked as a file but has a
trailing '/' in the path trips over files that happen to be under a
99-character path that is the perfect length for the 100-byte truncated
path to _look_ like it is has a trailing / when it really doesn't.

```
<1: header, link, "././@LongLink">
<2: "full-path-to-file-inside-dir/filename">
<3: header, file, "first-99-bytes-of-full-path/">
<4: body of file>
```

The parser appears to treat the above as:
 1. parse link entry as normal
 2. parse metadata frame
 3. parse file entry as directory entry
 4. assume next frame is next entry

This happens with tarballs generated by GNU Tar, but it is **not**
related to the `gnu` tar format.

The way BSD tar generates the above is:

```
<frame: header, file, "path-after-the-slash", prefix>
<frame: body of file>
```

It appears that the root cause is GNU Tar doesn't use the prefix field
of the POSIX tar header and instead relies on the link frame and assumes
that the path in the following "real" file entry is mostly advisory so
it doesn't matter that it is truncated at 100 bytes.

BSD Tar, on the other hand, makes use of the prefix field and instead
only puts the file portion of the path in the path field.

The take away here is that if you are not using npm to tar up your
bundles, at least use BSD Tar instead of GNU Tar, because even after
fixing this there will be a long tail of broken versions of npm to
support.
@rmg
Copy link
Copy Markdown
Contributor Author

rmg commented Jun 19, 2018

@isaacs this is the reduced test case I was describing to you

@rmg
Copy link
Copy Markdown
Contributor Author

rmg commented Jun 19, 2018

Here's some hex-dumps of the troublesome tarball:

https://gist.github.com/rmg/922ce5256e535a77682d19037e8fa622

@isaacs
Copy link
Copy Markdown
Owner

isaacs commented Jun 19, 2018

Thank you! I'll take a look soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants