feat: select and validate manifest#288
Conversation
| // SelectValidManifest looks in the targetDir for manifests declared in the | ||
| // release, reads and validates those found. The selection ensures that all | ||
| // valid manifests found are identical, so only one is kept and returned. | ||
| // | ||
| // A file matching a manifest path declared in the release is considered valid | ||
| // if it is a zstd file, readable by manifest.Read (with a known schema | ||
| // version) and successfully validated by manifestutil.Validate. | ||
| // | ||
| // Not finding any manifest (valid or not) means the targetDir cannot be | ||
| // considered as previously produced by Chisel for the given release. | ||
| // | ||
| // Finding only manifests with unknown schema version means the targetDir may | ||
| // have been produced by Chisel, but possibly by a future/incompatible version. | ||
| // Chisel cannot safely proceed and so errors out. | ||
| // | ||
| // Finding multiple manifests, with at least one valid means Chisel can proceed, | ||
| // ignoring unknown ones. |
There was a problem hiding this comment.
[Note to reviewer] This reworked comment addresses the following review comment:
This is not a useful comment. Any misinformed person can tell that what "select valid manifest" does is to "return a valid manifest". As a rule, we always try to write comments that inform what is not obvious from the code, and if we cannot do that, we avoid the comment. That's not the case here, though.. there's a lot to be said about where the manifest is being selected, how it's being selected, what does it even mean to be valid, etc. Succinct is good, but it needs to be clear, terse, informative.
Please spend a moment or two thinking about this and digesting it so I don't need to bother you too much with comments in the future. :)
From #264 (comment)
There was a problem hiding this comment.
The whole text is much better, thanks.
That last sentence is both repeating what was already said, and being slightly misleading in that it contradicts what was said earlier in the very first sentence. So perhaps just drop it.
| // ignoring unknown ones. | ||
| func SelectValidManifest(targetDir string, release *setup.Release) (*manifest.Manifest, error) { | ||
| targetDir = filepath.Clean(targetDir) | ||
| if !filepath.IsAbs(targetDir) { |
There was a problem hiding this comment.
[Note to reviewer]: Previous review comment:
That feels super sketchy. We pretty much never take into account the current directory, except on the very entry point of the overall process. This entry point is the CLI main function, not a library function deep into the implementation. The kinds of bugs that result from this kind of logic tend to go to the news.
From #264 (comment)
There was a problem hiding this comment.
This is replicating the piece of code at the beginning of Run, which is the only other place we take the current working dir into account. SelectValidManifest is part of the API exposed by slicer and is called at the same level as Run. SelectValidManifest is also expected to be called before Run. We also want both to target the same directory.
I wanted to avoid refactoring Run in this PR but what do you think of extracting this logic of "correcting" the targetDir in a function so it can be done in a consistent way in the 2 places we need to do it?
Otherwise, this logic could be extracted out of Run and moved in cmd_cut.go. However it would change the behavior of Run and would make it riskier to use I think.
From #264 (comment)
There was a problem hiding this comment.
This is just attempt to explain why the code looks like this without actually addressing the concern from the original comment:
That feels super sketchy. We pretty much never take into account the current directory, except on the very entry point of the overall process. This entry point is the CLI main function, not a library function deep into the implementation. The kinds of bugs that result from this kind of logic tend to go to the news.
This is also the third time I'm saying this in reviews or in our calls. Is the plan to push this for review until it passes? :)
If you've seen this being done elsewhere, then let's also fix that elsewhere, rather than making it worse.
There was a problem hiding this comment.
I have reworked it, with an hopefully safer approach.
The caller, for now only Execute in cmd_cut.go, is now responsible for doing the path modification if the received target dir is not absolute. To limit risks of cutting in an ambiguous/unexpected target dir, slicer.Run is still checking the received target dir is absolute and returns an internal error if not. Out of consistency SelectValidManifest is doing the same even though it is not writing anything, so this check is not strictly necessary.
| // | ||
| // Wildcard characters can only appear at the end as **, and the path before | ||
| // those wildcards must be a directory. | ||
| func validateManifestPath(path string, info setup.PathInfo) (string, error) { |
There was a problem hiding this comment.
[Note to reviewer]: This addresses the previous review comment:
This is too loose. Imagine what happens if path is "/hmm/oops**", for instance, or just "/oops".
We need to be strict about how a manifest path looks like. Hopefully this is already the case elsewhere so there's already a pattern to follow. If not, please let me know.
From #264 (comment)
There was a problem hiding this comment.
I added a validateManifestPath function verifying that the received path matches our definition of a valid manifest path before doing any manipulation on it. This function is now used in FindPaths and FindPathsInRelease. I also added a couple of tests to make sure the issue you pointed out was handled properly.
From #264 (comment)
| return pkgArchive, nil | ||
| } | ||
|
|
||
| type NoManifestError struct { |
There was a problem hiding this comment.
[Note to reviewer] This error type is returned in SelectValidManifest in the two situations where the caller can recover from the error, as it is done in cmd_cut.go.
I am conflicted on the usefulness of the message field. On one hand in the only call site of SelectValidManifest, the message is ignored. On the other hand, I think it makes sense for future consumers of this API to be able to display these different error messages.
What do you think?
(discussion initially started in #264 (comment))
There was a problem hiding this comment.
The field would be okay if we were making good use of it, but all this is being used for is two variations of "cannot find manifest file", so that's indeed not useful right now.
There was a problem hiding this comment.
Thanks. I replaced it with a named error.
| // ignoring unknown ones. | ||
| func SelectValidManifest(targetDir string, release *setup.Release) (*manifest.Manifest, error) { | ||
| targetDir = filepath.Clean(targetDir) | ||
| if !filepath.IsAbs(targetDir) { |
There was a problem hiding this comment.
This is just attempt to explain why the code looks like this without actually addressing the concern from the original comment:
That feels super sketchy. We pretty much never take into account the current directory, except on the very entry point of the overall process. This entry point is the CLI main function, not a library function deep into the implementation. The kinds of bugs that result from this kind of logic tend to go to the news.
This is also the third time I'm saying this in reviews or in our calls. Is the plan to push this for review until it passes? :)
If you've seen this being done elsewhere, then let's also fix that elsewhere, rather than making it worse.
| return pkgArchive, nil | ||
| } | ||
|
|
||
| type NoManifestError struct { |
There was a problem hiding this comment.
The field would be okay if we were making good use of it, but all this is being used for is two variations of "cannot find manifest file", so that's indeed not useful right now.
| // version) and successfully validated by manifestutil.Validate. | ||
| // | ||
| // Not finding any manifest (valid or not) means the targetDir cannot be | ||
| // considered as previously produced by Chisel for the given release. |
There was a problem hiding this comment.
We can drop this part. Not finding a valid manifest simply means we cannot find a valid and useful manifest to continue. We don't need to attempt to explain what it means otherwise. The manifest might have been produced by an old version of Chisel that we do not support anymore, or whatever else.
| // SelectValidManifest looks in the targetDir for manifests declared in the | ||
| // release, reads and validates those found. The selection ensures that all | ||
| // valid manifests found are identical, so only one is kept and returned. | ||
| // | ||
| // A file matching a manifest path declared in the release is considered valid | ||
| // if it is a zstd file, readable by manifest.Read (with a known schema | ||
| // version) and successfully validated by manifestutil.Validate. | ||
| // | ||
| // Not finding any manifest (valid or not) means the targetDir cannot be | ||
| // considered as previously produced by Chisel for the given release. | ||
| // | ||
| // Finding only manifests with unknown schema version means the targetDir may | ||
| // have been produced by Chisel, but possibly by a future/incompatible version. | ||
| // Chisel cannot safely proceed and so errors out. | ||
| // | ||
| // Finding multiple manifests, with at least one valid means Chisel can proceed, | ||
| // ignoring unknown ones. |
There was a problem hiding this comment.
The whole text is much better, thanks.
That last sentence is both repeating what was already said, and being slightly misleading in that it contradicts what was said earlier in the very first sentence. So perhaps just drop it.
| } | ||
| manifestPaths := manifestutil.FindPathsInRelease(release) | ||
| if len(manifestPaths) == 0 { | ||
| return nil, &NoManifestError{message: "no manifest generated for the release"} |
There was a problem hiding this comment.
This error doesn't sound right.. we are not generating manifests here.
| if foundUnknownSchema { | ||
| return nil, fmt.Errorf("cannot select a manifest: all manifests found use unknown schema") | ||
| } else { | ||
| return nil, &NoManifestError{message: "no valid manifest found in directory"} |
There was a problem hiding this comment.
"cannot find valid manifest file"
| } | ||
| if selected == nil { | ||
| if foundUnknownSchema { | ||
| return nil, fmt.Errorf("cannot select a manifest: all manifests found use unknown schema") |
There was a problem hiding this comment.
"cannot select a manifest: schema version is unknown"
| var selectedHash string | ||
| var selectedPath string | ||
| foundUnknownSchema := false | ||
| for _, manifestPath := range manifestPaths { |
There was a problem hiding this comment.
This loop might be more readable if the logic for a single file gets moved to its own function.
There was a problem hiding this comment.
I extracted it to a tryLoadManifest function, making the logic of SelectValidManifest a bit easier to reason about.
This PR is the first step to enable Chisel recuting a rootfs. It adds the ability to
select a manifest from the given rootfs based on the release. If a manifest is
found, the slices it lists are added to the slices requested by the user. The rest
of the cutting process is unchanged for now.
This work was split from #264.