fix(anonymizer): truncate output files on open to prevent stale JSON#8248
fix(anonymizer): truncate output files on open to prevent stale JSON#8248Retr0-XD wants to merge 1 commit intojaegertracing:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes the anonymizer/uiconv tooling so repeated runs targeting the same output path don’t leave trailing bytes from previous runs (which can corrupt JSON output).
Changes:
- Add
os.O_TRUNCwhen opening the captured spans output file. - Add
os.O_TRUNCwhen opening the anonymized spans output file. - Add
os.O_TRUNCwhen opening the UI converter output file.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| cmd/anonymizer/app/writer/writer.go | Truncate captured/anonymized output files on open to prevent stale trailing JSON. |
| cmd/anonymizer/app/uiconv/extractor.go | Truncate UI output file on open to prevent stale trailing JSON. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cmd/anonymizer/app/writer/writer.go
Outdated
| cf, err := os.OpenFile(config.CapturedFile, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.ModePerm) | ||
| if err != nil { |
There was a problem hiding this comment.
os.OpenFile(..., os.ModePerm) creates the output JSON files with mode 0777 (subject to umask), which can unintentionally set executable bits and make captured/anonymized trace data world-readable/writable. Consider using a more appropriate default like 0o600 (safer for trace data) or 0o644 (typical for JSON output).
cmd/anonymizer/app/writer/writer.go
Outdated
| cf, err := os.OpenFile(config.CapturedFile, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.ModePerm) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("cannot create output file: %w", err) | ||
| } |
There was a problem hiding this comment.
The error returned on failure to open the captured file is always cannot create output file, which makes it hard to tell which path failed when multiple files are opened here. Include config.CapturedFile (and similarly for the anonymized file) in the error message to simplify troubleshooting.
| // newExtractor creates extractor. | ||
| func newExtractor(uiFile string, traceID string, reader *spanReader, logger *zap.Logger) (*extractor, error) { | ||
| f, err := os.OpenFile(uiFile, os.O_CREATE|os.O_WRONLY, os.ModePerm) | ||
| f, err := os.OpenFile(uiFile, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.ModePerm) |
There was a problem hiding this comment.
os.OpenFile(..., os.ModePerm) creates the UI JSON output file with mode 0777 (subject to umask), which can set executable bits and expose trace data more broadly than intended. Consider using a tighter/default mode such as 0o600 (safer) or 0o644.
| f, err := os.OpenFile(uiFile, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.ModePerm) | |
| f, err := os.OpenFile(uiFile, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o600) |
| f, err := os.OpenFile(uiFile, os.O_CREATE|os.O_WRONLY, os.ModePerm) | ||
| f, err := os.OpenFile(uiFile, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.ModePerm) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("cannot create output file: %w", err) |
There was a problem hiding this comment.
The error message cannot create output file doesn't include the uiFile path, which can make diagnosing failures harder when this tool is wired into scripts. Consider including the filename in the error message.
| return nil, fmt.Errorf("cannot create output file: %w", err) | |
| return nil, fmt.Errorf("cannot create output file %q: %w", uiFile, err) |
When running the anonymizer tool multiple times pointing to the same output paths, the files were opened with O_CREATE|O_WRONLY but without O_TRUNC. This means any previous content is kept in place and new data is written on top. If the new output happens to be shorter than the old one, the leftover bytes at the end produce malformed JSON that cannot be parsed. Added os.O_TRUNC to all three os.OpenFile calls across writer.go and extractor.go so the file is always zeroed out before writing begins. Also tightened the file creation mode from os.ModePerm (0777) to 0o600 so output files containing potentially sensitive trace data are not world-readable, and included the file path in each error message to make failures easier to diagnose. Fixes: jaegertracing#8231 Signed-off-by: Sakthi Harish <sakthi.harish@edgeverve.com>
e6be8ea to
a72cc6f
Compare
|
there is already an open PR, closing as a dupe. |
What does this PR do?
When you run the anonymizer tool more than once and point it to the same output files, the old content isn't cleared before writing starts. The files are opened with
O_CREATE|O_WRONLYbut withoutO_TRUNC, so if a previous run wrote more data than the current one, the leftover bytes stick around at the end and produce invalid JSON.This adds
os.O_TRUNCto all threeos.OpenFilecalls — two inwriter.go(for the captured and anonymized output files) and one inextractor.go(for the UI converter output). That ensures the file is always empty when we start writing.Which issue(s) does this PR fix?
Fixes #8231
Checklist
t.TempDir()which always starts with an empty dir, so no test assertions are affected)Signed-off-by: Sakthi Harish sakthi.harish@edgeverve.com