differential timestamp compression when serializing processed profiles#5033
Conversation
|
I think we'll have add some kind of upgrader support for the serialized format if we rely on more of its structure. At the moment, the pipeline is: JSON.parse -> convert to unserialized -> run processed upgrader -> use processed format |
af75e6b to
9e7ffe2
Compare
mstange
left a comment
There was a problem hiding this comment.
I think this is a good change; JSON size reductions (gzipped or not) are welcome.
Please bump the processed format version, add an empty upgrader with a comment describing the change (and the fact that it only applies to the serialized format, and that the upgrader runs after unserialization and doesn't see a difference), and also update the markdown file that describes the format changes.
| function _serializeSamples({ time, ...restOfSamples }): any { | ||
| let lastTime; | ||
| return { | ||
| timeDeltas: time.map((t, i) => { | ||
| const timeDelta = i === 0 ? t : t - lastTime; | ||
| lastTime = t; | ||
| return timeDelta; | ||
| }), | ||
| ...restOfSamples, | ||
| }; | ||
| } | ||
|
|
||
| function _unserializeSamples({ timeDeltas, time, ...restOfSamples }): any { | ||
| let lastTime; | ||
| return { | ||
| time: | ||
| time || | ||
| timeDeltas.map((t, i) => { | ||
| return (lastTime = i === 0 ? t : lastTime + t); | ||
| }), | ||
| ...restOfSamples, | ||
| }; | ||
| } |
There was a problem hiding this comment.
In both _serializeSamples and _unserializeSamples you can initialize lastTime to zero and remove the i === 0 check.
| return { | ||
| time: | ||
| time || | ||
| timeDeltas.map((t, i) => { |
There was a problem hiding this comment.
please rename the t argument to delta
| time?: Milliseconds[], | ||
| timeDeltas: Milliseconds[], |
There was a problem hiding this comment.
Hmm, if you put an optional time field here, then I think it would make sense for the timeDeltas field to also be optional.
| @@ -1750,13 +1785,25 @@ export function serializeProfile(profile: Profile): string { | |||
| */ | |||
| function _unserializeProfile({ | |||
There was a problem hiding this comment.
Ok, so as long as the code in this function handles both the old and the new format, we don't need to add an upgrading step prior to unserialization. That seems ok for now but it won't scale forever.
| time?: Milliseconds[], | ||
| timeDeltas: Milliseconds[], |
There was a problem hiding this comment.
Please add a comment saying "Starting with version XX of the processed format, the time column may optionally be serialized as a timeDeltas column instead. Both formats are supported for unserialization."
| time: | ||
| time || | ||
| timeDeltas.map((t, i) => { | ||
| return (lastTime = i === 0 ? t : lastTime + t); |
There was a problem hiding this comment.
I'm not a big fan of having an assignment inside a return statement, but you can keep it if you really want it.
9e7ffe2 to
6670371
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5033 +/- ##
=======================================
Coverage 88.41% 88.42%
=======================================
Files 304 304
Lines 27552 27574 +22
Branches 7450 7456 +6
=======================================
+ Hits 24361 24383 +22
Misses 2963 2963
Partials 228 228 ☔ View full report in Codecov by Sentry. |
|
|
||
| ### Version 50 | ||
|
|
||
| The serialized format now stores sample and counter sample times as time deltas instead of absolute timestamps to reduce the JSON size. The unserialized version is unchanged. |
There was a problem hiding this comment.
If I am somebody who outputs profiles, do I have to change my output?
We need to make a decision about what we want to say in this documentation: Should version 50 require the new format, or should version 50 say "both formats are supported"?
| // The serialized format now stores sample and counter sample times as | ||
| // time deltas instead of absolute timestamps to reduce the JSON size. | ||
| // The unserialized version is unchanged, and because the upgraders run | ||
| // after unserialization they see no difference. |
There was a problem hiding this comment.
Same here - does version 50 require the delta format? The three places where this is documented should be consistent.
6670371 to
60e5770
Compare
* main: (128 commits) differential timestamp compression when serializing processed profiles (firefox-devtools#5033) Add selectedMarker to the console APIs Fix the styling of the console message Fix the extra spaces in the console friendly message Review comments and some more tweaks Update tests Add a utility to save to disk the data Add a window console utility to retrieve the profile attached to the current tab Extract the function that unwraps a BrowserConnection from the BrowserConnectionStatus item Do not hardcode the user agent when creating the browser connection Send a UserAgent header to the symbolication server that identifies this request as coming from the Firefox Profiler. Bump micromatch from 4.0.7 to 4.0.8 (firefox-devtools#5101) Update all Yarn dependencies (2024-08-28) (firefox-devtools#5100) ⬆️ Update webpack to version 5.94.0 (firefox-devtools#5099) Run prettier after last change Update src/components/app/ZipFileViewer.js zip file -> archive Move the getTabToThreadIndexesMap selector logic into the profile-logic folder Refactor getInnerWindowIDToTabMap to not lookup for the top most parent unnecessarily Add some tests for the new getTabToThreadIndexesMap selector ...
[Florian Quèze] differential timestamp compression when serializing processed profiles (#5033) [joshuaobrien] Zip file viewer: omit profile extension if it is .json (#5079) [Paul Adenot] Allow typing '?' in `contenteditable` elements without having the help menu pop in (#5124) [Julien Wajsberg] Enable the Friulian locale (#5127) Thanks also to our localizers! fur: Fabio Tomat nl: simonmeulenbeek tr: Grk, Nazım Can Altınova
I was confused in this comment and in other comments on this PR. The processed format upgraders run before the conversion from the serialized format. The upgraders see timeDeltas, stringArray etc. I'll put up a PR to remove the misleading comment. Edit: PR #5285 |
|
(Nothing is broken, as far as I can tell. I was just confused on this point.) |
I asked for this comment in firefox-devtools#5033 but I was confused. The upgraders run before we convert the profile from the serialized format to the in-memory format.
I asked for this comment in firefox-devtools#5033 but I was confused. The upgraders run before we convert the profile from the serialized format to the in-memory format.
Just a WIP to share the code and have a deploy preview for feedback. Flow and tests don't pass (yet).
For https://profiler.firefox.com/public/k56pyq04p6yey31fxez6gex1ab53avx14ghmsb8/ (a power profile coming from the gecko profiler on Android) the gzipped size drops from 82.2MB to 77.9MB (5% win)
For https://profiler.firefox.com/public/qpws4tqwgah43y1q22p2e9nhyc3j6ma3wwe38nr/ (a power profile from a USB power meter with µs timestamp precision on the samples) the gzipped size drops from 134MB to 93.5MB (30% win).
Note: without the PR, I could serialize and compress 4h of this profile (resulting in a 165.6MB file I can download but not upload. Longer timer ranges failed with OOM errors). With the PR I had OOM errors when serializing 4h and I reduced the range a bit. 3h20min is the maximum time range of this profile that can be serialized with the PR without running into OOM errors.
For https://profiler.firefox.com/public/83t636whpa0h6vbcvxr32nz29q3wy116arvtfp0/ (a power profile from a smart power plug, with a perfectly consistent 1s sampling rate) the gzipped size drops from 61.8kB to 8.38kB (86% win).