Add the ability to specify frame alignment#20
Conversation
|
I wouldn't be surprised if there are bugs in this implementation. This PR is meant to be more of a conversation-starter than something that can be merged as-is. |
|
Very interesting, thanks. Do you have a use case for aligned frames? I like the idea but suspect that it will be rarely useful in practice |
|
The idea is based on the fact that files are not really random-access: you can only really seek to a multiple of 4k. So if the frame you want isn't 4k-aligned then the kernel will end up reading some useless data from the end of the previous frame. This doesn't matter much if you seek rarely, so this |
This PR is a proof-of-concept which adds a
--align=<n>flag which forces all frames to be aligned to ann-byte boundary in the resulting file. It does this by inserting skippable frames as padding.Demo
This is the result of running with
--align=1M. The odd-numbered frames are the regular ones (the ones you'd want to read from). Notice how they all start on 1 MiB boundaries. The even-numbered frames are skippable "padding" frames.Motivation
In theory, setting
--align=4Kshould reduce read amplification and pagecache usage when seeking. The effect would be more noticeable with smaller frames. Eg. if you wanted a perfectly seekable file, you'd want 4 KiB frames. In that case, misaligned frames would be amplified by 100%. I haven't measured anything though, so I don't really know how big the effect is in practice.