more: constant memory initialization overhead#7765
Conversation
442ddb7 to
7649214
Compare
|
waiting for #7680 to be merged |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
4f212a6 to
a981d4c
Compare
|
GNU testsuite comparison: |
| const USAGE: &str = help_usage!("more.md"); | ||
| const BELL: &str = "\x07"; | ||
| const BELL: char = '\x07'; | ||
| const MULTI_FILE_TOP_PROMPT: &str = "\r::::::::::::::\n\r{}\n\r::::::::::::::\n"; |
There was a problem hiding this comment.
as a bit cryptic, could you please add a comment for this? thanks
There was a problem hiding this comment.
Will do @sylvestre. Please note that this PR is still WIP and not ready for full review. I'm awaiting my previous PR to be merged. Once that is done, I can better compare changes to main.
I will also do a writeup on the changes made, benchmarking, and add comments in the code.
|
GNU testsuite comparison: |
|
@tertsdiepraam @sylvestre here is how I plan to refactor
Would be great to hear your feedback! |
61ddf17 to
148d6a2
Compare
|
GNU testsuite comparison: |
148d6a2 to
310d807
Compare
|
Benchmark < /dev/urandom base64 | fold -w 120 | head -c 100M > bigfile
< /dev/urandom base64 | fold -w 120 | head -c 1M > smallfile
cargo build --no-default-features --features more --release
# file
/usr/bin/time ./target/release/coreutils more bigfile
/usr/bin/time ./target/release/coreutils more smallfile
# pipe
/usr/bin/time bash -c 'cat bigfile | ./target/release/coreutils more'
/usr/bin/time bash -c 'cat smallfile | ./target/release/coreutils more'
# in `main`
# bigfile: 0.37user 0.11system 0:01.81elapsed 26%CPU (0avgtext+0avgdata 126796maxresident)k 0inputs+0outputs (0major+31065minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.21elapsed 0%CPU (0avgtext+0avgdata 4540maxresident)k 0inputs+0outputs (0major+452minor)pagefaults 0swaps
# bigfile(pipe): 0.37user 0.15system 0:01.97elapsed 26%CPU (0avgtext+0avgdata 126832maxresident)k 0inputs+0outputs (0major+31430minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.01system 0:01.63elapsed 1%CPU (0avgtext+0avgdata 4344maxresident)k 0inputs+0outputs (0major+812minor)pagefaults 0swaps
# in `more-mem-constant`
# bigfile: 0.00user 0.00system 0:01.04elapsed 0%CPU (0avgtext+0avgdata 3084maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.05elapsed 0%CPU (0avgtext+0avgdata 3224maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 3528maxresident)k 0inputs+0outputs (0major+488minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.13elapsed 0%CPU (0avgtext+0avgdata 3592maxresident)k 0inputs+0outputs (0major+491minor)pagefaults 0swaps
# system
# bigfile: 0.00user 0.00system 0:00.95elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.11elapsed 0%CPU (0avgtext+0avgdata 3584maxresident)k 0inputs+0outputs (0major+494minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.10elapsed 0%CPU (0avgtext+0avgdata 3524maxresident)k 0inputs+0outputs (0major+492minor)pagefaults 0swapsGreatly reduced time to first paint and memory usage (41x) for large files. Verified that the overhead is constant with respect to input size. |
|
GNU testsuite comparison: |
310d807 to
686f4b3
Compare
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
347c086 to
022ed0f
Compare
|
GNU testsuite comparison: |
022ed0f to
f138ac5
Compare
| pub fn uumain(args: impl uucore::Args) -> UResult<()> { | ||
| let _guard = TerminalGuard; | ||
|
|
||
| // Disable raw mode before exiting if a panic occurs | ||
| set_hook(Box::new(|panic_info| { | ||
| terminal::disable_raw_mode().unwrap(); | ||
| print!("\r"); | ||
| println!("{panic_info}"); | ||
| })); |
There was a problem hiding this comment.
logic for modifying the terminal has been moved from uumain to more.
|
GNU testsuite comparison: |
|
@tertsdiepraam could you please review this PR? |
|
GNU testsuite comparison: |
|
Also, can you please squash the multiple commits in just one or two implementation steps ? |
Could the PR be squashed into one commit during merge? |
dd0da28 to
81ca89a
Compare
|
GNU testsuite comparison: |
81ca89a to
3c91ece
Compare
|
GNU testsuite comparison: |
3c91ece to
37a94e1
Compare
|
GNU testsuite comparison: |
|
I am facing some issues with running tests locally even though they pass in the test runner. Please give me some more time to fix it. |
37a94e1 to
bfcfb04
Compare
|
GNU testsuite comparison: |
a9d4a8f to
d4bc6b3
Compare
|
I have "fixed" the tests by migrating them to |
| enum OutputType { | ||
| Tty(Stdout), | ||
| Pipe(Box<dyn Write>), | ||
| #[cfg(test)] | ||
| Test(Vec<u8>), | ||
| } |
There was a problem hiding this comment.
This enum is used to distinguish between stdout types and provide a way for tests to verify output.
|
GNU testsuite comparison: |
Do you have the results of re-running the benchmark with your latest changes? |
yes, I've updated it in the previous comment here: #7765 (comment) |
|
I've just finished reviewing. I did not see any code smell, it's very readable imo and However, this is a big refactor, and I have a pretty low confidence level in my own competence to review this. |
There was a problem hiding this comment.
Most of this looks really good. Thanks! You've crammed a lot into this PR (and basically 1 commit), which makes it hard to review. I've got some comments, but we can probably merge this after those have been fixed. Next time, please split it up over multiple PRs.
| (Some(n), _) | (None, Some(n)) if n > 0 => Some(n + 1), | ||
| _ => None, // Use terminal height |
There was a problem hiding this comment.
Next time, please separate cleanup from actual changes. That'll make it much easier to review.
src/uu/more/src/more.rs
Outdated
| print!("\r"); | ||
| stdout.flush().unwrap(); | ||
| #[cfg(test)] | ||
| impl Deref for OutputType { |
There was a problem hiding this comment.
This seems to involved for deref, I'd rather have a unwrap_test method or something more explicit like that. Only implementing this for cfg(test) also makes me a bit worried that the behaviour might change inside and outside of tests.
There was a problem hiding this comment.
I've moved this into mod tests so it should not change the original implementation. Besides, the unreachable macro should catch errors if not used appropriately.
| } | ||
| } | ||
|
|
||
| #[cfg(target_os = "fuchsia")] |
There was a problem hiding this comment.
Did you remove this intentionally? Shouldn't we keep it?
There was a problem hiding this comment.
You're right, I've added it back.
src/uu/more/src/more.rs
Outdated
| lines: Vec<String>, // Lines read from the input | ||
| cumulative_line_sizes: Vec<u64>, // Cumulative byte sizes of lines | ||
| upper_mark: usize, // Current line at the top of the screen | ||
| content_rows: usize, // Number of rows that fit on the screen | ||
| eof_reached: bool, | ||
| lines_squeezed: usize, // Number of lines squeezed out in the current view |
There was a problem hiding this comment.
These comments should probably be docstrings.
| reset_term()?; | ||
| std::process::exit(0); |
There was a problem hiding this comment.
This is for another PR, but it might be nice to never exit explicitly, so that we always reset via the drop guard.
| /// Process user input events until exit | ||
| fn process_events(&mut self, options: &Options) -> UResult<()> { | ||
| loop { | ||
| if !event::poll(Duration::from_millis(100))? { |
There was a problem hiding this comment.
Why did you change the number of millis?
There was a problem hiding this comment.
The original 10ms polling rate seems excessive for manual navigation. Wdyt?
| time = { version = "0.3.36" } | ||
| unicode-segmentation = "1.11.0" | ||
| unicode-width = "0.2.0" | ||
| utf-8 = "0.7.6" |
There was a problem hiding this comment.
Was this already unused? I can't find where you removed the usage of it.
There was a problem hiding this comment.
When I added the tempfile crate into uumore's Cargo file using cargo add, it automatically removed the utf-8 crate (along with selinux-sys) from the main Cargo file. I assumed it was cleaning up unused crates, but I can add it back if needed.
src/uu/more/src/more.rs
Outdated
| // Ensure we have enough lines loaded for display | ||
| self.read_until_line(self.upper_mark + self.content_rows)?; |
There was a problem hiding this comment.
Does this take the squeezed lines into account?
There was a problem hiding this comment.
No, it does not. That is a good point. I will move this into the loop
src/uu/more/src/more.rs
Outdated
| // Display lines until we've filled the screen | ||
| let mut lines_printed = 0; | ||
| let mut index = self.upper_mark; | ||
| while lines_printed < self.content_rows && index < self.lines.len() { |
There was a problem hiding this comment.
This looks like it could be a for loop over the index.
There was a problem hiding this comment.
The main loop condition is lines_printed < self.content_rows. We may not always read until self.lines.len(). The index variable is used after the loop to check for EOF.
There was a problem hiding this comment.
I've refactored the loop; hope it is clearer now.
d4bc6b3 to
7f5e2b9
Compare
7f5e2b9 to
c8c4f52
Compare
|
GNU testsuite comparison: |
close #6397
This PR improves
uu_moreby achieving constant memory overhead during initialization. Previously (#7680), we read the entire file once and store byte positions. We now do so incrementally using the following techniques:InputTypeenum shown above.Stdinalready implementsBufReadand we need the underlyingFiletype to get the file size for displaying the status.Pagerobject, we scan the only up to the start line (options.from_line). We store the lines read inlines, and the byte positions incumulative_line_sizesto display the file navigation status.from_linetakes precedence over pattern which is in line withmore. This means that if the specified pattern only exists beforefrom_line, it will not be found.self.lines. This means we scan the entire file at most once.self.upper_mark + self.content_rowshave been read into memory. Then, we iterate over the relevant indexes inself.linesto print.test_more.rstomore.rssince we cannot easily extract stdout from crossterm'sAlternateScreen.less.