I/O: `.lines()` iterator is slower than a manual loop over `.read_line()` due to allocations

[`BufRead::lines()`](https://doc.rust-lang.org/stable/std/io/trait.BufRead.html#method.lines) is a convenient way to read line-by-line. But since it implements the `Iterator` trait which allows `.collect()`ing the elements, it allocates every line on the heap separately.

A manual loop over [`BufRead::read_line()`](https://doc.rust-lang.org/stable/std/io/trait.BufRead.html#method.read_line) is significantly faster because you can reuse a single buffer for all lines, although it is more verbose.

Code using `.lines()`:

```rust
    for line in reader.lines() {
        // TODO: process the line
    }
```

Manual `read_line` loop reusing the same buffer:

```rust
    let mut line = String::new();
    while reader.read_line(&mut line)? != 0 { // 0 bytes read is how the OS indicates that we reached end of file
        // TODO: process the line
        line.clear(); // clear the buffer after each line, or we'll end up with the whole file in memory!
    }
```

Benchmark results on [this public domain book](https://www.gutenberg.org/ebooks/71375) in plain text format repeated 50 times:

```
Benchmark 1: target/release/lines ~/repeated_book
  Time (mean ± σ):     132.0 ms ±  10.8 ms    [User: 122.9 ms, System: 9.0 ms]
  Range (min … max):   118.6 ms … 161.0 ms    100 runs
 
Benchmark 2: target/release/read_line ~/repeated_book
  Time (mean ± σ):      97.2 ms ±  10.4 ms    [User: 87.9 ms, System: 9.2 ms]
  Range (min … max):    90.5 ms … 131.8 ms    100 runs
 
Summary
  'target/release/read_line ~/repeated_book' ran
    1.36 ± 0.18 times faster than 'target/release/lines ~/repeated_book'
```

Exact code used for benchmarks: 
https://github.com/Shnatsel/fast-io-cookbook/blob/82ae9bd106002c4d7ffeb6f263b9ae8c05ffe95e/src/bin/lines.rs
https://github.com/Shnatsel/fast-io-cookbook/blob/82ae9bd106002c4d7ffeb6f263b9ae8c05ffe95e/src/bin/read_line.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I/O: `.lines()` iterator is slower than a manual loop over `.read_line()` due to allocations #68

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

I/O: .lines() iterator is slower than a manual loop over .read_line() due to allocations #68

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

I/O: `.lines()` iterator is slower than a manual loop over `.read_line()` due to allocations #68