Skip to content

branchless .filter(_).count()#39107

Merged
bors merged 4 commits into
rust-lang:masterfrom
llogiq:branchless_filter_count
Feb 5, 2017
Merged

branchless .filter(_).count()#39107
bors merged 4 commits into
rust-lang:masterfrom
llogiq:branchless_filter_count

Conversation

@llogiq

@llogiq llogiq commented Jan 16, 2017

Copy link
Copy Markdown
Contributor

I found that the branchless version is only slower if we have little to no branch misses, which usually isn't the case. I notice speedups between -5% (perfect prediction) and 60% (real world data).

@rust-highfive

Copy link
Copy Markdown
Contributor

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

Comment thread src/libcore/iter/mod.rs Outdated
// branchless count
c += (&mut predicate)(&x) as usize;
}
c

@bluss bluss Jan 17, 2017

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one doesn't want to argue style for trivial things, but I find the code cramped when simple should work well:

fn count(mut self) -> usize {
    // Attempt to produce branchless count if possible
    let mut count = 0;
    for elt in self.iter {
        count += (self.predicate)(&elt) as usize;
    }
    count
}

The surrounding methods use .by_ref() which seems archaic (just &mut self.iter would be fine if mutable reference only was required)

@llogiq llogiq Jan 17, 2017

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will change shortly (perhaps also change the other methods en passant).

@bluss bluss left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementation looks good to me. Will wait for others to see if they agree with this special case.

@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jan 17, 2017
@sfackler

Copy link
Copy Markdown
Member

Thoughts @alexcrichton @aturon? I feel somewhat conflicted.

@alexcrichton

Copy link
Copy Markdown
Member

Seems fine to me, although it'd be nice to have some tests specifically for this combinator now that we're specializing the implementation.

@BurntSushi

Copy link
Copy Markdown
Member

Only one of the jobs failed, but the error looks related. Strange.

error[E0369]: binary operation `%` cannot be applied to type `&&{integer}`
   --> /checkout/src/libcore/../libcoretest/iter.rs:197:37
    |
197 |     assert_eq!(xs.iter().filter(|x| x % 2 == 0).count(), 5);
    |                                     ^
    |
note: an implementation of `std::ops::Rem` might be missing for `&&{integer}`
   --> /checkout/src/libcore/../libcoretest/iter.rs:197:37
    |
197 |     assert_eq!(xs.iter().filter(|x| x % 2 == 0).count(), 5);
    |       

@BurntSushi

Copy link
Copy Markdown
Member

I think my concern with this is not the specific change, but rather, what's to stop someone from removing this specialization in the future because their particular data runs faster without it? Is a comment documenting it enough?

@bluss

bluss commented Jan 24, 2017

Copy link
Copy Markdown
Contributor

Only one job actually compiles rust and runs the tests, so that's expected.

@BurntSushi Do you mean that if this is added, that it's part of guaranteed behaviour? I don't think it should be, and it should be possible to reevaluate changes like this.

@BurntSushi

Copy link
Copy Markdown
Member

@bluss No, I mean, what's to prevent someone, say, a year from now submitting a PR that removes this specialization for similar reasons that @llogiq provided? I'm not referring to the behavior, but rather, the state of adding/removing optimizations like this?

I suppose the best answer to my concern is an executable benchmark, but I'm not sure we have the infrastructure in place for that?

(And maybe this concern just doesn't matter much in practice.)

@alexcrichton

Copy link
Copy Markdown
Member

@BurntSushi optimizations like this can be removed/added as we see fit, I don't think we're making a binding contract to implement this method exactly this way, so we're still open to future improvements if necessary.

In any case this looks good to go, so

@bors: r+

@bors

bors commented Feb 4, 2017

Copy link
Copy Markdown
Collaborator

📌 Commit bfabe81 has been approved by alexcrichton

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Feb 5, 2017
…lexcrichton

branchless .filter(_).count()

I found that the branchless version is only slower if we have little to no branch misses, which usually isn't the case. I notice speedups between -5% (perfect prediction) and 60% (real world data).
frewsxcv added a commit to frewsxcv/rust that referenced this pull request Feb 5, 2017
…lexcrichton

branchless .filter(_).count()

I found that the branchless version is only slower if we have little to no branch misses, which usually isn't the case. I notice speedups between -5% (perfect prediction) and 60% (real world data).
bors added a commit that referenced this pull request Feb 5, 2017
@bors bors merged commit bfabe81 into rust-lang:master Feb 5, 2017
@bors

bors commented Feb 5, 2017

Copy link
Copy Markdown
Collaborator

⌛ Testing commit bfabe81 with merge 9c8cdb2...

@llogiq llogiq deleted the branchless_filter_count branch February 5, 2017 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants